Clyde Street

Learning, Teaching, Performing


1 Comment

Off to the ASTN Annual Conference 2 November

I am travelling to Geelong tomorrow to attend the first annual conference of the Australian Sports Technologies Network (ASTN).

You can find a copy of the program here.

The conference is preceded by a Sports Tech Commercialsation Boot Camp.

University partners in the ASTN have an opportunity to present some background about their interest in Sport Technology.

This is the presentation I will make on behalf of the University of Canberra.


Leave a comment

Dealing With Data Deluge

I have spent much of the last two days in conversations with coaches about personalising learning environments for athletes and their colleagues.

I think this ability to personalise coaching and modulate training is a characteristic of the (+) of the coach I discussed here.

A recurring theme in conversations has been the growth in pervasive sensing data in training and competition environments. I have become increasingly interested in how computational intelligence might help with these data.

Whilst contemplating these issues, I received an alert from The Scholarly Kitchen to Todd Carpenter’s Does All Science Need to be Preserved? Do We Need to Save Every Last Data Point?

In his post Todd observes:

There are at present few best practices for managing and curating data. Libraries have developed, over the decades, processes and plans for how to curate an information collection and to “de-accession” (i.e., discard) unwanted or unnecessary content. At this stage in the development of an infrastructure for data management, there is no good understanding of how to curate a data collection. This problem is compounded by the fact that we are generating far more data than we have capacity to store or analyze effectively.

He notes “the much deeper questions of large datasets and what to preserve, at what level of detail and granularity, and whether all data is equally important to preserve are questions that have yet to be fully addressed”.

Todd pointed to Kelvin Droegemeier‘s presentation, A Strategy for Dynamically Adaptive Weather Prediction: Cyberinfrastructure Reacting to the Atmosphere the U.S. National Academies Board on Research Data and Information ( a copy of the presentation here).

In his presentation, Kelvin asked a fundamental research question “Can we better understand the atmosphere, educate more effectively about it, and forecast more accurately if we adapt our technologies and approaches to the weather as it occurs?“.

To do so Kelvin noted the need to “Revolutionize the ability of scientists, students, and operational practitioners to observe, analyze, predict, understand, and respond to intense local weather by interacting with it dynamically and adaptively in real time“. He emphasised the need for adaptive systems and the provision of service oriented architecture.

This architecture for Linked Environments for Atmospheric Discovery is outlined in slide 19 of his presentation:

In his discussion of Kelvin’s paper, and the issue of volume of data, Todd suggests:

One can certainly maintain the highest grain data if in retrospect it was an extraordinary discovery or event.  However, if fine grain detail was collected and nothing of consequence occurred, does that fine-grain detail need to be preserved? Probably not, without some other specific reason to do so. Obviously, this is a simplification, since you will want to retain some version of the data collected for re-analysis, but the raw data and the resolution of that data need not be preserved on an ongoing basis.

I do think these are issues of importance for sport as volumes of pervasive sensing data are acquired. I see significant parallels between the prospective study of injury risk and Kelvin’s discussion of local weather variation.

There are important issues related to curation too. I see Todd’s post as an excellent introduction to the granularity of data and the decisions we make about the costs and benefits of collecting and storing data. A recent paper (Balli and Korukoğlu, 2012) raises an interesting question about how early these data can be collected for talent identification purposes.

Photo Credit

The British Coach Giving a Few Weight Lifting Hints


1 Comment

Goal-Line Technology Update: May 2012

iSportConnect carried a story today about the use of goal-line technology in football.

The story indicates that:

  • England’s friendly against Belgium at Wembley on 2 June will be used to test the Hawk-Eye goal-line technology system.
  • Independent testers from the Swiss Federal Laboratories for Materials Science and Technology and representatives from FIFA will monitor the system.
  • It will not be available for use by the match officials on 2 June.
  • The Hawk-Eye system was tested in a English minor league cup final between Eastleigh and AFC Totton in April.
  • An alternative system, GoalRef, has undergone trials at two Danish league matches this month.
  • The International Football Association Board will consider reports from these trials to consider the introduction of goal-line technology at its special meeting in July.

Photo Credit

Goal!


Leave a comment

Enterprise Computing Conference: ECC@UC

The Faculty of Information Sciences and Engineering is hosting with IBM an Enterprise Computing Conference in the Ann Harding Centre on the University campus (15-16 May).

The program is here.

Senator Kate Lundy opened the Conference. In her introduction she noted the growth of enterprise and cloud computing and their role in the digital economy. She underscored the transformational potential of these approaches.

Senator Lundy discussed the potential of the National Broadband Network infrastructure in supporting innovation. She reminded delegates about George Gilder’s Law: “bandwidth grows at least three times faster than computer power.”

In the next part of her talk Senator Lundy discussed the skills required to optimise connectivity. She noted the IBM and University of Melbourne partnership and indicated the potential of a Marist/IBM model at the University of Canberra.

In her conclusion Senator Lundy affirmed the importance of collaboration to develop the skills required for a digital economy to harvest the benefits of enterprise and cloud computing. She exhorted delegates to use the conference to explore the transformational opportunities available from a connected society.

 


2 Comments

Probability and Sensemaking

This post evolved out of two links I received yesterday.

The first came from Stephen Downes’ discussion of evidence-based decision-making. The second from an IBM post The Next 5 in 5.

Both links extended my interest in a probabilistic approach to decision-making.

Stephen linked to S L Zabell’s discussion of Rudolf Carnap‘s Logic of Inductive Inference. I noted Zabell’s observation that:

Carnap’s interest in logical probability was primarily as a tool, a tool to be used in understanding the quantitative confirmation of an hypothesis based on evidence and, more generally, in rational decision making.

The paper is a detailed mathematical account of Carnap’s work and is way beyond my allocation of logical-mathematical intelligence. However I did note:

Carnap recognized the limited utility of the inductive inferences that the continuum of inductive methods provided, and sought to extend his analysis to the case of analogical inductive inference: an observation of a given type makes more probable not merely observations of the exact same type but also observations of a “similar” type. The challenge lies both in making precise the meaning of “similar”, and in being able to then derive the corresponding continua (p.294).

in the application of inductive logic to a given knowledge situation, the total evidence available must be taken as basis for determining the degree of confirmation (p.298).

Carnap advanced his view of “the problem of probability’. Noting a “bewildering multiplicity” of theories that had been advanced over the course of more than two and a half centuries, Carnap suggested one had to carefully steer between the Scylla and Charybdis of assuming either too few or too many underlying explicanda, and settled on just two. These two underlying concepts Carnap called probability1and probability2: degree of confirmation versus relative frequency in the long run (p.300).

Zabell concludes that “Carnap’s most lasting influence was more subtle but also more important: he largely shaped the way current philosophy views the nature and role of probability, in particular its widespread acceptance of the Bayesian paradigm” (p.305)

Stephen Downes cited Zabell in a discussion of  an evidence-based approach to demonstrating value in the future of libraries. In this discussion Brian Kelly argues that:

We do need to continue to gather evidence of the value of services, and not just library services.  But we need to understand that the evidence will not necessarily justify a continuation of established approaches to providing services.  And if evidence is found which supports the view that libraries will be extinct by 2020 then the implications need to be openly and widely discussed.

I picked up on Brian’s assertion that “the evidence will not necessarily justify a continuation of established approaches to providing services” and pondered the implications of this for my use of probabilistic approaches to sport performance. Rudolf Carnap’s work via S.L. Zabell’s synthesis has pushed me further to consider how much evidence I need to have to support an approach to performance in sport that can describe, predict, model and transform. I am going to follow up on Brian Skyrms and Jack Good too particularly in relation to ‘total evidence’!

By coincidence just as I was coming up for air after an early morning visit to induction and evidence I received an Economist alert to IBM’s 5 in 5 predictions (At the end of each year, IBM examines market and societal trends expected to transform our lives, as well as emerging technologies from IBM’s global labs, to develop a multi-year forecast).

The fifth item in the list is Big Data & sensemaking engines start feeling like best friends. Jeff Jonas writes “Here at IBM we are working on sensemaking technologies where the data finds the data, and relevance finds you. Drawing on data points you approve (your calendar, routes you drive, etc.), predictions from such technology will seem ingenious.”

He adds that:

This new era of Big Data is going to materially change what is possible in terms of prediction. Much like the difference between a big pile of puzzle pieces versus, the pieces already in some state of assembly – the latter required to reveal pictures. This is information in context, and while some pieces may be missing; some may be duplicates; others have errors; and a few are professionally fabricated lies – nonetheless, what can be known only emerges as the pieces come together (data finding data).

Earlier in the year Jeff wrote about massively scalable sensemaking analytics. His post has links to his other writing in this area:

Sensemaking Systems Must be Expert Counting Systems, Data Finds Data, Context Accumulation, Sequence Neutrality and Information Colocation to new techniques to harness the Big Data/New Physics phenomenon.

Jeff’s G2 system (Privacy by Design (PbD) has seven features: Full Attribution; Data Tethering; Analytics in the Anonymized Data Space; Tamper-Resistant Audit Logs; False Negative Favoring Methods; Self-Correcting False Positives; Information Transfer Accounting.

In 2010 Jeff discussed sensemaking and observed that:

Man continues to chase the notion that systems should be capable of digesting daunting volumes of data and making sufficient sense of this data such that novel, specific, and accurate insight can be derived without direct human involvement.

I found his explanation of the role of expert counting in sensemaking very helpful. I am disappointed that I have not accessed Jeff’s work until now.

It is fascinating that two early morning links can open up such a rich vein of discovery. At the moment I am particularly interested in how records can be used to inform decision making and what constitutes necessary and sufficient evidence to transform performance.

I have a lot of New Year reading to do!

Photo Credits

What’s That? (97)

Sensemaking

Computer Wire Art


1 Comment

Growing Sport Analytics

I have been thinking a great deal about transforming performance and leading ahead of the curve this year. By coincidence this year is ending with the screening of Moneyball.

I have been interested in particular in how secondary data can inform and support coaches and the coaching process.

One of the catalysts in my thinking has been Usama Fayyad. I heard Usama speak at a knowledge discovery in databases conference in Sydney in 2005. His 1996 paper From Data Mining to Knowledge Discovery in Databases co-written with Gregory Piatetsky-Shapiro and Padhraic Smyth was my first engagement with a domain of enquiry that has become a primary focus for me.

In their 1996 paper the authors point out that:

Across a wide variety of fields, data are being collected and accumulated at a dramatic pace. There is an urgent need for a new generation of computational theo- ries and tools to assist humans in extracting useful information (knowledge) from the rapidly growing volumes of digital data. These theories and tools are the subject of the emerging field of knowledge discovery in databases (KDD).

I spent much of the 1980s and 90s collecting data about performance in rugby union and a number of other sports. All of these data were collected with hand notation systems. My interest then and now is the pattern of observable individual and team behaviour.

I saw an early copy of Michael Lewis’s book The Art of Winning an Unfair Game (2003) and was attracted intuitively to the power of sabermetrics. I saw in Bill James’s work the passion that fired Charles Reep in his observations of association football. Michael, Bill and Charles had a predecessor in Hugh Fullerton who in 1910 wrote about The Inside Game of baseball.

In his paper, Hugh Fullerton points out:

Last season (1909) I arranged with scorers to record hits of various kinds, and secured the scores thus kept on 40 Central League games, 26 American Association games, and fourteen college games to compare with major league scores kept in the same manner. In the college games one grounder in every 8 1/3 passed the infielders. In the Central League one in 10 7/12, in the American Association one in 12 2/43, and in the American National Leagues (45 games of my own scoring) one in every 15 3/16.

He adds that:

The figures were amazing, as they followed so closely the classification of the leagues. They proved that there is a reason for the “class”, but the proof is not found in the mathematics, but in two word (unless you hyphenate them), “team work.”

For Hugh Fullerton the inside game is “the art of getting the hits that “he couldn’t have got anyhow””.

102 years on from Hugh Fullerton’s research there is a growing community of practice in sport analytics. A recent example of this incandescence was a Sports Analytics conference in Manchester in November.

Speakers at the conference included: Bill Gerrard, Raffaele Poli, Simon Wilson, David Fallows, Gavin Fleig, Ed Sulley, Ian Lenagan, Rob Lowe, Fergus Connolly, Ian Graham, Nick Broad and Steve Houston.

Usama Fayyad, Gregory Piatetsky-Shapiro and Padhraic Smyth  point out that Knowledge KDD is “the nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data”.

I believe they have provided a fundamental guide to the family resemblances that characterise the analysis of sport performance:

Here, data are a set of facts (for example, cases in a database), and pattern is an expression in some language describing a subset of the data or a model applicable to the subset. Hence, in our usage here, extracting a pattern also designates fitting a model to data; finding structure from data; or, in general, making any high-level description of a set of data. The term process implies that KDD comprises many steps, which involve data preparation, search for patterns, knowledge evaluation, and refinement, all repeated in multiple iterations. By nontrivial, we mean that some search or inference is involved; that is, it is not a straightforward computation of predefined quantities like computing the average value of a set of numbers.

I am hopeful that 2012 will provide opportunities to share data throughout the community of practice that is sport analytics.

Photo Credits

Miracles

Australian bowler, Bill O’Reilly, demonstrates his famous grip


2 Comments

Launch of the Human-Centred Computing Laboratory

This Wednesday sees the official launch of the Human-Centred Computing Laboratory (HCC Lab) at the University of Canberra.

Guests and colleagues will be welcomed by Michael Wagner, Director of the HCC Lab, by Dharmendra Sharma, the Dean of the Faculty of Information Sciences and Engineering and by Frances Shannon, the Deputy Vice-Chancellor (Research) of the University.

The guest of honour is Senator Kate Lundy and she will launch the HCC Lab.

Thereafter there will be a full-day workshop on Biometrics, Vision and Sensing, and Sport Information and Informatics. Kuldip Paliwal, Griffith University and Michael Breakspear will make keynote addresses. Kuldip will discuss Modulation-domain speech enhancement and Michael’s topic is Quantitative assessment in psychiatry using embedding technologies.

Other presentations will be given by Michael Wagner (anger detection), Denis Burnham (speech), Roland Goecke (facial expression), Yuko Kinoshita (voice comparison), Girija Chetty (information fusion), Tania Churchill (training adaptation in sport) and me (sport information and informatics).

My paper is titled Greater Than The Sum Of Our Parts? I will report on my work to date with colleagues in the Faculty of Information Sciences and Engineering. I will consider the opportunities for research and practice in Sports Information and Sports Informatics. I conclude with a discussion of open access, interdisciplinary research in a connected University.

My key points will be:

  • The importance of collaborative enquiry
  • The increasing synthesis of sport information and sport informatics
  • The opportunities for open access to research processes and outcomes
  • The interdisciplinary riches inherent in a human-centred computing laboratory

I suggest that the observation, recording and analysis of performance provide the catalysts for interdisciplinary endeavour in the HCC Lab.

Some images to illustrate work underway:

Photo Credit

Banksy

Machine Learning