The Big Data Summit was held at the University of Technology, Sydney, 28 October 2014. The keynote was presented by Dr Usama Fayyad and I had the pleasure of introducing him. Here is my introduction.

Good morning and welcome to our keynote presentation.

It is indeed an honour to introduce our most distinguished key note presenter, Dr Usama Fayyad.

Today, Usama is Chief Data Officer and group Managing Director at Barclays Bank in London, where his responsibilities include data governance, information risk management, and the data infrastructure for business intelligence, data warehousing, big data and analytics technologies across the Barclays Group globally.

…and that is just the beginnings of a page of achievements and contributions that Usama has made to data mining, knowledge discovery, data science, and big data. I invite you to visit the Big Data Summit web page to read a brief account of Usama’s journey. This is a journey from his PhD at the University of Michigan in 1991, through NASA’s Jet Propulsoin Laboratory, into the pre-burn of Microsoft, and propelled to becomming the industry’s first Chief Data Officer with Yahoo. Usama also has a string of tech startups in the US, in Jordan, and beyond, to his name.

Clearly the one page overview is only a brief summary of his achievements and of the multitude of industry and academic contributions and wards he has received over very many years. I first met Usama in 1995 in Montreal.

Here in Australia we had established the first data mining research group in 1993 located within CSIRO in Canberra, at the Australian  National University. We were working closely with industry and government and particularly with Dr Warwick Graco’s group at the Health Insurance Commission which was pioneering the use of data mining in practice in Australia.

We had heard of Usama as a data miner pioneer whom we should invite to Australia to share his wisdom with us as we developed our data mining capabilities.

I attended IJCAI in Montreal that year. This also saw the establishment of the first of the KDD conference, which Usama has  been central in initiating and nurturing. Usama visited us in CSIRO in Canberra in 1995, and also visited UTS in Sydney before and after the visit. This was the first of many visits to Australia. It was also a landmark moment when, in Sydney he took the challenge to shave his head, thus creating the image of the man we know so well today.

Obviously we were well advised by Usama in CSIRO and developed a strong data mining capability that has fed into many other teams in research and practice throughout Australia.

Over the years every talk of Usama’s that I have attended I had the pleasure of coming away with new ideas and new insights about the technology, about the industry, about research directions, and always with good humour. Last year at this very same Big Data Summit I appreciated gaining current insights around analytics-in-database developments and where the industry was heading.

I am again keenly looking forward to Usama’s keynote, and I welcome Usama to the stage.

Rattle 3.1.0 is now available on CRAN. This represents ongoing bug fixes and functionality improvements.

Specific updates include:

  • Numerous updates of plots to use ggplot2 rather than base graphics: ROC curves, riskchart, box plots, histogram plots, pairs plot, Benfords. Advanced Graphics is now the default, reverting to tradition graphics where needed. The migration to ggplot2 is ongoing.
  • Added new Benfords functionality.
  • Added a rescale option to kmeans.
  • New psfchart() for evaluation.
  • New function normVarNames() to normalise variable names to a standard preferred style.
  • Evaluate -> Error Matrix has been updated to report averaged class error and to report class errors.
  • Evaluate -< PrvOb plot bug fix for non-missing data.
  • INSTALL: Remove old INSTALL file – visit for installation instructions.
  • plotNetwork() has been removed – not used by Rattle and generally of limited use. See for the code.
  • No longer report repository revision number in version or about.
  • Miscellaneous bug fixes and stability improvements.
  • weatherAUS dataset is up-to-date.

Display a pairs plot under the Explore tab, choosing the Distributions option, and clicking Execute.

Screenshot from 2014-07-20 09:30:43

Click the CheckBox for BoxPlot of MinTemp.

Screenshot from 2014-07-20 09:51:55

On the Model tab, choose the Decision Tree option, then Execute, then the Draw button.

Screenshot from 2014-07-20 09:52:37

On the Evaluate tab, choose the Risk Chart option to evaluate the performance of the classification model.

Screenshot from 2014-07-20 09:52:09

Welcome to the New Togaware Presence.

Here you will find resources for the Data Scientist.

The site is under development. More material is available from