Entries by


Rattle 5.1 Released

Rattle, an open source GUI For Data Science and Machine Learning using R, has been updated to Version 5.1 on CRAN and is available for download now. As always, the latest updates to Rattle are available from  bitbucket. A small but important update means that Rattle now works with the latest version of RGtk2 on […]

Setting up R for ML Tutorial

Preparation For our Machine Learning in R tutorial each participant is requested to install or obtain access to the free (as in libre) and open source software R and Rattle. Please complete this prior to the session itself or else let me know of any issues you have installing. An Azure (cloud) Data Science Virtual […]

The Essentials of Data Science

My new book is now available from Amazon. From the cover: The Essentials of Data Science: Knowledge Discovery Using R presents the concepts of data science through a hands-on approach using free and open source software. It systematically drives an accessible journey through data analysis and machine learning to discover and share knowledge from data. […]

Open Source R on the Azure Ubuntu Data Science Virtual Machine

Data scientists rely on the freedom to innovate that is afforded by open source software. We often deploy an open source software stack based on Ubuntu GNU/Linux and the R Statistical Software. This provides a powerful environment for the management, wrangling, analysis, modeling, and presentation of data within a tool that supports machine learning and […]

Sharing our R Programs — With Style

This was originally shared as a Revolution Analytics Blog Post on 25th October 2016. Programming is an art and a way we express ourselves. As we write our programs we should keep in mind that someone else is very likely to be reading it. We can facilitate the accessibility of our programs through a clear presentation […]


Rattle 5.0.0 Alpha Released – ggraptR and Microsoft R Support

I have released an alpha version of Rattle with two significant updates. Eugene Dubossarsky and his team have been working on a Shiny interface to generate ggplot2  graphics interactively. It is a package called ggraptR. This is now available through Rattle’s Explore tab choosing the Interactive option. In line with Rattle’s philosophy of teaching programming […]

A Grammar of Machine Learning: graml

Data Scientists have access to a grammar for preparing data (Hadley Wickham’s tidyr package in R), a grammar for data wrangling (dplyr), and a grammar for graphics (ggplot2). At an R event hosted by CSIRO in Canberra in 2011 Hadley  noted that we are missing a grammar for machine learning. At the time I doodled […]


Data Science explained Simply

A 5-video series called Data Science for Beginners has been released by Microsoft. It introduces practical data science concepts to a non-technical audience… making data science accessible – keeping the language clear and simple as an entry point to understanding data science.   http://aka.ms/data-science-for-beginners-1 http://aka.ms/data-science-for-beginners-2 http://aka.ms/data-science-for-beginners-3 http://aka.ms/data-science-for-beginners-4 http://aka.ms/data-science-for-beginners-5 Graham @ Microsoft