I have released an alpha version of Rattle with two significant updates.

Eugene Dubossarsky and his team have been working on a Shiny interface to generate ggplot2  graphics interactively. It is a package called ggraptR. This is now available through Rattle’s Explore tab choosing the Interactive option.

screenshot-from-2016-09-12-124814

In line with Rattle’s philosophy of teaching programming of data by exposing all code through Rattle’s Log tab, ggraptR has a button to generate the plot. You can click the Generate Plot Code button, copy the resulting code and paste it into the R console, knitr document, or jupyter notebook. Execute the code and you generate the plot. Now you can start fine tuning it some more if you like.

The current alpha version has a few niggles that are being sorted out but it is already worth giving it a try.

The second major update is the initial support for Microsoft R Server so that Rattle can now handle datasets of any size. From Rattle’s Data tab choose an XDF file to load.

screenshot-from-2016-09-12-130305

A sample of the full (generally big) dataset will actually be loaded into memory but many of the usual operations will be performed on the XDF dataset on disk. For example, build a decision tree and Rattle will automatically choose rxDTree() for the XDF dataset instead of rpart().

screenshot-from-2016-09-12-130554

Visualise the tree as usual.

screenshot-from-2016-09-12-130617

Performance evaluation is also currently supported.

screenshot-from-2016-09-12-131401

Do check the Log tab to review the commands that were executed underneath.

This is an initial release. There’s still plenty of functionality to expose. Currently implemented for Binary Classification:

  • Data: Load xdf file;
  • Explore: Subset the dataset for interactive exploration;
  • Models: rxDTree, rxDforest;
  • Evaluate: Error Matrix, Risk Chart.

Still to come:

  • Data: Import CSV;
  • Models: boosting, neural network, svm.

You can try this new version out using either Microsoft R Client on MS/Windows or fire up an Azure Linux Data Science Virtual Machine which comes with the developer version of Microsoft R Server installed. Then upgrade the pre-installed Rattle to this new release.

> install.packages(c("rattle", "devtools"))
> devtools::install_bitbucket("kayontoga/rattle")

Graham @ Microsoft

A 5-video series called Data Science for Beginners has been released by Microsoft. It introduces practical data science concepts to a non-technical audience… making data science accessible – keeping the language clear and simple as an entry point to understanding data science.

 

http://aka.ms/data-science-for-beginners-1
http://aka.ms/data-science-for-beginners-2
http://aka.ms/data-science-for-beginners-3
http://aka.ms/data-science-for-beginners-4
http://aka.ms/data-science-for-beginners-5

Graham @ Microsoft