I’ve had a few enquiries lately on the relationship between Rattle and the WebFOCUS predictive analytics component called RStat from Information Builders (WebFOCUS is their widely used Business Intelligence suite).

I even had an approach recently from an Australian company offering to provide a demonstration of RStat to see if it might be something we could use for our Data Mining.

Yes, RStat is a fork of Rattle. We developed the initial fork through Togaware back in 2008 or thereabouts. It is an extension to Rattle that allows direct integration into WebFOCUS. Those familiar with Rattle will recognise the following plots generated from RStat.

A WebFOCUS Business Intelligence user can launch RStat as an integrated component of the suite. That is a nice option and users can seamlessly deploy the power of R to do their predictive analytics.

One of the first benefits of the integration is that it allows data to be imported directly and seamlessly into RStat from WebFOCUS. The significance of this is that WebFOCUS has an admirable collection of data import modules and so data from pretty much any source can be integrated into WebFOCUS and thus into RStat (Rattle).

The other significant addition found in RStat is the ability to directly export R models as C code into WebFOCUS. These then become compilable C code modules and hence are first class WebFOCUS objects. These objects are then deployed within the WebFOCUS environment. Anyone familiar with the challenge of deploying R models (or Python models) into a production environment will recognise the significance of this functionality – models are automatically deployable into production without the recoding requirements we as Data Scientists often face.

Information Builders now maintain RStat. Also have a look at the RStat Fact Sheet for details.

Cheers

I’ve moved the hosting of the open source Rattle GUI for doing Data Mining with R onto bitbucket using git. Developers can now clone and modify and push requests.

Visit https://bitbucket.org/kayontoga/rattle

Connect-R is a relatively new service which acts as a market place for matching requests for improvements to R packages with developers who may be able to do so. There is also the option to crowd fund the development. I’ve started encouraging users of Rattle to add feature requests through Connect-R.

A number of new features have now been added to Rattle through the use of Connect-R with the developer receiving payment for so doing. The implementations were well done and I readily included them into the Rattle release code.

Rattle 3.1.0 is now available on CRAN. This represents ongoing bug fixes and functionality improvements.

Specific updates include:

  • Numerous updates of plots to use ggplot2 rather than base graphics: ROC curves, riskchart, box plots, histogram plots, pairs plot, Benfords. Advanced Graphics is now the default, reverting to tradition graphics where needed. The migration to ggplot2 is ongoing.
  • Added new Benfords functionality.
  • Added a rescale option to kmeans.
  • New psfchart() for evaluation.
  • New function normVarNames() to normalise variable names to a standard preferred style.
  • Evaluate -> Error Matrix has been updated to report averaged class error and to report class errors.
  • Evaluate -< PrvOb plot bug fix for non-missing data.
  • INSTALL: Remove old INSTALL file – visit rattle.togaware.com for installation instructions.
  • plotNetwork() has been removed – not used by Rattle and generally of limited use. See onepager.togaware.com for the code.
  • No longer report repository revision number in version or about.
  • Miscellaneous bug fixes and stability improvements.
  • weatherAUS dataset is up-to-date.

Display a pairs plot under the Explore tab, choosing the Distributions option, and clicking Execute.

Screenshot from 2014-07-20 09:30:43

Click the CheckBox for BoxPlot of MinTemp.

Screenshot from 2014-07-20 09:51:55

On the Model tab, choose the Decision Tree option, then Execute, then the Draw button.

Screenshot from 2014-07-20 09:52:37

On the Evaluate tab, choose the Risk Chart option to evaluate the performance of the classification model.

Screenshot from 2014-07-20 09:52:09