Togaware DATA MINING
Desktop Survival Guide
by Graham Williams
Google

Interacting with R

So the basic, yet also very powerful, paradigm for interacting with R is that of writing sentences in a language. For this you need an editor, and ideally one that supports: R syntax highlighting in colour; parenthesis checking; command completion; and code evaluation by R. Such an editor, and highly recommended, is Emacs with theESS package. Which ever editor you prefer, a favoured mode of operation is to write your R sentences, using your editor, into a file, and then ask R to evaluate the instructions you have provided. Such a practise will ensure you work efficiently to capture the results of your data understanding, data cleaning, data transformations, and data mining. It will also ensure your work is repeatable, and as your data changes, you can simply re-run your processes, as expressed in your script files.

While graphical user interfaces (GUIs) provide an easy path into using a tool, you very quickly lose the ability to capture your processes and, by staying within the GUI you quickly find that you have limited functionality and flexibility that a full language would provide. Indeed, you also tend to end up not understanding what it is you are doing, and can easily fall into statistical traps! GUIs are good for helping remember the commands to perform specific tasks but a GUI can often end up getting in the way, rather than helping.

Compare this to writing books which still fundamentally involves putting words into sentences in a document. So it is with writing data mining stories. The tools available provide much help in writing our stories, but still we need to put the sentences together.

And like any story, we will be writing them for others to read, not just for the computer to evaluate. So always write your R code with the intention that others will want to read it. They will!



Subsections
Copyright © Togaware Pty Ltd
Support further development through the purchase of the PDF version of the book.
The PDF version is a formatted comprehensive draft book (with over 800 pages).
Brought to you by Togaware. This page generated: Sunday, 22 August 2010