Togaware DATA MINING
Desktop Survival Guide
by Graham Williams
Google

Basic Data Summary

Once a dataset has been loaded into Rattle we can start to obtain an idea of the shape of the data from the simple summary that is displayed, as in Figure 5.1. For example, the first variable, Date is recognised as a unique identifier for each observation. It has 366 unique values, which is the same number of observations. Rattle introduces a heuristic to note such variables as identifiers.

The following five variables are all identified as Numeric, followed by the Categoric WindGustDir, and so on.

The comment column will also identify how many missing observations there are for particular variables. Dealing with missing values is covered in Chapter [*].



Copyright © Togaware Pty Ltd
Support further development through the purchase of the PDF version of the book.
The PDF version is a formatted comprehensive draft book (with over 800 pages).
Brought to you by Togaware. This page generated: Sunday, 22 August 2010