DATA MINING
Desktop Survival Guide by Graham Williams |
|||||
|
One of the simplest and most common ways of sharing data today is via the comma separated values (CSV) format. CSV has become a standard file format used to exchange data between many different applications. CSV files, which usually have a exttt.csv extension, can be exported and imported by spreadsheets and databases, including OpenOffice Calc, Gnumeric, MS/Excel, SAS/Enterprise Miner, Teradata, Netezza, and many, many, other applications. For these reasons, CSV is a good option for importing data into Rattle. The downside is that a CSV file does not contain explicit metadata (i.e., data about the data--including whether the data is numeric or categoric). Without this metadata R sometimes determines the wrong data type for a particular column. This is not usually fatal and we can help R along when loading data using the R Console commands.