DATA MINING
Desktop Survival Guide by Graham Williams |
|||||
Setting Up a Data Source Name |
The key to using ODBC is to know (or to set up) the data source name (DSN) for your databases. The setting up of DSNs is outside the scope of Rattle, being a configuration tosk through your operating system. Under GNU/Linux, for example, using the unixodbc package, the system DSNs are often defined in the file /etc/odbcinst.ini and in /etc/odbc.ini. Under MS/Windows the control panel provides access to a DSN tool.
Within Rattle we specify a known DSN by typing the name into the text entry. Once that is done, we press the Enter key and Rattle will attempt to connect. This may require a username and password to be supplied. For a Teradata Warehouse connection you will be presented with a dialog box.
If the connection is successful we will find a list of available tables in the Table combobox.
We can choose a Table, and also include a limit on the number of rows that we wish to load into Rattle. This allows us to get a smaller sample of the data for testing purposes before loading up a large dataset. If the Row Limit is set to 0 then all of the rows from the table are retrieved. Unfortunately there is now SQL standard for limiting the number of rows returned from a query. For the Teradata and Netezza warehouses the SQL keyword is LIMIT and this is what is used by Rattle.
Copyright © Togaware Pty Ltd Support further development through the purchase of the PDF version of the book.