DATA MINING
Desktop Survival Guide by Graham Williams |
|||||
|
A binning function is provided by Rattle, coded by
Daniele Medri. The Rattle interface provides an option to choose
between Quantile binning, KMeans binning, and
Equal Width binning. For each option the default number of
bins is 4, and we can change this to suit our needs. The generated
variables are prefixed with either BIN_QUn_
, BIN_KMn_
,
and BIN_EWn_
respectively, with n
replaced with the
number of bins. Thus, we can create multiple binnings for any
variable.
An example of why we might want to do this is to visualise data. A mosaic plot, for exapmle, is only uesful for categoric data and so we could turn Sunshine into a categoric by binning. Also talk about binning to show box plot for different targets.
Note that quantile binning is the same as equal count binning.
Copyright © Togaware Pty Ltd Support further development through the purchase of the PDF version of the book.