DATA MINING
Desktop Survival Guide by Graham Williams |
|||||
Removing Outliers |
Tests for outliers have primarily been superseded by the use of robust methods. Outlier tests are poor in that outliers tend to damage results long before they are detected. Robust methods attempt to compensate rather than reject outliers. RandomForrest modelling helps avoid the issue of outliers.
You can get a list of what the boxplot function thinks
are outliers:
> load("wine.RData") > bp <- boxplot(wine$Ash, plot=FALSE) > bp$out [1] 3.22 1.36 3.23 |