DATA MINING
Desktop Survival Guide by Graham Williams |
|||||
Examples |
Here's an example using the iris data:
> iris.rf <- randomForest(Species ~ ., iris, sampsize=c(10, 20, 10)) |
You can also name the classes in the Roption[]sampsize specification:
> samples <- c(setosa=10, versicolor=20, virginica=10) > iris.rf <- randomForest(Species ~ ., iris, sampsize=samples) |
You can do a stratified sampling using a different variable than the
class labels so that you even up the distribution of the class. Andy
Liaw gives an example of the multi-centered clinical trial data where you
want to draw the same number of patients per center to grow each tree
where you can do something like:
> randomForest(..., strata=center, sampsize=rep(min(table(center))), nlevels(center))) |
To be confident that the random forest score is simply the proportion of positive examples, we can try building one tree, then multiple trees, and see what we get. We can start with a single tree (note that we use the Rattle generated commands, as listed in the Log tab, and thus we use the Rattle internal variables.
First build a single tree:
> set.seed(123) > crs$rf <- randomForest(as.factor(Adjusted) ~ ., data=crs$dataset[crs$sample,c(2:10,13)], ntree=1, importance=TRUE, na.action=na.omit) > crs$pr <- predict(crs$rf, crs$dataset[-crs$sample, c(2:10,13)], type="prob")[,2] > summary(as.factor(crs$pr)) 0 1 NA's 423 139 38 |
Now build two trees and rerun the code:
> set.seed(123) > crs$rf <- randomForest(as.factor(Adjusted) ~ ., data=crs$dataset[crs$sample,c(2:10,13)], ntree=2, importance=TRUE, na.action=na.omit) > crs$pr <- predict(crs$rf, crs$dataset[-crs$sample, c(2:10,13)], type="prob")[,2] > summary(as.factor(crs$pr)) 0 0.5 1 NA's 353 124 85 38 |
And then four trees:
> set.seed(123) > crs$rf <- randomForest(as.factor(Adjusted) ~ ., data=crs$dataset[crs$sample,c(2:10,13)], ntree=4, importance=TRUE, na.action=na.omit) > crs$pr <- predict(crs$rf, crs$dataset[-crs$sample, c(2:10,13)], type="prob")[,2] > summary(as.factor(crs$pr)) 0 0.25 0.5 0.75 1 NA's 293 98 68 62 41 38 |
Copyright © Togaware Pty Ltd Support further development through the purchase of the PDF version of the book.