DATA MINING
Desktop Survival Guide by Graham Williams |
|||||
Cost Based Learning |
An alternative is to use cost sensitive learning where the algorithm used by the model builder itself is modified. This approaches introduces numeric modifications to any formula used to estimate the error in our model. Mis-classifying a positive example as a negative, a false negative, (e.g., identifying a fraudulent case as not fraudulent) is more ``costly'' than a false positive. In health, for example, we do not want to miss cases of true cancer, and might find it somewhat more acceptable to momentarily investigate cases that turn out not to be cancer, simply because, missing the cancer may lead to premature death. A model builder will take into account the different costs of the outcome classes and build a model that does not so readily dismiss the very much under represented outcome.