| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Chapter summary:
Statistical classifiers only work x% of the time. x is inversely proportional to what the theory says it should be.
clsfy contains several classes for representing, using and training
statistical classifiers. Input data is represented by vnl_vector<double>.
Output classes are represented by integers [0..n_classes).
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
All classifiers support the classification of sample vectors, and estimates
of class probabilities. All classifiers are derived from
clsfy_classifier_base.
unsigned n_dims() constDimensionality of vector space of inputs.
unsigned n_classes() constThe number of possible output classes. If n_classes() == 1, this indicates a binary classifier. In this case, most functions return values associated with just the positive (1st) class. As far as the interface is concerned a binary classifier is distinct from a multiclass classifier with n_classes() == 2.
unsigned classify(x) constMost likely class of vector x
void class_probabilities(vcl_vector<double> & outputs, x) constEstimate of a-posteriori class probabilities for vector x. If the classifier is binary (i.e. n_classes == 1), only a single value will be returned, and will be the probability of being in the class 1, also called the positive class.
double log_l(x)If the classifier is binary, an estimate of the a-posteriori log likelihood of being in class 1.
The classifiers all support IO via vsl_b_read, vsl_b_write, and
vsl_print_summary.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The classifier training algorithms are embedded within the classes derived
from clsfy_builder_base.
clsfy_classifier_base* new_classifier()Create a new classifier of appropriate type on heap and return pointer
double build(model, training_inputs, training_outputs, n_classes)Train the classifier from the data supplied
The concrete builders have attributes that can be modified to control the training process. They should all have default values for these attributes which may allow you to build a classifier without understanding too much about it.
The builders all support IO via vsl_b_read, vsl_b_write, and
vsl_print_summary.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This code is an example of the strategy pattern (Gamma, et al. Design Patterns, Addison Wesley, 1995.) It is possible to write code that builds and uses a classifier, where your code does not itself know what sort of classifier is being used. Both builders and classifiers can be saved and loaded by base class pointer.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
clsfy_binary_hyperplaneSimple two class classifier, where the class boundary is a plane (or line in two dimensions).
clsfy_binary_hyperplane_ls_builderTrain a hyperplane classifier using least squares.
clsfy_pdf_classifierA binary classifier that takes a single PDF to describe the positive class (class number 1). The boundary is set on an iso-probability contour.
clsfy_k_nearest_neighbourOne of the simplest and most effective classifies around. Don't wait until the end of your PhD before comparing your algorithm with this one.
clsfy_rbf_parzenA Parzen window classifiers that uses a radial basis kernel at each training point.
clsfy_random_classifierUseful for testing, this classifier outputs a preferred class independent of the input data.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Suppose we wish to compute a classifier from a set of vectors, then estimate the probability that each vector was taken from class 1 by the distribution.
vcl_vector<vnl_vector<double> > data_inputs(n);
vcl_vector<unsigned> data_targets(n);
// Load in the vectors
....
// Create an iterator object to pass the data in
mbl_data_wrapper<vnl_vector<double> > v_data(data_input);
// Define what type of builder to use. In this case we want a hyperplane.
clsfy_binary_hyperplane_ls_builder builder;
// Generate model to build
clsfy_classifier_base *classifier = builder.new_classifier();
// I could have created it directly using
// clsfy_binary_hyperplane;
// Build the model from the data
builder.build(*classifier, v_data, data_targets);
vsl_print_summary(vcl_cout, classifier);
// Now find error;
unsigned error;
for (int i=0;i<data.size();++i)
{
if (classifier->classify(data_inputs[i]) != data_targets[i])
error++;
vcl_cout "Training error " << error << " out of " << n << "samples"<<vcl_endl;
// Tidy up
delete classifier;
|
| [ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated on May, 1 2013 using texi2html 1.76.