A framework for similarity-based methods.

Wlodzislaw Duch, Karol Grudziński,
Department of Informatics, Nicolaus Copernicus University,
Grudziadzka 5, 87-100 Torun, Poland.
E-mail: wduch@fizyka.umk.pl

Second Polish Conference on Theory and Applications of Artificial Intelligence,
Łódź, 28-30 Sept. 1998, pp. 33-60

Similarity-based methods (SBM) are a generalization of the minimal distance (MD) methods which form a basis of many machine learning and pattern recognition method. Investigation of similarity leads to a fruitful framework in which many classification methods are accommodated. Probability p(C|X;M) of assigning class C to vector X, given the classification model M depends on adaptive parameters of the model and procedures used in calculation, such as: the number of reference vectors taken into account in the neighborhood of X, the maximum size of the neighborhood, parameterization of the similarity measures, the weighting function estimating contributions of neighboring reference vectors, the procedure used to create a set of reference vectors from the training data, the total cost function minimized at the training stage and the kernel function, scaling the influence of the error on the total cost function. SBM may include several models M and an interpolation procedure to combine results of a committee of models. Neural networks, such as the Radial Basis Function (RBF) and the Multilayer Perceptrons (MLPs) models, are included in this framework as special cases. Many new versions of similarity based methods are derived from this framework. A few of them have been tested on a real-world data sets and very good results have been obtained.

Paper in PDF, 80 kB