谷歌浏览器插件
订阅小程序
在清言上使用

Model-based classification with dissimilarities: a maximum likelihood approach

Pattern Analysis and Applications(2008)

引用 0|浏览6
暂无评分
摘要
Most of classification problems concern applications with objects lying in an Euclidean space, but, in some situations, only dissimilarities between objects are known. We are concerned with supervised classification analysis from an observed dissimilarity table, which task is classifying new unobserved or implicit objects (only known through their dissimilarity measures with previously classified ones forming the training data set ) into predefined classes. This work concentrates on developing model-based classifiers for dissimilarities which take into account the measurement error w.r.t. Euclidean distance. Basically, it is assumed that the unobserved objects are unknown parameters to estimate in an Euclidean space, and the observed dissimilarity table is a random perturbation of their Euclidean distances of gaussian type. Allowing the distribution of these perturbations to vary across pairs of classes in the population leads to more flexible classification methods than usual algorithms. Model parameters are estimated from the training data set via the maximum likelihood (ML) method, and allocation is done by assigning a new implicit object to the group in the population and positioning in the Euclidean space maximizing the conditional group likelihood with the estimated parameters. This point of view can be expected to be useful in classifying dissimilarity tables that are no longer Euclidean due to measurement error or instabilities of various types. Two possible structures are postulated for the error, resulting in two different model-based classifiers. First results on real or simulated data sets show interesting behavior of the two proposed algorithms, ant the respective effects of the dissimilarity type and of the data intrinsic dimension are investigated. For these latter two aspects, one of the constructed classifiers appears to be very promising. Interestingly, the data intrinsic dimension seems to have a much less adverse effect on our classifiers than initially feared, at least for small to moderate dimensions.
更多
查看译文
关键词
dissimilarity data,model-based classifier,maximum likelihood estimate,intrinsic data dimension,success classification rate,multidimensional scaling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要