Nearest Neighbor Voting in High Dimensional Data: Learning from Past Occurrences
Computer Science and Information Systems, Tome 9 (2012) no. 2.

Voir la notice de l'article provenant de la source Computer Science and Information Systems website

Hubness is a recently described aspect of the curse of dimensionality inherent to nearest-neighbor methods. This paper describes a new approach for exploiting the hubness phenomenon in k-nearest neighbor classification. We argue that some of the neighbor occurrences carry more information than others, by the virtue of being less frequent events. This observation is related to the hubness phenomenon and we explore how it affects high-dimensional k-nearest neighbor classification. We propose a new algorithm, Hubness Information k-Nearest Neighbor (HIKNN), which introduces the k-occurrence informativeness into the hubness-aware k-nearest neighbor voting framework. The algorithm successfully overcomes some of the issues with the previous hubness-aware approaches, which is shown by performing an extensive evaluation on several types of high-dimensional data.
Keywords: k-nearest neighbor, curse of dimensionality, hubness, neighbor occurrence models, self-information, fuzzy, voting
@article{CSIS_2012_9_2_a10,
     author = {Nenad Tomasev and Dunja Mladenic},
     title = {Nearest {Neighbor} {Voting} in {High} {Dimensional} {Data:} {Learning} from {Past} {Occurrences}},
     journal = {Computer Science and Information Systems},
     publisher = {mathdoc},
     volume = {9},
     number = {2},
     year = {2012},
     url = {http://geodesic.mathdoc.fr/item/CSIS_2012_9_2_a10/}
}
TY  - JOUR
AU  - Nenad Tomasev
AU  - Dunja Mladenic
TI  - Nearest Neighbor Voting in High Dimensional Data: Learning from Past Occurrences
JO  - Computer Science and Information Systems
PY  - 2012
VL  - 9
IS  - 2
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/CSIS_2012_9_2_a10/
ID  - CSIS_2012_9_2_a10
ER  - 
%0 Journal Article
%A Nenad Tomasev
%A Dunja Mladenic
%T Nearest Neighbor Voting in High Dimensional Data: Learning from Past Occurrences
%J Computer Science and Information Systems
%D 2012
%V 9
%N 2
%I mathdoc
%U http://geodesic.mathdoc.fr/item/CSIS_2012_9_2_a10/
%F CSIS_2012_9_2_a10
Nenad Tomasev; Dunja Mladenic. Nearest Neighbor Voting in High Dimensional Data: Learning from Past Occurrences. Computer Science and Information Systems, Tome 9 (2012) no. 2. http://geodesic.mathdoc.fr/item/CSIS_2012_9_2_a10/