A Bimodality Test in High Dimensions
Serdica Journal of Computing, Tome 6 (2012) no. 4, pp. 437-450
Cet article a éte moissonné depuis la source Bulgarian Digital Mathematics Library
We present a test for identifying clusters in high dimensional
data based on the k-means algorithm when the null hypothesis is spherical
normal. We show that projection techniques used for evaluating validity of
clusters may be misleading for such data. In particular, we demonstrate
that increasingly well-separated clusters are identified as the dimensionality
increases, when no such clusters exist. Furthermore, in a case of true
bimodality, increasing the dimensionality makes identifying the correct clusters more difficult.
In addition to the original conservative test, we propose a practical test with the same asymptotic behavior that performs well for a
moderate number of points and moderate dimensionality. ACM Computing Classification System (1998): I.5.3.
Keywords:
Clustering, Bimodality, Multidimensional Space, Asymptotic Test
@article{SJC_2012_6_4_a5,
author = {Palejev, Dean},
title = {A {Bimodality} {Test} in {High} {Dimensions}},
journal = {Serdica Journal of Computing},
pages = {437--450},
year = {2012},
volume = {6},
number = {4},
language = {en},
url = {http://geodesic.mathdoc.fr/item/SJC_2012_6_4_a5/}
}
Palejev, Dean. A Bimodality Test in High Dimensions. Serdica Journal of Computing, Tome 6 (2012) no. 4, pp. 437-450. http://geodesic.mathdoc.fr/item/SJC_2012_6_4_a5/