Quadratic Euclidean 1-Mean and 1-Median 2-Clustering Problem with Constraints on the Size of the Clusters: Complexity and Approximability
Trudy Instituta matematiki i mehaniki, Trudy Instituta Matematiki i Mekhaniki UrO RAN, Tome 25 (2019) no. 4, pp. 69-78 Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice du chapitre de livre

We consider the problem of partitioning a set of $N$ points in $d$-dimensional Euclidean space into two clusters minimizing the sum of the squared distances between each element and the center of the cluster to which it belongs. The center of the first cluster is its centroid (the geometric center). The center of the second cluster should be chosen among the points of the input set. We analyze the variant of the problem with given sizes (cardinalities) of the clusters; the sum of the sizes equals the cardinality of the input set. We prove that the problem is strongly NP-hard and there is no fully polynomial-time approximation scheme for it.
Keywords: Euclidean space, clustering, quadratic variation, center, median, strong NP-hardness, nonexistence of FPTAS, approximation-preserving reduction.
Mots-clés : 2-partition, centroid
@article{TIMM_2019_25_4_a6,
     author = {A. V. Kel'manov and A. V. Pyatkin and V. I. Khandeev},
     title = {Quadratic {Euclidean} {1-Mean} and {1-Median} {2-Clustering} {Problem} with {Constraints} on the {Size} of the {Clusters:} {Complexity} and {Approximability}},
     journal = {Trudy Instituta matematiki i mehaniki},
     pages = {69--78},
     year = {2019},
     volume = {25},
     number = {4},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/TIMM_2019_25_4_a6/}
}
TY  - JOUR
AU  - A. V. Kel'manov
AU  - A. V. Pyatkin
AU  - V. I. Khandeev
TI  - Quadratic Euclidean 1-Mean and 1-Median 2-Clustering Problem with Constraints on the Size of the Clusters: Complexity and Approximability
JO  - Trudy Instituta matematiki i mehaniki
PY  - 2019
SP  - 69
EP  - 78
VL  - 25
IS  - 4
UR  - http://geodesic.mathdoc.fr/item/TIMM_2019_25_4_a6/
LA  - ru
ID  - TIMM_2019_25_4_a6
ER  - 
%0 Journal Article
%A A. V. Kel'manov
%A A. V. Pyatkin
%A V. I. Khandeev
%T Quadratic Euclidean 1-Mean and 1-Median 2-Clustering Problem with Constraints on the Size of the Clusters: Complexity and Approximability
%J Trudy Instituta matematiki i mehaniki
%D 2019
%P 69-78
%V 25
%N 4
%U http://geodesic.mathdoc.fr/item/TIMM_2019_25_4_a6/
%G ru
%F TIMM_2019_25_4_a6
A. V. Kel'manov; A. V. Pyatkin; V. I. Khandeev. Quadratic Euclidean 1-Mean and 1-Median 2-Clustering Problem with Constraints on the Size of the Clusters: Complexity and Approximability. Trudy Instituta matematiki i mehaniki, Trudy Instituta Matematiki i Mekhaniki UrO RAN, Tome 25 (2019) no. 4, pp. 69-78. http://geodesic.mathdoc.fr/item/TIMM_2019_25_4_a6/

[1] Aloise D., Deshpande A., Hansen P., Popat P., “NP-hardness of Euclidean sum-of-squares clustering”, Machine Learning, 75:2 (2009), 245–248 | DOI | Zbl

[2] Kariv O., Hakimi S.L., “An algorithmic approach to network location problems. Pt. II: The $p$-Medians”, SIAM J. Appl. Math., 37:3 (1979), 513–538 | DOI | MR | Zbl

[3] Gimadi E.Kh., Kelmanov A.V., Kelmanova M.A., Khamidullin S.A., “Aposteriornoe obnaruzhenie v chislovoi posledovatelnosti kvaziperiodicheskogo fragmenta pri zadannom chisle povtorov”, Sib. zhurn. industr. matematiki, 9:1(25) (2006), 55–74 | MR | Zbl

[4] Gimadi E.Kh., Kel'manov A.V., Kel'manova M.A., Khamidullin S.A., “A posteriori detecting a quasiperiodic fragment in a numerical sequence”, Pattern Recognition and Image Anal., 18:1 (2008), 30–42 | DOI

[5] Baburin A.E., Gimadi E.Kh., Glebov N.I., Pyatkin A.V., “Zadacha otyskaniya podmnozhestva vektorov s maksimalnym summarnym vesom”, Diskretnyi analiz i issledovanie operatsii. Ser. 2, 14:1 (2007), 32–42 | Zbl

[6] James G., Witten D., Hastie T., Tibshirani R., An introduction to statistical learning, Springer, Science+Business Media, LLC, N Y, 2013, 426 pp. | MR | Zbl

[7] Bishop C.M., Pattern recognition and machine learning, Springer, Science+Business Media, LLC, N Y, 2006, 738 pp. | MR | Zbl

[8] Shirkhorshidi A.S., Aghabozorgi S, Wah T.Y., Herawan T., “Big data clustering: A review”, Computational Science and Its Applications (ICCSA 2014), Proc., Lecture Notes in Computer Science, 8583, eds. B. Murgante et al., 2014, 707–720 | DOI

[9] Aggarwal C.C., Data mining: The Textbook, Springer, International Publishing, N Y etc., 2015, 734 pp. | MR | Zbl

[10] Edwards A.W.F., Cavalli-Sforza L.L., “A method for cluster analysis”, Biometrics, 21 (1965), 362–375 | DOI

[11] Garey M.R., Johnson D.S., Computers and intractability: A guide to the theory of NP-completeness, Freeman, San Francisco, 1979, 338 pp. | MR | Zbl

[12] Papadimitriou C.H., Computational complexity, Addison-Wesley, N Y, 1994, 523 pp. | MR | Zbl

[13] Vazirani V.V., Approximation algorithms, Springer-Verlag, Berlin; Heidelberg; N Y, 2001, 380 pp. | MR

[14] Dolgushev A.V., Kelmanov A.V., “Priblizhennyi algoritm resheniya odnoi zadachi klasternogo analiza”, Diskretnyi analiz i issledovanie operatsii, 18:2 (2011), 29–40 | MR | Zbl

[15] Dolgushev A.V., Kelmanov A.V., Shenmaier V.V., “Priblizhennaya polinomialnaya skhema dlya odnoi zadachi klasternogo analiza”, Intellektualizatsiya obrabotki informatsii, 9-ya mezhdunar. konf., cb. dokl. (Resp. Chernogoriya, g. Budva, 16-22 sentyabrya 2012 g.), Torus Press, M., 2012, 242–244

[16] Kelmanov A.V., Khandeev V.I., “Randomizirovannyi algoritm dlya odnoi zadachi dvukhklasternogo razbieniya mnozhestva vektorov”, Zhurn. vychisl. matematiki i mat. fiziki, 55:2 (2015), 335–344 | DOI | Zbl

[17] Kel'manov A.V., Motkova A.V., Shenmaier V.V., “An approximation scheme for a weighted two-cluster partition problem”, Analysis of Images, Social Networks and Texts - 6th Internat. Conf. (AIST 2017), Revised Selected Papers, Lecture Notes in Computer Science, 10716, 2018, 323–333 | DOI