Computational complexity of the problem of~choosing typical representatives in~a~2-clustering of a~finite set of~points in~a~metric~space
Diskretnyj analiz i issledovanie operacij, Tome 27 (2020) no. 2, pp. 5-16.

Voir la notice de l'article provenant de la source Math-Net.Ru

We consider the computational complexity of one extremal problem of choosing a subset of $p$ points from some given $2$-clustering of a finite set in a metric space. The chosen subset of points has to describe the given clusters in the best way from the viewpoint of some geometric criterion. This is a formalization of an applied problem of data mining which consists in finding a subset of typical representatives of a dataset composed of two classes based on the function of rival similarity. The problem is proved to be NP-hard. To this end, we polynomially reduce to the problem one of the well-known problems NP-hard in the strong sense, the $p$-median problem. Bibliogr. 15.
Keywords: NP-hard problem, typical representative, rival similarity, $p$-median problem, data mining.
@article{DA_2020_27_2_a0,
     author = {I. A. Borisova},
     title = {Computational complexity of the problem of~choosing typical representatives in~a~2-clustering of a~finite set of~points in~a~metric~space},
     journal = {Diskretnyj analiz i issledovanie operacij},
     pages = {5--16},
     publisher = {mathdoc},
     volume = {27},
     number = {2},
     year = {2020},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/DA_2020_27_2_a0/}
}
TY  - JOUR
AU  - I. A. Borisova
TI  - Computational complexity of the problem of~choosing typical representatives in~a~2-clustering of a~finite set of~points in~a~metric~space
JO  - Diskretnyj analiz i issledovanie operacij
PY  - 2020
SP  - 5
EP  - 16
VL  - 27
IS  - 2
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/DA_2020_27_2_a0/
LA  - ru
ID  - DA_2020_27_2_a0
ER  - 
%0 Journal Article
%A I. A. Borisova
%T Computational complexity of the problem of~choosing typical representatives in~a~2-clustering of a~finite set of~points in~a~metric~space
%J Diskretnyj analiz i issledovanie operacij
%D 2020
%P 5-16
%V 27
%N 2
%I mathdoc
%U http://geodesic.mathdoc.fr/item/DA_2020_27_2_a0/
%G ru
%F DA_2020_27_2_a0
I. A. Borisova. Computational complexity of the problem of~choosing typical representatives in~a~2-clustering of a~finite set of~points in~a~metric~space. Diskretnyj analiz i issledovanie operacij, Tome 27 (2020) no. 2, pp. 5-16. http://geodesic.mathdoc.fr/item/DA_2020_27_2_a0/

[1] Garey M. R., Johnson D. S., Computers and intractability: A guide to the theory of NP-completeness, Freeman, San Francisco, 1979 | MR | Zbl

[2] Aloise D., Deshpande A., Hansen P., Popat P., “NP-hardness of Euclidean sum-of-squares clustering”, Mach. Learn., 75:2 (2009), 245–248 | DOI | Zbl

[3] Dasgupta S., The hardness of $k$-means clustering, Tech. Rep. CS2007-0890, Univ. California, San Diego, 2008, 6 pp.

[4] Papadimitriou C. H., “Worst-case and probabilistic analysis of a geometric location problem”, SIAM J. Comput., 10:3 (1981), 542–557 | DOI | MR | Zbl

[5] Har-Peled S., Mazumdar S., “Coresets for $k$-means and $k$-median clustering and their applications”, Proc. 36th Annu. ACM Sympos. Theory Comput. (Chicago, IL, USA, June 13–15, 2004), ACM, New York, 2004, 291–300 | MR | Zbl

[6] Kaufman L., Rousseeuw P. J., “Clustering by means of medoids”, Statistical data analysis based on the $L_{1}$-norm and related methods, North Holland, Amsterdam, 1987, 405–416 | MR

[7] A. V. Kel'manov, A. V. Pyatkin, V. I. Khandeev, “On the complexity of some max-min clustering problems”, Tr. Inst. Mat. Mekh. UrO RAN, 24, no. 4, 2018, 189–198 (Russian) | MR

[8] A. V. Kel'manov, A. V. Pyatkin, “NP-hardness of some Euclidean problems of partition of a finite set of points”, Zh. Vychisl. Mat. Mat. Fiz., 58:5 (2018), 852–856 (Russian) | DOI | Zbl

[9] Aggarwal H., Imai N., Katoh N., Suri S., “Finding $k$ points with minimum diameter and related problems”, J. Algorithms, 12:1 (1991), 38–56 | DOI | MR | Zbl

[10] Zukhba A. V., “NP-completeness of the problem of prototype selection in the nearest neighbor method”, Pattern Recognit. Image Anal., 20:4 (2010), 484–494 | DOI

[11] Banerjee S., Bhore S., Chitnis R., “Algorithms and hardness results for nearest neighbor problems in bicolored point sets”, Proc. 13th Latin Amer. Theor. Inform. Symp. (Buenos Aires, Argentina, Apr. 16–19, 2018), Lect. Notes Comput. Sci., 10807, Springer, Cham, 2018, 80–93 | DOI | MR | Zbl

[12] V. N. Vapnik, The Restoration of Dependencies from Empirical Data, Nauka, M., 1974 (Russian) | MR

[13] Burges C. J. C., “A Tutorial on support vector machines for pattern recognition”, Data Mining Knowl. Discov., 2:2 (1998), 121–167 | DOI | MR

[14] Zagoruiko N. G., Borisova I. A., Dyubanov V. V., Kutnenko O. A., “Methods of recognition based on the function of rival similarity”, Pattern Recognit. Image Anal., 18:1 (2008), 1–6 | DOI

[15] Kariv O., Hakimi S., “An algorithmic approach to network location problems. II: The $p$-medians”, SIAM J. Appl. Math., 37 (1979), 539–560 | DOI | MR | Zbl