Attainable best guarantee for the accuracy of $k$-medians clustering in $[0,1]$
Trudy Instituta matematiki i mehaniki, Trudy Instituta Matematiki i Mekhaniki UrO RAN, Tome 23 (2017) no. 4, pp. 301-310
Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice du chapitre de livre

The scalar $k$-medians clustering problem is considered in the context of a two-player zero-sum game. The set of strategies of the first player coincides with a family of fixed-length samples from the interval $[0,1]$. The strategies of the second player are all possible partitions of an arbitrary sample of a given length into a given number of clusters. The quality of the clustering is evaluated by the payoff function equal to the sum of deviations of the elements from the centers of clusters nearest to them. It is easy to see that the game has no value except for rare cases. For arbitrary positive integers $n$ and $k$, we establish an upper bound $0.5n/(2k-1)$ for the lower value of the game and prove its attainability for $k>1$ and sufficiently large $n=n(k)$. Thus, we show that a clustering of an arbitrary sample of length $n$ can be constructed by the $k$ medians method so that the payoff does not exceed the obtained bound, and the bound is attainable for an arbitrary number of clusters and for sufficiently long samples. These results are applicable in combinatorial optimization in the proof of polynomial solvability of subclasses of intractable extremal problems.
Keywords: clustering, $k$-medians problem, attainable accuracy guarantee.
@article{TIMM_2017_23_4_a28,
     author = {M. Yu. Khachai and D. M. Khachai and V. S. Pankratov},
     title = {Attainable best guarantee for the accuracy of $k$-medians clustering in~$[0,1]$},
     journal = {Trudy Instituta matematiki i mehaniki},
     pages = {301--310},
     year = {2017},
     volume = {23},
     number = {4},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/TIMM_2017_23_4_a28/}
}
TY  - JOUR
AU  - M. Yu. Khachai
AU  - D. M. Khachai
AU  - V. S. Pankratov
TI  - Attainable best guarantee for the accuracy of $k$-medians clustering in $[0,1]$
JO  - Trudy Instituta matematiki i mehaniki
PY  - 2017
SP  - 301
EP  - 310
VL  - 23
IS  - 4
UR  - http://geodesic.mathdoc.fr/item/TIMM_2017_23_4_a28/
LA  - ru
ID  - TIMM_2017_23_4_a28
ER  - 
%0 Journal Article
%A M. Yu. Khachai
%A D. M. Khachai
%A V. S. Pankratov
%T Attainable best guarantee for the accuracy of $k$-medians clustering in $[0,1]$
%J Trudy Instituta matematiki i mehaniki
%D 2017
%P 301-310
%V 23
%N 4
%U http://geodesic.mathdoc.fr/item/TIMM_2017_23_4_a28/
%G ru
%F TIMM_2017_23_4_a28
M. Yu. Khachai; D. M. Khachai; V. S. Pankratov. Attainable best guarantee for the accuracy of $k$-medians clustering in $[0,1]$. Trudy Instituta matematiki i mehaniki, Trudy Instituta Matematiki i Mekhaniki UrO RAN, Tome 23 (2017) no. 4, pp. 301-310. http://geodesic.mathdoc.fr/item/TIMM_2017_23_4_a28/

[1] Aggarwal C.C., Reddy C.K., Data clustering: algorithms and applications, Chapman Hall/CRC Data Mining and Knowledge Discovery Ser., Taylor Francis Inc., Bosa Roca, 2013, 652 pp. | MR

[2] Ben-David S., Computational feasibility of clustering under clusterability assumptions, [e-resource]. CoRR abs/1501.00437, 2015, arXiv: 1501.00437

[3] Duda R.O., Hart P.E., Stork D.G., Pattern classification, Wile, N. Y., 2001, 680 pp. | MR | Zbl

[4] A. Gronlund, K.G. Larsen, A. Mathiasen, J.S. Nielsen, Fast exact $k$-means, $k$-medians and bregman divergence clustering in 1d, [e-resource]. CoRR abs/1701.07204, 2017, arXiv: 1701.07204

[5] Guruswami V., Indyk P., “Embeddings and non-approximability of geometric problems”, Proc. of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '03), Society for industrial and applied mathematics, Philadelphia, 2003, 537–538 | MR | Zbl

[6] Har-Peled S., Mazumdar S., “On coresets for $k$-means and $k$-median clustering”, Proc. of the Thirty-Sixth Annual ACM Symposium on Theory of Computing (STOC '04), ACM, N. Y., 2004, 291–300 | DOI | MR | Zbl

[7] Khachay M., Neznakhina K., “Generalized pyramidal tours for the generalized traveling salesman problem”, Lecture Notes in Computer Science, 10627, 2017, 265–277 | DOI

[8] Khachay M., Pankratov V., Khachay D., “Attainable best guarantee for the accuracy of $k$-medians clustering in [0,1]”, 8th International Conf. Optimization and Applications (OPTIMA2017), eds. eds. Y.G. Evtushenko, et al., CEUR Workshop Proceedings, Aachen, 2017, 322–327

[9] Kumar A., Sabharwal Y., Sen S., “Linear-time approximation schemes for clustering problems in any dimensions”, J. ACM, 57(2), Feb (2010), 5:1-5:32 | DOI | MR