Statistical technique in clustaring problems
Matematičeskoe modelirovanie, Tome 34 (2022) no. 10, pp. 110-122.

Voir la notice de l'article provenant de la source Math-Net.Ru

Problem of evaluating and improving quality of clustering of multispectral data is under consideration. Method for calculating distance between clusters is developed. Vectors of each cluster are considered as implementations of some random vector. Sampling distribution functions (SDF) are found and errors of approximation of unknown exact distribution functions by sampling ones are obtained. Distance between two clusters is defined as distance between two SDF. Criteria for indiscernible, overlapping and disjoint clusters are defined. Technique to improve clustering is suggested. Consistently indiscernible clusters or indiscernible and overlapping ones are united. Simulated data experiments results are given. It is shown that the technique can decompose simulated data into initial groups of vectors. Real data experiments results are given. Real data are multispectral images of sensor HYPERION, obtained above ocean under clear sky and broken clouds. It is shown that the suggested technique can find clouds and their shadows on images.
Keywords: clustering
Mots-clés : multispectral images, statistical techniques.
@article{MM_2022_34_10_a6,
     author = {O. V. Nikolaeva},
     title = {Statistical technique in clustaring problems},
     journal = {Matemati\v{c}eskoe modelirovanie},
     pages = {110--122},
     publisher = {mathdoc},
     volume = {34},
     number = {10},
     year = {2022},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/MM_2022_34_10_a6/}
}
TY  - JOUR
AU  - O. V. Nikolaeva
TI  - Statistical technique in clustaring problems
JO  - Matematičeskoe modelirovanie
PY  - 2022
SP  - 110
EP  - 122
VL  - 34
IS  - 10
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/MM_2022_34_10_a6/
LA  - ru
ID  - MM_2022_34_10_a6
ER  - 
%0 Journal Article
%A O. V. Nikolaeva
%T Statistical technique in clustaring problems
%J Matematičeskoe modelirovanie
%D 2022
%P 110-122
%V 34
%N 10
%I mathdoc
%U http://geodesic.mathdoc.fr/item/MM_2022_34_10_a6/
%G ru
%F MM_2022_34_10_a6
O. V. Nikolaeva. Statistical technique in clustaring problems. Matematičeskoe modelirovanie, Tome 34 (2022) no. 10, pp. 110-122. http://geodesic.mathdoc.fr/item/MM_2022_34_10_a6/

[1] B. Bahmani, B. Moseley, A. Vattani, R. Kumar, S. Vassilvitskii, “Scalable K-means++”, Proc. of the VLDB Endowment, 5:7 (2012), 622–633 | DOI

[2] S. V. Belim, P. E. Kutlunin, “Vydelenie konturov na izobrazheniiakh s pomoshchiu algoritma klasterizatsii”, Komputernaia Optika, 39:1 (2015), 119–125 | DOI

[3] R. Agrawal, J. Gehrke, D. Gunopulos, P. Raghavan, “Automatic subspace clustering of high dimensional data for data mining applications”, Proc. ACM SIGMOD International Conference on Management Data, ACM Press, Seattle, Washington, USA, 1998, 94–105 | DOI

[4] H. Nagesh, S. Goi, A. Choudhary, “Adaptive grids for clustering massive data sets”, Proc. of 1st SIAM International Conference on Data Mining (Chicago, USA, 2001), 1–17 | MR | Zbl

[5] N. P. Lin, C. I. Chang, N.Y Jan, H. J. Chen, W. H. Hao, “A deflected grid-based algorithm for clustering analysis”, Int. J. of Math. Models Methods in Applied Sci., 1:1 (2007), 33–39

[6] I. A. Pestunov, V. B. Berikov, Yu. N. Sinyavsliy, “Segmentatziia mnogospektralnykh izobrazhenii na osnove ansamblia neparametricheskikh algoritmov klasterizatsii”, Vestnik Sib. Gosud. aerokosmucheskogo univ. im. akad. M.F. Reshetneva, 2010, no. 5, 56–64

[7] N. P. Lavernov, V. V. Popovich, L. A. Vedeshin, F. R. Galiano, “Metody analiza dannykh distantsionnogo zondirovaniia Zemli”, Sovr. problemy distantsionnogo zondirovaniia Zemli iz kosmosa, 12:6 (2015), 145–153

[8] L. M. Bruce, C. H. Koger, J. Li, “Dimensionality reduction of hyperspectral data using discrete wavelet transform feature extraction”, IEEE Transactions on Geoscience and Remote Sensing, 40:10 (2002), 2331–2338 | DOI

[9] G. Sheikholeslami, S. Chatterjee, A. Zhang, “WaveCluster: A multi-resolution clustering approach for very large spatial databases”, Proc. of the 24th VLDB Conf. (NY, USA, 1998), 428–439

[10] F. Wang, Ch. H. Q. Ding, T. Li, “Integrated KL (K-means Laplacian) Clustering: A new clustering approach by combining attribute data and pairwise relations”, Proc. of SIAM Inter. Conf. on Data Mining (Sparks, Nevada, USA, 2009), 38–38 | DOI

[11] A. A. Varlamova, A. Yu. Denisova, V. V. Sergeyev, “Informatsionnaia tekhnologia obrabotki dannykh DZZ dlia otsenki arealov rastenii”, Komputernaia Optika, 42:5 (2015), 864–876 | DOI

[12] M. A. Kashnitskaya, “Issledovanie dinamiki ploshchadei vodnoi poverkhnosti ozer stepnoi zony Vostochnogo Zabaikalia na osnove dannykh distantsionnogo zondirovaniia Zemli”, Sovr. problemy distantsionnogo zondirovaniia Zemli iz kosmosa, 18:3 (2021), 242–253 | DOI

[13] Y. Tarabalka, J. Chanussot, J. A. Benediktsson, “Segmentation and classification of hyperspectral images using watershed transformation”, Pattern Recognition, 43 (2010), 2367–2379 | DOI | Zbl

[14] E. V. Myasnikov, “Hyperspectral image segmentation using dimensionality reduction and classical segmentation approaches”, Computer Optics, 41:4 (2017), 564–572 | DOI

[15] Yu. N. Orlov, S. L. Fedorov, Metody chislennogo modelirovaniia protsessov nestatsionarnogo sluchainogo bluzhdaniia, MFTI, M., 2016, 108 pp.

[16] G. G. Baula, M. N. Brychikhin, M. I. Istomina, A. Yu. Krotkov, E. Yu. Szhyonov, A. A. Rizvanov, V. N. Tret'yakov, “Formirovanie bazy dannykh giperspectralnykh opticheskikh kharakteristik selskokhoziaistvennykh kultur v ultrafioletovoi, vidimoi i blizhnei infrakrasnoi oblastiakh spektra”, Kosmonavtika i raketostroenie, 2013, no. 4, 178–184