Manifold learning based on kernel density estimation
Učënye zapiski Kazanskogo universiteta. Seriâ Fiziko-matematičeskie nauki, Uchenye Zapiski Kazanskogo Universiteta. Seriya Fiziko-Matematicheskie Nauki, Tome 160 (2018) no. 2, pp. 327-338 Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice du chapitre de livre

The problem of unknown high-dimensional density estimation has been considered. It has been suggested that the support of its measure is a low-dimensional data manifold. This problem arises in many data mining tasks. The paper proposes a new geometrically motivated solution to the problem in the framework of manifold learning, including estimation of an unknown support of the density. Firstly, the problem of tangent bundle manifold learning has been solved, which resulted in the transformation of high-dimensional data into their low-dimensional features and estimation of the Riemann tensor on the data manifold. Following that, an unknown density of the constructed features has been estimated with the use of the appropriate kernel approach. Finally, using the estimated Riemann tensor, the final estimator of the initial density has been constructed.
Keywords: dimensionality reduction, manifold learning, manifold valued data, density estimation on manifold.
@article{UZKU_2018_160_2_a13,
     author = {A. P. Kuleshov and A. V. Bernstein and Yu. A. Yanovich},
     title = {Manifold learning based on~kernel density estimation},
     journal = {U\v{c}\"enye zapiski Kazanskogo universiteta. Seri\^a Fiziko-matemati\v{c}eskie nauki},
     pages = {327--338},
     year = {2018},
     volume = {160},
     number = {2},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/UZKU_2018_160_2_a13/}
}
TY  - JOUR
AU  - A. P. Kuleshov
AU  - A. V. Bernstein
AU  - Yu. A. Yanovich
TI  - Manifold learning based on kernel density estimation
JO  - Učënye zapiski Kazanskogo universiteta. Seriâ Fiziko-matematičeskie nauki
PY  - 2018
SP  - 327
EP  - 338
VL  - 160
IS  - 2
UR  - http://geodesic.mathdoc.fr/item/UZKU_2018_160_2_a13/
LA  - en
ID  - UZKU_2018_160_2_a13
ER  - 
%0 Journal Article
%A A. P. Kuleshov
%A A. V. Bernstein
%A Yu. A. Yanovich
%T Manifold learning based on kernel density estimation
%J Učënye zapiski Kazanskogo universiteta. Seriâ Fiziko-matematičeskie nauki
%D 2018
%P 327-338
%V 160
%N 2
%U http://geodesic.mathdoc.fr/item/UZKU_2018_160_2_a13/
%G en
%F UZKU_2018_160_2_a13
A. P. Kuleshov; A. V. Bernstein; Yu. A. Yanovich. Manifold learning based on kernel density estimation. Učënye zapiski Kazanskogo universiteta. Seriâ Fiziko-matematičeskie nauki, Uchenye Zapiski Kazanskogo Universiteta. Seriya Fiziko-Matematicheskie Nauki, Tome 160 (2018) no. 2, pp. 327-338. http://geodesic.mathdoc.fr/item/UZKU_2018_160_2_a13/

[1] Seung H. S., “Cognition: The manifold ways of perception”, Science, 290:5500 (2000), 2268–2269 | DOI

[2] Huo X., Ni X. S., Smith A. K., “A survey of manifold-based learning methods”, Recent Advances in Data Mining of Enterprise Data: Algorithms and Applications, World Sci., Singapore, 2008, 691–745 | DOI

[3] Ma Y., Fu Y., Manifold Learning Theory and Applications, CRC Press, London, 2011, 314 pp. | MR

[4] M{ü}ller E., Assent I., Krieger R., Günnemann S., Seidl T., “DensEst: Density estimation for data mining in high dimensional spaces”, Proc. 2009 SIAM Int. Conf. on Data Mining, Soc. Ind. Appl. Math., Philadelphia, 2009, 175–186 | DOI

[5] Kriegel H. P., Kroger P., Renz M., Wurst S., “A generic framework for efficient subspace clustering of high-dimensional data”, Proc. 5th IEEE Int. Conf. on Data Mining (ICDM'05), IEEE, Houston, TX, 2005, 250–257 | DOI

[6] Zhu F., Yan X., Han J., Yu P. S., Cheng H., “Mining colossal frequent patterns by core pattern fusion”, Proc. 23rd IEEE Int. Conf. on Data Engineering, IEEE, Istanbul, 2007, 706–715 | DOI

[7] Bradley P., Fayyad U., Reina C., “Scaling clustering algorithms to large databases”, KDD-98 Proc. 4th Int. Conf. on Knowledge Discovery and Data Mining, Am. Assoc. Artif. Intell., New York, 1998, 9–15

[8] Weber R., Schek H. J., Blott S., “A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces”, Proc. 24th VLDB Conf. (New York, 1998), 194–205

[9] Domeniconi C., Gunopulos D., “An efficient density-based approach for data mining tasks.”, Knowl. Inf. Syst., 6:6 (2004), 750–770 | DOI

[10] Bennett K. P., Fayyad U., Geiger D., “Density-based indexing for approximate nearest-neighbor queries”, KDD-99 Proc. 5th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (New York, 1999), 233–243 | DOI

[11] Scott D. W., “Multivariate density estimation and visualization”, Handbook of Computational Statistics, eds. Gentle J. E., Härdle W. K., Mori Yu., Springer, Berlin–Heidelberg, 2012, 549–569 | DOI | MR

[12] Niyogi P., Smale S., Weinberger S., “Finding the homology of submanifolds with high confidence from random samples”, Discrete Comput. Geom., 39:1–3 (2008), 419–441 | DOI | MR | Zbl

[13] Jost J., Riemannian Geometry and Geometric Analysis, Springer, Berlin–Heidelberg, 2005, xiii+566 pp. | DOI | MR | Zbl

[14] Lee J., Manifolds and Differential Geometry, Graduate Studies in Mathematics, 107, Am. Math. Soc., 2009, 671 pp. | DOI | MR | Zbl

[15] Pennec X., “Probabilities and statistics on Riemannian manifolds: Basic tools for geometric measurements”, Int. Workshop on Nonlenear Signal and Image Processing (NSIP-99) (Antalya, 1999), 194–198

[16] Pelletier B., “Kernel density estimation on Riemannian manifolds”, Stat. Probab. Lett., 73:3 (2005), 297–304 | DOI | MR | Zbl

[17] Guillermo H., Munoz A., Rodriguez D., “Locally adaptive density estimation on Riemannian manifolds.”, Sort: Stat. Oper. Res. Trans., 37:2 (2013), 111–130 | MR | Zbl

[18] Freedman D., “Efficient simplicial reconstructions of manifolds from their samples”, IEEE Trans. Pattern Anal. Mach. Intell., 24:10 (2002), 1349–1357 | DOI

[19] Kramer M. A., “Nonlinear principal component analysis using autoassociative neural networks”, AIChE J., 37:2 (1991), 233–243 | DOI

[20] Dinh L., Sohl-Dickstein J., Bengio S., Density estimation using Real NVP, 2016, 32 pp., arXiv: 1605.08803

[21] Zhang Z., Zha H., “Principal manifolds and nonlinear dimensionality reduction via tangent space alignment.”, SIAM J. Sci. Comput., 26:1 (2004), 313–338 | DOI | MR | Zbl

[22] Bengio Y., Paiement J.-F., Vincent P., “Out-of-sample extensions for LLE, Isomap, MDS, eigenmaps, and spectral clustering”, Proc. 16th Int. Conf. on Neural Information Processing Systems, 2003, 177–184 | DOI

[23] Zhang P., Qiao H., Zhang B., “An improved local tangent space alignment method for manifold learning”, Pattern Recognit. Lett., 32:2 (2011), 181–189 | DOI

[24] Bernstein A. V., Kuleshov A. P., Tangent bundle manifold learning via Grassmann Stiefel eigenmaps, 2012, 25 pp., arXiv: 1212.6031

[25] Bernstein A., Kuleshov A. P., “Manifold Learning: Generalization ability and tangent proximity”, Int. J. Software Inf., 7:3 (2013), 359–390

[26] Genovese C. R., Perone-Pacifico M., Verdinelli I., Wasserman L., “Minimax manifold estimation”, J. Mach. Learn. Res., 13 (2012), 1263–1291 | MR | Zbl

[27] Yanovich Yu., “Asymptotic properties of local sampling on manifold”, J. Math. Stat., 12:3 (2016), 157–175 | DOI

[28] Yanovich Yu., “Asymptotic properties of nonparametric estimation on manifold”, Proc. 6th Workshop on Conformal and Probabilistic Prediction and Applications, Proceedings of Machine Learning Research, 60, 2017, 18–38

[29] Rozza A., Lombardi G., Rosa M., Casiraghi E., Campadelli P., “IDEA: Intrinsic dimension estimation algorithm”, Proc. Int. Conf. “Image Analysis and Processing (ICIAP 2011)”, Springer, Berlin–Heidelberg, 2011, 433–442 | DOI | MR

[30] Campadelli P., Casiraghi E., Ceruti C., Rozza A., “Intrinsic dimension estimation: Relevant techniques and a benchmark framework”, Math. Probl. Eng., 2015 (2015), 759567, 1–21 | DOI | MR

[31] Rosenblatt M., “Remarks on some nonparametric estimates of a density function”, Ann. Math. Stat., 27:3 (1956), 832–837 | DOI | MR | Zbl

[32] Parzen E., “On estimation of a probability density function and mode”, Ann. Math. Stat., 33:3 (1962), 1065–1076 | DOI | MR | Zbl

[33] Wagner T. J., “Nonparametric estimates of probability densities”, IEEE Trans. Inf. Theory, 21:4 (1975), 438–440 | DOI | MR | Zbl

[34] Henry G., Rodriguez D., “Kernel density estimation on Riemannian manifolds: Asymptotic results”, J. Math. Imaging Vis., 34:3 (2009), 235–239 | DOI | MR

[35] Hendriks H., “Nonparametric estimation of a probability density on a Riemannian manifold using Fourier expansions”, Ann. Stat., 18:2 (1990), 832–849 | DOI | MR | Zbl

[36] Ozakin A., Gray A., “Submanifold density estimation”, Proc. Conf. “Neural Information Processing Systems”(NIPS 2009), 2009, 1–8

[37] Kuleshov A., Bernstein A., Yanovich Yu., “High-dimensional density estimation for data mining tasks”, Proc. 2017 IEEE Int. Conf. on Data Mining (ICDMW), IEEE, New Orleans, LA, 2017, 523–530 | DOI

[38] Bernstein A., Kuleshov A., Yanovich Yu., “Asymptotically optimal method for manifold estimation problem”, Proc. XXIX Eur. Meet. of Statisticians (Budapest, 2013), 8–9

[39] Kuleshov A., Bernstein A., “Manifold learning in data mining tasks”, Proc. MLDM 2014: Machine Learning and Data Mining in Pattern Recognition, 2014, 119–133 | DOI

[40] Bernstein A., Kuleshov A., Yanovich Yu., “Information preserving and locally isometric conformal embedding via Tangent Manifold Learning”, Proc. 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA), IEEE, Paris, 2015, 1–9 | DOI

[41] Xiong Y., Chen W., Apley D., Ding X., “A non-stationary covariance-based Kriging method for metamodelling in engineering design”, Int. J. Numer. Methods Eng., 71:6 (2007), 733–756 | DOI | Zbl