Space semantic aware loss function for embedding creation in case of transaction data
Journal of the Belarusian State University. Mathematics and Informatics, Tome 1 (2022), pp. 97-102.

Voir la notice de l'article provenant de la source Math-Net.Ru

Transaction data are the most popular data type of bank domain, they are often represented as sparse vectors with a large number of features. Using sparse vectors in deep learning tasks is computationally inefficient and may lead to overfitting. Аutoencoders are widely applied to extract new useful features in a lower dimensional space. In this paper we propose to use a novel loss function based on the metric that estimates the quality of mapping the semantic structure of the original tabular data to the embedded space. The proposed loss function allows preserving the item relation structure of the original space during the dimension reduction transformation. The obtained results show the improvement of the resulting embedding properties while using the combination of the new loss function and the traditional mean squared error one.
Keywords: data; embedding; vector; loss function; autoencoder.
@article{BGUMI_2022_1_a9,
     author = {M. E. Vatkin and D. A. Vorobey and M. V. Yakovlev and M. G. Krivova},
     title = {Space semantic aware loss function for embedding creation in case of transaction data},
     journal = {Journal of the Belarusian State University. Mathematics and Informatics},
     pages = {97--102},
     publisher = {mathdoc},
     volume = {1},
     year = {2022},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/BGUMI_2022_1_a9/}
}
TY  - JOUR
AU  - M. E. Vatkin
AU  - D. A. Vorobey
AU  - M. V. Yakovlev
AU  - M. G. Krivova
TI  - Space semantic aware loss function for embedding creation in case of transaction data
JO  - Journal of the Belarusian State University. Mathematics and Informatics
PY  - 2022
SP  - 97
EP  - 102
VL  - 1
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/BGUMI_2022_1_a9/
LA  - en
ID  - BGUMI_2022_1_a9
ER  - 
%0 Journal Article
%A M. E. Vatkin
%A D. A. Vorobey
%A M. V. Yakovlev
%A M. G. Krivova
%T Space semantic aware loss function for embedding creation in case of transaction data
%J Journal of the Belarusian State University. Mathematics and Informatics
%D 2022
%P 97-102
%V 1
%I mathdoc
%U http://geodesic.mathdoc.fr/item/BGUMI_2022_1_a9/
%G en
%F BGUMI_2022_1_a9
M. E. Vatkin; D. A. Vorobey; M. V. Yakovlev; M. G. Krivova. Space semantic aware loss function for embedding creation in case of transaction data. Journal of the Belarusian State University. Mathematics and Informatics, Tome 1 (2022), pp. 97-102. http://geodesic.mathdoc.fr/item/BGUMI_2022_1_a9/

[1] P. Gupta, R. E. Banchs, P. Rosso, “Squeezing bottlenecks: exploring the limits of autoencoder semantic representation capabilities”, Neurocomputing, 175:PB (2016), 1001–1008 | DOI

[2] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, J. Dean, “Distributed representations of words and phrases and their compositionality”, Proceedings of the 26-th International conference on neural information processing system (Lake Tahoe, Nevada, USA), v. 2, Curran Associates Inc, New York, 2013, 3111–3119

[3] H. Bourlard, Y. Kamp, “Auto-association by multilayer perceptrons and singular value decomposition”, Biological Cybernetics, 59 (1988), 291–294 | DOI | MR | Zbl

[4] Credit card fraud detection, Machine Learning Group, Cambridge, 2018 https://www.kaggle.com/mlg-ulb/creditcardfraud/data | MR

[5] M. A. Al-Shabi, “Credit card fraud detection using autoencoder model in unbalanced datasets”, Journal of Advances in Mathematics and Computer Science, 33:5 (2019), 1–16 | DOI

[6] A. Husejinovic, “Credit card fraud detection using naive Bayesian and C4.5 decision tree classifiers”, Periodicals of Engineering and Natural Sciences, 8:1 (2020), 1–5

[7] T. Saito, M. Rehmsmeier, “The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets”, PLoS One, 10:3 (2015) | DOI

[8] J. Davis, M. Goadrich, “The relationship between precision-recall and ROC curves”, Proceedings of the 23-rd International conference on machine learning (Pittsburgh, USA), Association for Computing Machinery, New York, 2006, 233–240 | DOI

[9] E. E. Marushko, A. A. Doudkin, X. Zheng, “Identification of Earth’s surface objects using ensembles of convolutional neural networks”, Journal of the Belarusian State University. Mathematics and Informatics, 2 (2021), 114–123 | DOI