Effective methods for Email Classification: Is it a Business or Personal Email?
Computer Science and Information Systems, Tome 19 (2022) no. 3.

Voir la notice de l'article provenant de la source Computer Science and Information Systems website

With the steady increase in the number of Internet users, email remains the most popular and extensively used communication means. Therefore, email management is an important and growing problem for individuals and organiza-tions. In this paper, we deal with the classification of emails into two main cate-gories, Business and Personal. To find the best performing solution for this problem, a comprehensive set of experiments has been conducted with the deep learning al-gorithms: Bidirectional Long-Short Term Memory (BiLSTM) and Attention-based BiLSTM (BiLSTM+Att), together with traditional Machine Learning (ML) algo-rithms: Stochastic Gradient Descent (SGD) optimization applied on Support Vector Machine (SVM) and Extremely Randomized Trees (ERT) ensemble method. The variations of individual email and conversational email thread arc representations have been explored to reach the best classification generalization on the selected task. A special contribution of this paper is the extraction of a large number of ad-ditional lexical, conversational, expressional, emotional, and moral features, which proved very useful for differentiation between personal and official written con-versations. The experiments were performed on the publicly available Enron email benchmark corpora on which we obtained the State-Of-the-Art (SOA) results. As part of the submission, we have made our work publicly available to the scientific community for research purposes.
Keywords: Email classification, business, personal, deep learning, BiLSTM, SGD, BERT embeddings, Tf-Idf, lexicons, NLP
@article{CSIS_2022_19_3_a13,
     author = {Milena \v{S}o\v{s}i\'c and Jelena Graovac},
     title = {Effective methods for {Email} {Classification:} {Is} it a {Business} or {Personal} {Email?}},
     journal = {Computer Science and Information Systems},
     publisher = {mathdoc},
     volume = {19},
     number = {3},
     year = {2022},
     url = {http://geodesic.mathdoc.fr/item/CSIS_2022_19_3_a13/}
}
TY  - JOUR
AU  - Milena Šošić
AU  - Jelena Graovac
TI  - Effective methods for Email Classification: Is it a Business or Personal Email?
JO  - Computer Science and Information Systems
PY  - 2022
VL  - 19
IS  - 3
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/CSIS_2022_19_3_a13/
ID  - CSIS_2022_19_3_a13
ER  - 
%0 Journal Article
%A Milena Šošić
%A Jelena Graovac
%T Effective methods for Email Classification: Is it a Business or Personal Email?
%J Computer Science and Information Systems
%D 2022
%V 19
%N 3
%I mathdoc
%U http://geodesic.mathdoc.fr/item/CSIS_2022_19_3_a13/
%F CSIS_2022_19_3_a13
Milena Šošić; Jelena Graovac. Effective methods for Email Classification: Is it a Business or Personal Email?. Computer Science and Information Systems, Tome 19 (2022) no. 3. http://geodesic.mathdoc.fr/item/CSIS_2022_19_3_a13/