Applied Machine Learning in Recognition of DGA Domain Names
Computer Science and Information Systems, Tome 19 (2022) no. 1.

Voir la notice de l'article provenant de la source Computer Science and Information Systems website

Recognition of domain names generated by domain generation algorithms (DGAs) is the essential part of malware detection by inspection of network traffic. Besides basic heuristics (HE) and limited detection based on blacklists, the most promising course seems to be machine learning (ML). There is a lack of studies that extensively compare different ML models in the field of DGA binary classification, including both conventional and deep learning (DL) representatives. Also, those few that exist are either focused on a small set of models, use a poor set of features in ML models or fail to secure unbiased independence between training and evaluation samples. To overcome these limitations, we engineered a robust feature set, and accordingly trained and evaluated 14 ML, 9 DL, and 2 comparative models on two independent datasets. Results show that if ML features are properly engineered, there is a marginal difference in overall score between top ML and DL representatives. This paper represents the first attempt to neutrally compare the performance of many different models for the recognition of DGA domain names, where the best models perform as well as the top representatives from the literature.
Keywords: domain generation algorithm, binary classification, supervised machine learning, deep learning, blind evaluation
@article{CSIS_2022_19_1_a11,
     author = {Miroslav \v{S}tampar and Kre\v{s}imir Fertalj},
     title = {Applied {Machine} {Learning} in {Recognition} of {DGA} {Domain} {Names}},
     journal = {Computer Science and Information Systems},
     publisher = {mathdoc},
     volume = {19},
     number = {1},
     year = {2022},
     url = {http://geodesic.mathdoc.fr/item/CSIS_2022_19_1_a11/}
}
TY  - JOUR
AU  - Miroslav Štampar
AU  - Krešimir Fertalj
TI  - Applied Machine Learning in Recognition of DGA Domain Names
JO  - Computer Science and Information Systems
PY  - 2022
VL  - 19
IS  - 1
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/CSIS_2022_19_1_a11/
ID  - CSIS_2022_19_1_a11
ER  - 
%0 Journal Article
%A Miroslav Štampar
%A Krešimir Fertalj
%T Applied Machine Learning in Recognition of DGA Domain Names
%J Computer Science and Information Systems
%D 2022
%V 19
%N 1
%I mathdoc
%U http://geodesic.mathdoc.fr/item/CSIS_2022_19_1_a11/
%F CSIS_2022_19_1_a11
Miroslav Štampar; Krešimir Fertalj. Applied Machine Learning in Recognition of DGA Domain Names. Computer Science and Information Systems, Tome 19 (2022) no. 1. http://geodesic.mathdoc.fr/item/CSIS_2022_19_1_a11/