A New Approximate Method For Mining Frequent Itemsets From Big Data
Computer Science and Information Systems, Tome 18 (2021) no. 3.

Voir la notice de l'article provenant de la source Computer Science and Information Systems website

Mining frequent itemsets in transaction databases is an important task in many applications. It becomes more challenging when dealing with a large transaction database because traditional algorithms are not scalable due to the memory limit. In this paper, we propose a new approach for approximately mining of frequent itemsets in a big transaction database. Our approach is suitable for mining big transaction databases since it produces approximate frequent itemsets from a subset of the entire database, and can be implemented in a distributed environment. Our algorithm is able to efficiently produce high-accurate results, however it misses some true frequent itemsets. To address this problem and reduce the number of false negative frequent itemsets we introduce an additional parameter to the algorithm to discover most of the frequent itemsets contained in the entire data set. In this article, we show an empirical evaluation of the results of the proposed approach.
Keywords: Approximate Method, Frequent Itemsets Mining, Random Sample Partition, Big Transaction Database
@article{CSIS_2021_18_3_a2,
     author = {Timur Valiullin and Joshua Zhexue Huang and Chenghao Wei and Jianfei Yin and Dingming Wu and Iuliia Egorova},
     title = {A {New} {Approximate} {Method} {For} {Mining} {Frequent} {Itemsets} {From} {Big} {Data}},
     journal = {Computer Science and Information Systems},
     publisher = {mathdoc},
     volume = {18},
     number = {3},
     year = {2021},
     url = {http://geodesic.mathdoc.fr/item/CSIS_2021_18_3_a2/}
}
TY  - JOUR
AU  - Timur Valiullin
AU  - Joshua Zhexue Huang
AU  - Chenghao Wei
AU  - Jianfei Yin
AU  - Dingming Wu
AU  - Iuliia Egorova
TI  - A New Approximate Method For Mining Frequent Itemsets From Big Data
JO  - Computer Science and Information Systems
PY  - 2021
VL  - 18
IS  - 3
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/CSIS_2021_18_3_a2/
ID  - CSIS_2021_18_3_a2
ER  - 
%0 Journal Article
%A Timur Valiullin
%A Joshua Zhexue Huang
%A Chenghao Wei
%A Jianfei Yin
%A Dingming Wu
%A Iuliia Egorova
%T A New Approximate Method For Mining Frequent Itemsets From Big Data
%J Computer Science and Information Systems
%D 2021
%V 18
%N 3
%I mathdoc
%U http://geodesic.mathdoc.fr/item/CSIS_2021_18_3_a2/
%F CSIS_2021_18_3_a2
Timur Valiullin; Joshua Zhexue Huang; Chenghao Wei; Jianfei Yin; Dingming Wu; Iuliia Egorova. A New Approximate Method For Mining Frequent Itemsets From Big Data. Computer Science and Information Systems, Tome 18 (2021) no. 3. http://geodesic.mathdoc.fr/item/CSIS_2021_18_3_a2/