Instance-based classification using prototypes generated from large noisy and streaming datasets
Computer Science and Information Systems, Tome 17 (2020) no. 1.

Voir la notice de l'article provenant de la source Computer Science and Information Systems website

Nowadays, large volumes of training data are available from various data sources and streaming environments. Instance-based classifiers perform adequately when they use only a small subset of such datasets. Larger data volumes introduce high computational cost that prohibits the timely execution of the classification process. Conventional prototype selection and generation algorithms are also inappropriate for data streams and large datasets. In the past, we proposed prototype generation algorithms that maintain a dynamic set of prototypes and are appropriate for such types of data. Dynamic because existing prototypes may be updated, or new prototypes may be appended to the set of prototypes in the course of processing. Still, repetitive generation of new prototypes may result to forming unpredictably large sets of prototypes. In this paper, we propose a new variation of our algorithm that maintains the prototypes in a convenient and manageable way. This is achieved by removing the weakest prototype when a new prototype is generated. The new algorithm has been tested on several datasets. The experimental results reveal that it is as accurate as its predecessor, yet it is more efficient and noise tolerant.
Keywords: k-NN classification, Data reduction, Prototype generation, Data streams, Large datasets, Noisy data
@article{CSIS_2020_17_1_a5,
     author = {Stefanos Ougiaroglou and Dimitris A. Dervos and Georgios Evangelidis},
     title = {Instance-based classification using prototypes generated from large noisy and streaming datasets},
     journal = {Computer Science and Information Systems},
     publisher = {mathdoc},
     volume = {17},
     number = {1},
     year = {2020},
     url = {http://geodesic.mathdoc.fr/item/CSIS_2020_17_1_a5/}
}
TY  - JOUR
AU  - Stefanos Ougiaroglou
AU  - Dimitris A. Dervos
AU  - Georgios Evangelidis
TI  - Instance-based classification using prototypes generated from large noisy and streaming datasets
JO  - Computer Science and Information Systems
PY  - 2020
VL  - 17
IS  - 1
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/CSIS_2020_17_1_a5/
ID  - CSIS_2020_17_1_a5
ER  - 
%0 Journal Article
%A Stefanos Ougiaroglou
%A Dimitris A. Dervos
%A Georgios Evangelidis
%T Instance-based classification using prototypes generated from large noisy and streaming datasets
%J Computer Science and Information Systems
%D 2020
%V 17
%N 1
%I mathdoc
%U http://geodesic.mathdoc.fr/item/CSIS_2020_17_1_a5/
%F CSIS_2020_17_1_a5
Stefanos Ougiaroglou; Dimitris A. Dervos; Georgios Evangelidis. Instance-based classification using prototypes generated from large noisy and streaming datasets. Computer Science and Information Systems, Tome 17 (2020) no. 1. http://geodesic.mathdoc.fr/item/CSIS_2020_17_1_a5/