Improving the Performance of Process Discovery Algorithms by Instance Selection
Computer Science and Information Systems, Tome 17 (2020) no. 3.

Voir la notice de l'article provenant de la source Computer Science and Information Systems website

Process discovery algorithms automatically discover process models based on event data that is captured during the execution of business processes. These algorithms tend to use all of the event data to discover a process model. When dealing with large event logs, it is no longer feasible using standard hardware in limited time. A straightforward approach to overcome this problem is to down-size the event data by means of sampling. However, little research has been conducted on selecting the right sample, given the available time and characteristics of event data. This paper evaluates various subset selection methods and evaluates their performance on real event data. The proposed methods have been implemented in both the ProM and the RapidProM platforms. Our experiments show that it is possible to considerably speed up discovery using instance selection strategies. Furthermore, results show that applying biased selection of the process instances compared to random sampling will result in simpler process models with higher quality.
Keywords: Process Mining, Process Discovery, Subset Selection, Event Log Preprocessing, Performance Enhancement
@article{CSIS_2020_17_3_a15,
     author = {Mohammadreza Fani Sani and Sebastiaan J. van Zelst and Wil van der Aalst},
     title = {Improving the {Performance} of {Process} {Discovery} {Algorithms} by {Instance} {Selection}},
     journal = {Computer Science and Information Systems},
     publisher = {mathdoc},
     volume = {17},
     number = {3},
     year = {2020},
     url = {http://geodesic.mathdoc.fr/item/CSIS_2020_17_3_a15/}
}
TY  - JOUR
AU  - Mohammadreza Fani Sani
AU  - Sebastiaan J. van Zelst
AU  - Wil van der Aalst
TI  - Improving the Performance of Process Discovery Algorithms by Instance Selection
JO  - Computer Science and Information Systems
PY  - 2020
VL  - 17
IS  - 3
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/CSIS_2020_17_3_a15/
ID  - CSIS_2020_17_3_a15
ER  - 
%0 Journal Article
%A Mohammadreza Fani Sani
%A Sebastiaan J. van Zelst
%A Wil van der Aalst
%T Improving the Performance of Process Discovery Algorithms by Instance Selection
%J Computer Science and Information Systems
%D 2020
%V 17
%N 3
%I mathdoc
%U http://geodesic.mathdoc.fr/item/CSIS_2020_17_3_a15/
%F CSIS_2020_17_3_a15
Mohammadreza Fani Sani; Sebastiaan J. van Zelst; Wil van der Aalst. Improving the Performance of Process Discovery Algorithms by Instance Selection. Computer Science and Information Systems, Tome 17 (2020) no. 3. http://geodesic.mathdoc.fr/item/CSIS_2020_17_3_a15/