Option predictive clustering trees for multi-target regression
Computer Science and Information Systems, Tome 17 (2020) no. 2.

Voir la notice de l'article provenant de la source Computer Science and Information Systems website

Decision trees are one of the most widely used predictive modelling methods primarily because they are readily interpretable and fast to learn. These nice properties come at the price of predictive performance. Moreover, the standard induction of decision trees suffers from myopia: a single split is chosen in each internal node which is selected in a greedy manner; hence, the resulting tree may be sub-optimal. To address these issues, option trees have been proposed which can include several alternative splits in a new type of internal nodes called option nodes. Considering all of this, an option tree can be also regarded as a condensed representation of an ensemble. In this work, we propose to learn option trees for multi-target regression (MTR) based on the predictive clustering framework. The resulting models are thus called option predictive clustering trees (OPCTs). Multi-target regression is concerned with learning predictive models for tasks with multiple numeric target variables. We evaluate the proposed OPCTs on 11 benchmark MTR data sets. The results reveal that OPCTs achieve statistically significantly better predictive performance than a single predictive clustering tree (PCT) and are competitive with bagging and random forests of PCTs. By limiting the number of option nodes, we can achieve a good trade-off between predictive power and efficiency (model size and learning time). We also perform parameter sensitivity analysis and bias-variance decomposition of the mean squared error. Our analysis shows that OPCTs can reduce the variance of PCTs nearly as much as ensemble methods do. In terms of bias, OPCTs occasionally outperform other methods. Finally, we demonstrate the potential of OPCTs for multifaceted interpretability and illustrate the potential for inclusion of domain knowledge in the tree learning process.
Keywords: multi-target regression, option trees, interpretable models, predictive clustering trees, bias-variance decomposition of error
@article{CSIS_2020_17_2_a6,
     author = {Toma\v{z} Stepi\v{s}nik and Alja\v{z} Osojnik and Sa\v{s}o D\v{z}eroski and Dragi Kocev},
     title = {Option predictive clustering trees for multi-target regression},
     journal = {Computer Science and Information Systems},
     publisher = {mathdoc},
     volume = {17},
     number = {2},
     year = {2020},
     url = {http://geodesic.mathdoc.fr/item/CSIS_2020_17_2_a6/}
}
TY  - JOUR
AU  - Tomaž Stepišnik
AU  - Aljaž Osojnik
AU  - Sašo Džeroski
AU  - Dragi Kocev
TI  - Option predictive clustering trees for multi-target regression
JO  - Computer Science and Information Systems
PY  - 2020
VL  - 17
IS  - 2
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/CSIS_2020_17_2_a6/
ID  - CSIS_2020_17_2_a6
ER  - 
%0 Journal Article
%A Tomaž Stepišnik
%A Aljaž Osojnik
%A Sašo Džeroski
%A Dragi Kocev
%T Option predictive clustering trees for multi-target regression
%J Computer Science and Information Systems
%D 2020
%V 17
%N 2
%I mathdoc
%U http://geodesic.mathdoc.fr/item/CSIS_2020_17_2_a6/
%F CSIS_2020_17_2_a6
Tomaž Stepišnik; Aljaž Osojnik; Sašo Džeroski; Dragi Kocev. Option predictive clustering trees for multi-target regression. Computer Science and Information Systems, Tome 17 (2020) no. 2. http://geodesic.mathdoc.fr/item/CSIS_2020_17_2_a6/