Stability properties of feature selection measures
Teoriâ veroâtnostej i ee primeneniâ, Tome 69 (2024) no. 1, pp. 33-45 Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice de l'article

In this paper, we prove that the monotonicity property of the stability measure for the feature (factor) selection introduced by Nogueira, Sechidis, and Brown [J. Mach. Learn. Res., 18 (2018), pp. 1–54] may not hold. Another monotonicity property takes place. We also show the cases in which it is possible to compare by certain parameters the matrices describing the operation of algorithms for identifying relevant features.
Keywords: selection of relevant features, stability measures of the feature selection algorithm, properties of stability measures.
@article{TVP_2024_69_1_a1,
     author = {A. V. Bulinski},
     title = {Stability properties of feature selection measures},
     journal = {Teori\^a vero\^atnostej i ee primeneni\^a},
     pages = {33--45},
     year = {2024},
     volume = {69},
     number = {1},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/TVP_2024_69_1_a1/}
}
TY  - JOUR
AU  - A. V. Bulinski
TI  - Stability properties of feature selection measures
JO  - Teoriâ veroâtnostej i ee primeneniâ
PY  - 2024
SP  - 33
EP  - 45
VL  - 69
IS  - 1
UR  - http://geodesic.mathdoc.fr/item/TVP_2024_69_1_a1/
LA  - ru
ID  - TVP_2024_69_1_a1
ER  - 
%0 Journal Article
%A A. V. Bulinski
%T Stability properties of feature selection measures
%J Teoriâ veroâtnostej i ee primeneniâ
%D 2024
%P 33-45
%V 69
%N 1
%U http://geodesic.mathdoc.fr/item/TVP_2024_69_1_a1/
%G ru
%F TVP_2024_69_1_a1
A. V. Bulinski. Stability properties of feature selection measures. Teoriâ veroâtnostej i ee primeneniâ, Tome 69 (2024) no. 1, pp. 33-45. http://geodesic.mathdoc.fr/item/TVP_2024_69_1_a1/

[1] V. Bolón-Canedo, A. Alonso-Betanzos, Recent advances in ensembles for feature selection, Intell. Syst. Ref. Libr., 147, Springer, Cham, 2018, xiv+205 pp. | DOI

[2] L. D. D. Desboulets, “A review on variable selection in regression analysis”, Econometrics, 6:4 (2018), 45, 27 pp. | DOI

[3] B. Remeseiro, V. Bolon-Canedo, “A review of feature selection methods in medical applications”, Comput. Biol. Med., 112 (2019), 103375, 9 pp. | DOI

[4] V. Tam, N. Patel, M. Turcotte, Y. Bossé, G. Paré, D. Meyre, “Benefits and limitations of genome-wide association studies”, Nat. Rev. Genet., 20:8 (2019), 467–484 | DOI

[5] M. Kuhn, K. Johnson, Feature engineering and selection. A practical approach for predictive models, Chapman Hall/CRC, Boca Raton, FL, 2019, xv+297 pp. | DOI

[6] Xiaojun Mao, Liuhua Peng, Zhonglei Wang, Nonparametric feature selection by random forests and deep neural networks, 2022, 24 pp., arXiv: 2201.06821v1

[7] T. Hastie, R. Tibshirani, M. Wainwrigth, Statistical learning with sparsity. The lasso and generalizations, Monogr. Statist. Appl. Probab., 143, CRC Press, Boca Raton, FL, 2015, xv+351 pp. | DOI | MR | Zbl

[8] D. G. Kleinbaum, M. Klein, Logistic regression. A self-learning text, Stat. Biol. Health, 3rd ed., Springer, New York, 2010, xvii+701 pp. | DOI | Zbl

[9] A. Bulinski, A. Kozhevin, “Statistical estimation of mutual information for mixed model”, Methodol. Comput. Appl. Probab., 23:1 (2021), 123–142 | DOI | MR | Zbl

[10] D. Gola, J. M. M. John, K. van Steen, I. R. König, “A roadmap to multifactor dimensionality reduction methods”, Brief. Bioinform., 17:2 (2016), 293–308 | DOI

[11] M. D. Ritchie, L. W. Hahn, N. Roodi, L. Renee Bailey, W. D. Dupont, F. F. Parl, J. H. Moore, “Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer”, Am. J. Hum. Genet., 69:1 (2001), 138–147 | DOI

[12] K. Dunne, P. Cunningham, F. Azuaje, Solutions to instability problems with sequential wrapper-based approaches to feature selection, Tech. Rep. TCDCS-2002-28, Trinity college, School of Computer Science, Dublin, 2002, 22 pp.

[13] A. Kalousis, J. Prados, M. Hilario, “Stability of feature selection algorithms”, 5th IEEE international conference on data mining (ICDM{'}05) (Houston, TX, 2005), IEEE, 2005, 1–8 | DOI

[14] N. Meinshausen, P. Bühlmann, “Stability selection”, J. R. Stat. Soc. Ser. B Stat. Methodol., 72:4 (2010), 417–473 | DOI | MR | Zbl

[15] R. D. Shah, R. J. Samworth, “Variable selection with error control: another look at stability selection”, J. R. Stat. Soc. Ser. B Stat. Methodol., 75:1 (2013), 55–80 | DOI | MR | Zbl

[16] S. Nogueira, K. Sechidis, G. Brown, “On the stability of feature selection algorithms”, J. Mach. Learn. Res., 18 (2017), 174, 54 pp. | MR | Zbl

[17] U. M. Khaire, R. Dhanalakshmi, “Stability of feature selection algorithm: a review”, J. King Saud Univ. — Comput. Inf. Sci., 34:4 (2022), 1060–1073 | DOI

[18] J. L. Lustgarten, V. Gopalakrishnan, S. Visweswaran, “Measuring stability of feature selection in biomedical datasets”, AMIA Annu. Symp. Proc., 2009 (2009), 406–410

[19] L. I. Kuncheva, “A stability index for feature selection”, AIAP {'}07: Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications (Innsbruck, 2007), ACTA Press, Anaheim, CA, 2007, 390–395

[20] J. L. Fleiss, “Measuring nominal scale agreement among many raters”, Psychol. Bull., 76:5 (1971), 378–382 | DOI

[21] A. A. Kozhevin, “Feature selection based on statistical estimation of mutual information”, Sib. elektron. matem. izv., 18:1 (2021), 720–728 | DOI | MR | Zbl