To the foundation of the dimensionality reduction method for explanatory variables
Zapiski Nauchnykh Seminarov POMI, Probability and statistics. Part 18, Tome 408 (2012), pp. 84-101 Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice du chapitre de livre

Investigation of many complex phenomena involves data sets of high dimensions. This is typical for many medical and biological studies especially in Genetics and Pharmacology. We treat the binary response variable (showing, e.g., the state of patient's health) depending on $n$ discrete factors (explanatory variables). A very important problem is to find the most significant among them. The aim of the paper is to establish the necessary and sufficient conditions for strong consistency of the specified estimates, employing the cross-validation, of the error arising in prediction algorithm for the response variable. The impact of the choice of penalty function is discussed as well. The obtained results provide a basis for the well-known MDR-method widely used in genetic data analysis.
@article{ZNSL_2012_408_a5,
     author = {A. V. Bulinski},
     title = {To the foundation of the dimensionality reduction method for explanatory variables},
     journal = {Zapiski Nauchnykh Seminarov POMI},
     pages = {84--101},
     year = {2012},
     volume = {408},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/ZNSL_2012_408_a5/}
}
TY  - JOUR
AU  - A. V. Bulinski
TI  - To the foundation of the dimensionality reduction method for explanatory variables
JO  - Zapiski Nauchnykh Seminarov POMI
PY  - 2012
SP  - 84
EP  - 101
VL  - 408
UR  - http://geodesic.mathdoc.fr/item/ZNSL_2012_408_a5/
LA  - ru
ID  - ZNSL_2012_408_a5
ER  - 
%0 Journal Article
%A A. V. Bulinski
%T To the foundation of the dimensionality reduction method for explanatory variables
%J Zapiski Nauchnykh Seminarov POMI
%D 2012
%P 84-101
%V 408
%U http://geodesic.mathdoc.fr/item/ZNSL_2012_408_a5/
%G ru
%F ZNSL_2012_408_a5
A. V. Bulinski. To the foundation of the dimensionality reduction method for explanatory variables. Zapiski Nauchnykh Seminarov POMI, Probability and statistics. Part 18, Tome 408 (2012), pp. 84-101. http://geodesic.mathdoc.fr/item/ZNSL_2012_408_a5/

[1] G. Bradley-Smith, S. Hope, H. V. Firth, J. A. Hurst, Oxford Handbook of Genetics, Oxford University Press, New York, 2010

[2] H. Schwender, I. Ruczinski, “Testing SNPs and sets of SNPs for importance in association studies”, Biostatistics, 12:1 (2011), 18–32 | DOI

[3] M. D. Ritchie, L. W. Hahn, N. Roodi, R. Bailey, W. D. Dupont, F. F. Parl, J. H. Moore, “Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer”, Amer. J. Human Genetics, 69 (2001), 139–147 | DOI

[4] M. D. Ritchie, A. A. Motsinger, “Multifactor dimensionality reduction for detecting gene-gene and gene-environment interactions in pharmacogenomics studies”, Pharmacogenomics, 6:8 (2005), 823–834 | DOI

[5] M. D. Ritchie, L. W. Hahn, J. H. Moore, “Power of multifactor dimensionality reduction for detecting gene-gene interactions in the presence of genotyping error, missing data, phenocopy, and genetic heterogeneity”, Genetic Epidemiology, 24:2 (2003), 150–157 | DOI

[6] H. Mei, M. L. Cuccaro, E. R. Martin, “Multifactor dimensionality reduction-phenomics: a novel method to capture genetic heterogeneity with use of phenotypic variables”, Amer. J. Human Genetics, 81 (2007), 1251–1261 | DOI

[7] H. He, W. S. Oetting, M. J. Brott, S. Basu, “Power of multifactor dimensionality reduction and penalized logistic regression for detecting gene-gene interaction in a case-control study”, BMC Medical Genetics, 10:127 (2009) | DOI

[8] T. Cattaert, V. Urrea, A. C. Naj, L. De Lobel, V. De Wit, M. Fu, J. M. M. John, H. Shen, M. L. Calle, M. D. Ritchie, T. L. Edwards, K. Van Steen, “FAM-MDR: a flexible family-based multifactor dimensionality reduction technique to detect epistasis using related individuals”, PLoS ONE, 5:4 (2010), e10304 | DOI

[9] T. L. Edwards, E. S. Torstenson, E. M. Martin, M. D. Ritchie, “A cross-validation procedure for general pedigrees and matched odds ratio fitness metric implemented for the multifactor dimensionality reduction pedigree disequilibrium test MDR-PDT and cross-validation: power studies”, Genetic Epidemiology, 34:2 (2010), 194–199 | DOI

[10] S. Oh, J. Lee, M.-S. Kwon, B. Weir, K. Ha, T. Park, “A novel method to identify high order gene-gene interactions in genome-wide association studies: Gene-based MDR”, BMC Bioinformatics, 13:Suppl 9 (2012), S5 http://www.biomedcentral.com/1471-2105/13/S9/S5 | DOI

[11] A. Bulinski, O. Butkovsky, V. Sadovnichy, A. Shashkin, P. Yaskov, A. Balastskiy, L. Samokhodskaya, V. Tkachuk, “Statistical methods of SNP data analysis and applications”, Open J. Statist., 2:1 (2012), 73–87 | DOI | MR

[12] D. Velez, B. White, A. Motsinger, W. Bush, M. Ritchie, S. Williams, J. Moore, “Balanced accuracy function for epistasis modeling in imbalanced datasets using multifactor dimensionality reduction”, Genetic Epidemiology, 31:4 (2007), 306–315 | DOI

[13] S. Arlot, A. Celisse, “A survey of cross-validation procedures for model selection”, Statist. Surv., 4 (2010), 40–79 | DOI | MR | Zbl

[14] R. L. Taylor, T.-C. Hu, “Strong laws of large numbers for arrays of row-wise independent random elements”, Int. J. Math. Math. Sci., 10:4 (1987), 805–814 | DOI | MR | Zbl

[15] P. Golland, F. Liang, S. Mukherjee, D. Panchenko, Permutation Tests for Classification, Lect. Notes Comput. Sci., 3559, Springer, 2005 | MR | Zbl