Algebraic Methods for Studying Interactions Between Epidemiological Variables
Mathematical modelling of natural phenomena, Tome 7 (2012) no. 3, pp. 227-252.

Voir la notice de l'article provenant de la source EDP Sciences

Background Independence models among variables is one of the most relevant topics in epidemiology, particularly in molecular epidemiology for the study of gene-gene and gene-environment interactions. They have been studied using three main kinds of analysis: regression analysis, data mining approaches and Bayesian model selection. Recently, methods of algebraic statistics have been extensively used for applications to biology. In this paper we present a synthetic, but complete description of independence models in algebraic statistics and a new method of analyzing interactions, that is equivalent to the correction by Markov bases of the Fisher’s exact test. Methods We identified the suitable algebraic independence model for describing the dependence of two genetic variables from the occurrence of cancer and exploited the theory of toric varieties and Gröbner basis for developing an exact independence test based on the Diaconis-Sturmfels algorithm. We implemented it in a Maple routine and we applied it to the study of gene-gene interaction in Gen-Air, an European case-control study. We computed the p-value for each pair of genetic variables interacting with disease status and we compared our results with the standard asymptotic chi-square test. Results We found an association among COMT Val158Met, APE1 Asp148Glu and bladder cancer (p-value: 0.009). We also found the interaction among TP53 Arg72Pro, GSTP1 Ile105Val and lung cancer (p-value: 0.00035). Leukaemia was observed to significantly interact with the pairs ERCC2 Lys751Gln and RAD51 172 G > T (p-value 0.0072), ERCC2 Lys751Gln and LIG4Thr9Ile (p-value: 0.0095) and APE1 Asp148Glu and GSTP1 Ala114Val (p-value: 0.0036). Conclusion Taking advantage of results from theoretical and computational algebra, the method we propose was more selective than other methods in detecting new interactions, and nevertheless its results were consistent with previous epidemiological and functional findings. It also helped us in controlling the multiple comparison problem. In the light of our results, we believe that the epidemiologic study of interactions can benefit of algebraic methods based on properties of toric varieties and Gröbner bases.
DOI : 10.1051/mmnp/20127314

F. Ricceri 1, 2 ; C. Fassino 3 ; G. Matullo 1, 2 ; M. Roggero 3 ; M.-L. Torrente 3 ; P. Vineis 1, 4 ; L. Terracini 3

1 Human Genetics Foundation, Turin, Italy
2 Department of Genetics, Biology and Biochemistry, University of Turin, Italy
3
4 Imperial College, London, UK
@article{MMNP_2012_7_3_a13,
     author = {F. Ricceri and C. Fassino and G. Matullo and M. Roggero and M.-L. Torrente and P. Vineis and L. Terracini},
     title = {Algebraic {Methods} for {Studying} {Interactions} {Between} {Epidemiological} {Variables}},
     journal = {Mathematical modelling of natural phenomena},
     pages = {227--252},
     publisher = {mathdoc},
     volume = {7},
     number = {3},
     year = {2012},
     doi = {10.1051/mmnp/20127314},
     language = {en},
     url = {http://geodesic.mathdoc.fr/articles/10.1051/mmnp/20127314/}
}
TY  - JOUR
AU  - F. Ricceri
AU  - C. Fassino
AU  - G. Matullo
AU  - M. Roggero
AU  - M.-L. Torrente
AU  - P. Vineis
AU  - L. Terracini
TI  - Algebraic Methods for Studying Interactions Between Epidemiological Variables
JO  - Mathematical modelling of natural phenomena
PY  - 2012
SP  - 227
EP  - 252
VL  - 7
IS  - 3
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/articles/10.1051/mmnp/20127314/
DO  - 10.1051/mmnp/20127314
LA  - en
ID  - MMNP_2012_7_3_a13
ER  - 
%0 Journal Article
%A F. Ricceri
%A C. Fassino
%A G. Matullo
%A M. Roggero
%A M.-L. Torrente
%A P. Vineis
%A L. Terracini
%T Algebraic Methods for Studying Interactions Between Epidemiological Variables
%J Mathematical modelling of natural phenomena
%D 2012
%P 227-252
%V 7
%N 3
%I mathdoc
%U http://geodesic.mathdoc.fr/articles/10.1051/mmnp/20127314/
%R 10.1051/mmnp/20127314
%G en
%F MMNP_2012_7_3_a13
F. Ricceri; C. Fassino; G. Matullo; M. Roggero; M.-L. Torrente; P. Vineis; L. Terracini. Algebraic Methods for Studying Interactions Between Epidemiological Variables. Mathematical modelling of natural phenomena, Tome 7 (2012) no. 3, pp. 227-252. doi : 10.1051/mmnp/20127314. http://geodesic.mathdoc.fr/articles/10.1051/mmnp/20127314/

[1] A. Agresti Statist. Med. 2001 2709 2722

[2] A. Agresti, Categorical data analysis, Wiley, 2002.

[3] H. Aurtrup Mutat Res 2000 65 76

[4] N. Beerenwinkel, L. Pachter, B. Sturmfels, S.F. Elena, R.E. Lenski BMC Evol Biol. 2007 60

[5] S.P. Cleary, M. Cotterchio, E. Shi, S. Gallinger, P. Harper Am. J. Epidemiol. 2010 1000 1014

[6] H.J. Cordell Nat Rev Genet 2009 392 404

[7] D. Cox, J. Little, D. O’Shea, Ideals, varieties, and algorithms, Undergraduate Texts in Mathematics, vol. 60, Springer-Verlag, New York, 1992.

[8] A.C. Davison, D.V. Hinkley, Bootstrap methods and their applications, Cambridge University Press, Cambridge, 1997.

[9] P. Diaconis, B. Sturmfels Ann. Statist. 1998 363 397

[10] M. Drton, S. Sullivant Statist. Sinica. 2007 1273 1297

[11] F. Dudbridge, A. Gusnanto, B.P.C. Koeleman Human Genomics 2006 310 317

[12] F. Dudbridge, B.P.C. Koeleman Am. J. Hum. Genet 2004 424 435

[13] E.S. Edgington, Randomization tests (3rd ed.), Marcel Dekker, New York, 1995.

[14] B. Efron, The jackknife, the bootstrap and other resampling plans, Society of Industrial and Applied Mathematics CBMS-NFS Monographs, vol. 38, Capital City Press, Philadelphia, 1982.

[15] L. Fan, J.O. Fuss, Q.J. Cheng, A.S. Arvai, M. Hammel, V.A. Roberts, P.K. Cooper, J.A. Tainer Cell 2008 789 800

[16] C. Fassino, M.L. Torrente, Simple approximate varieties for sets of empirical points, Submitted. Available at http://arxiv.org/abs/1008.0274

[17] I.O. Filiz, X. Guo, J. Morton, B. Sturmfels, Graphical models for correlated defaults, Available at http://arxiv.org/pdf/0809.1393v1.pdf, 2008.

[18] R.A. Fisher, The design of experiments, Oliver and Boyd, Edinburgh, 1935.

[19] W. Fulton, Introduction to toric varieties, Princeton University Press, 1993.

[20] P. Good, Resampling methods : A practical guide to data analysis (3rd edition), Birchäuser, Boston, 2006.

[21] H. Gorji, N Shahbazi, P. Habibollahi, S.M. Tavangar, A. Firooz, M.H. Ghahremani Dermatol Sci 2009 208 10

[22] L.W. Hahn, M.D. Ritchie, J.H. Moore Bioinformatics 2003 376 382

[23] I. Hallgrimsdottir, B Sturmfels Journal of Symbolic Computation 2006 125 137

[24] D.Y. Lin Bioinformatics 2005 781 787

[25] H.W. Lo, L. Stephenson, X. Cao, M. Milas, R. Pollock, F. Ali-Osman Mol Cancer Res 2008 843 50

[26] A.S. Malaspinas, C. Uhler Journal of Algebraic Statistics 2011 36 53

[27] M. Manuguerra, G. Matullo, F. Veglia, H. Autrup, A.M. Dunning, S. Garte, E. Gormally, C. Malaveille, S. Guarrera, S. Polidoro, F. Saletta, M. Peluso, L. Airoldi, K. Overvad, O. Raaschou-Nielsen, F. Clavel-Chapelon, J. Linseisen, H. Boeing, D. Trichopoulos, A. Kalandidi, D. Palli, V. Krogh, R. Tumino, S. Panico, H.B. Bueno-De Mesquita, P.H. Peeters, E. Lund, G. Pera, C. Martinez, P. Amiano, A. Barricarte, M.J. Tormo, J.R. Quiros, G. Berglund, L. Janzon, B. Jarvholm, N.E. Day, N.E. Allen, R. Saracci, R. Kaaks, P. Ferrari, E. Riboli, P. Vineis Carcinogenesis 2007 414 22

[28] T. Martone, P. Vineis, C. Malaveille, B. Terracini Mutat Res 2000 303 9

[29] G. Matullo, A.M. Dunning, S. Guarrera, C. Baynes, S. Polidoro, S. Garte, H. Autrup, C. Malaveille, M. Peluso, L. Airoldi, F. Veglia, E. Gormally, G. Hoek, M. Krzyzanowski, K. Overvad, O. Raaschou-Nielsen, F. Clavel-Chapelon, J. Linseisen, H. Boeing, A. Trichopoulou, D. Palli, V. Krogh, R. Tumino, S. Panico, H.B. Bueno-De Mesquita, P.H. Peeters, E. Lund, G. Pera, C. Martinez, M. Dorronsoro, A. Barricarte, M.J. Tormo, J.R. Quiros, N.E. Day, T.J. Key, R. Saracci, R. Kaaks, E. Riboli, P. Vineis Carcinogenesis 2006 997 1007

[30] Y. Meng, Q. Ma, Y. Yu, J. Farrell, L.A. Farrer, M.A. Wilcox BMC Genet 2005 S146

[31] J. Molitor, M. Papathomas, M Jerrett, S. Richardson Biostatistics 2010 484 498

[32] D.S. Moore, G. McCabe, W. Duckworth, S. Sclove, Chapter 18 :bootstrap methods and permutation tests, The Practice of Business Statistics, W.H. Freeman, New York, 2003.

[33] L. Pachter, B. Sturmfels Proc Natl Acad Sci U S A 2004 16138 43

[34] L. Pachter, B. Sturmfels Proc Natl Acad Sci U S A 2004 16132 7

[35] M. Papathomas, J. Molitor, S. Richardson, E. Riboli, P. Vineis Environ. Health Perspect 2011 84 91

[36] L. Patchter, B. Sturmfels, Algebraic statistics for computational biology, Cambridge University Press, 2005.

[37] M. Peluso, P. Hainaut, L. Airoldi, H. Autrup, A. Dunning, S. Garte, E. Gormally, C. Malaveille, G. Matullo, A. Munnia, E. Riboli, P. Vineis Mutat Res 2005 92 104

[38] G. Pistone, E. Riccomagno, and H.P. Wynn, Algebraic statistics, Chapman and Hall/CRC, Boca Raton, 2001.

[39] F. Rapallo Scandinavian Journal of Statistics 2003 385 397

[40] F. Rapallo Statistical Methods & Applications 2005 45 66

[41] E. Riboli J. Nutr. 2001 170 175

[42] T.K. Rice, N.J. Schork, D.C. Rao Advances in Genetics 2008 293 308

[43] M.D. Ritchie, L.W. Hahn, N. Roodi, L.R. Bailey, W.D. Dupont, F.F. Parl, J.H. Moore Am. J. Hum. Genet. 2001 138 47

[44] J.L. Simon, Resampling : The new statistics (2nd edition), http://bcs.whfreeman.com/pbs/, 1997.

[45] B. Sturmfels, Gröbner bases and convex polytopes, American Mathematical Society, 1996.

[46] B. Sturmfels, Solving systems of polynomial equations, American Mathematical Society, 2002.

[47] B. Sturmfels, Algebra and geometry of statistical models, Tech. report, John von Neumann Lectures, TU München, 2003.

[48] B. Sturmfels, S. Sullivant J Comput Biol 2005 204 228

[49] P. Vineis, L. Airoldi, F. Veglia, L. Olgiati, R. Pastorelli, H. Autrup, A. Dunning, S. Garte, E. Gormally, P. Hainaut, C. Malaveille, G. Matullo, M. Peluso, K. Overvad, A. Tjonneland, F. Clavel-Chapelon, H. Boeing, V. Krogh, D. Palli, S. Panico, R. Tumino, B. Bueno-De Mesquita, P. Peeters, G. Berglund, G. Hallmans, R. Saracci, E. Riboli BMJ 2005 277

[50] S. Wang, W. Xiong, W. Ma, S. Chanock, W. Jedrychowski, R. Wu, F.P. Perera, Gene-environment interactions on growth trajectories, Genetic Epidemiology (2012), doi : 10.1002/gepi.21613.

[51] R.D. Wood Environ Mol Mutagen 2010 520 6

[52] Y. Zhang, J.S. Liu Nature Genet 2007 1167 1173

[53] Y. Zhang, L.H. Rohde, H. Wu Curr Genomics 2009 250 8

Cité par Sources :