The method for analysis of expression data homogeneity based on the Student test
Matematičeskaâ biologiâ i bioinformatika, Tome 13 (2018) no. 1, pp. 50-67.

Voir la notice de l'article provenant de la source Math-Net.Ru

As early as in 2002, the need was declared for a public repository of experimental results for gene expression profiling. Since that time, several storage hubs for gene expression profiling data have been created, to enable profile analysis and comparison. This gene expression profiling may usually be performed using either mRNA microarray hybridization ornext-generation sequencing. However, all these big data may be heterogeneous, even if they were obtained for the same type of normal or pathologically altered organs and tissues, and have been investigated using the same experimental platform. In the current work, we have proposed a new method for analyzing the homogeneity of expression data based on the Student test. Using computational experiments, we have shown the advantage of our method in terms of computational speed for large datasets, and developed an approach to interpreting the results for the Student test application. Using a new method of data analysis, we have suggested a scheme for visualization of the overall picture of gene expression and comparison of expression profiles at different diseases and/or different stages of the same disease.  
@article{MBB_2018_13_1_a11,
     author = {R. O. Aliev and N. M. Borisov},
     title = {The method for analysis of expression data homogeneity based on the {Student} test},
     journal = {Matemati\v{c}eska\^a biologi\^a i bioinformatika},
     pages = {50--67},
     publisher = {mathdoc},
     volume = {13},
     number = {1},
     year = {2018},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/MBB_2018_13_1_a11/}
}
TY  - JOUR
AU  - R. O. Aliev
AU  - N. M. Borisov
TI  - The method for analysis of expression data homogeneity based on the Student test
JO  - Matematičeskaâ biologiâ i bioinformatika
PY  - 2018
SP  - 50
EP  - 67
VL  - 13
IS  - 1
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/MBB_2018_13_1_a11/
LA  - ru
ID  - MBB_2018_13_1_a11
ER  - 
%0 Journal Article
%A R. O. Aliev
%A N. M. Borisov
%T The method for analysis of expression data homogeneity based on the Student test
%J Matematičeskaâ biologiâ i bioinformatika
%D 2018
%P 50-67
%V 13
%N 1
%I mathdoc
%U http://geodesic.mathdoc.fr/item/MBB_2018_13_1_a11/
%G ru
%F MBB_2018_13_1_a11
R. O. Aliev; N. M. Borisov. The method for analysis of expression data homogeneity based on the Student test. Matematičeskaâ biologiâ i bioinformatika, Tome 13 (2018) no. 1, pp. 50-67. http://geodesic.mathdoc.fr/item/MBB_2018_13_1_a11/

[1] Schena M., Shalon D., Davis R. W., Brown P. O., “Quantitative monitoring of gene expression patterns with a complementary DNA microarray”, Science, 270:5235 (1995), 467–470 | DOI

[2] Zhang W., Yu Y., Hertwig F., Thierry-Mieg J., Zhang W., Thierry-Mieg D., Wang J., Furlanello C., Devanarayan V., Cheng J., Deng Y. et al., “Comparison of RNA-seq and microarray-based models for clinical endpoint prediction”, Genome Biology, 16:1 (2015), 133 | DOI

[3] Kooken J., Foxa K., Foxa A., Altomareb D., Creekb K., Wunschelc D., Pajares-Merinod S., Martinez-Ballesterosd I., Garaizard J., Oyarzabale O., Samadpour M., “Identification of staphylococcal species based on variations in protein sequences (mass spectrometry) and DNA sequence (sodA microarray)”, Molecular and cellular probes, 28:1 (2014), 41–50 | DOI

[4] Kellam P., “Microarray gene expression database: progress towards an international repository of gene expression data”, Genome Biology, 2:5 (2001), reports4011.1, 3 pp. | DOI

[5] Edgar R., Domrachev M., Lash A. E., “Gene Expression Omnibus: NCBI gene expression and hybridization array data repository”, Nucleic Acids Research, 30:1 (2002), 207–210 | DOI

[6] Brazma A., Parkinson H., Sarkans U., Shojatalab M., Vilo J., Abeygunawardena N., Holloway E., Kapushesky M., Kemmeren P., Garcia Lara G., Oezcimen A. et al., “ArrayExpress-a public repository for microarray gene expression data at the EBI”, Nucleic Acids Research, 31:1 (2003), 68–71 | DOI | MR

[7] Jones P., Cote R. G., Cho S. Y., Klie S., Martens L., Quinn A. F., Thorneycroft D., Hermjakob H., “PRIDE: new developments and new datasets”, Nucleic Acids Research, 36 (2008), D878–D883 | DOI

[8] McLendon R., Bigner D., Friedman A., Van Meir E. G., Mastrogianakis G. M., Olson J. J., Brat D. J., Mikkelsen T., Lehman N., Aldape K. et al., “Comprehensive genomic characterization defines human glioblastoma genes and core pathways”, Nature, 455:7216 (2008), 1061–1068 | DOI

[9] Demetrashvili N., Kron K., Pethe V., Bapat B., Briollais L., How to deal with batch effect in sequential microarray experiments?, Molecular Informatics, 29:5 (2010), 387–393 | DOI

[10] Guo L., Lobenhofer E. K., Wang C., Shippy R., Harris S. C., Zhang L., Meil N., Chen T., Herman D., Goodsaid F. M., et al., “Rat toxicogenomic study reveals analytical consistency across microarray platforms”, Nature Biotechnology, 24 (2006)

[11] Borisov N., Suntsova M., Garazha A., Lezhnina K., Kovalchuk O., Aliper A., Ilnitskaya E., Sorokin M., Korzinkin M., Saenko V. et al., “Data aggregation at the level of molecular pathways improves stability of experimental transcriptomic and proteomic data”, Cell Cycle, 16:19 (2017), 1810–1823 | DOI

[12] Welle S., Brooks A. I., Delehanty J. M., Needler N., Thornton C. A., “Gene expression profile of aging in human muscle”, Physiological Genomics, 14:2 (2003), 149–159 | DOI

[13] Blalock E. M., Geddes J. W., Chen K. C., Porter N. M., Markesbery W. R., Landfield P. W., “Incipient Alzheimer's disease: microarray correlation analyses reveal major transcriptional and tumor suppressor responses”, PNAS, 101:7 (2004), 2173–2178 | DOI

[14] Borovecki F., Lovrecic L., Zhou J., Jeong H., Then F., Rosas H. D., Hersch S. M., Hogarth P., Bouzou B., Jensen R. V., Krainc D., “Genome-wide expression profiling of human blood reveals biomarkers for Huntington's disease”, PNAS, 102:31 (2005), 11023–11028 | DOI

[15] Sternberg A., Killick S., Littlewood T., Hatton C., Peniket A., Seidl T., Soneji S., Leach J., Bowen D., Chapman C. et al., “Evidence for reduced B-cell progenitors in early (low-risk) myelodysplastic syndrome”, Blood, 106:9 (2005), 2982–2991 | DOI

[16] Scherzer C. R., Eklund A. C., Morse L. J., Liao Z., Locascio J. J., Fefer D., Schwarzschild M. A., Schlossmacher M. G., Hauser M. A., Vance J. M., Sudarsky L. R. et al., “Molecular markers of early Parkinson's disease based on gene expression in blood”, PNAS, 104:3 (2007), 955–960 | DOI

[17] Yusenko M. V., Zubakov D., Kovacs G., “Gene expression profiling of chromophobe renal cell carcinomas and renal oncocytomas by Affymetrix GeneChip using pooled and individual tumours”, International Journal of Biological Sciences, 5:6 (2009), 517 | DOI

[18] Duke D. C., Moran L. B., Pearce R. K. B., Graeber M. B., “The medial and lateral substantia nigra in Parkinson's disease: mRNA profiles associated with higher brain tissue vulnerability”, Neurogenetics, 8:2 (2007), 83–94 | DOI

[19] Kaizer E. C., Glaser C. L., Chaussabel D., Banchereau J., Pascual V., White P. C., “Gene expression in peripheral blood mononuclear cells from children with diabetes”, The Journal of Clinical Endocrinology Metabolism, 92:9 (2007), 3705–3711 | DOI

[20] Hokama M., Oka S., Leon J., Ninomiya T., Honda H., Sasaki K., Iwaki T., Ohara T., Sasaki T., LaFerla F. M. et al., “Altered expression of diabetes-related genes in Alzheimer's disease brains: the Hisayama study”, Cerebral Cortex, 24:9 (2013), 2476–2488 | DOI

[21] Lewis D. A., Stashenko G. J., Akay O. M., Price L. I., Owzar K., Ginsburg G. S., Chi J., Ortel T. L., “Whole blood gene expression analyses in patients with single versus recurrent venous thromboembolism”, Thrombosis Research, 128:6 (2011), 536–540 | DOI

[22] Lewis D. A., Suchindran S., Beckman M. G., Hooper W. C., Grant A. M., Heit J. A., Manco-Johnson M., Moll S., Philipp C. S., Kenney K. et al., “Whole blood gene expression profiles distinguish clinical phenotypes of venous thromboembolism”, Thrombosis Research, 135:4 (2015), 659–665 | DOI

[23] Tso C. L., Shintaku P., Chen J., Liu Q., Liu J., Chen Z., Yoshimoto K., Mischel P. S., Cloughesy T. F., Liau L. M., Nelson S. F., “Primary glioblastomas express mesenchymal stem-like properties”, Molecular Cancer Research, 4:9 (2006), 607–619 | DOI

[24] Asgharzadeh S., Pique-Regi R., Sposto R., Wang H., Yang Y., Shimada H., Matthay K., Buckley J., Ortega A., Seeger R. C., “Prognostic significance of gene expression profiles of metastatic neuroblastomas lacking MYCN gene amplification”, Journal of the National Cancer Institute, 98:17 (2006), 1193–1203 | DOI

[25] Rock R. B., Hu S., Deshpande A., Munir S., May B. J., Baker C. A., Peterson P. K., Kapur V., “Transcriptional response of human microglial cells to interferon-[gamma]”, Genes and Immunity, 6:8 (2005), 712 | DOI

[26] Bolstad B. M., Irizarry R. A., Astrand M., Speed T. P., “A comparison of normalization methods for high density oligonucleotide array data based on variance and bias”, Bioinformatics, 19:2 (2003), 185–193 | DOI

[27] Suzuki R., Shimodaira H., “Pvclust: an R package for assessing the uncertainty in hierarchical clustering”, Bioinformatics, 22:12 (2006), 1540–1542 | DOI

[28] Bache S. M., Wickham H., Magrittr: A forward-pipe operator for R, R package version 1.5, , 2014 (data obrascheniya: 15.04.2018) https://CRAN.R-project.org/package=magrittr

[29] Gentleman R., Carey V., Morgan M., Falcon S., Biobase: base functions for Bioconductor, R package version 2.34.0, , 2016 (data obrascheniya: 27.03.2018) https://www.bioconductor.org/packages/3.4/bioc/html/Biobase.html

[30] Wu Z., Irizarry R. A., Gentleman R., Martinez-Murillo F., Spencer F., “A model-based background adjustment for oligonucleotide expression arrays”, Journal of the American statistical Association, 99:468 (2004), 909–917 | DOI | MR | Zbl

[31] Irizarry R. A., Gautier L., Bolstad B. M., Miller C., Methods for affymetrix oligonucleotide arrays, R package version 1.52, , 2016 (data obrascheniya: 27.03.2018) https://www.bioconductor.org/packages/3.4/bioc/html/affy.html

[32] Bolstad B. M., preprocessCore: A collection of pre-processing functions, R package version 1.36.0, , 2016 (data obrascheniya: 27.03.2018) https://www.bioconductor.org/packages/3.4/bioc/html/preprocessCore.html

[33] Pollard K. S., Gilbert H. N., Ge Y., Taylor S., Dudoit S., Resampling-based multiple hypothesis testing, R package version 2.30.0, , 2016 (data obrascheniya: 27.03.2018) https://www.bioconductor.org/packages/3.4/bioc/html/multtest.html

[34] Team R. C., Worldwide C., R Foundation for Statistical Computing, R package version 3.6.0, , 2017 (data obrascheniya: 07.01.2018) https://stat.ethz.ch/R-manual/R-devel/library/stats/html/00Index.html

[35] Pages H., Carlson M., Falcon S., Li N. A., Annotation Dbi: Annotation Database Interface, R package version 1.36.2, , 2016 (data obrascheniya: 27.03.2017) https://bioconductor.org/packages/3.4/bioc/html/AnnotationDbi.html

[36] Gene Expression Omnibus, (data obrascheniya: 29.03.2018) https://www.ncbi.nlm.nih.gov/geo/