Voir la notice de l'article provenant de la source Math-Net.Ru
@article{MBB_2017_12_1_a0, author = {N. N. Nazipova and E. A. Isaev and V. V. Kornilov and D. V. Pervukhin and A. A. Morozova and A. A. Gorbunov and M. N. Ustinin}, title = {Big {Data} in bioinformatics}, journal = {Matemati\v{c}eska\^a biologi\^a i bioinformatika}, pages = {102--119}, publisher = {mathdoc}, volume = {12}, number = {1}, year = {2017}, language = {ru}, url = {http://geodesic.mathdoc.fr/item/MBB_2017_12_1_a0/} }
TY - JOUR AU - N. N. Nazipova AU - E. A. Isaev AU - V. V. Kornilov AU - D. V. Pervukhin AU - A. A. Morozova AU - A. A. Gorbunov AU - M. N. Ustinin TI - Big Data in bioinformatics JO - Matematičeskaâ biologiâ i bioinformatika PY - 2017 SP - 102 EP - 119 VL - 12 IS - 1 PB - mathdoc UR - http://geodesic.mathdoc.fr/item/MBB_2017_12_1_a0/ LA - ru ID - MBB_2017_12_1_a0 ER -
%0 Journal Article %A N. N. Nazipova %A E. A. Isaev %A V. V. Kornilov %A D. V. Pervukhin %A A. A. Morozova %A A. A. Gorbunov %A M. N. Ustinin %T Big Data in bioinformatics %J Matematičeskaâ biologiâ i bioinformatika %D 2017 %P 102-119 %V 12 %N 1 %I mathdoc %U http://geodesic.mathdoc.fr/item/MBB_2017_12_1_a0/ %G ru %F MBB_2017_12_1_a0
N. N. Nazipova; E. A. Isaev; V. V. Kornilov; D. V. Pervukhin; A. A. Morozova; A. A. Gorbunov; M. N. Ustinin. Big Data in bioinformatics. Matematičeskaâ biologiâ i bioinformatika, Tome 12 (2017) no. 1, pp. 102-119. http://geodesic.mathdoc.fr/item/MBB_2017_12_1_a0/
[1] Manyika J., Chui M., Brown B., Bughin J., Dobbs R., Roxburgh C., Byers A. H., The Next Frontier for Innovation, Competition, and Productivity, McKinsey Global Institute, San Francisco, 2011 (data obrascheniya: 17.02.2017) http://www.mckinsey.com/business-functions/digital-mckinsey/our-insights/big-data-the-next-frontier-for-innovation
[2] Jacobs A., “The Pathologies of Big Data”, Communications of the ACM, 52:8 (2009) | DOI
[3] What's New in Gartner's Hype Cycle for Emerging Technologies, , Gartner, 2015 (data obrascheniya: 17.02.2017) http://www.gartner.com/smarterwithgartner/whats-new-in-gartners-hype-cycle-for-emerging-technologies-2015/
[4] Chui M., Loffler M., Roberts R., The Internet of Things, , McKinsey Quarterly, 2010 (data obrascheniya: 17.02.2017) http://www.mckinsey.com/industries/high-tech/our-insights/the-internet-of-things
[5] Hogeweg P., “The Roots of Bioinformatics in Theoretical Biology”, PLOS Computational Biology, 7:3 (2011), e1002021 | DOI
[6] Winkler H., Verbreitung und Ursache der Parthenogenesis im Pflanzen - und Tierreiche, Verlag Fischer, Jena, 1920
[7] Baker M., “The 'Oms Puzzle”, Nature, 494 (2013), 416–419 | DOI
[8] Ohashi H., Hesegawa M., Wakimoto K., Miyamoto-Sato E., “Next-generation technologies for multiomics approaches including interactome sequencing”, BioMed Research International, 2015 (2015), 104209 | DOI
[9] “International Human Genome Sequencing Consortium. Human genome”, Nature, 409 (2001), 860–921 | DOI
[10] Venter J. C., Adams M. D., Myers E. W., Li P. W., Mural R. J., Sutton G. G., Smith H. O., Yandell M., Evans C. A., Holt R. A., et al., “The sequence of the human genome”, Science, 291:5507 (2001), 1304–1351 | DOI
[11] Buermans H. P. J., den Dunnen J. T., “Next generation sequencing technology. Advances and applications”, BBA — Molecular Basis of Disease, 1842:10 (2014), 1932–1941 | DOI
[12] Bioinforx Inc. Next Generation Sequencing Software, (data obrascheniya: 17.02.2017) http://bioinfo.wisc.edu/knowledge_base/next-gen-seq_software.php
[13] BaseSpace Sequence Hub, (data obrascheniya: 17.02.2017) https://www.illumina.com/content/dam/illumina-marketing/documents/products/datasheets/datasheet_basespace.pdf
[14] CLCBio, (data obrascheniya: 17.02.2017) http://www.clcbio.com
[15] DNASTAR Lasergene, (data obrascheniya: 17.02.2017) https://www.dnastar.com/t-allproducts.aspx
[16] Kearse M., Moir R., Wilson A., Stones-Havas S., Cheung M., Sturrock S., Buxton S., Cooper A., Markowitz S., Duran C., et al., “Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data”, Bioinformatics, 28:12 (2012), 1647–1649 | DOI
[17] Giardine B., Riemer C., Hardison R. C., Burhans R., Elnitski L., Shah P., Zhang Y., Blankenberg D., Albert I., Taylor J., et al., “Galaxy: a platform for interactive large-scale genome analysis”, Genome Res., 15:10 (2005), 1451–1455 | DOI
[18] Goecks J., Nekrutenko A., Taylor J., “Galaxy Team. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences”, Genome Biol., 11:8 (2010), R86 | DOI
[19] Madduri R. K., Sulakhe D., Lacinski L., Liu B., Rodriguez A., Chard K., Dave U. J., Foster I. T., “Experiences Building Globus Genomics: A Next-Generation Sequencing Analysis Service using Galaxy, Globus, and Amazon Web Services”, Concurr. Comput., 26:13 (2014), 2266–2279 | DOI
[20] Wattam A. R., Abraham D., Dalay O., Disz T. L., Driscoll T., Gabbard J. L., Gillespie J. J., Gough R., Hix D., Kenyon R., et al., “PATRIC, the bacterial bioinformatics database and analysis resource”, Nucleic Acids Res., 42 (2014), D581–D591 | DOI
[21] Golosova O., Henderson R., Vaskin Y., Gabrielian A., Grekhov G., Nagarajan V., Oler A. J., Quinones M., Hurt D., Fursov M., Huyen Y., “Unipro UGENE NGS pipelines and components for variant calling, RNA-seq and ChIP-seq data analyses”, PeerJ., 2 (2014), e644 | DOI
[22] Okonechnikov K., Golosova O., Fursov M., “UGENE Team. Unipro UGENE: a unified bioinformatics toolkit”, Bioinformatics, 28:8 (2012), 1166–1167 | DOI
[23] Jagla B., Wiswedel B., Coppree J.-Y., “Extending KNIME for next-generation sequencing data analysis”, Bioinformatics, 27:20 (2011), 2907–2909 | DOI
[24] Warr W. A., “Scientific workflow systems: Pipeline Pilot and KNIME”, Journal of Computer-Aided Molecular Design, 26:7 (2012), 801–804 | DOI
[25] Oinn T., Addis M., Ferris J., Marvin D., Senger M., Greenwood M., Carver T., Glover K., Pocock M. R., Wipat A., Li P., “Taverna: a tool for the composition and enactment of bioinformatics workflows”, Bioinformatics, 20:17 (2004), 3045–3054 | DOI
[26] Barnett D. W., Garrison E. K., Quinlan A. R., Stromberg M. P., Marth G. T., “BamTools: a C++ API and toolkit for analyzing and managing BAM files”, Bioinformatics, 27:12 (2011), 1691–1692 | DOI
[27] Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R., “1000 Genome Project Data Processing Subgroup. The Sequence Alignment/Map format and SAMtools”, Bioinformatics, 25:16 (2009), 2078–2079 | DOI
[28] Nordell Markovits A., Joly Beauparlant C., Toupin D., Wang S., Droit A., Gevry N., “NGS++: a library for rapid prototyping of epigenomics software tools”, Bioinformatics, 29:15 (2013), 1893–1894 | DOI
[29] Plieskatt J., Rinaldi G., Brindley P. J., Jia X., Potriquet J., Bethony J., Mulvenna J., “Bioclojure: a functional library for the manipulation of biological sequences”, Bioinformatics, 30:17 (2014), 2537–2539 | DOI
[30] libStatGen, (data obrascheniya: 17.02.2017) https://github.com/statgen/libStatGen/
[31] Pitt W. R., Williams M. A., Steven M., Sweeney B., Bleasby A. J., Moss D. S., “The Bioinformatics Template Library — generic components for biocomputing”, Bioinformatics, 17:8 (2001), 729–737 | DOI
[32] Stajich J. E., Block D., Boulez K., Brenner S. E., Chervitz S. A., Dagdigian C., Fuellen G., Gilbert J. G., Korf I., Lapp H., et al., “The Bioperl toolkit: Perl modules for the life sciences”, Genome Res., 12:10 (2002), 1611–1618 | DOI
[33] Goto N., Prins P., Nakao M., Bonnal R., Aerts J., Katayama T., “BioRuby: bioinformatics software for the Ruby programming language”, Bioinformatics, 26:20 (2010), 2617–269 | DOI
[34] Holland R. C., Down T. A., Pocock M., Prlic A., Huen D., James K., Foisy S., Drager A., Yates A., Heuer M., et al., “BioJava: an open-source framework for bioinformatics”, Bioinformatics, 24:18 (2008), 2096–2097 | DOI
[35] Cock P. J., Antao T., Chang J. T., Chapman B. A., Cox C. J., Dalke A., Friedberg I., Hamelryck T., Kauff F., Wilczynski B., et al., “Biopython: freely available Python tools for computational molecular biology and bioinformatics”, Bioinformatics, 25:11 (2009), 1422–1423 | DOI
[36] Open Bioinformatics Foundation, (data obrascheniya: 17.02.2017) https://www.open-bio.org/wiki/Main_Page
[37] Huber W., Carey V. J., Gentleman R., Anders S., Carlson M., Carvalho B. S., Bravo H. C., Davis S., Gatto L., Girke T., et al., “Orchestrating high-throughput genomic analysis with Bioconductor”, Nat. Methods, 12:2 (2015), 115–121 | DOI
[38] Gentleman R. C., Carey V. J., Bates D. M., Bolstad B., Dettling M., Dudoit S., Ellis B., Gautier L., Ge Y., Gentry J., et al., “Bioconductor: open software development for computational biology and bioinformatics”, Genome Biol., 5:10 (2004), R80 | DOI
[39] Milicchio F., Rose R., Bian J., Min J., Prosperi M., “Visual programming for next-generation data analytics”, BioData Mining, 9 (2016), 16 | DOI
[40] Bernstein F. C., Koetzle T. F., Williams G. J., Meyer E. F. Jr., Brice M. D., Rodgers J. R., Kennard O., Shimanouchi T., Tasumi M., “The Protein Data Bank: a computer-based archival file for macromolecular structures”, J. Mol. Biol., 112:3 (1977), 535–542 | DOI
[41] Bourne P. E., Berman H. M., McMahon B., Watenpaugh K. D., Westbrook J. D., Fitzgerald P. M. D., “Macromolecular crystallographic information file”, Methods in Enzymology, 277 (1997), 571–590 | DOI
[42] Galperin M. Y., Fernandez-Suarez X. M., Rigden D. J., “The 24th annual Nucleic Acids Research database issue: a look back and upcoming changes”, Nucleic Acids Res., 45 (2017), D1–D11 | DOI
[43] Benson D., Lipman D. J., Ostell J., “GenBank”, Nucleic Acids Res., 22 (1994), 3441–3444 | DOI
[44] Rice C. M., Fuchs R., Higgins D. G., Stoehr P. J., Cameron G. N., “The EMBL Data Library”, Nucleic Acids Res., 21 (1993), 2967–2971 | DOI
[45] Tateno Y., Gojobori T., “DNA Data Bank of Japan in the age of information biology”, Nucleic Acids Res., 25:1 (1997), 14–17 | DOI
[46] de Brevern A. G., Meyniel J.-P., Fairhead C., Neuveglise C., Malpertuy A., “Trends in IT Innovation to Build a Next Generation Bioinformatics Solution to Manage and Analyse Biological Big Data Produced by NGS Technologies”, BioMed Research International, 2015, 904541
[47] Lith A., Mattsson J., Investigating Storage Solutions for Large Data. A comparison of well performing and scalable data storage solutions for real time extraction and batch insertion of data, Master of Science Thesis, 2010 (data obrascheniya: 17.02.2017) http://publications.lib.chalmers.se/records/fulltext/123839.pdf
[48] Svensson J., Relational vs. graph databases: Which to use and when?, SD Times, 2016 (data obrascheniya: 17.02.2017) http://sdtimes.com/guest-view-relational-vs-graph-databases-use/#sthash.yHI6aoDv.dpuf
[49] Have C. T., Jensen L. J., Are graph databases ready for bioinformatics?, Bioinformatics, 29:24 (2013), 3107–3108 | DOI
[50] Taylor R. C., “An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics”, BMC Bioinformatics, 11 (2010), S1 | DOI
[51] Chang F., Dean J., Ghemawat S., Hsieh W. C., Wallach D. A., Burrows M., Chandra T., Fikes A., Gruber R. E., “Bigtable: A Distributed Storage System For Structured Data”, The 7th Symposium on Operating System Design and Implementation, Usenix Association, Seattle, WA, 2006, 14 pp. (data obrascheniya: 17.02.2017) https://static.googleusercontent.com/media/research.google.com/ru//archive/bigtable-osdi06.pdf
[52] Shen L., Shao N., Liu X., Nestler E., “Ngs.plot: quick mining and visualization of next-generation sequencing data by integrating genomic databases”, BMC Genomics, 15:1 (2014), 284 | DOI
[53] Robinson J. T., Thorvaldsdóttir H., Winckler W., Guttman M., Lander E. S., Getz G., Mesirov J. P., “Integrative genomics viewer”, Nature Biotechnology, 29:1 (2011), 24–26 | DOI
[54] Toedling J., Ciaudo C., Voinnet O., Heard E., Barillot E., “Girafe — an R/Bioconductor package for functional exploration of aligned next-generation sequencing reads”, Bioinformatics, 26:22 (2010), 2902–2903 | DOI
[55] Nolan D., Lang D. T., “Interactive and animated scalable vector graphics and R data displays”, Journal of Statistical Software, 46:1 (2012), 1–88 | DOI
[56] TIBCO Spotfire Homepage, (data obrascheniya: 17.02.2017) http://spotfire.tibco.com/
[57] Wexler J., Thompson W., Aponte K., “Time Is Precious, So Are Your Models. SAS provides solutions to streamline deployment”, SAS Global Forum 2013, 086-2013 (data obrascheniya: 17.02.2017) https://support.sas.com/resources/papers/proceedings13/086-2013.pdf
[58] Tanenbaum E., van Steen M., Raspredelennye sistemy. Printsipy i paradigmy, Piter, S.-P., 2003, 877 pp.
[59] Dean J., Ghemawat S., “MapReduce: simplified data processing on large clusters”, Commun. ACM, 51:1 (2008), 107–113 | DOI
[60] White T., Hadoop: The Definitive Guide, O'Reilly Media, Inc., 2015, 756 pp.
[61] The Apache Software Foundation Home page, (data obrascheniya: 17.02.2017) http://www.apache.org/
[62] IBM z Systems — z13s, (data obrascheniya: 17.02.2017) http://www-03.ibm.com/systems/z/hardware/z13s.html/
[63] Rustici G., Kolesnikov N., Brandizi M., Burdett T., Dylag M., Emam I., Farne A., Hastings E., Ison J., Keays M., et al., “ArrayExpress update — trends in database growth and links to data analysis tools”, Nucleic Acids Res., 41 (2013), D987–D990 | DOI
[64] Greene A. C., Giffin K. A., Greene C. S., Moore J. H., “Adapting bioinformatics curricula for big data”, Briefings in Bioinformatics, 17:1 (2016), 43–50 | DOI
[65] Margolis R., Derr L., Dunn M., Huerta M., Larkin J., Sheehan J., Guyer M., Green E. D., “The National Institutes of Health's Big Data to Knowledge (BD2K) initiative: capitalizing on biomedical big data”, J. Am. Med. Inform. Assoc., 21 (2014), 957–958 | DOI
[66] Luo J., Wu M., Gopukumar D., Zhao Y., “Big Data Application in Biomedical Research and Health Care: A Literature Review”, Biomed. Inform. Insights., 8 (2016), 1–10