The Problem of Processing and Storage of Large Amounts of Scientific Data and Approaches to Its Solution
Matematičeskaâ biologiâ i bioinformatika, Tome 8 (2013) no. 1, pp. 49-65.

Voir la notice de l'article provenant de la source Math-Net.Ru

Today we have the problem of big science data. The information collecting in science experiments, especially in bioinformatics and astrophysics grows in amazing rate. In this paper we consider special program techniques and computer technologies used for work with superlarge volumes of data. Also, we discuss the state of affairs with the big data in the Institute of Mathematical Problems of Biology RAS and in the Pushchino Radio Astronomy Observatory (Astro Space Center of Lebedev Physics Institute RAS).
@article{MBB_2013_8_1_a0,
     author = {E. A. Isaev and V. V. Kornilov},
     title = {The {Problem} of {Processing} and {Storage} of {Large} {Amounts} of {Scientific} {Data} and {Approaches} to {Its} {Solution}},
     journal = {Matemati\v{c}eska\^a biologi\^a i bioinformatika},
     pages = {49--65},
     publisher = {mathdoc},
     volume = {8},
     number = {1},
     year = {2013},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/MBB_2013_8_1_a0/}
}
TY  - JOUR
AU  - E. A. Isaev
AU  - V. V. Kornilov
TI  - The Problem of Processing and Storage of Large Amounts of Scientific Data and Approaches to Its Solution
JO  - Matematičeskaâ biologiâ i bioinformatika
PY  - 2013
SP  - 49
EP  - 65
VL  - 8
IS  - 1
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/MBB_2013_8_1_a0/
LA  - ru
ID  - MBB_2013_8_1_a0
ER  - 
%0 Journal Article
%A E. A. Isaev
%A V. V. Kornilov
%T The Problem of Processing and Storage of Large Amounts of Scientific Data and Approaches to Its Solution
%J Matematičeskaâ biologiâ i bioinformatika
%D 2013
%P 49-65
%V 8
%N 1
%I mathdoc
%U http://geodesic.mathdoc.fr/item/MBB_2013_8_1_a0/
%G ru
%F MBB_2013_8_1_a0
E. A. Isaev; V. V. Kornilov. The Problem of Processing and Storage of Large Amounts of Scientific Data and Approaches to Its Solution. Matematičeskaâ biologiâ i bioinformatika, Tome 8 (2013) no. 1, pp. 49-65. http://geodesic.mathdoc.fr/item/MBB_2013_8_1_a0/

[1] Howe D., Costanzo M., Fey P., Gojobori T., Hannick L., Hide W., Hill D. P., Kania R., Schaeffer M., St Pierre S., et al., “Big data: the future of biocuration”, Nature, 455 (2008), 47–50 | DOI

[2] PMC — a free full-text archive of biomedical and life sciences journal literature at the U.S. National Institutes of Health's National Library of Medicine (NIH/NLM) (data obrascheniya: 10.02.2013) http://www.ncbi.nlm.nih.gov/pmc/

[3] MIKE2.0, The open source standard for Information Management. Big Data Definition, (data obrascheniya: 10.02.2013) http://mike2.openmethodology.org/wiki/Big_Data_Definition

[4] Manyika J., Chui M., Brown B., Bughin J., Dobbs R., Roxburgh C., Byers A. H., Big data: The next frontier for innovation, competition, and productivity, McKinsey Global Institute Report , 2011 (data obrascheniya: 10.02.2013) http://www.mckinsey.com/insights/mgi/research/technology_and_innovation/big_data_the_next_frontier_for_innovation

[5] Kanarakus K., “Mashina Bolshikh Dannykh”, Seti (Network World), 2011, no. 04 (data obrascheniya: 10.02.2013) http://www.osp.ru/nets/2011/04/13010802/

[6] Lynch C., How do your data grow?, Nature, 455:7209 (2008), 28–29 | DOI

[7] Human Genome Project Information Website (data obrascheniya: 10.02.2013) http://www.ornl.gov/sci/techresources/Human_Genome/home.shtml

[8] Drmanac R., Sparks A. B., Callow M. J., Halpern A. L., Burns N. L., Kermani B. G., Carnevali P., Nazarenko I., Nilsen G. B., George Yeung G., et al., “Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays”, Science, 327 (2010), 78–81 | DOI

[9] Pell J., Hintze A., Canino-Koning R., Howe A., Tiedje J. M., Brown C. T., “Scaling metagenome sequence assembly with probabilistic de Bruijn graphs”, PNAS, 109, 13272–13277 | DOI | MR | Zbl

[10] Eric E. S., Linderman M. D., Sorenson J., Lee L., Nolan G. P., “Computational solutions to large-scale data management and analysis”, Nat. Rev. Genet., 11 (2010), 647–657

[11] Loh P., Baym M., Berger B., “Compressive genomics”, Nature Biotechnology, 30 (2012), 627–630 | DOI

[12] E-health Standards and Interoperability, ITU-T Technology Watch Report , April 2012 (data obrascheniya: 10.02.2013) http://www.itu.int/dms_pub/itu-t/oth/23/01/T23010000170001PDFE.pdf

[13] Castro D., “The Role of Information Technology in Medical Research”, IEEE 2009, Atlanta Conference on Science, Technology and Innovation Policy (October 2009), 2009

[14] Ob'edinennyi tsentr vychislitelnoi biologii i bioinformatiki na baze Instituta matematicheskikh problem biologii Puschinskogo nauchnogo tsentra RAN (data obrascheniya: 10.02.2013) http://www.jcbi.ru/index.html

[15] Brumfiel G., “High-energy physics: Down the petabyte highway”, Nature, 469:7330 (2011), 282–283 | DOI

[16] Essers L., “Filtr sekretov mirozdaniya”, Computerworld Rossiya, 2011, no. 18

[17] The Sloan Digital Sky Survey (data obrascheniya: 10.02.2013) http://www.sdss.org/

[18] “Data, data everywhere. A special report on managing information”, The Economist, 2010

[19] Stephens M., Petabyte-chomping big sky telescope sucks down baby code, , The Register (data obrascheniya: 10.02.2013) http://www.theregister.co.uk/2010/11/26/lsst_big_data_and_agile

[20] Boon M., Astronomical Computing, , Symmetry Breaking (data obrascheniya: 10.02.2013) http://www.symmetrymagazine.org/breaking/2010/10/18/astronomical-computing

[21] LOFAR website (data obrascheniya: 10.02.2013) http://www.lofar.org/

[22] SKA Project website (data obrascheniya: 10.02.2013) http://www.skatelescope.org/

[23] Pugachev V. D., Isaev E. A., Amzarakov M. B., Samodurov V. A., Sukhov R. R., Kobylka N. A., “Razvitie tsentrov obrabotki nauchnykh dannykh”, Vserossiiskaya radioastronomicheskaya konferentsiya, tez. dokl., IPA, S.-P., 2011, 144

[24] Proekt «RadioAstron» (data obrascheniya: 10.02.2013) http://www.asc.rssi.ru/radioastron/rus/index.html

[25] Indiana launches new ultra-high-speed network. University Information Technology Services (data obrascheniya: 10.02.2013) http://uitsnews.iu.edu/2012/01/31/indiana-launches-new-ultra-high-speed-network

[26] Dubova N., “V avangarde Bolshikh Dannykh”, Otkrytye sistemy, 2012, no. 03

[27] The Apache Software Foundation Project (data obrascheniya: 10.02.2013) http://www.apache.org/foundation

[28] Apache Hadoop project website (data obrascheniya: 10.02.2013) http://hadoop.apache.org

[29] White T., Hadoop: The Definitive Guide. Storage and Analysis at Internet Scale, 3rd Edition, O'Reilly Media; Yahoo Press., 2012, 688 pp.

[30] Dean J., Ghemawat S., “MapReduce: Simplified data processing on large clusters”, Proceedings of the Sixth Conference on Operating System Design and Implementation (Berkeley, 2004)

[31] Sadalage P., Fowler M., NoSQL Distilled, Pearson Education, 2012, 192 pp.

[32] Stonebraker M., Abadi D., Dawitt D. J., Madden S., Paulson E., Pavlo A., Rasin A., MapReduce and Parallel DBMSs: Friends or Foes?, Communications of the ACM, 53:1 (2010) | DOI

[33] Pavlo A., Paulson E., Rasin A., Abadi D. J., DeWitt D. J., Madden S. R., Stonebraker M., “A comparison of approaches to large-scale data analysis”, Proceedings of the 35th SIGMOD International Conference on Management of Data, ACM Press, New York, 2009, 165–178 | DOI

[34] Chernyak L., “Platformy dlya Bolshikh Dannykh”, Otkrytye sistemy, 2012, no. 07

[35] Artemov S., Big Data: novye vozmozhnosti dlya rastuschego biznesa, , «Infosistemy Dzhet» (data obrascheniya: 10.02.2013) http://www.jet.su

[36] Announcing the New SGI UV: The Big Brain Computer, , Business Wire (data obrascheniya: 10.02.2013) http://www.businesswire.com/news/home/20120618005340/en

[37] Vykhodtsev A., “Platforma dlya Bolshikh Dannykh”, Otkrytye sistemy, 2012, no. 06

[38] Yakhina I., “Khranilische dlya Bolshikh Dannykh”, Otkrytye sistemy, 2012, no. 07

[39] Serov D., “Mashiny dlya analitikov”, Otkrytye sistemy, 2011, no. 04

[40] Chernyak L., “Bolshie dannye vozrozhdayut DAS”, Computerworld Rossiya, 2011, no. 14

[41] Proekt EGEE-RDIG (data obrascheniya: 10.02.2013) http://www.egee-rdig.ru

[42] TOP500 List of the world's top supercomputers, November 2012 (data obrascheniya: 10.02.2013) http://www.top500.org/lists/2012/11/

[43] (data obrascheniya: 10.02.2013) http://www.olcf.ornl.gov/titan/

[44] (data obrascheniya: 10.02.2013) http://parallel.ru/cluster/lomonosov.html

[45] The OpenNet Project (data obrascheniya: 10.02.2013) http://www.opennet.ru/opennews/art.shtml?num=35358

[46] The Graph 500 List (data obrascheniya: 10.02.2013) http://www.graph500.org/

[47] Vychislitelnyi klaster PNTs RAN (data obrascheniya: 10.02.2013) http://www.jcbi.ru/klaster/index.shtml

[48] Lakhno V. D., Isaev E. A., Pugachev V. D., Zaitsev A. Yu., Fialko N. S., Rykunov S. D., Ustinin M. N., “Razvitie informatsionno-kommunikatsionnykh tekhnologii v Puschinskom nauchnom tsentre RAN”, Matematicheskaya biologiya i bioinformatika, 7:2 (2012), 529–544 | MR

[49] Shatskaya M. V., Girin I. A., Isaev E. A., Likhachev S. F., Pimakov A. S., Seliverstov S. I., Fedorov N. A., “Organizatsiya tsentra obrabotki nauchnoi informatsii dlya radiointerferometricheskikh proektov”, Kosmicheskie issledovaniya, 50:4 (2012), 346–350