Heuristic search of exact biclusters in binary data
International Journal of Applied Mathematics and Computer Science, Tome 30 (2020) no. 1, pp. 161-171.

Voir la notice de l'article provenant de la source Library of Science

The biclustering of two-dimensional homogeneous data consists in finding a subset of rows and a subset of columns whose intersection provides a set of cells whose values fulfil a specified condition. Usually it is defined as equality or comparability. One of the presented approaches is based on the model of Boolean reasoning, in which finding biclusters in binary or discrete data comes down to the problem of finding prime implicants of some Boolean function. Due to the high computational complexity of this task, the application of some heuristics should be considered. In the paper, a modification of the well-known Johnson strategy for prime implicant approximation induction is presented, which is necessary for the biclustering problem. The new method is applied to artificial and biomedical datasets.
Keywords: biclustering, Boolean reasoning, prime implicant approximation, biomedical data analysis, Johnson heuristic
@article{IJAMCS_2020_30_1_a12,
     author = {Michalak, Marcin and Jaksik, Roman and \'Sl\k{e}zak, Dominik},
     title = {Heuristic search of exact biclusters in binary data},
     journal = {International Journal of Applied Mathematics and Computer Science},
     pages = {161--171},
     publisher = {mathdoc},
     volume = {30},
     number = {1},
     year = {2020},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/IJAMCS_2020_30_1_a12/}
}
TY  - JOUR
AU  - Michalak, Marcin
AU  - Jaksik, Roman
AU  - Ślęzak, Dominik
TI  - Heuristic search of exact biclusters in binary data
JO  - International Journal of Applied Mathematics and Computer Science
PY  - 2020
SP  - 161
EP  - 171
VL  - 30
IS  - 1
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/IJAMCS_2020_30_1_a12/
LA  - en
ID  - IJAMCS_2020_30_1_a12
ER  - 
%0 Journal Article
%A Michalak, Marcin
%A Jaksik, Roman
%A Ślęzak, Dominik
%T Heuristic search of exact biclusters in binary data
%J International Journal of Applied Mathematics and Computer Science
%D 2020
%P 161-171
%V 30
%N 1
%I mathdoc
%U http://geodesic.mathdoc.fr/item/IJAMCS_2020_30_1_a12/
%G en
%F IJAMCS_2020_30_1_a12
Michalak, Marcin; Jaksik, Roman; Ślęzak, Dominik. Heuristic search of exact biclusters in binary data. International Journal of Applied Mathematics and Computer Science, Tome 30 (2020) no. 1, pp. 161-171. http://geodesic.mathdoc.fr/item/IJAMCS_2020_30_1_a12/

[1] Busygin, S., Prokopyev, O. and Pardalos, P.M. (2008). Biclustering in data mining, Computers Operations Research 35(9): 2964–2987.

[2] Chagoyen, M., Carmona-Saez, P., Shatkay, H., Carazo, J.M. and Pascual-Montano, A. (2006). Discovering semantic features in the literature: A foundation for building functional associations, BMC Bioinformatics 7, Article no. 41.

[3] Cheng, Y. and Church, G. (2000). Biclustering of expression data, Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology, San Diego, CA, USA, Vol. 8, pp. 93–103.

[4] Gusenleitner, D., Howe, E.A., Bentink, S., Quackenbush, J. and Culhane, A.C. (2012). iBBiG: Iterative binary bi-clustering of gene sets, Bioinformatics 28(19): 2484–2492.

[5] Hartigan, J.A. (1972). Direct clustering of a data matrix, Journal of the American Statistical Association 67(337): 123–129.

[6] Hochreiter, S., Bodenhofer, U., Heusel, M., Mayr, A., Mitterecker, A., Kasim, A., Khamiakova, T., Van Sanden, S., Lin, D., Talloen, W., Bijnens, L., Goehlmann, H.W.H., Shkedy, Z. and Clevert, D.-A. (2010). FABIA: Factor analysis for bicluster acquisition, Bioinformatics 26(12): 1520–1527.

[7] Ignatov, D.I. and Watson, B.W. (2016). Towards a unified taxonomy of biclustering methods, Russian and South African Workshop on Knowledge Discovery Techniques Based on Formal Concept Analysis, Stellenbosch, South Africa, Vol. 1522, pp. 23–39.

[8] Johnson, D. (1974). Approximation algorithms for combinational problems, Journal of Computer and System Sciences 9(3): 256–278.

[9] Kasim, A., Shkedy, Z., Kaiser, S., Hochreiter, S. and Talloen, W. (2016). Applied Biclustering Methods for Big and High Dimensional Data Using R, CRC Press, Taylor Francis Group, New York, NY.

[10] Latkowski, R. (2003). On decomposition for incomplete data, Fundamenta Informaticae 54(1): 1–16.

[11] Michalak, M. and Ślęzak, D. (2018). Boolean representation for exact biclustering, Fundamenta Informaticae 161(3): 275–297.

[12] Michalak, M. and Ślęzak, D. (2019). On Boolean representation of continuous data biclustering, Fundamenta Informaticae 167(3): 193–217.

[13] Murali, T.M. and Kasif, S. (2003). Extracting conserved gene expression motifs from gene expression data, Proceedings of the Pacific Symposium on Biocomputing, Kauai, HI, USA, pp. 77–88.

[14] Orzechowski, P. and Boryczko, K. (2016). Text mining with hybrid biclustering algorithms, Proceedings of the 15th International Conference on Artificial Intelligence and Soft Computing, Zakopane, Poland, pp. 102–113.

[15] Prelić, A., Bleuler, S., Zimmermann, P., Wille, A., Bühlmann, P., Gruissem, W., Hennig, L., Thiele, L. and Zitzler, E. (2006). A systematic comparison and evaluation of biclustering methods for gene expression data, Bioinformatics 22(9): 1122–1129.

[16] Rodriguez-Baena, D.S., Perez-Pulido, A.J. and Aguilar-Ruiz, J.S. (2011). A biclustering algorithm for extracting bit-patterns from binary datasets, Bioinformatics 27(19): 2738–2745.

[17] Serin, A. and Vingron, M. (2011). DeBi: Discovering differentially expressed biclusters using a frequent itemset approach, Algorithms for Molecular Biology 6, Article no. 1.

[18] Tanay, A., Sharan, R. and Shamir, R. (2005). Handbook of Computational Molecular Biology, Chapman Hall/CRC Press, New York, NY.

[19] Yang, J., Wang, H., Wang, W. and Yu, P. (2003). Enhanced biclustering on expression data, Proceedings of the 3rd IEEE Symposium on Bioinformatics and Bioengineering, Bethesda, MD, USA, pp. 321–327.