On characteristics of symbolic execution in the problem of assessing the quality of obfuscating transformations
Modelirovanie i analiz informacionnyh sistem, Tome 28 (2021) no. 1, pp. 38-51.

Voir la notice de l'article provenant de la source Math-Net.Ru

Obfuscation is used to protect programs from analysis and reverse engineering. There are theoretically effective and resistant obfuscation methods, but most of them are not implemented in practice yet. The main reasons are large overhead for the execution of obfuscated code and the limitation of application only to a specific class of programs. On the other hand, a large number of obfuscation methods have been developed that are applied in practice. The existing approaches to the assessment of such obfuscation methods are based mainly on the static characteristics of programs. Therefore, the comprehensive (taking into account the dynamic characteristics of programs) justification of their effectiveness and resistance is a relevant task. It seems that such a justification can be made using machine learning methods, based on feature vectors that describe both static and dynamic characteristics of programs. In this paper, it is proposed to build such a vector on the basis of characteristics of two compared programs: the original and obfuscated, original and deobfuscated, obfuscated and deobfuscated. In order to obtain the dynamic characteristics of the program, a scheme based on a symbolic execution is constructed and presented in this paper. The choice of the symbolic execution is justified by the fact that such characteristics can describe the difficulty of comprehension of the program in dynamic analysis. The paper proposes two implementations of the scheme: extended and simplified. The extended scheme is closer to the process of analyzing a program by an analyst, since it includes the steps of disassembly and translation into intermediate code, while in the simplified scheme these steps are excluded. In order to identify the characteristics of symbolic execution that are suitable for assessing the effectiveness and resistance of obfuscation based on machine learning methods, experiments with the developed schemes were carried out. Based on the obtained results, a set of suitable characteristics is determined.
Keywords: obfuscation, symbolic execution, program similarity, program comprehension.
@article{MAIS_2021_28_1_a2,
     author = {P. D. Borisov and Yu. V. Kosolapov},
     title = {On characteristics of symbolic execution in the problem of assessing the quality of obfuscating transformations},
     journal = {Modelirovanie i analiz informacionnyh sistem},
     pages = {38--51},
     publisher = {mathdoc},
     volume = {28},
     number = {1},
     year = {2021},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/MAIS_2021_28_1_a2/}
}
TY  - JOUR
AU  - P. D. Borisov
AU  - Yu. V. Kosolapov
TI  - On characteristics of symbolic execution in the problem of assessing the quality of obfuscating transformations
JO  - Modelirovanie i analiz informacionnyh sistem
PY  - 2021
SP  - 38
EP  - 51
VL  - 28
IS  - 1
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/MAIS_2021_28_1_a2/
LA  - ru
ID  - MAIS_2021_28_1_a2
ER  - 
%0 Journal Article
%A P. D. Borisov
%A Yu. V. Kosolapov
%T On characteristics of symbolic execution in the problem of assessing the quality of obfuscating transformations
%J Modelirovanie i analiz informacionnyh sistem
%D 2021
%P 38-51
%V 28
%N 1
%I mathdoc
%U http://geodesic.mathdoc.fr/item/MAIS_2021_28_1_a2/
%G ru
%F MAIS_2021_28_1_a2
P. D. Borisov; Yu. V. Kosolapov. On characteristics of symbolic execution in the problem of assessing the quality of obfuscating transformations. Modelirovanie i analiz informacionnyh sistem, Tome 28 (2021) no. 1, pp. 38-51. http://geodesic.mathdoc.fr/item/MAIS_2021_28_1_a2/

[1] C. Collberg, C. Thomborson, “Watermarking, Tamper-Proofing, and Obfuscation Tools for Software Protection”, IEEE Transactions on Software Engineering, 2002, Aug, 735–746 | DOI

[2] S. Garg, C. Gentry, S. Halevi, M. Raykova, A. Sahai, B. Waters, “Candidate Indistinguishability Obfuscation and Functional Encryption for all Circuits”, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, 2013, 40–49 | DOI | MR

[3] H. Xu, Y. Zhou, J. Ming, M. Lyu, “Layered obfuscation: a taxonomy of Software obfuscation techniques for layered security”, Cybersecurity, 3, Apr (2020), 9 | DOI

[4] C. Collberg, C. Thomborson, D. Low, A Taxonomy of Obfuscating Transformations, Tech. Report, No 148, Jul, Dept. of Computer Science, Univ. of Auckland, 1997

[5] Y. Kanzaki, A. Monden, C. Collberg, “Code Artificiality: A Metric for the Code Stealth Based on an N-Gram Model”, 2015 IEEE/ACM 1st International Workshop on Software Protection, 2015, 31–37 | DOI

[6] R. Mohsen, A. Pinto, “Algorithmic Information theory for Obfuscation Security”, Proceedings of the 12th International Conference on Security and Cryptography, ICETE 2015, v. 1, SECRYPT, 2015, 76–87 | DOI

[7] R. Mohsen, A. Pinto, “Evaluating Obfuscation Security: A Quantitative Approach”, International Symposium on Foundations and Practice of Security, LNCS, 192, Springer, 2015, 174 | DOI

[8] M. Ceccato, M. Di Penta, J. Nagra, P. Falcarin, F. Ricca, M. Torchiano, P. Tonella, “The Effectiveness of Source Code Obfuscation: an Experimental Assessment”, 2009 IEEE 17th International Conference on Program Comprehension (May 2009), 178–187 | DOI

[9] J. Siegmund, “Program Comprehension: Past, Present, and Future”, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering, SANER (Mar. 2016), v. 5, 13–20 | DOI

[10] E. Avidan, D. Feitelson, “From Obfuscation to Comprehension”, 2015 IEEE 23rd International Conference on Program Comprehension (May 2015), 178–181 | DOI

[11] P. Borisov, Y. Kosolapov, “On the Automatic Analysis of the Practical Resistance of Obfusting Transformations”, Modeling and Analysis of Information Systems, 26:3, Sep. (2019), 317–331 | DOI | MR

[12] J. King, “Symbolic Execution and Program Testing”, Commun. ACM, 19:7, Jul. (1976), 385–394 | DOI | MR | Zbl

[13] B. Yadegari, S. Debray, “Symbolic Execution of Obfuscated Code”, Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (Oct. 2015), 2015, 732–744 | DOI

[14] C. Lattner, V. Adve, “LLVM: A Compilation Framework for Lifelong Program Analysis and Transformation”, Proceedings of the International Symposium on Code Generation and Optimization: Feedback-Directed and Runtime Optimization, CGO'04, IEEE Computer Society, USA, 2004, 75–86 | DOI

[15] P. F. Brown, P. V. deSouza, R. L. Mercer, V. J. D. Pietra, J. C. Lai, “Class-Based n-Gram Models of Natural Language”, Comput. Linguist., 18:4, Dec. (1992), 467–479

[16] N. Zhang, Hikari — an improvement over Obfuscator-LLVM, 2017

[17] A. Dinaburg, A. Ruef, “Mcsema: Static translation of x86 instructions to llvm”, ReCon 2014 Conference (Montreal, Canada, 2014)

[18] C. Cadar, M. Nowack, “KLEE symbolic execution engine in 2019”, International Journal on Software Tools for Technology Transfer, 2020, Jun. | DOI

[19] S. Muchnick, Advanced Compiler Design Implementation, 1997

[20] C. Eagle, The IDA pro book: the unofficial guide to the world's most popular disassembler, 2nd ed., No Starch Press

[21] G. Ravipati, A. R. Bernat, N. Rosenblum, B. P. Miller, J. K. Hollingsworth, Towards the Deconstruction of Dyninst, Tech. Rep., Jul., UW Madison, 2007, 9 pp.

[22] R. N. Horspool, N. Marovac, “An approach to the problem of detranslation of computer programs”, The Computer Journal, 23:3 (1980), 223–229 | DOI | MR

[23] C. Visual, B. Unit, Microsoft portable executable and common objectle format specification, 1999

[24] H. Lu, Elf: From the programmer's perspective, 1995., 1995

[25] J. Křoustek, P. Matula, J. Končický, D. Kolář, Accurate Retargetable Decompilation Using Additional Debugging Information, Jan, 2012

[26] S. Dasgupta, S. Dinesh, D. Venkatesh, V. S. Adve, C. W. Fletcher, “Scalable validation of binary lifters”, Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation, 2020, 655–671 | DOI

[27] S. Banescu, C. Collberg, V. Ganesh, Z. Newsham, A. Pretschner, “Code obfuscation against symbolic execution attacks”, ACSAC'16: Proceedings of the 32nd Annual Conference on Computer Security Applications (Dec. 2016), 189–200 | DOI

[28] P. Junod, J. Rinaldini, J. Wehrli, J. Michielin, “Obfuscator-LLVM — Software Protection for the Masses”, 2015 IEEE/ACM 1st International Workshop on Software Protection (May 2015), 3–9 | DOI

[29] T. László, Á. Kiss, “Obfuscating C++ Programs via Control Flow Flattening”, Annales Universitatis Scientiarum Budapestinensis de Rolando Eötvös Nominatae. Sectio Computatorica, 30:1 (2009), 3–19 | Zbl

[30] Y. Kosolapov, P. Borisov, “Similarity Features Fore Evaluation Of Obfuscation Effectiveness”, 2020 International Conference on Decision Aid Sciences and Application (DASA), 2020, 898–902 | DOI