Automatic generation of hash functions for~program~code~obfuscation
Prikladnaâ diskretnaâ matematika, no. 4 (2020), pp. 102-117.

Voir la notice de l'article provenant de la source Math-Net.Ru

The specifics of hash function applications in code obfuscation are considered, as well as the disadvantages of currently existing hash functions for this purpose. Considering these specifics and disadvantages, an automatic hash function generation method is proposed, that is based on a genetic programming approach. Methods of measuring the hash function resilience to automated preimage attacks based on SMT solvers usage and random collision resistance are also proposed. Generated functions were evaluated and a method of weak function detection is proposed, that allows to increase the resilience of generated functions to attacks notably.
Keywords: obfuscation, hash function, genetic programming, SMT solver.
Mots-clés : avalanche effect
@article{PDM_2020_4_a7,
     author = {R. K. Lebedev},
     title = {Automatic generation of hash functions for~program~code~obfuscation},
     journal = {Prikladna\^a diskretna\^a matematika},
     pages = {102--117},
     publisher = {mathdoc},
     number = {4},
     year = {2020},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/PDM_2020_4_a7/}
}
TY  - JOUR
AU  - R. K. Lebedev
TI  - Automatic generation of hash functions for~program~code~obfuscation
JO  - Prikladnaâ diskretnaâ matematika
PY  - 2020
SP  - 102
EP  - 117
IS  - 4
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/PDM_2020_4_a7/
LA  - ru
ID  - PDM_2020_4_a7
ER  - 
%0 Journal Article
%A R. K. Lebedev
%T Automatic generation of hash functions for~program~code~obfuscation
%J Prikladnaâ diskretnaâ matematika
%D 2020
%P 102-117
%N 4
%I mathdoc
%U http://geodesic.mathdoc.fr/item/PDM_2020_4_a7/
%G ru
%F PDM_2020_4_a7
R. K. Lebedev. Automatic generation of hash functions for~program~code~obfuscation. Prikladnaâ diskretnaâ matematika, no. 4 (2020), pp. 102-117. http://geodesic.mathdoc.fr/item/PDM_2020_4_a7/

[1] S. Banescu, C. S. Collberg, V. Ganesh et al., “Code obfuscation against symbolic execution attacks”, Proc. 32nd Ann. Conf. Computer Security Appl., 2016, 189–200

[2] S. Udupa, S. Debray, M. Madou, “Deobfuscation: Reverse Engineering Obfuscated Code”, 12th Working Conf. WCRE'05, 2005, 10 pp.

[3] P. Junod, J. Rinaldini, J. Wehrli, J. Michielin, “Obfuscator-LLVM — software protection for the masses”, IEEE/ACM 1st Intern. Workshop Software Protection, 2015, 3–9

[4] M. Sharif, A. Lanzi, J. Giffin, W. Lee, “Impeding malware analysis using conditional code obfuscation”, Proc. NDSS (San Diego, California, USA, 2008)

[5] IDA: Lumina server, , 2020 https://www.hex-rays.com/products/ida/lumina/

[6] D. Xu, J. Ming, D. Wu, “Cryptographic function detection in obfuscated binaries via bitprecise symbolic loop mapping”, IEEE Symp. Security Privacy (SP) (Los Alamitos, CA, USA, 2017), 921–937

[7] W. Qiu, Z. Gong, Y. Guo et al, “GPU-based high performance password recovery technique for hash functions”, J. Inform. Sci. Eng., 32:1 (2016), 97–112

[8] C. Estebanez, J. Hernandez-Castro, A. Ribagorda, P. Isasi, “Finding state-of-the-art noncryptographic hashes with genetic programming”, LNCS, 4193, 2006, 818–827

[9] J. Tapiador, J. Hernandez-Castro, P. Peris-Lopez, A. Ribagorda, “Automated design of cryptographic hash schemes by evolving highly-nonlinear functions”, J. Inform. Sci. Eng., 24:5 (2008), 1485–1504

[10] A. Menezes, P. van Oorschot, S. Vanstone, Handbook of Applied Cryptography, CRC Press, 1996, 661 pp.

[11] C. S. Collberg, C. Thomborson, “Watermarking, tamper-proofing, and obfuscation tools for software protection”, IEEE Trans. Software Eng., 28:8 (2002), 735–746

[12] A. F. Webster, S. E. Tavares, “On the design of S-boxes”, LNCS, 218, 1985, 523–534

[13] R. Cilibrasi, P. M. B. Vitanyi, “Clustering by compression”, IEEE Trans. Inform. Theory, 51:4 (2005), 1523–1545

[14] J. R. Koza, “Genetic programming as a means for programming computers by natural selection”, Stat. Comput., 4:2 (1994), 87–112

[15] F. A. Fortin, F. M. De Rainville, M. A. G. Gardner et al., “DEAP: Evolutionary algorithms made easy”, J. Machine Learning Res., 13:1 (2012), 2171–2175

[16] M. J. Dworkin, SHA-3 Standard: Permutation-Based Hash and Extendable-Output Functions, National Institute of Standards and Technology, 2015

[17] Y. Shoshitaishvili, R. Wang, C. Salls et al, “SOK: (State of) The art of war: Offensive techniques in binary analysis”, IEEE Symp. Security Privacy (SP), 2016, 138–157

[18] V. Chipounov, V. Kuznetsov, G. Candea, “S2E: A platform for in-vivo multi-path analysis of software systems”, ACM SIGPLAN Notices, 46:3 (2011), 265–278

[19] L. De Moura, N. Bjorner, “Z3: An efficient SMT solver”, LNCS, 4963, 2008, 337–340

[20] C. Barrett, C. L. Conway, M. Deters et al., “CVC4”, Intern. Conf. Comput. Aided Verification, 2011, 171–177

[21] B. Dutertre, “Yices 2.2”, Intern. Conf. Comput. Aided Verification, 2014, 737–744

[22] R. Brummayer, A. Biere, “Boolector: An efficient SMT solver for bit-vectors and arrays”, LNCS, 5505, 2009, 174–177

[23] D. E. Knuth, The Art of Computer Programming, v. 3, Sorting and Searching, 2nd Ed., Addison-Wesley Professional, 1998, 800 pp.

[24] P. Flajolet, P. J. Grabner, P. Kirschenhofer, H. Prodinger, “On Ramanujan's Q-function”, J. Computat. Appl. Math., 58:1 (1995), 103–116

[25] FNV Hash, , 1991 http://www.isthe.com/chongo/tech/comp/fnv/