Enhancing Network Engineering Capabilities through LLM Fine-Tuning with Automatically Generated Datasets
Computer Science and Information Systems, Tome 23 (2026) no. 1
Citer cet article
Voir la notice de l'article provenant de la source Computer Science and Information Systems website
The paper presents a method for automatically generating domain-specific datasets to fine-tune open-source LLMs in network engineering. Our objective is to address the increasingly complex nature of network configuration and management jobs by supplying LLMs with high-quality training data. We evaluated datasets generated using open-source LLMs, including DeepSeek-R1 671B, LLaMA 3.1 70B, Qwen 2.5 72B, and Mixtral 8x7B, analyzing the quality of unprocessed knowledge data and the efficacy of cleaning and deduplication methods. The resulting dataset addresses various subjects related to routing, security, and network services. After-ward, we fine-tuned smaller LLaMA 3.2 1B, LLaMA 3.2 3B and Qwen 2.5 1.5B models using Low-Rank Adaptation, thereby minimizing computational demands while maintaining the quality of domain knowledge.