검색증강생성 기반 거대 언어 모델의 시큐어 코드 생성 기법

김명혁; 이상진

검색증강생성 기반 거대 언어 모델의 시큐어 코드 생성 기법

김명혁

이상진

Vol. 35, No. 3, pp. 535-544, 6월. 2025

10.13089/JKIISC.2025.35.3.535, Full Text:

Keywords: Large Language Model (LLM), retrieval-augmented generation (RAG), Secure Code Generation
Abstract

Large language models (LLMs) like GPT-4 are being widely adopted across various domains, with significant impact in code generation. However, these models, trained on public repositories such as GitHub, often generate code with security vulnerabilities. Previous attempts to address this issue through direct model training with vulnerability information have faced several limitations: the difficulty of building accurately labeled datasets distinguishing between vulnerable and secure code, the requirement for substantial computational resources and time, and the challenges of maintaining up-to-date training. In this study, we propose an alternative approach that leverages retrieval-augmented generation (RAG) to incorporate external security vulnerability knowledge without modifying the base model. Our experiments demonstrate that this RAG-based approach effectively generates more secure code without the need for model retraining.

Statistics

Show / Hide Statistics

Cite this article

[IEEE Style]

김명혁 and 이상진, "Secure Code Generation Using Retrieval-Augmented Generation for Large Language Models," Journal of The Korea Institute of Information Security and Cryptology, vol. 35, no. 3, pp. 535-544, 2025. DOI: 10.13089/JKIISC.2025.35.3.535.

[ACM Style]

김명혁 and 이상진. 2025. Secure Code Generation Using Retrieval-Augmented Generation for Large Language Models. Journal of The Korea Institute of Information Security and Cryptology, 35, 3, (2025), 535-544. DOI: 10.13089/JKIISC.2025.35.3.535.