검색증강생성 기반 거대 언어 모델의 시큐어 코드 생성 기법

Vol. 35, No. 3, pp. 535-544, 6월. 2025
10.13089/JKIISC.2025.35.3.535, Full Text:
Keywords: Large Language Model (LLM), retrieval-augmented generation (RAG), Secure Code Generation
Abstract

Large language models (LLMs) like GPT-4 are being widely adopted across various domains, with significant impact in code generation. However, these models, trained on public repositories such as GitHub, often generate code with security vulnerabilities. Previous attempts to address this issue through direct model training with vulnerability information have faced several limitations: the difficulty of building accurately labeled datasets distinguishing between vulnerable and secure code, the requirement for substantial computational resources and time, and the challenges of maintaining up-to-date training. In this study, we propose an alternative approach that leverages retrieval-augmented generation (RAG) to incorporate external security vulnerability knowledge without modifying the base model. Our experiments demonstrate that this RAG-based approach effectively generates more secure code without the need for model retraining.

Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from December 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
김명혁 and 이상진, "Secure Code Generation Using Retrieval-Augmented Generation for Large Language Models," Journal of The Korea Institute of Information Security and Cryptology, vol. 35, no. 3, pp. 535-544, 2025. DOI: 10.13089/JKIISC.2025.35.3.535.

[ACM Style]
김명혁 and 이상진. 2025. Secure Code Generation Using Retrieval-Augmented Generation for Large Language Models. Journal of The Korea Institute of Information Security and Cryptology, 35, 3, (2025), 535-544. DOI: 10.13089/JKIISC.2025.35.3.535.