효율적인 HWP 악성코드 탐지를 위한 데이터 유용성 검증 및 확보 기반 준지도학습 기법

Vol. 34, No. 1, pp. 71-82, 2월. 2024
https://doi.org/10.13089/JKIISC.2024.34.1.71, Full Text:
Keywords: Malware detection, Semi-supervised learning, Data Utility, Artificial intelligence, cybersecurity
Abstract

With the advancement of information and communication technology (ICT), the use of electronic document types such as PDF, MS Office, and HWP files has increased. Such trend has led the cyber attackers increasingly try to spread malicious documents through e-mails and messengers. To counter such attacks, AI-based methodologies have been actively employed in order to detect malicious document files. The main challenge in detecting malicious HWP(Hangul Word Processor) files is the lack of quality dataset due to its usage is limited in Korea, compared to PDF and MS-Office files that are highly being utilized worldwide. To address this limitation, data augmentation have been proposed to diversify training data by transforming existing dataset, but as the usefulness of the augmented data is not evaluated, augmented data could end up harming model’s performance. In this paper, we propose an effective semi-supervised learning technique in detecting malicious HWP document files, which improves overall AI model performance via quantifying the utility of augmented data and filtering out useless training data.

Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from December 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
손진혁, 김영국, 고기혁, 조호묵, "Efficient Hangul Word Processor (HWP) Malware Detection Using Semi-Supervised Learning with Augmented Data Utility Valuation," Journal of The Korea Institute of Information Security and Cryptology, vol. 34, no. 1, pp. 71-82, 2024. DOI: https://doi.org/10.13089/JKIISC.2024.34.1.71.

[ACM Style]
손진혁, 김영국, 고기혁, and 조호묵. 2024. Efficient Hangul Word Processor (HWP) Malware Detection Using Semi-Supervised Learning with Augmented Data Utility Valuation. Journal of The Korea Institute of Information Security and Cryptology, 34, 1, (2024), 71-82. DOI: https://doi.org/10.13089/JKIISC.2024.34.1.71.