CTINEXUS: Automatic Cyber Threat Intelligence Knowledge Graph Construction Using Large Language Models

Abstract

Textual descriptions in cyber threat intelligence (CTI) reports, such as security articles and news, are rich sources of knowledge about cyber threats, crucial for organizations to stay informed about the rapidly evolving threat landscape. However, current CTI knowledge extraction methods lack flexibility and generalizability, often resulting in inaccurate and incomplete knowledge extraction.

Syntax parsing relies on fixed rules and dictionaries, while model fine-tuning requires large annotated datasets, making both paradigms challenging to adapt to new threats and ontologies. To bridge the gap, we propose CTINexus, a novel framework leveraging optimized in-context learning (ICL) of large language models (LLMs) for data-efficient CTI knowledge extraction and high-quality cybersecurity knowledge graph (CSKG) construction.

Unlike existing methods, CTINexus requires neither extensive data nor parameter tuning and can adapt to various ontologies with minimal annotated examples. This is achieved through: (1) a carefully designed automatic prompt construction strategy with optimal demonstration retrieval for extracting a wide range of cybersecurity entities and relations; (2) a hierarchical entity alignment technique that canonicalizes the extracted knowledge and removes redundancy; (3) a long-distance relation prediction technique to further complete the CSKG with missing links.

Our extensive evaluations using 150 real-world CTI reports collected from 10 platforms demonstrate that CTINexus significantly outperforms existing methods in constructing accurate and complete CSKG, highlighting its potential to transform CTI analysis with an efficient and adaptable solution for the dynamic threat landscape.

Read the Paper

Authors

Yutong Cheng, Virginia Tech
Osama Bajaber, Virginia Tech
Saimon Amanuel Tsegai, Virginia Tech
Dawn Song, University of California, Berkeley
Peng Gao, Virginia Tech

Publication

Venue: In Proceedings of the 10th IEEE European Symposium on Security and Privacy, Euro S&P, 2025
Date: 2/12/2025

CTINEXUS: Automatic Cyber Threat Intelligence Knowledge Graph Construction Using Large Language Models

Abstract

Authors

Publication

Related Papers

An Exploratory Mixed-methods Study on General Data Protection Regulation (GDPR) Compliance in Open-Source Software

Principled and Automated Approach for Investigating AR/VR Attacks

Security Enhancement in UAV Swarms: A Case Study Using Federated Learning and SHAP Analysis

Scale-MIA: A Scalable Model Inversion Attack against Secure Federated Learning via Latent Space Reconstruction

S2M3: Split-and-Share Multi-Modal Models for Distributed Multi-Task Inference on the Edge

"This is not a scam!": Assessment of an awareness raising program tackling older adults' scam victimization in a multi-method study

Unraveling the Complexities of MTA-STS Deployment and Management in Securing Email

Current Showcase

Past Showcases

2025 Papers by Topic