AI-Powered Vulnerability Detection: Multi-LLM Orchestration Secures Rust's Unsafe Code

Discover how multi-LLM orchestration and symbolic execution are revolutionizing memory vulnerability detection in incomplete Rust CVEs, enhancing software security and compliance.

AI-Powered Vulnerability Detection: Multi-LLM Orchestration Secures Rust's Unsafe Code

      Rust has rapidly gained traction as a systems programming language, largely due to its innovative ownership system that virtually eliminates common memory safety errors like buffer overflows and use-after-free bugs. This intrinsic safety is a significant advantage over languages like C and C++, making Rust a preferred choice for building robust and secure applications. However, this safety net has one crucial exception: `unsafe` code blocks. These blocks are essential for low-level operations, interacting with hardware, or integrating with code written in other languages via Foreign Function Interfaces (FFI). While necessary, they reintroduce the very memory vulnerabilities Rust typically prevents, posing a growing challenge for software security.

      The problem is compounded by how vulnerabilities are documented. Common Vulnerabilities and Exposures (CVE) databases often provide only isolated code snippets. These fragments lack critical context such as struct definitions, trait implementations, import statements, and project configuration files (Cargo manifests). For any existing formal verification tool or static analyzer, this "incomplete code" problem creates an insurmountable barrier, leading to compilation failures and zero output. Such tools, including Kani, Prusti, Creusot, and Haybale, demand a complete, compilable project to function, rendering them ineffective against the very real-world vulnerability examples they are meant to address. This gap leaves organizations exposed, as deep, execution-level vulnerability analysis on such incomplete Rust code remains largely out of reach for automated solutions.

The "Incomplete Code" Challenge and Limitations of Current Tools

      Traditional methods fall short in analyzing real-world Rust CVE snippets. Tools like Clippy offer syntactic pattern matching, which can catch some obvious issues but entirely misses path-dependent bugs – complex vulnerabilities that only emerge under specific execution flows. Miri, another Rust tool, provides runtime verification but often gives generic labels that require further manual investigation. Symbolic execution tools, such as KLEE, are powerful for uncovering deep bugs by exploring all possible execution paths. However, KLEE typically operates on LLVM bitcode generated from fully compiled C/C++ code and requires a complete FFI harness, which raw Rust snippets simply cannot provide. This fundamental disconnect means that despite the existence of sophisticated analysis techniques, the practical reality of incomplete CVE data leaves a significant blind spot in automated Rust security.

      The core issue is that incomplete code snippets from CVE databases cannot be compiled or properly understood by existing automated analysis tools. Without the full context – the definitions of data structures, necessary external libraries, and build configurations – these tools cannot even begin their analysis. This isn't just an inconvenience; it's a critical security vulnerability. If automated tools cannot analyze these snippets, then developers and security researchers are forced to manually reconstruct the missing context, a time-consuming, error-prone, and often unscalable process. This situation creates a severe impedance mismatch between the information available in vulnerability databases and the capabilities of state-of-the-art security analysis tools, especially for the complex `unsafe` blocks in Rust.

A Novel Multi-Agent AI Approach for Rust Security

      To bridge this critical "incomplete code" gap, a groundbreaking system has been developed that combines symbolic execution with a sophisticated 4-agent multi-LLM (Large Language Model) architecture, as detailed in a recent academic paper (Source: Symbolic Execution Meets Multi-LLM Orchestration: Detecting Memory Vulnerabilities in Incomplete Rust CVE Snippets). This innovative pipeline transforms incomplete Rust CVE snippets into KLEE-compatible FFI harnesses, enabling deep vulnerability analysis. At the heart of this system is a collaborative network of specialized AI agents, each designed for a specific task:

  • Oracle/Validator (GPT-4 Turbo): This agent acts as the strategic planner, receiving the raw Rust CVE snippet, identifying likely vulnerability types, and generating a structured analysis plan. This plan guides subsequent agents, ensuring they focus on relevant vulnerability classes.
  • Safety Checker (Claude Opus): With the analysis plan in hand, this agent performs a deep security assessment, assigns a risk score (0-10), and annotates critical lines of code where vulnerabilities might reside.
  • Code Specialist (Claude Sonnet): This agent leverages the outputs from the previous two agents to generate a complete, compilable Rust FFI wrapper. This wrapper replicates the vulnerability class within KLEE-compatible types, allowing the system to then synthesize the corresponding C harness. For organizations requiring such advanced code generation capabilities, ARSA Technology offers custom AI solutions tailored to specific development and security needs.
  • Fast Filter (GPT-4o-mini): Based on the assigned risk score, this agent intelligently selects optimal KLEE parameters, such as search strategy and time/memory limits, to optimize the symbolic execution process.


      This orchestrated approach allows the system to synthesize the necessary context (FFI wrappers) that traditional tools require, effectively making previously unanalyzable code segments accessible for deep security analysis.

Unveiling Deeper Vulnerabilities

      The effectiveness of this multi-agent system is evident in its empirical evaluation against 31 real-world Rust CVEs, covering 11 Common Weakness Enumeration (CWE) categories. The results are compelling: the system achieved a remarkable 90.3% wrapper compilation success rate. This contrasts sharply with existing state-of-the-art formal verification tools, which consistently reported a 0% success rate on these incomplete snippets, due to their inability to compile the fragmented code.

      Beyond compilation, the system demonstrated superior vulnerability detection. It identified 1,206 critical errors across 26 files, translating to an 83.9% detection rate. In comparison, Clippy, a widely used Rust linter, only found 14 warnings across 11 files (35.5% detection), while Miri provided generic labels that lacked actionable detail. The 26.8x increase in detected issues highlights the system's capacity to uncover deeply embedded, path-dependent vulnerabilities that evade simpler static analysis. This ability to transform raw, isolated data into actionable security intelligence is a cornerstone of modern cybersecurity, reminiscent of how ARSA's AI Video Analytics convert passive CCTV streams into real-time operational insights.

Beyond Detection: Structured Insights with Graph Databases

      After symbolic execution, the system doesn't just stop at reporting errors. A crucial post-processing component, `graph_klee.py`, ingests KLEE's structured JSON vulnerability report. This script then constructs a Graph Database, transforming raw output into a rich, interconnected knowledge base. In this database, nodes represent various entities such as CVE files, CWE categories, specific error types, and symbolic execution paths. Edges define the relationships between these nodes, creating a powerful framework for structured cross-CVE vulnerability queries.

      This graph database approach offers significant advantages:

  • Cross-CVE Analysis: Security researchers can easily query relationships between different vulnerabilities, understanding common patterns or dependencies across multiple CVEs.
  • Pattern Clustering: The database facilitates the identification of recurring vulnerability patterns, which can inform proactive security measures and more robust coding practices.
  • Structured Export: The organized data can be readily exported for further downstream analysis, integration with other security tools, or compliance reporting.


      This structured approach to vulnerability data management is vital for ongoing security improvement and for understanding the broader landscape of software weaknesses.

The Power of Orchestrated AI

      The research also rigorously evaluated the efficacy of the multi-agent architecture itself. A comparison between the specialized 4-agent system and a single general-purpose LLM baseline revealed significant performance differences. The 4-agent architecture dramatically reduced wrapper compilation failures from 42% (with a single agent) to just 9.7%. Moreover, the multi-agent system increased the number of detected errors from 487 to 1,206.

      These results unequivocally confirm that role specialization and structured context passing among multiple AI agents produce measurably better outcomes than relying on a single, general-purpose model. This finding underscores the growing trend towards multi-agent systems for tackling complex, multi-stage problems in AI, where breaking down tasks and assigning them to specialized models leads to superior precision, efficiency, and reliability. This type of advanced AI deployment is a core focus for ARSA Technology, which has been experienced since 2018 in developing and deploying complex AI solutions for various industries.

      This innovative blend of symbolic execution and multi-LLM orchestration marks a significant leap forward in automated memory vulnerability detection for Rust's `unsafe` code. By overcoming the "incomplete code" challenge, it enables unprecedented depth of analysis for real-world CVEs, significantly enhancing the security posture of critical software systems.

      Ready to enhance your organization's software security with advanced AI and IoT solutions? Explore how ARSA Technology can provide end-to-end technology transformation with precision, scalability, and measurable ROI. From real-time video analytics to industrial sensor networks and custom AI development, ARSA is your trusted partner.

Contact ARSA today for a free consultation and let's build the future together.

      **Source:** Abdelrazek, Z., & Lee, Y. (2026). Symbolic Execution Meets Multi-LLM Orchestration: Detecting Memory Vulnerabilities in Incomplete Rust CVE Snippets. In Proceedings of the 6th International Workshop on Software Security Engineering (SSE ’26), Glasgow, Scotland, United Kingdom. https://arxiv.org/abs/2605.00034