Securing AI Agents: Resolving the Granularity Mismatch in Trust and Authority
Explore how argument-level provenance and capability contracts enhance AI agent security, preventing indirect prompt injection attacks by distinguishing trusted data from authority-bearing arguments.
The Unseen Threat to AI Agents
As artificial intelligence continues to integrate into enterprise operations, Large Language Model (LLM) agents are becoming indispensable. These agents are tasked with increasingly complex workflows, such as processing external data from webpages or emails, and then performing actions through privileged Application Programming Interfaces (APIs). This powerful combination, however, introduces a critical security vulnerability: indirect prompt injection. Unlike traditional cyberattacks that target system infrastructure, indirect prompt injection involves embedding malicious instructions within seemingly benign content that the agent is designed to process. When the agent later acts on this untrusted content, it can be manipulated into executing unintended or harmful commands, thereby compromising security.
The core of this risk lies not in the mere presence of untrusted text, but in its potential to steer the agent's authority. For instance, an agent might be asked to summarize a webpage and email it to a specific recipient. The webpage content is legitimately supposed to influence the email body. However, if the malicious instructions within that webpage can alter the email recipient, it becomes a severe security breach. This scenario highlights a fundamental challenge in current AI agent security.
The Granularity Mismatch: Why Current Defenses Fall Short
Many existing security measures for AI agents attempt to mediate trust at a broad level, often treating an entire tool invocation (elike sending an email or executing a command) as a single trust boundary. This "whole-call policy" creates a significant problem, as outlined in a recent academic paper (Linfeng Fan et al., "The Granularity Mismatch in Agent Security," arXiv:2605.11039v1, 2026). When dealing with mixed-trust workflows, where some parts of an input are legitimate while others are not, this coarse granularity forces a difficult choice.
If a defense blocks any tool call where any argument is influenced by external, untrusted content, it disrupts legitimate "retrieval-then-act" behaviors, severely limiting the agent's utility. Conversely, if it permits the call, it leaves "authority-bearing arguments"—those that direct the agent's power, like a recipient, a target URL, or a file path—vulnerable to hijacking. This creates a "granularity mismatch," where the security boundary needs to exist within the tool call, differentiating between arguments with varying levels of trust requirements.
PACT: A New Paradigm for Agent Security
To address this critical challenge, researchers have proposed a novel approach called PACT (Provenance-Aware Capability Contracts). PACT introduces an argument-level runtime monitor that reframes AI agent security as an issue of "authority binding." Instead of treating an entire tool invocation as uniformly trusted or untrusted, PACT recognizes that trust is a property specific to what each argument does. Arguments like email recipients, shell commands, or database queries are "authority-bearing" because they direct the agent's power. Other arguments, such as a summary or a report body, are merely "content" and may legitimately incorporate external information.
PACT's design permits external information where the task semantics require it, while rigorously preventing the same information from binding privileged destinations, commands, or sensitive data. This allows AI agents to operate effectively in complex environments without sacrificing critical security controls. Implementing such nuanced security frameworks is an area where advanced AI solution providers, like ARSA Technology, can help enterprises design and deploy robust systems. Our custom AI solutions are built with these intricate security considerations in mind, ensuring practical and secure AI deployments.
How PACT Works: Argument-Level Control
The PACT system operates by equipping each tool with an "argument-level contract." These contracts assign "semantic roles" to each argument, such as target, command, credential, or content. During an agent's execution, PACT diligently tracks the "provenance" (origin) of values as they move through various "replanning steps" and tool-call chains. Before any tool invocation, PACT checks each argument against the trust requirements stipulated by its assigned role.
For instance, an argument designated as a 'target' (e.g., an email recipient) might require its value to originate directly from the user's initial request, preventing untrusted external sources from altering it. Conversely, an argument designated as 'content' (e.g., the body of an email) could be allowed to inherit its value from an untrusted webpage, as long as it doesn't gain undue authority. This robust framework is not a detector for malicious strings but rather a structural constraint on what untrusted data is allowed to control, fostering a privacy-by-design approach crucial for enterprise AI. For example, in deployments utilizing ARSA AI Box Series for edge analytics, such fine-grained control ensures data integrity and operational reliability even in sensitive environments.
Tangible Benefits and Real-World Impact
The effectiveness of PACT has been demonstrated through rigorous evaluations. In controlled diagnostic suites using "oracle provenance" (an idealized scenario with perfect knowledge of data origins), PACT achieved 100% utility and 100% security. This highlights its potential to eliminate the security–utility tradeoff that often plagues traditional, coarser-grained monitors, which frequently suffer from false positives (blocking benign actions) or false negatives (allowing attacks).
In more realistic "AgentDojo deployments" across various LLM models, PACT continued to show strong performance. For the three most capable models, it achieved 100% security while recovering 38.1–46.4% utility, significantly outperforming other leading defenses like CaMeL by 8–16 percentage points at the same security level. Ablation studies further confirmed that both "semantic roles" and "cross-step provenance" are indispensable for PACT's success. These results translate into tangible business benefits, including reduced operational risk, enhanced compliance, and the ability to deploy powerful AI agents without compromising data integrity or system security. Organizations can confidently use real-time systems like AI Video Analytics, knowing that advanced security protocols are in place.
Beyond the Runtime: Future Directions in Agent Security
By isolating the core problem, PACT also points to the next set of challenges in AI agent security. The remaining errors in deployment environments primarily concentrate in the areas of "provenance inference" (automatically determining the origin of data) and "contract synthesis" (automatically generating appropriate security contracts for tool arguments). These areas represent ongoing research and development bottlenecks, requiring further innovation to fully automate and perfect argument-level security.
This research, "The Granularity Mismatch in Agent Security: Argument-Level Provenance Solves Enforcement and Isolates the LLM Reasoning Bottleneck" by Linfeng Fan et al., provides a foundational shift in how we approach securing sophisticated AI agents, moving beyond simple content filtering to a more nuanced, structural enforcement of authority. Such advancements are crucial for the continued safe and effective deployment of AI across various industries.
To learn more about implementing secure, high-performance AI and IoT solutions for your enterprise, explore ARSA Technology's offerings and contact ARSA for a free consultation.