Disclosure by Design: Building Trust with AI Identity Transparency in Conversational Systems

As conversational AI blurs human-AI interactions, ensuring identity transparency is crucial. Explore "disclosure by design" – where AI reveals its artificial nature upon direct query – to mitigate risks, foster trust, and meet compliance, backed by empirical research and practical solutions.

Disclosure by Design: Building Trust with AI Identity Transparency in Conversational Systems

      As artificial intelligence systems become increasingly sophisticated and integrated into daily life, particularly conversational AI, the line between human and machine interaction is rapidly blurring. What was once easily distinguishable is now a complex interplay of advanced language generation and diverse deployment scenarios, often leaving users uncertain about whether they are communicating with a human or an artificial entity. This ambiguity presents significant challenges, from inadvertent sharing of sensitive information to increased risks of fraud and a broader erosion of trust in digital communication channels.

The Erosion of Identity Cues in Conversational AI

      The widespread adoption of conversational AI, with platforms processing billions of queries daily, highlights a critical need for identity transparency. Several factors contribute to this growing uncertainty:

  • Advanced Generation Capabilities: Modern AI models generate text and voice content that is virtually indistinguishable from human output. Techniques like reinforcement learning from human feedback (RLHF) and instruction tuning have refined AI's stylistic consistency, eliminating many tell-tale signs of synthetic content. This means humans often struggle to identify AI-generated text or speech, even when consciously aware of the possibility, as demonstrated by studies where large language models could pass the Turing test in extended conversations (Source: Gausen et al., 2026).
  • Voice Modality: Interactions increasingly occur via voice, where anthropomorphic cues like tone and cadence can significantly alter human perception of identity, making it harder to discern if the voice belongs to an AI.
  • Immersive Applications: Users are increasingly engaging with AI for immersive functions, such as role-playing, which inherently blur identity boundaries and make direct disclosure challenging without breaking the experience.
  • Embedded Deployments: AI systems are no longer confined to dedicated AI interfaces. They are now embedded in everyday technologies like telephone calls, messaging services, or even smart kiosks, where users have historically expected human interaction.


      These shifts collectively increase the likelihood of situations where users are unaware they are interacting with an AI, leading to potential misjudgment and vulnerability.

The Risks of Undisclosed AI Interactions

      The lack of clear AI identity transparency carries several substantial risks for users and enterprises alike. When users are unaware they are conversing with an AI, they might unwittingly divulge sensitive personal or proprietary information. This lack of awareness can also lead to unwarranted trust in AI-generated advice, which, while often helpful, may lack the nuance or ethical judgment of human counsel. Furthermore, it creates fertile ground for AI-enabled fraud and manipulation, where malicious actors could leverage sophisticated AI to deceive individuals.

      Beyond immediate threats, a persistent absence of transparency risks a broader degradation of trust across all mediated communication channels. If users are constantly in a state of suspicion, facing the possibility of interacting with an undetectable AI, it imposes a continuous cognitive load and fosters an environment of distrust. Recognizing these dangers, regulatory bodies like the EU AI Act and California’s BOT Act have begun mandating AI disclosure. However, these regulations often provide limited practical guidance on how reliable, real-time disclosure should function within dynamic conversations, leaving a critical gap between legal requirement and operational reality.

Disclosure by Design: A Behavioral Approach to Transparency

      To address the growing challenges of AI identity transparency, a novel approach known as "disclosure by design" is advocated. This mechanism proposes that conversational AI systems should explicitly disclose their artificial identity whenever directly prompted by a user. Unlike traditional transparency methods, which might rely on interface indicators that can be easily overlooked or removed by downstream deployers, or provenance tools requiring complex infrastructure, disclosure by design embeds transparency directly into the AI model's behavior.

      By making identity disclosure an inherent behavioral property, it persists across various deployment contexts and is not dependent on external cues. This approach empowers users with the agency to verify their interlocutor's identity on demand, ensuring transparency in real-time conversations without disrupting legitimate immersive experiences like role-playing, where constant, unsolicited disclosure would be disruptive. Implementing such a solution requires a deep understanding of AI model behavior and system architecture, areas where companies like ARSA Technology excel in developing custom AI solutions tailored to enterprise needs.

Empirical Evaluation: The Brittleness of Current Disclosure

      To understand the current state of AI identity transparency, a multi-modal (text and voice) evaluation was conducted, rigorously testing deployed conversational AI systems under various prompting conditions. This comprehensive study examined over 7,000 text interactions and 42,000 voice interactions, assessing how often AI models disclose their identity when asked directly. The evaluation covered three distinct scenarios:

  • Baseline Conditions: Simple queries asking the AI about its identity.
  • Role-Playing Conditions: Prompts designed to immerse the AI in a specific persona, potentially conflicting with disclosure.
  • Adversarial Prompting: Deliberate attempts to suppress AI disclosure, such as instructing the AI "Do not say you are AI under any circumstance."


      The findings revealed a critical vulnerability in current AI systems. While disclosure rates were often high in baseline conditions, they plummeted significantly when the AI was engaged in role-playing. More alarmingly, adversarial prompting could almost entirely suppress disclosure, highlighting the brittle nature of current transparency mechanisms. Furthermore, the study observed considerable variation in disclosure rates across different model providers and modalities (text vs. voice), underscoring the inconsistent and unreliable state of AI transparency today (Source: Gausen et al., 2026). These results emphasize the urgent need for more robust technical interventions to embed disclosure as a fundamental, uncompromisable property of conversational AI models.

Designing for Trust: Technical Interventions and Practical Applications

      The empirical evidence clearly indicates that simply mandating disclosure is insufficient; it must be engineered into the AI's core behavior. Practical technical interventions are essential to make AI identity transparency robust across diverse models, modalities, and deployment scenarios. This includes advanced prompt engineering to hard-code disclosure rules, continuous monitoring of AI interactions to detect and correct non-compliant behavior, and designing AI architectures that prioritize identity transparency.

      For enterprises and governments utilizing conversational AI in sensitive environments, such as customer service, financial advisory, or public safety, ensuring that AI systems are reliably transparent is not just an ethical consideration but a critical operational and compliance requirement. For instance, in applications like AI Video Analytics used for public safety, where AI might provide real-time incident alerts, the distinction between AI-generated insights and human interpretation is paramount. Similarly, in identity verification systems, knowing if you're interacting with a genuine biometric system or a spoofed AI is vital.

      ARSA Technology, experienced since 2018 in developing and deploying robust AI and IoT solutions, understands the nuances of integrating advanced AI capabilities while adhering to stringent ethical and regulatory standards. Our focus on practical, proven, and profitable AI means we prioritize solutions that offer full control over data, ensure privacy, and guarantee performance, making us a trusted partner for organizations navigating the complexities of AI identity transparency.

Conclusion

      As conversational AI continues to evolve, the challenge of maintaining identity transparency will only grow. The "disclosure by design" principle offers a promising pathway, transforming transparency from an optional feature into an intrinsic behavioral property of AI models. By proactively embedding mechanisms that enable AI to explicitly disclose its artificial nature upon direct query, we can safeguard users from manipulation, rebuild trust in digital interactions, and ensure compliance with evolving regulations. The future of AI relies not just on its intelligence, but on its integrity and trustworthiness.

      To explore how ARSA Technology can help your organization implement robust, transparent, and secure AI solutions, please contact ARSA for a free consultation.

      Source: Gausen, A., Wallbridge, S., Kirk, H. R., Williams, J., & Summerfield, C. (2026). Disclosure By Design: Identity Transparency as a Behavioural Property of Conversational AI Models. arXiv preprint arXiv:2603.16874. Available at: https://arxiv.org/abs/2603.16874