Safeguarding Adaptive AI in Healthcare: An Overview of the AEGIS Governance Framework

Explore AEGIS, an operational infrastructure for adaptive medical AI governance under US FDA and EU regulations. Learn how it ensures safety, enables continuous improvement, and addresses regulatory challenges.

Safeguarding Adaptive AI in Healthcare: An Overview of the AEGIS Governance Framework

      Artificial intelligence and machine learning (AI/ML) systems are rapidly transforming healthcare, offering unprecedented capabilities for diagnostics, treatment planning, and predictive analytics. Unlike traditional medical devices, these AI models are often designed to learn and adapt over time, continuously improving their performance with new data. While this adaptability is a significant advantage, it presents unique challenges for regulatory oversight. Ensuring the safety and effectiveness of continuously evolving AI models requires a sophisticated governance framework that balances innovation with patient safety.

The Evolving Regulatory Landscape for Medical AI

      Historically, medical device regulations were designed for static products, not dynamic software that changes its behavior post-deployment. This mismatch creates a regulatory gap. For instance, a gradual degradation in an AI algorithm's performance might not constitute a single, reportable "adverse event" in databases like the FDA's MAUDE system, making it difficult to detect systemic issues. As more than 1,400 AI-enabled medical devices have received clearance since 1995, the need for adaptive regulatory frameworks has become critical.

      Recognizing this, regulatory bodies like the U.S. Food and Drug Administration (FDA) and the European Union have introduced new mechanisms. The FDA's Predetermined Change Control Plan (PCCP), solidified in its August 2025 guidance, allows manufacturers to pre-specify acceptable modifications and the criteria for implementing them. Once approved, changes within these defined bounds do not require repeated, lengthy submissions. Similarly, the EU AI Act, under Article 43(4), provides for "predetermined changes" that avoid triggering new Conformity Assessments when properly documented. These frameworks aim to facilitate continuous improvement while maintaining safety.

Bridging the Operational Gap with AEGIS

      Despite these advancements, a significant implementation gap persists. Regulatory guidelines specify what manufacturers must do but often lack concrete details on how to operationalize these requirements. Key questions remain unanswered: What specific metrics should trigger an update or rollback? What quantitative thresholds define acceptable performance? How should data drift be rigorously detected? And what audit trail is needed for regulatory defensibility?

      This is where the AI/ML Evaluation and Governance Infrastructure for Safety (AEGIS) comes in. Presented in a recent paper (Source: arxiv.org/abs/2603.22322), AEGIS provides a practical operational layer that translates high-level regulatory change-control concepts into executable governance procedures. It offers a technical infrastructure to manage adaptive medical AI, comprising three core components:

  • Dataset Assimilation and Retraining Module (DARM): This module manages the intake of new data and oversees the retraining processes for AI models, ensuring data quality and integrity.
  • Model Monitoring Module (MMM): The MMM provides continuous surveillance of the deployed AI model's performance in real-world settings, actively looking for any signs of performance degradation or data drift.
  • Conditional Decision Module (CDM): This module is the brain of the governance system, automating deployment decisions based on predefined rules, thresholds, and performance comparisons.


      Together, these modules operationalize the FDA's PCCP requirements and the EU AI Act's provisions for predetermined changes, specifying what must be monitored and how decisions should be made. The specific clinical context then determines the precise thresholds, metrics, and underlying AI architectures.

Operationalizing Safety and Compliance

      A core innovation of AEGIS is its comprehensive deployment decision taxonomy and a novel fixed performance reference comparison mechanism. The system outputs one of four deployment decisions for a new model iteration:

  • APPROVE: The new model meets all performance and safety criteria and can be deployed.
  • CONDITIONAL APPROVAL: The new model is deployable but requires additional monitoring or specific conditions due to minor issues, such as slight cross-site data drift.
  • CLINICAL REVIEW: The new model shows concerning performance, possibly a regression from a fixed performance reference, requiring immediate expert human assessment.
  • REJECT: The new model fails to meet critical safety or performance thresholds and cannot be deployed.


      In parallel to these deployment decisions, AEGIS generates an independent Post-Market Surveillance (PMS) ALARM signal. This signal is triggered if the currently released model is assessed as being at risk in the field, for instance, due to significant distributional shift. This composite output can flag a critical governance state: for example, a "REJECT + ALARM" scenario means no new model is deployable, and the existing model in use is simultaneously failing. Such a critical safety escalation state was not easily representable in prior frameworks.

      The "fixed performance reference comparison mechanism" is analogous to the FDA's concept of Substantial Equivalence (SE), where a new medical device is deemed as safe and effective as a predicate device. For AI models, AEGIS mandates that new iterations must perform within defined bounds of these fixed reference metrics or demonstrate improvement. This mechanism supports both deterministic thresholds and statistically rigorous Confidence Interval (CI)-based equivalence testing. Furthermore, AEGIS integrates the ML Cumulative Performance Score (MLcps), a composite metric that allows domain-specific weighting for a unified performance assessment.

Real-World Application and Impact

      To demonstrate its broad applicability, AEGIS was successfully applied to two very different clinical scenarios: predicting sepsis from electronic health records and segmenting brain tumors from medical imaging. Crucially, the identical governance architecture accommodated these diverse data modalities through configuration alone, proving its generalizability.

      In iterative deployment simulations, AEGIS consistently detected performance degradation and distributional drift before they became obvious to a human observer. For instance, across 11 iterations of a sepsis prediction model, AEGIS exercised all four deployment decision categories, including an APPROVE decision with a co-issued ALARM when the new model was deployable but the released model was at risk, and a REJECT + ALARM when both new and released models failed to meet thresholds, representing a critical safety event. This proactive detection and enforcement of safety-critical thresholds are vital for maintaining the integrity of adaptive AI systems in dynamic healthcare environments.

The Future of Medical AI Governance

      The AEGIS framework provides a crucial operational layer for managing the complexities of adaptive medical AI under stringent regulatory environments. By providing executable specifications for monitoring, decision-making, and compliance, it ensures that continuous learning does not compromise patient safety. This capability allows healthcare providers to harness the full potential of AI for continuous improvement, leading to better patient outcomes and more efficient healthcare systems.

      For enterprises leveraging advanced AI solutions in various sectors, such as those provided by ARSA Technology, robust governance and monitoring frameworks are paramount. Whether it's AI Video Analytics for public safety or edge AI systems for industrial operations, ensuring model performance, detecting drift, and maintaining data privacy are critical for trusted and effective deployments. ARSA Technology, with its experienced since 2018 team and focus on practical, enterprise-grade AI solutions, understands the importance of building systems that work reliably and responsibly in the real world.

      To explore how ARSA Technology can help your organization implement robust, AI-powered solutions, contact ARSA for a free consultation.

      Source: Afdideh, F., Astaraki, M., Seoane, F., & Abtahi, F. (2026). AEGIS: An Operational Infrastructure for Post-Market Governance of Adaptive Medical AI Under US and EU Regulatory Frameworks. arXiv preprint arXiv:2603.22322.