Skip to Content
Home » Blog » AI » What Rigorous AI Risk Management Actually Requires: A Five-Stage Breakdown
May 4, 2026

What Rigorous AI Risk Management Actually Requires: A Five-Stage Breakdown

What Rigorous AI Risk Management Actually Requires: A Five-Stage Breakdown

A research paper published in early 2026 by more than 30 academics from Oxford, MIT, Stanford, UC Berkeley, and Purdue does something unusual for academic work: it admits what the field does not yet know how to do.

 

The paper, Open Problems in Frontier AI Risk Management, catalogues unresolved challenges across the full AI risk management lifecycle. For enterprise AI teams, this kind of honest accounting is more useful than another framework promising to solve everything. Understanding where the research community itself acknowledges uncertainty is directly relevant to how organizations build governance programs that hold up over time.

 

This post walks through each of the five stages, what the research identifies as genuinely unresolved, and where platform infrastructure can address the parts that are already solvable today.

Stage 1: Risk Planning

Risk planning establishes the foundation for everything that follows: what system is being governed, who it affects, what boundaries apply, and what thresholds define acceptable risk.

 

For traditional narrow AI systems, purpose-built tools operating within a defined scope these , questions are relatively tractable. For frontier AI, they are not.

 

The paper identifies four open problems at this stage. Two are particularly relevant for enterprise teams:

Boundary determination is genuinely hard. Frontier AI systems are general-purpose, modular, and frequently reused across contexts. Base models get fine-tuned. APIs get embedded into applications built by teams with no connection to the original developer. Vendor-integrated and composite systems create supply chains where accountability is unclear at integration points. The paper notes there is no standardized methodology for enumerating dependencies or assigning interface responsibilities when scoping these systems.

 

Risk acceptance criteria are built on proxies. Most frontier AI governance frameworks use capability thresholds as their primary risk measure: if the model can do X, apply mitigation Y. But capability is a proxy for risk, not a direct measure of it. Traditional safety-critical industries like aviation and nuclear express acceptable risk in terms of harm probability and severity. Frontier AI governance has not yet established equivalent standards.

 

Where platform infrastructure helps: Before you can scope a risk management program, you need an accurate picture of what AI you actually have running. Airia’s AI Inventory Management and Risk Classifications give enterprise teams a centralized, continuously updated view of AI assets across the organization, including models embedded in vendor tools and third-party integrations. That inventory becomes the foundation for scope-setting and risk tiering that the paper identifies as essential but often absent.

Stage 2: Risk Identification

Risk identification is the process of systematically finding potential risk sources before they materialize into incidents. The paper identifies two open problems here, and the second is particularly significant for enterprise practitioners.

 

The techniques commonly used for risk identification were designed for deterministic systems. Methods like HAZOP, FMEA, and fishbone analysis are built for systems that behave predictably within defined parameters. Frontier AI systems do not behave that way. Risks emerge from non-linear interactions, from how humans use the system over time, from adversarial inputs, and from multi-agent dynamics that are not well understood. The paper is direct: we do not yet have good methods for identifying risks that arise from this kind of complexity.

 

Shadow AI compounds the problem. The paper emphasizes deployment context and human-AI interaction as critical but underexplored risk sources. What makes this harder in practice is that organizations frequently do not have visibility into all the AI running across their environment. AI embedded in SaaS tools, AI accessed through personal accounts, AI called through third-party APIs — each represents a risk source that never makes it into a formal identification process because it is never formally inventoried.

 

Where platform infrastructure helps: You cannot identify risks from systems you do not know exist. Airia’s AI Discovery surfaces AI wherever it is operating across your organization (in browser-based tools, embedded applications, developer environments, and API integrations) and brings it into a governed inventory. This directly addresses the scoping gap the paper identifies at Stage 1 and the visibility gap that makes Stage 2 risk identification incomplete. Shadow AI cannot be a risk source in your governance program if it remains invisible to your governance program.

Stage 3: Risk Analysis

Risk analysis is where organizations gather information to understand the nature and level of identified risks. The paper identifies eight open problems at this stage which reflects how much of current governance practice concentrates here and how many of the assumptions underlying that practice are contested.

 

Three findings stand out for enterprise teams:

 

Pre-deployment evaluations are not reliable predictors of production behavior. Capability assessments measure what a model can do under controlled conditions. They do not capture how the model behaves under sustained adversarial pressure, at full production resource levels, or in the specific deployment context your organization has configured. Organizations making deployment decisions based primarily on vendor-provided evaluation scores are working with data that systematically understates production-level risk.

 

External assessments frequently have structural independence problems. Many assessments described as independent are hybrid arrangements: the developer selects the assessor, funds the engagement, and in some cases can shape the scope or limit publication of negative findings. The paper notes this mirrors well-documented conflict-of-interest dynamics in financial auditing. For enterprise buyers, this is a reason to ask detailed questions about how any third-party assessment of a vendor’s AI systems was conducted.

 

Post-deployment monitoring data is fragmented. The data needed to assess real-world risk (model usage patterns, application-level behavior, incident reports) is currently collected separately, stored separately, and rarely connected in ways that support coherent risk analysis. NIST’s AI 800-4 report documented the same gap specifically for monitoring. The upstream cause, as this paper makes clear, is that risk analysis frameworks have not established a methodology for combining these data streams.

 

Where platform infrastructure helps: Airia’s Governance Dashboard provides real-time, centralized visibility into AI activity across integrated systems, tracking agent decisions, monitoring interactions, and maintaining audit trails that connect production behavior to governance records. Agent Red Teaming allows organizations to test AI systems against adversarial conditions before and after deployment, supplementing vendor-provided evaluations with assessments your team controls. Together these capabilities move risk analysis from a pre-deployment checkpoint toward the continuous practice the paper identifies as necessary.

Stage 4: Risk Evaluation

Risk evaluation takes the output of analysis and applies it against pre-established criteria to determine whether risk is acceptable, and if not, what should happen next. The paper identifies four open problems here, two of which are directly actionable for enterprise governance teams.

 

Risk acceptance criteria are applied inconsistently across the industry. Each major AI developer uses different thresholds, different methods, and different standards for declaring a system acceptable for deployment. Compare Anthropic’s Responsible Scaling PolicyOpenAI’s Preparedness Framework, and Google DeepMind’s Frontier Safety Framework side by side. Each is substantive. None uses the same criteria. This makes cross-vendor risk comparison difficult for enterprise buyers and creates a situation where “we evaluated this model and found it acceptable” provides limited external signal.

 

Aggregate risk evaluation has no agreed methodology. Even when individual risk assessments are conducted carefully, combining them into an overall judgment about a system’s deployment readiness involves assumptions that are rarely made explicit. The paper notes that it remains unclear whether a single unacceptable risk should disqualify an entire system, or how to weigh low-probability catastrophic outcomes against high-probability moderate ones. Most organizations are making these calls without a formalized basis for doing so.

 

Where platform infrastructure helps: Airia’s System Controls and Compliance Reporting provide the infrastructure for applying consistent risk criteria across all AI systems in your environment, and for documenting those evaluations against regulatory standards like the EU AI Act, NIST AI RMF, and ISO 42001. Human-in-the-loop approval workflows ensure that deployment decisions are reviewed, recorded, and traceable, which addresses both the consistency gap and the absence of documented rationale the paper identifies. When the criteria for your risk evaluation decisions are embedded in a platform rather than informal practice, they become auditable.

Stage 5: Risk Mitigation

Risk mitigation encompasses the controls, interventions, and protective measures organizations put in place to bring risk to an acceptable level. The paper organizes these across four levels: data, model, system, and ecosystem. The open problems at each level share a common finding: the durability of current controls under adversarial conditions is not well established.

 

Data-level controls have unclear effectiveness. Filtering training data to prevent models from developing harmful capabilities is appealing because it acts early in the model lifecycle. The relationship between training data contents and emergent model behavior is not well understood, and emerging research suggests this approach is more reliable for complex, specialized capabilities than for simpler behavioral tendencies like toxicity or sycophancy.

 

Model-level controls are reversible under adversarial conditions. Fine-tuning, RLHF, and machine unlearning techniques can suppress harmful capabilities in standard operating conditions. The paper documents that adversarial users can reliably surface suppressed capabilities through targeted prompting, fine-tuning attacks, or other interventions. The mitigations work. Their persistence under sustained pressure is a different question, and one that the research community has not resolved.

 

System-level controls degrade as deployment contexts evolve. Guardrails and runtime enforcement mechanisms are calibrated to the deployment context they were designed for. As usage patterns shift, as user behavior adapts, and as models are updated, the alignment between guardrail logic and real-world risk changes. Organizations frequently lack reliable methods for detecting this degradation before it creates exposure.

 

Where platform infrastructure helps: Airia’s Responsible AI Guardrails and Agent Constraints operate at the system level. This is significant in the context of the paper’s findings: system-level controls applied through infrastructure rather than model training are not subject to the same adversarial reversibility risks as model-level mitigations. An agent cannot be prompted or fine-tuned around constraints enforced at the infrastructure layer. Airia’s Routing Engine adds an additional control layer, directing traffic according to policy rules that sit outside any individual model’s behavior. Together, these capabilities represent the most durable tier of mitigation the paper describes.

What This Means for Enterprise Teams

Know what you have before you decide what to do about it. Several of the open problems identified across all five stages trace back to incomplete visibility into AI systems and their deployment contexts. A governance program built on an inaccurate inventory of AI assets is working from a flawed foundation, regardless of how rigorous the downstream processes are.

 

Treat pre-deployment evaluations as inputs, not conclusions. The paper is clear that capability assessments and vendor evaluations do not predict production behavior with the reliability governance decisions require. Organizations that treat a clean pre-deployment eval as a risk management conclusion are systematically underweighting post-deployment risk. Build monitoring and feedback mechanisms that run continuously, not as a one-time gate.

 

Choose an infrastructure that can adapt to a moving target. The risk management landscape for frontier AI is changing faster than standards bodies can keep up. The most resilient governance programs are not the ones hardcoded to a single framework — they are the ones whose infrastructure can update as the field evolves. When your governance controls are embedded in adaptable platform infrastructure rather than static documentation, they can respond to new regulatory expectations, new risk research, and new deployment contexts without requiring a rebuild.

 

The paper maps where the research community is still working. For the parts that are already solved, there is no reason to wait.