Skip to Content
Home » Blog » AI » EU AI Act Part 5 – Building the Paper Trail: What Your Records Actually Need to Contain
May 21, 2026

EU AI Act Part 5 – Building the Paper Trail: What Your Records Actually Need to Contain

EU AI Act Part 5 – Building the Paper Trail: What Your Records Actually Need to Contain

The first four posts in this series covered why documentation determines classification, how to work through the Article 6 test, how the filter mechanism functions and where it breaks, and how these issues play out in specific sectors. This post is where the theory becomes build work.

 

The classification framework established by the Commission’s draft guidelines does not execute itself. It requires organizations to construct and maintain a specific set of records before placing AI systems on the market. This post covers what those records need to contain, how to build them, and the common gaps that leave organizations exposed.

Record 1: The Documentation Audit

Before building new records, take stock of what exists. The documentation audit is the foundation because it establishes the current state of your intended purpose documentation and identifies the inconsistencies that create classification exposure.

 

What to pull together: Technical documentation, instructions for use, API documentation, marketing website content, sales decks, case study materials, promotional videos or demos, terms of service, usage policies, contractual templates, and any public statements about the system’s capabilities or use cases.

 

What to look for:

 

  • Does the documentation describe a specific, limited intended purpose, or a general capability?
  • Is that description consistent across every surface, or do different documents characterize the system differently?
  • Do any materials present examples, use cases, or demonstrations that fall within Annex III categories—even if the main description does not?
  • Does any contractual language expand the system’s permitted uses beyond what the technical documentation describes?

What to do with the gaps: Where you find inconsistencies, you face a choice. Either narrow the marketing language to match the technical documentation, or update the technical documentation to acknowledge and manage the broader use cases, including their classification implications. Leaving inconsistencies in place is not a neutral decision; it is accepting that the broader framing governs under the intended purpose doctrine.

Record 2: The Classification Self-Assessment

The classification self-assessment is the core record, the document in which the provider works through the Article 6 analysis and records its conclusions. Under Article 6(4) of the AI Act, providers who conclude that their system does not qualify as high risk, relying on the Article 6(3) filter, must document that assessment before placing the system on the market.

 

Structure the assessment around the two pathways:

 

For Article 6(1), document: whether the system is a regulated product or a safety component; the analysis under each prong of Article 3(14) (safety function and failure/malfunction); and whether the applicable harmonization legislation requires third-party conformity assessment or equivalent scrutiny.

 

For Article 6(2), document which Annex III areas and use cases the intended purpose was assessed against, the conclusion for each, and where the intended purpose falls within a listed use case, whether the Article 6(3) filter applies and why under a narrow reading of the relevant condition.

 

The filter self-assessment needs to go further than a conclusion. It needs to explain why the specific condition applies under a narrow interpretation, why the system is not part of a complex architecture that blocks the filter, and why the system does not perform profiling of natural persons. A bare assertion that “this system performs a narrow procedural task” will not survive scrutiny. The evidence needs to be there.

 

Maintain this record as a living document. Any significant change to the system’s functionality, documentation, or deployment context requires a fresh assessment. The record should note its own version and the date of last review.

Record 3: The Failure Mode Analysis for Article 6(1) Systems

For organizations deploying AI in physical products, the classification self-assessment needs to be underpinned by a dedicated failure mode analysis. The Article 3(14) second prong, the consequence-based safety component test, requires evidence about what happens when the system fails, not just what it is designed to do.

 

A standard Failure Mode and Effects Analysis (FMEA) provides the right structure, but it needs to be run specifically against the Article 3(14) question: for each identified failure mode, does that failure cause the product to enter or remain in an unsafe state, prevent a protective response, or enable a hazardous operation?

 

Document the analysis, the failure modes considered, and the conclusions for each. Where the analysis concludes that failure modes do not create safety hazards, explain why—the reasoning matters, not just the result. This record is the evidence for a negative conclusion on the safety component test, and it needs to be robust enough to survive technical scrutiny from a market surveillance authority.

Record 4: The Pipeline Map for Complex Systems

For organizations deploying multiple AI components that interact—including agentic AI architectures—the classification self-assessment needs to be supported by a pipeline map: a documented description of how the components fit together, what each contributes to the combined output, and what individual decisions or decision-influencing outputs the combined system produces.

 

This record serves two purposes. First, it enables the anti-fragmentation analysis: determining whether the combined system’s outputs materially influence individual decisions within a high-risk use case, and therefore whether the complex-system rule applies. Second, it documents the separability analysis for components that claim the filter: demonstrating that a specific component is “genuinely separable” from the high-risk purpose and does not feed into the outputs that trigger classification.

 

The pipeline map should be maintained at the system level, not the model level. As components are added, modified, or removed, the map needs to be updated and the classification assessment revisited.

Keeping the Records Current

Classification is not a point-in-time determination. The Commission’s draft guidelines make clear that it is based on the intended purpose at the time of market placement—and that substantial changes to the system may require reassessment. Build a review trigger into your governance process: any change to the system’s functionality, documentation, or deployment configuration that could plausibly affect the classification analysis should trigger a review of the self-assessment record.

 

The review should be documented. The record of classification reasoning is only as useful as its currency.

The Build in Summary

Four records: the documentation audit, the classification self-assessment (including filter self-assessment where applicable), the failure mode analysis for Article 6(1) systems, and the pipeline map for complex architectures. Together, they constitute the paper trail that the intended purpose doctrine requires. Without them, classification conclusions—in either direction—are assertions, not positions.

 

Post 6 closes the series with what to do right now, during the window, while these guidelines are still in draft.

Next in the series: Part 6 — The Window Is Open: How to Act Before These Guidelines Become Final

 

The Paper Trail series is based on the European Commission’s draft guidelines on the classification of high-risk AI systems under Article 6 of Regulation (EU) 2024/1689, published May 19, 2026 for stakeholder consultation. The guidelines are not yet final. Nothing in this series constitutes legal advice.