- within Technology topic(s)
- within Transport, Media, Telecoms, IT, Entertainment and Family and Matrimonial topic(s)
- with Inhouse Counsel
- in China
We continue our analysis of the second set of Specialised Technical Guides (Guides 3 to 15) issued by the Spanish Artificial Intelligence Supervisory Agency (AESIA) to support compliance with the European Artificial Intelligence Act (AI Act).
On this occasion we take a look at the key aspects of Guides 9 and 10, which address accuracy and robustness in AI systems:
Guide 9: "Accuracy"
Guide 9, entitled "Accuracy", offers guidance on the accuracy required by high-risk AI systems under Article 15 of the AI ACT. Accuracy plays a crucial role in AI systems to mitigate, as much as possible, potential risks to health, safety and fundamental rights that may arise from the use of high-risk AI systems.
We highlight below some key points set out by AESIA in Guide 9 for ensuring compliance with the accuracy requirement:
- Lifecycle-based approach to accuracy. Assessing aspects of an AI system's lifecycle is essential because they can influence its overall accuracy – the primary aim is to ensure that an AI system's accuracy remains stable and constant over time. To achieve this, data pre-processing should be performed (among other things, assessing that model training data is free of sampling bias and using identical data processing methodology to compare accuracy among various models); and measures should be taken commensurate with the type of model and intended purpose to avoid overfitting (eg, that hyperparameters are reported during model training/testing and validation processes, as well as their values for each model, or that no information from the test dataset is used when fitting hyperparameters).
- Suitable metric selection to measure accuracy. Annex 7.1 of the Guide provides a non-exhaustive list of accuracy metrics that can be used to measure accuracy, as well as the types of models to which they relate. Two key elements should be considered when selecting an appropriate accuracy metric for a system: (i) the system's intended purpose; and (ii) the risks encountered in the risk management system, as well as selecting a target function that can achieve the intended purpose. In this section, the Guide highlights the importance of having a centralised repository where all metrics information associated with a model at any point in its lifecycle is managed.
- Measures necessary for the provider to ensure consistent accuracy throughout the lifecycle. Providers are should, among other elements, implement technical measures related to inventories of accuracy metrics and target functions (eg, system output is accompanied by a measure of uncertainty associated with the accuracy of that output); and use specific metrics accompanied by statistical evaluations that are suitable for determining data distribution, data dependence and other assumptions so as to achieve a meaningful evaluation.
- Suitable documentation. The accuracy assurance and selection process should be properly documented in accordance with all remaining technical documentation provided in Guide 15, entitled "Technical Documentation". This process includes documentation such as model cards and database cards, which offer a better overview of the accuracy metrics that have been provided and the source of the data used to train the model.
Guide 10: "System robustness"
Guide 10, entitled "System robustness", elaborates on Article 15 of the AI Act, focusing on the robustness of high-risk AI systems. This Guide identifies cybersecurity as a cornerstone of robustness because it safeguards against attacks that could manipulate an AI system's response and, therefore, undermine its accuracy. It also points out that robustness mechanisms must be designed to ensure that cybersecurity measures do not deteriorate over time.
In order to properly assess an AI system's robustness, AESIA highlights several key considerations, including:
- Responsibility lies with the AI system provider to establish appropriate robustness metrics. To properly assess a model's robustness, the provider should take the following steps:
- set robustness requirements or objectives and associated metrics.
- design experiments to test and demonstrate robustness.
- conduct experiments according to the established plan – results, data used and all output values are then recorded so that metrics are calculated in a more aggregated manner.
- interpret the results to inform decision-making.
- determine whether the system meets robustness requirements based on the criteria and interpretation identified above.
- To ensure adequate robustness, providers must implement verification procedures to confirm that the design requirements have been met, as well as a validation process to confirm that the AI system fulfils its intended purpose when tested with real-world data sets and executed code.
- The metrics chosen to validate robustness features should be tested and verified in hardware environments that mirror the computational resources available to the deployed system (memory, CPU, processing speed, etc.). Hardware-related robustness metrics should be described in the AI system documentation before accuracy and performance tests are designed for the system's operational stage; similarly, robustness should be monitored through statistics, data distribution and changes in business usage.
The Guide further underlines the importance of both providers and, where appropriate, deployers having sufficient technical robustness to ensure that failures, errors or inconsistencies do not seriously compromise system security or adversely affect fundamental rights. To this end, the Guide emphases the need for providers to:
- establish strategies and measures to predict potential failures, unintended consequences that could adversely impact on or that affect individuals' security or fundamental rights.
- address shortcomings in AI system robustness against errors, failures or inconsistencies – to do so, providers should:
- From an organisational standpoint, among other measures: implement techniques for alerting the responsible parties or, if no response is received from them in a timely manner, implement a function that allows automatic system shutdown and introduce failsafe protocols to allow humans to anticipate catastrophic events.
- From a technical standpoint recommended measures include promoting multi-stakeholder engagement to maximise diversity and the inclusion of different domain profiles during system design, development, maintenance, implementation, monitoring and use; and setting up model design evaluation committees to predict inconsistencies in system design or implementation that may lead to unintended outcomes.
- Redundancy mechanisms should be in place to ensure system robustness, including back-up systems or failover plans.
Finally, the Guide stresses the importance of AI systems that continue to learn after deployment, as learning can increase the risk of new biases emerging over time. Providers and deployers should therefore ensure that continued training of an AI system over time does not erode the robustness achieved prior to deployment, and should adopt mitigation strategies to address any changes that negatively affect a system's accuracy and robustness and/or its underlying data.
Related links
- 'New AESIA Guidelines to support compliance with the AI Act' (02 February 2026)
- AESIA introductory guides: Guides 1 & 2
- AESIA's Specialised Technical Guides: Guides 3 & 4 | Guides 5 & 6 | Guides 7 & 8
The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.