1. Introduction: Why the NIST AI RMF Matters
The NIST Artificial Intelligence Risk Management Framework (AI RMF 1.0) was released by the United States National Institute of Standards and Technology on 26 January 2023, in response to a Congressional direction in the National Artificial Intelligence Initiative Act of 2020. It is a voluntary, rights-preserving, non-sector-specific and use-case-agnostic framework designed to help organisations that design, develop, deploy, evaluate or acquire AI systems to better manage the individual, organisational and societal risks associated with those systems. Unlike a compliance standard with pass or fail conformity requirements, the AI RMF is a flexible, outcomes-based resource intended to be adapted to an organisation's context, risk appetite and legal obligations.
For CyberSigma clients, the AI RMF has rapidly become the de facto lingua franca of AI governance. Procurement teams, regulators, insurers and enterprise customers increasingly expect suppliers of AI-enabled products to demonstrate that they have mapped, measured and managed AI risk in a structured way. The framework also underpins the US Executive Order landscape, the accompanying NIST Generative AI Profile (NIST-AI-600-1, July 2024), and forms a practical bridge to the EU AI Act, ISO/IEC 42001:2023 and ISO/IEC 23894:2023. This guide provides an auditor-grade, implementer-ready deep-dive that translates the framework's four functions and their categories and subcategories into a concrete assessment programme.
Copyright and usage note
NIST publications, including the AI RMF 1.0, the Playbook and the Generative AI Profile, are US Government works and are generally not subject to copyright protection in the United States; they may be reproduced freely. Nevertheless, this CyberSigma guide is original commentary and interpretation. It paraphrases and restructures NIST material for assessment purposes and does not reproduce NIST text verbatim. Always refer to the authoritative source documents (NIST AI 100-1, NIST-AI-600-1 and the NIST AI RMF Playbook on the NIST Trustworthy and Responsible AI Resource Centre) for the canonical wording of each subcategory.
2. What is the NIST AI RMF
The AI RMF is a two-part resource. Part 1 (Foundational Information) frames AI risk, describes the challenges of measuring and managing it, and defines the seven characteristics of trustworthy AI. Part 2 (Core and Profiles) sets out the operational heart of the framework: four high-level functions, subdivided into categories and subcategories that describe specific outcomes an organisation should achieve. The functions are Govern, Map, Measure and Manage.
Risk under the AI RMF is defined as a composite measure of the probability of an event occurring and the magnitude or degree of the consequences of that event. Consequences can be positive (opportunities) or negative (harms). The framework recognises three broad categories of harm: harm to people (individual civil liberties, rights, physical or psychological safety, and economic opportunity; group and societal harm to democratic participation or educational access), harm to organisations (business operations, security breaches, monetary and reputational harm), and harm to ecosystems (the global financial system, supply chain, interconnected systems and the natural environment).
The seven characteristics of trustworthy AI
The AI RMF is oriented around cultivating trustworthiness. It defines seven interrelated characteristics that trustworthy AI systems should exhibit. These are not standalone controls but properties that the four functions collectively help achieve. Trade-offs between them are expected and must be managed transparently.
| Trustworthiness characteristic | What it means in practice |
|---|
| Valid and Reliable | The system performs as intended, is accurate and robust, and behaves consistently under expected and unexpected conditions. Validity and reliability are the necessary foundation on which the other characteristics rest. |
| Safe | The system does not, under defined conditions, lead to a state that endangers human life, health, property or the environment. Safety requires responsible design, deployment, decision-making and clear mechanisms to shut down or intervene. |
| Secure and Resilient | The system can withstand adversarial attacks (evasion, poisoning, model extraction, prompt injection), maintain confidentiality, integrity and availability, and recover gracefully from disruption. |
| Accountable and Transparent | Information about the system, its design decisions, data provenance and limitations is available to relevant actors; clear lines of responsibility exist across the AI lifecycle. |
| Explainable and Interpretable | The mechanisms underlying system outputs (explainability) and the meaning of those outputs in context (interpretability) can be communicated appropriately to the relevant audience. |
| Privacy-Enhanced | The system safeguards human autonomy, identity and dignity through practices such as data minimisation, anonymisation, de-identification and privacy-preserving techniques. |
| Fair with Harmful Bias Managed | The system addresses equality and equity, and manages systemic, computational/statistical and human-cognitive biases across the lifecycle. |
3. Who Must Comply and Scope of Applicability
The AI RMF is voluntary and imposes no legal mandate of its own. However, its adoption is increasingly expected or effectively required through contract, regulation and procurement. It applies across the full AI lifecycle and to all AI actors, defined broadly as those who play an active role in the AI system lifecycle. The framework explicitly scopes in generative and dual-use foundation models through the companion Generative AI Profile.
| Actor / role | How the AI RMF applies |
|---|
| AI developers and builders | Organisations that design, code, train, fine-tune or integrate models. Primary responsibility for Map and Measure outcomes and for documenting model behaviour and limitations. |
| AI deployers / operators | Organisations that put AI systems into production, including via third-party APIs. Responsible for context-specific Govern and Manage outcomes, monitoring and human oversight. |
| AI acquirers / procurers | Enterprises buying AI-enabled products. Use the RMF to set supplier requirements, evaluate vendor claims and allocate responsibility contractually. |
| Federal agencies and government suppliers | US agencies are directed toward RMF-aligned practices via Executive Orders and OMB guidance; suppliers to government inherit these expectations. |
| Regulated sectors (finance, healthcare, critical infrastructure) | Sector regulators increasingly cite the RMF as the reasonable baseline for AI governance, mapping it to existing model-risk and safety regimes (e.g. SR 11-7, FDA guidance). |
| Third-party evaluators, auditors and TEVV teams | Use the RMF Core as the assessment criteria for independent test, evaluation, verification and validation (TEVV). |
- Non-sector-specific: applies to any organisation regardless of size, sector or geography that develops or uses AI.
- Use-case-agnostic: covers predictive ML, decision-support, computer vision, NLP, and generative / foundation models (via NIST-AI-600-1).
- Lifecycle-wide: application context; data and input; AI model; task and output; people and planet — all five lifecycle dimensions are in scope.
- Rights-preserving: designed to protect civil rights, civil liberties and privacy throughout.
4. Structure of the NIST AI RMF
The RMF Core comprises four functions, which are each broken down into categories and further into subcategories that state specific, testable outcomes. In AI RMF 1.0 there are 19 categories and 72 subcategories in total, distributed across the four functions as shown below. Govern is a cross-cutting function that infuses the other three; Map, Measure and Manage are broadly sequential but iterative.
| Function | Purpose | Categories | Subcategories |
|---|
| GOVERN | Cultivates a culture of risk management; establishes policies, structures, accountability, roles, and processes across the lifecycle. Cross-cutting. | 6 (Govern 1-6) | 19 |
| MAP | Establishes the context to frame AI risks; identifies context, categorises the system, maps benefits, costs, impacts and interdependencies. | 5 (Map 1-5) | 18 |
| MEASURE | Employs quantitative, qualitative and mixed methods to analyse, assess, benchmark and monitor AI risk and trustworthiness. | 4 (Measure 1-4) | 18 |
| MANAGE | Allocates resources to mapped and measured risks; prioritises, responds to, recovers from and communicates about incidents. | 4 (Manage 1-4) | 17 |
Around the Core sit Profiles (use-case or sector-specific implementations of the Core, such as the Generative AI Profile), and the AI RMF Playbook, which offers suggested actions, references and documentation for each subcategory. The framework also defines four maturity-agnostic characteristics of effective management that any organisation should embody: risk tolerance calibration, prioritisation, integration and management, and organisational integration.
5. Master Assessment Checklist
This is the operational core of the CyberSigma assessment. Each function is enumerated below with its categories and every subcategory represented, together with what an auditor should verify and the typical evidence expected. Subcategory identifiers follow the NIST numbering convention (e.g. GOVERN 1.1). No control area is skipped.
GOVERN 1 — Policies, processes, procedures and practices
| What to verify | Typical evidence |
|---|
| GOVERN 1.1: Legal and regulatory requirements involving AI are understood, managed and documented. | Legal register mapping AI to GDPR/DPDP/EU AI Act/sector rules; counsel sign-off; regulatory watch log. |
| GOVERN 1.2: The characteristics of trustworthy AI are integrated into organisational policies, processes and practices. | AI governance policy referencing the seven trustworthiness characteristics; policy version history. |
| GOVERN 1.3: Processes, procedures and practices are in place to determine the needed level of risk management activities based on the organisation's risk tolerance. | Documented risk tolerance statement; tiering criteria; risk acceptance matrix. |
| GOVERN 1.4: The risk management process and its outcomes are established through transparent policies, procedures and other controls based on organisational risk priorities. | RACI-linked risk process; control catalogue; transparency register. |
| GOVERN 1.5: Ongoing monitoring and periodic review of the risk management process and its outcomes are planned, with responsibilities clearly defined. | Review calendar; monitoring dashboards; review meeting minutes and actions. |
| GOVERN 1.6: Mechanisms are in place to inventory AI systems and are resourced according to organisational risk priorities. | AI system inventory / model register with risk rating, owner, status. |
| GOVERN 1.7: Processes and procedures are in place for decommissioning and phasing out AI systems safely and in a manner that does not increase risk. | Decommissioning SOP; retired-model log; data disposal records. |
GOVERN 2 — Accountability structures
| What to verify | Typical evidence |
|---|
| GOVERN 2.1: Roles, responsibilities and lines of communication related to mapping, measuring and managing AI risks are documented and clear to individuals and teams. | RACI matrix; job descriptions; AI governance board charter. |
| GOVERN 2.2: The organisation's personnel and partners receive AI risk management training to enable them to perform their duties consistent with related policies. | Training curriculum; completion records; competency assessments. |
| GOVERN 2.3: Executive leadership takes responsibility for decisions about risks associated with AI system development and deployment. | Board/executive minutes; risk sign-off records; accountable executive designation. |
GOVERN 3 — Workforce diversity, equity, inclusion and accessibility
| What to verify | Typical evidence |
|---|
| GOVERN 3.1: Decision-making related to mapping, measuring and managing AI risks throughout the lifecycle is informed by a diverse team (demographic, disciplinary, experiential, expertise, background). | Team composition records; interdisciplinary review panel membership. |
| GOVERN 3.2: Policies and procedures are in place to define and differentiate roles and responsibilities for human-AI configurations and oversight of AI systems. | Human oversight policy; human-in-the-loop / on-the-loop definitions; override procedures. |
GOVERN 4 — Organisational culture of risk awareness
| What to verify | Typical evidence |
|---|
| GOVERN 4.1: Organisational policies and practices are in place to foster a critical thinking and safety-first mindset in the design, development, deployment and uses of AI systems to minimise potential negative impacts. | Safety-first policy; pre-deployment risk gate; culture survey results. |
| GOVERN 4.2: Organisational teams document the risks and potential impacts of the AI technology they design, develop, deploy, evaluate and use, and communicate about them more broadly. | Impact documentation; model cards; internal risk communications. |
| GOVERN 4.3: Organisational practices are in place to enable AI testing, identification of incidents and information sharing. | Incident reporting channel; red-team programme; information-sharing agreements. |
GOVERN 5 — Stakeholder engagement and feedback
| What to verify | Typical evidence |
|---|
| GOVERN 5.1: Organisational policies and practices are in place to collect, consider, prioritise and integrate feedback from those external to the team that developed or deployed the AI system. | Stakeholder engagement plan; feedback log; consultation records. |
| GOVERN 5.2: Mechanisms are established to enable the team that developed or deployed AI systems to regularly incorporate adjudicated feedback from relevant AI actors into system design and implementation. | Feedback triage / adjudication process; change tickets traced to feedback. |
GOVERN 6 — Third-party and supply-chain risk
| What to verify | Typical evidence |
|---|
| GOVERN 6.1: Policies and procedures are in place to address AI risks associated with third-party entities, including risks of infringement of third-party intellectual property or other rights. | Vendor AI risk policy; third-party assessment questionnaires; IP due-diligence. |
| GOVERN 6.2: Contingency processes are in place to handle failures or incidents in third-party data or AI systems deemed to be high-risk. | Fallback / contingency plan; third-party incident SLAs; exit strategy. |
MAP 1 — Context establishment
| What to verify | Typical evidence |
|---|
| MAP 1.1: Intended purposes, potentially beneficial uses, context-specific laws, norms and expectations, and prospective settings in which the AI system will be deployed are understood and documented. | Use-case charter; context of use document; deployment-setting analysis. |
| MAP 1.2: Interdisciplinary AI actors, competencies, skills and capacities for establishing context are identified, and knowledge is documented. | Skills matrix; RACI; documented domain-expert involvement. |
| MAP 1.3: The organisation's mission and relevant goals for the AI capability are understood and documented. | Business case; objectives alignment record. |
| MAP 1.4: The business value or context of business use has been clearly defined or, in the case of assessing existing AI systems, re-evaluated. | Value proposition; benefit-cost analysis; re-evaluation notes. |
| MAP 1.5: Organisational risk tolerances are determined and documented. | Risk tolerance statement per use case; acceptance thresholds. |
| MAP 1.6: System requirements (e.g. set out functionality, be free of bias) are elicited from and understood by relevant AI actors; design decisions take socio-technical implications into account. | Requirements specification; socio-technical review; sign-off. |
MAP 2 — System categorisation
| What to verify | Typical evidence |
|---|
| MAP 2.1: The specific tasks and methods used to implement the tasks that the AI system will support are defined. | Task definition; ML method documentation; architecture diagram. |
| MAP 2.2: Information about the AI system's knowledge limits and how output will be utilised and overseen by humans is documented. | Model limitations statement; human-oversight design; usage guidance. |
| MAP 2.3: Scientific integrity and TEVV considerations are identified and documented, including with respect to experimental design, data collection and use, and the suitability of the methods used. | TEVV plan; experimental design records; data suitability assessment. |
MAP 3 — AI capabilities, benefits, costs and risks
| What to verify | Typical evidence |
|---|
| MAP 3.1: Potential benefits of intended AI system functionality and performance are examined and documented. | Benefits analysis; expected-value modelling. |
| MAP 3.2: Potential costs, including non-monetary costs, which result from expected or realised AI errors or system functionality and trustworthiness, are examined and documented. | Cost/harm analysis; failure-mode consequence catalogue. |
| MAP 3.3: Targeted application scope is specified and documented based on the system's capability, established context and appropriateness of the AI system. | Scope-of-use statement; appropriateness assessment; out-of-scope list. |
| MAP 3.4: Processes for operator and practitioner proficiency with AI system performance and trustworthiness, and relevant technical standards and certifications, are defined, assessed and documented. | Operator competency framework; certification records. |
| MAP 3.5: Processes for human oversight are defined, assessed and documented in accordance with organisational policies from the GOVERN function. | Human oversight procedures; escalation and override design. |
MAP 4 — Risks and benefits mapped for all components including third-party software and data
| What to verify | Typical evidence |
|---|
| MAP 4.1: Approaches for mapping AI technology and legal risks of its components, including the use of third-party data or software, are in place, followed and documented, as are risks of infringement of third-party intellectual property or other rights. | Component / SBOM-for-AI inventory; licence and IP review; data provenance. |
| MAP 4.2: Internal risk controls for the components of the AI system, including third-party AI technologies, are identified and documented. | Control mapping to components; third-party control attestations. |
MAP 5 — Impacts characterised
| What to verify | Typical evidence |
|---|
| MAP 5.1: Likelihood and magnitude of each identified impact (both potentially beneficial and harmful) based on expected use, past uses of AI systems in similar contexts, public incident reports, feedback or other data are identified and documented. | Impact assessment with likelihood x magnitude; reference to incident databases (e.g. AIID). |
| MAP 5.2: Practices and personnel for supporting regular engagement with relevant AI actors and integrating feedback about positive, negative and unanticipated impacts are in place and documented. | Ongoing engagement plan; feedback integration records. |
MEASURE 1 — Appropriate methods and metrics identified and applied
| What to verify | Typical evidence |
|---|
| MEASURE 1.1: Approaches and metrics for measurement of AI risks enumerated during the MAP function are selected for implementation, starting with the most significant AI risks. The risks or trustworthiness characteristics that will not (or cannot) be measured are properly documented. | Metrics selection rationale; documented non-measurable risks; gap register. |
| MEASURE 1.2: Appropriateness of AI metrics and effectiveness of existing controls are regularly assessed and updated, including reports of errors and potential impacts on affected communities. | Metric review log; control effectiveness testing; error reports. |
| MEASURE 1.3: Internal experts who did not serve as front-line developers for the system and/or independent assessors are involved in regular assessments and updates. Domain experts, users, AI actors external to the team and affected communities are consulted in support of assessments as necessary per organisational risk tolerance. | Independent assessment records; external consultation evidence. |
MEASURE 2 — AI systems evaluated for trustworthiness characteristics
| What to verify | Typical evidence |
|---|
| MEASURE 2.1: Test sets, metrics and details about the tools used during TEVV are documented. | TEVV documentation; dataset datasheets; tooling logs. |
| MEASURE 2.2: Evaluations involving human subjects meet applicable requirements (including human subject protection) and are representative of the relevant population. | Human-subjects protocol; ethics/IRB approval; representativeness analysis. |
| MEASURE 2.3: AI system performance or assurance criteria are measured qualitatively or quantitatively and demonstrated for conditions similar to deployment settings. Measures are documented. | Performance test reports; assurance criteria evidence. |
| MEASURE 2.4: The functionality and behaviour of the AI system and its components are monitored when in production. | Production monitoring dashboards; drift/anomaly alerts. |
| MEASURE 2.5: The AI system to be deployed is demonstrated to be valid and reliable. Limitations of the generalisability beyond the conditions under which the technology was developed are documented. | Validity/reliability report; generalisability limitations statement. |
| MEASURE 2.6: The AI system is evaluated regularly for safety risks as identified in the MAP function. The AI system to be deployed is demonstrated to be safe, its residual negative risk does not exceed the risk tolerance, and it can fail safely, particularly if made to operate beyond its knowledge limits. | Safety test evidence; fail-safe demonstration; residual-risk sign-off. |
| MEASURE 2.7: AI system security and resilience are evaluated and documented. | Adversarial / red-team results; penetration test; resilience testing. |
| MEASURE 2.8: Risks associated with transparency and accountability are examined and documented. | Transparency artefacts; audit trail evaluation; model card review. |
| MEASURE 2.9: The AI model is explained, validated and documented, and AI system output is interpreted within its context to inform responsible use and governance. | Explainability report (e.g. SHAP/LIME); interpretation guidance. |
| MEASURE 2.10: Privacy risk of the AI system is examined and documented. | Privacy impact assessment (DPIA); data-minimisation evidence. |
| MEASURE 2.11: Fairness and bias, as identified in the MAP function, are evaluated and results are documented. | Bias/fairness test report; subgroup performance analysis. |
| MEASURE 2.12: Environmental impact and sustainability of AI model training and management activities are assessed and documented. | Compute/energy accounting; carbon footprint estimate. |
| MEASURE 2.13: Effectiveness of the employed TEVV metrics and processes in the MEASURE function are evaluated and documented. | TEVV effectiveness review; metric-of-metrics analysis. |
MEASURE 3 — Mechanisms for tracking identified AI risks over time
| What to verify | Typical evidence |
|---|
| MEASURE 3.1: Approaches, personnel and documentation are in place to regularly identify and track existing, unanticipated and emergent AI risks based on factors such as intended and actual performance in deployed contexts. | Risk-tracking register; emergent-risk review cadence. |
| MEASURE 3.2: Risk tracking approaches are considered for settings where AI risks are difficult to assess using currently available measurement techniques or where metrics are not yet available. | Documentation of hard-to-measure risks; qualitative tracking methods. |
| MEASURE 3.3: Feedback processes for end users and impacted communities to report problems and appeal system outcomes are established and integrated into AI system evaluation metrics. | User feedback / appeal mechanism; incident intake linked to metrics. |
MEASURE 4 — Feedback about measurement efficacy gathered and assessed
| What to verify | Typical evidence |
|---|
| MEASURE 4.1: Measurement approaches for identifying AI risks are connected to deployment context(s) and informed through consultation with domain experts and other end users. Approaches are documented. | Context-linked measurement plan; domain-expert consultation notes. |
| MEASURE 4.2: Measurement results regarding AI system trustworthiness in deployment context(s) and across the AI lifecycle are informed by input from domain experts and relevant AI actors to validate whether the system is performing consistently as intended. Results are documented. | Validation review with experts; consistency-of-performance evidence. |
| MEASURE 4.3: Measurable performance improvements or declines based on consultations with relevant AI actors, including affected communities, and field data about context-relevant risks and trustworthiness characteristics are identified and documented. | Trend analysis; before/after improvement records; field-data reports. |
MANAGE 1 — AI risks prioritised, responded to and managed
| What to verify | Typical evidence |
|---|
| MANAGE 1.1: A determination is made as to whether the AI system achieves its intended purposes and stated objectives and whether its development or deployment should proceed. | Go / no-go decision record; objective-achievement evidence. |
| MANAGE 1.2: Treatment of documented AI risks is prioritised based on impact, likelihood and available resources or methods. | Prioritised risk treatment plan; resourcing decisions. |
| MANAGE 1.3: Responses to the AI risks deemed high priority, as identified by the MAP function, are developed, planned and documented. Risk response options can include mitigating, transferring, avoiding or accepting. | Risk response plans; risk acceptance sign-offs; transfer (insurance) records. |
| MANAGE 1.4: Negative residual risks (defined as the sum of all unmitigated risks) to both downstream acquirers of AI systems and end users are documented. | Residual-risk statement; downstream disclosure documentation. |
MANAGE 2 — Strategies to maximise benefits and minimise negative impacts
| What to verify | Typical evidence |
|---|
| MANAGE 2.1: Resources required to manage AI risks are taken into account, along with viable non-AI alternative systems, approaches or methods, to reduce the magnitude or likelihood of potential impacts. | Resource plan; non-AI alternative analysis. |
| MANAGE 2.2: Mechanisms are in place and applied to sustain the value of deployed AI systems. | Value-sustainment plan; maintenance / retraining schedule. |
| MANAGE 2.3: Procedures are followed to respond to and recover from a previously unknown risk when it is identified. | Novel-risk response SOP; recovery playbook; after-action reviews. |
| MANAGE 2.4: Mechanisms are in place and applied, and responsibilities are assigned and understood, to supersede, disengage or deactivate AI systems that demonstrate performance or outcomes inconsistent with intended use. | Kill-switch / deactivation procedure; assigned responsibilities; tested rollback. |
MANAGE 3 — AI risks from third parties managed
| What to verify | Typical evidence |
|---|
| MANAGE 3.1: AI risks and benefits from third-party resources are regularly monitored, and risk controls are applied and documented. | Third-party monitoring reports; applied control evidence. |
| MANAGE 3.2: Pre-trained models which are used for development are monitored as part of AI system regular monitoring and maintenance. | Foundation/pre-trained model monitoring; model-update change log. |
MANAGE 4 — Risk treatments documented, monitored and communicated
| What to verify | Typical evidence |
|---|
| MANAGE 4.1: Post-deployment AI system monitoring plans are implemented, including mechanisms for capturing and evaluating input from users and other relevant AI actors, appeal and override, decommissioning, incident response, recovery and change management. | Post-deployment monitoring plan covering appeal, override, decommission, incident, recovery, change. |
| MANAGE 4.2: Measurable activities for continual improvements are integrated into AI system updates and include regular engagement with interested parties, including relevant AI actors. | Continual improvement backlog; stakeholder engagement records tied to updates. |
| MANAGE 4.3: Incidents and errors are communicated to relevant AI actors, including affected communities. Processes for tracking, responding to and recovering from incidents and errors are followed and documented. | Incident communication log; incident register; recovery evidence. |
Generative AI Profile (NIST-AI-600-1) additional risk areas
For clients deploying generative or dual-use foundation models, the Generative AI Profile (July 2024) extends the Core with 12 categories of GAI-specific risk. These should be assessed alongside the four functions above.
| What to verify | Typical evidence |
|---|
| CBRN information or capabilities; dangerous, violent or hateful content risks are assessed and mitigated. | Content-safety evaluation; misuse red-team; guardrail configuration. |
| Confabulation (hallucination), data privacy and information integrity risks are measured. | Groundedness/factuality tests; PII leakage tests; provenance controls (e.g. content credentials). |
| Harmful bias, environmental and human-AI configuration risks are managed for GAI. | Bias evaluation on generations; compute footprint; over-reliance / automation-bias controls. |
| Intellectual property, obscene/abusive content, and value-chain / component-integration risks are addressed. | IP/training-data provenance; CSAM/NCII filtering; supply-chain attestations. |
6. Scoping, Materiality and Tiering
Because the AI RMF is outcomes-based rather than prescriptive, scoping is central. The organisation must determine, for each AI use case, the level of rigour applied to each function proportionate to context and risk tolerance. NIST does not define fixed risk tiers, but effective programmes adopt a tiering scheme aligned to potential impact.
| Risk tier | Illustrative criteria | RMF rigour applied |
|---|
| Minimal / low | Internal productivity tools; no impact on rights, safety or material decisions; easily reversible outputs. | Lightweight Govern + inventory; self-attested Map; periodic review. |
| Limited / moderate | Assists human decisions; limited personal-data use; contained blast radius. | Full Map and Measure on key trustworthiness characteristics; documented human oversight. |
| High | Materially affects individuals' rights, access, safety, finances or employment; regulated context. | Full four-function treatment; independent TEVV; DPIA; residual-risk sign-off by accountable executive. |
| Unacceptable / prohibited | Uses barred by law or policy (e.g. certain EU AI Act prohibited practices, social scoring). | Do not deploy; document decision and rationale. |
Materiality should draw on the three harm categories (people, organisations, ecosystems) and on likelihood x magnitude analysis from MAP 5.1. Where measurement is not feasible (MEASURE 1.1, 3.2), the residual uncertainty itself becomes a documented risk that raises the effective tier.
7. Implementation Approach (Phased)
Phase 1 — Establish governance (weeks 0-6)
- Activities: appoint an accountable executive and AI governance body; draft AI risk policy referencing the seven trustworthiness characteristics; define risk tolerance; stand up the AI system inventory.
- Deliverables: AI governance charter; AI risk management policy; risk-tolerance statement; initial model register (satisfies GOVERN 1.1-1.6, 2.1-2.3).
Phase 2 — Map context and risks (weeks 4-12)
- Activities: run use-case intake workshops; document context, purpose, legal constraints and business value; build component/data inventory including third parties; conduct impact assessments (likelihood x magnitude).
- Deliverables: per-system context charter; AI-SBOM and data provenance record; impact assessment; risk register (satisfies MAP 1-5).
Phase 3 — Measure trustworthiness (weeks 8-20)
- Activities: select metrics; run TEVV for validity, safety, security, explainability, privacy, fairness and environmental impact; involve independent assessors; establish production monitoring.
- Deliverables: TEVV report; bias/fairness and adversarial test results; model cards; monitoring dashboards (satisfies MEASURE 1-4).
Phase 4 — Manage and respond (weeks 16-24)
- Activities: prioritise and treat risks; document residual risk; implement deactivation/rollback and incident response; establish appeal and override; set third-party and pre-trained-model monitoring.
- Deliverables: risk treatment and response plans; residual-risk sign-off; incident-response runbook; post-deployment monitoring plan (satisfies MANAGE 1-4).
Phase 5 — Operate, improve and assure (ongoing)
- Activities: continuous monitoring, periodic re-assessment, feedback integration, metric review, and independent internal audit; maintain profiles for new use cases.
- Deliverables: continual improvement backlog; periodic assurance reports; updated Profiles (satisfies GOVERN 1.5, MEASURE 4, MANAGE 4).
8. Maturity / Capability Model
NIST does not publish an official maturity model for the AI RMF, but CyberSigma applies a five-level capability model to benchmark programmes and set improvement targets. Levels are assessed per function and rolled up to an overall rating.
| Level | Name | Characteristics |
|---|
| 1 | Initial / ad hoc | AI use is largely undocumented; no inventory; risk handled reactively; trustworthiness characteristics not defined. |
| 2 | Developing | Policy drafted; partial inventory; some MAP activity on flagship systems; measurement inconsistent. |
| 3 | Defined | All four functions operational and documented; risk tiering applied; TEVV run on high-risk systems; roles clear. |
| 4 | Managed / measured | Quantitative metrics tracked; independent assessment routine; production monitoring with drift/incident alerting; third-party risk governed. |
| 5 | Optimising | Continuous improvement embedded; profiles maintained per use case; feedback loops from affected communities; programme benchmarked and externally assured. |
9. Assessment and Audit Approach
- Scope and plan: agree the AI systems in scope, the applicable Profiles (Core plus Generative AI Profile where relevant), and the assessment criteria (the 72 subcategories plus GAI extensions).
- Gather documentation: request policies, inventory, context charters, TEVV reports, model cards, risk registers and monitoring evidence (see evidence list below).
- Assess GOVERN: evaluate governance structures, accountability, training, culture, stakeholder engagement and third-party governance against subcategories.
- Assess MAP: verify context establishment, categorisation, benefit/cost/risk mapping and impact characterisation for each in-scope system.
- Assess MEASURE: examine metric selection, TEVV coverage across all seven trustworthiness characteristics, independence of assessors, and production monitoring.
- Assess MANAGE: test risk prioritisation and treatment, residual-risk documentation, deactivation/rollback, incident response and third-party monitoring.
- Test evidence: sample artefacts, re-perform selected tests (e.g. re-run a fairness metric), and interview AI actors to confirm operating effectiveness, not just design.
- Rate maturity: score each function against the capability model and identify gaps.
- Report: document findings, risk-rated gaps, root causes and a prioritised remediation roadmap; where appropriate, produce a Target Profile.
- Re-assess: agree a re-test cadence tied to system risk tier and change management.
10. Evidence Request List
- Governance: AI risk management policy; governance charter; risk-tolerance statement; RACI; training records; decommissioning SOP.
- Inventory and context: AI system / model register; per-system context charters; use-case intake forms; legal and regulatory register.
- Data and supply chain: data provenance and datasheets; AI-SBOM / component inventory; third-party contracts, attestations and IP due-diligence.
- Measurement / TEVV: test plans, datasets and tooling logs; validity, safety, security (adversarial/red-team), explainability, privacy (DPIA), fairness/bias and environmental-impact reports; model cards.
- Monitoring: production monitoring dashboards; drift/anomaly and incident alerts; performance trend reports.
- Risk management: risk register; treatment and response plans; residual-risk sign-offs; risk-acceptance records.
- Incident and feedback: incident register and communications; user appeal/override mechanism; stakeholder feedback and adjudication log.
- Assurance: independent assessment reports; internal audit findings; prior remediation evidence; Current and Target Profiles.
11. Roles and Responsibilities
| Role | Key responsibilities under the AI RMF |
|---|
| Board / accountable executive | Owns AI risk appetite; approves policy; takes responsibility for deploy decisions and residual risk (GOVERN 2.3, MANAGE 1.3). |
| AI governance body / committee | Sets policy, tiering and standards; reviews high-risk systems; oversees culture and stakeholder engagement (GOVERN 1-5). |
| Chief AI Officer / AI risk lead | Runs the RMF programme; maintains inventory and Profiles; coordinates Map, Measure and Manage across teams. |
| Data scientists / ML engineers | Deliver Map and Measure outcomes; document models, limitations and TEVV; implement monitoring and fail-safe design. |
| Independent assessors / TEVV team | Perform independent evaluation and validation not conflicted by front-line development (MEASURE 1.3). |
| Legal, privacy and compliance | Maintain legal register; run DPIAs; manage IP and regulatory alignment (GOVERN 1.1, MAP 4.1, MEASURE 2.10). |
| Security / red team | Evaluate security and resilience; adversarial testing; incident response (MEASURE 2.7, MANAGE 4.3). |
| Product / system owners | Own deployment decisions, oversight configuration, appeal/override and decommissioning for their systems. |
| Procurement / vendor management | Govern third-party and pre-trained-model risk contractually and operationally (GOVERN 6, MANAGE 3). |
12. KPIs and Metrics to Track
- Percentage of AI systems in the inventory with a completed context charter and risk tier.
- Percentage of high-risk systems with completed TEVV across all seven trustworthiness characteristics.
- Number and ageing of open high-priority AI risks; residual-risk sign-off coverage.
- Model performance and drift metrics against thresholds; number of drift/anomaly alerts and mean time to respond.
- Fairness / bias metrics (e.g. subgroup performance disparity, equal-opportunity difference) versus targets.
- Security posture: number of adversarial findings, red-team coverage, mean time to remediate.
- Incident metrics: number of AI incidents, time to detect, time to communicate to affected actors, time to recover.
- Third-party coverage: percentage of third-party / pre-trained models under active monitoring with valid attestations.
- Training coverage: percentage of relevant personnel completing AI risk training.
- Feedback and appeals: volume, resolution time, and percentage of feedback integrated into system updates.
13. Readiness Checklist
- Accountable executive and AI governance body appointed and chartered.
- AI risk management policy references the seven trustworthiness characteristics and is approved.
- Documented risk-tolerance statement and risk-tiering scheme in place.
- Complete AI system / model inventory with owner, tier and status.
- Per-system context charters completed (purpose, legal constraints, business value).
- Component / AI-SBOM inventory and data provenance documented, including third parties.
- Impact assessments (likelihood x magnitude) completed for in-scope systems.
- TEVV performed across validity, safety, security, explainability, privacy, fairness and environmental impact for high-risk systems.
- Independent (non-front-line) assessment involved for high-risk systems.
- Production monitoring with drift, anomaly and incident alerting operational.
- Human oversight, appeal, override and deactivation/rollback mechanisms defined and tested.
- Residual risk documented and signed off; downstream disclosure prepared.
- Incident response and communication process to affected actors established.
- Third-party and pre-trained-model monitoring and contingency plans in place.
- Generative AI Profile risks assessed where GAI/foundation models are used.
- Periodic re-assessment cadence and continual improvement backlog established.
14. Common Gaps and Findings
- No authoritative AI inventory, so shadow AI and third-party model use are invisible (GOVERN 1.6 fails).
- Policy exists on paper but is not operationalised; trustworthiness characteristics never translated into testable criteria.
- MAP treated as a one-off form-filling exercise; impact assessments lack likelihood x magnitude rigour and ignore societal harm.
- MEASURE limited to accuracy; safety, security (adversarial), privacy, fairness and environmental impact untested.
- Assessments run only by front-line developers, breaching the independence expectation of MEASURE 1.3.
- Residual risk not documented, and downstream acquirers/end users not informed (MANAGE 1.4).
- No tested deactivation/rollback or kill-switch, and no defined human override (MANAGE 2.4, MAP 3.5).
- Third-party and pre-trained (foundation) models unmonitored after integration (MANAGE 3.1-3.2).
- Generative-AI-specific risks (confabulation, data leakage, IP, harmful content) not addressed via the GAI Profile.
- No feedback/appeal channel for affected communities, and feedback not integrated into updates (MEASURE 3.3, MANAGE 4.2).
- Decommissioning done ad hoc, leaving residual data and access risk (GOVERN 1.7).
- No periodic re-assessment tied to change management, so drift and emergent risk go unmanaged.
15. NIST AI RMF Mapped to Other Frameworks
| NIST AI RMF element | EU AI Act | ISO/IEC 42001:2023 | ISO/IEC 23894 / other |
|---|
| GOVERN (policy, accountability, culture) | Art. 9 risk management system; Art. 17 quality management system; governance and human oversight duties | Clauses 5-6 (leadership, AI policy, objectives); Annex A controls A.2-A.3 | ISO 23894 governance of AI risk; ISO 31000 principles |
| MAP (context, categorisation, impact) | Art. 6-7 risk classification; Annex III high-risk uses; fundamental-rights impact assessment (Art. 27) | Clause 6.1 AI risk assessment; AI system impact assessment (A.5) | ISO 23894 risk identification; ISO 24028 trustworthiness concepts |
| MEASURE (TEVV, trustworthiness testing) | Art. 8-15 requirements: accuracy, robustness, cybersecurity, data governance, transparency; conformity assessment (Art. 43) | Clause 9 performance evaluation; A.6 data and A.7 information for interested parties | ISO/IEC TR 24029 robustness; ISO/IEC 24027 bias; ISO/IEC 42001 A.6 |
| MANAGE (treatment, incidents, monitoring) | Art. 61 post-market monitoring; Art. 73 serious-incident reporting; corrective actions (Art. 20) | Clause 10 improvement; A.9 use and A.8 lifecycle controls | NIST 800-53 / CSF for security controls; ISO 27001 for ISMS |
| Trustworthiness: secure and resilient | Art. 15 cybersecurity requirements | Annex A security-related controls | NIST CSF 2.0; NIST 800-53; NIST AML taxonomy (AI 100-2) |
| Trustworthiness: privacy-enhanced | Interplay with GDPR / DPDP obligations | A.5 impact + privacy-related controls | NIST Privacy Framework; GDPR Art. 35 DPIA; DPDP Act 2023 |
| Generative AI Profile (NIST-AI-600-1) | GPAI / foundation-model duties (Art. 51-55) and systemic-risk models | 42001 controls applied to generative use cases | ISO/IEC 42001 Annex A; content-provenance standards (C2PA) |
16. How CyberSigma Helps
Partner with CyberSigma on NIST AI RMF
CyberSigma's CERT-In empanelled and PCI QSA-led advisory team operationalises the NIST AI RMF end-to-end: building your AI inventory and governance structures, running MAP context and impact assessments, executing independent TEVV across all seven trustworthiness characteristics (including adversarial, privacy, fairness and Generative AI Profile testing), and standing up production monitoring, incident response and residual-risk governance. We benchmark your programme against our five-level capability model, produce Current and Target Profiles, and map the RMF to the EU AI Act, ISO/IEC 42001, the DPDP Act and NIST CSF so a single control effort satisfies multiple obligations. Engage CyberSigma to turn AI risk from an audit finding into a demonstrable, assured competitive advantage.