Vaikora › Blog › Compliance & Audit

AI Governance for Financial Services: Compliance Guide

Compliance & Audit · June 30, 2026 · 12 min read

AI governance in financial services means establishing documented policies, ongoing monitoring, and decision logs for every AI system an institution deploys. Banks, insurers, and investment firms must treat large language models and AI agents the same way they manage quantitative models under SR 11-7: validate them before use, document their limitations, monitor their outputs in production, and retain evidence of risk mitigation. The Federal Reserve and OCC issued SR 11-7 to prevent model risk from hidden assumptions and data quality failures; that framework extends directly to generative AI, where hallucinations, prompt injection, and model drift pose similar governance gaps. Runtime controls that log, audit, and constrain AI decisions at decision time are becoming operational requirements, not optional enhancements.

What is AI Governance in Financial Services?

AI governance is the set of policies, controls, and processes that ensure an institution's AI systems operate within regulatory expectations and organizational risk tolerance. It spans model validation, ongoing monitoring, documentation of training data and limitations, and the audit trail that regulators expect to see when they examine AI deployments.

Financial services have always governed quantitative models. A trading desk's value-at-risk model, a credit-risk algorithm, or an anti-money-laundering detection system all require pre-deployment testing, documented assumptions, performance benchmarks, and post-deployment surveillance. The OCC and Federal Reserve codified this in SR 11-7 (Guidance on Model Risk Management), which requires banks to validate models before deployment, maintain model risk inventories, and monitor performance over time.

Generative AI models and large language models (LLMs) are now embedded in financial institutions: client-facing chatbots, internal research synthesis, regulatory filing automation, and even advisory workflows. Each introduces new governance questions: Who trained this model? What data was used? Can it be manipulated via prompt injection? Does it expose customer information? How do I prove to a regulator that I tested it? SR 11-7 applies to AI models just as it applies to traditional quantitative models, yet many institutions treat AI as a separate domain with weaker guardrails.

The SEC has signaled increasing scrutiny of AI systems in financial services. The Commission's May 2023 Investor Alert and public statements from SEC leadership in 2024-2025 emphasize disclosure obligations for AI systems used in investment advisory and trading. The SEC has also flagged "AI-washing", the practice of labeling a system as AI-driven without genuine algorithmic decision-making or documented risk controls. The expectation is clear: if AI is material to a financial product or service, disclose it; if you claim it drives decisions, prove you've validated and monitored it.

SR 11-7 and the Extension to Generative AI

SR 11-7, issued in 2011 and reaffirmed in recent supervisory letters, defines model risk as "the potential for loss arising from reliance upon a model that is inaccurate, misused, or misinterpreted." The guidance establishes three lines of defense:

First line: Model development teams must document model specifications, validate assumptions, test model outputs, and assess back-testing performance before deployment.

Second line: Model risk management functions (independent of development) must oversee the model inventory, validate pre-deployment testing, and design ongoing monitoring programs.

Third line: Internal audit must review the governance framework and confirm compliance.

Generative AI models present novel challenges under this framework. Unlike a traditional risk model with static parameters, an LLM or large language model responds to unstructured inputs and can produce inconsistent outputs across semantically identical prompts. Validation therefore requires adversarial testing: can you trick the model into ignoring guardrails? Does it leak customer data in responses? Can it be jailbroken via prompt injection?

Pre-deployment governance for AI in financial services now includes:

Data provenance audit: Identify the training data, including any third-party datasets; confirm no customer or confidential information was used unless explicitly authorized.
Adversarial testing: Run prompt injection, jailbreak, and extraction attempts to probe the model's boundaries.
Bias and fairness testing: Evaluate whether the model's outputs differ systematically by race, gender, age, or other protected attributes, particularly for customer-facing or lending use cases.
Output constraints: Define the domain the model should operate within and validate it refuses out-of-scope requests.

Post-deployment, SR 11-7 requires ongoing monitoring. For AI, this means logging every prompt and response, measuring output distribution drift, detecting anomalies, and retaining an immutable audit trail. If a customer complains an AI system gave them bad advice, or if a regulator asks how the system behaved on a particular date, you must be able to produce the exact input, the model's response, and evidence of what guardrails were applied.

SEC AI Guidance and Disclosure Obligations

The SEC has not yet issued a comprehensive AI rule for all financial institutions, but it has clarified expectations through public statements and enforcement guidance. The core principle is transparency and validation.

In 2024, the Commission emphasized that firms using AI in investment advisory, trading, or risk management must disclose material information about the AI system to clients and regulators. If your AI system makes or influences investment recommendations, clients deserve to know it; if it makes trading decisions, your prospectus should describe the system's constraints and monitoring. Withholding that information, or making inflated claims about an AI system's capabilities, falls under AI-washing and risks SEC enforcement.

The SEC also expects financial institutions to validate AI systems before deployment and have a documented process for monitoring them. In enforcement actions related to algorithmic trading and robo-advisory systems (even before generative AI became common), the Commission has fined firms for failing to maintain adequate testing records or for deploying systems without sufficient pre-release validation.

For fintech firms and digital asset platforms, the expectation is similar: if an AI system influences customer transactions, compliance decisions, or pricing, document how it works, validate its accuracy, and have evidence of ongoing monitoring. The bar for "documentation" includes not just high-level descriptions but technical specifications, test results, and performance metrics.

Model Risk Management for AI Models in Production

Applying model risk management to AI in financial services requires extending traditional governance frameworks to handle the speed, scale, and opacity of LLMs.

Inventory and classification: Maintain a registry of every AI system, its purpose, its users, and its risk category. A customer-facing chatbot that answers FAQ questions poses lower risk than an AI system that influences credit decisions. Classify accordingly.

Validation and testing: Before an AI system goes live, validate it against test sets that include adversarial examples, edge cases, and domain-specific scenarios. For a lending AI, test how it behaves when inputs contain missing data, contradictory information, or attempts to exploit the model. For a trading AI, test it on historical market stress scenarios.

Documentation: Create and maintain detailed documentation of the model's training data, architecture, known limitations, and the governance process used to validate and deploy it. This documentation is what regulators will ask to see.

Monitoring and alerting: In production, log every input and output. Measure key performance indicators: accuracy on a holdout test set, response latency, rate of refusals or errors. Set up alerts for anomalies, such as a sudden increase in refusals or a shift in the distribution of outputs. If the model's performance degrades, have a process to investigate and remediate.

Audit trail: Retain an immutable record of all decisions and the reasoning behind them. This is particularly important for financial services, where regulators and customers may later ask how a decision was made. The audit trail should include the model version, the input, the output, any constraints or guardrails applied, and a timestamp.

Regulatory Environment in 2026 and Beyond

As of mid-2026, the regulatory environment remains fluid but the direction is clear: AI systems in financial services will face governance expectations similar to traditional models, with additional scrutiny around transparency, bias, and model drift.

Federal banking agencies: The Federal Reserve, OCC, and FDIC have all issued or are preparing guidance on AI governance. SR 11-7 continues to apply. The agencies have also signaled that banks using AI should treat it as an extension of their model risk management framework, not as a separate, less-regulated category.

SEC: The Commission continues to pursue enforcement actions against firms that misrepresent AI capabilities or fail to disclose material information about AI systems. The focus is on investor protection and fair disclosure; expect more guidance on AI disclosure in investment advisory and trading contexts.

NIST AI Risk Management Framework (AI RMF): The National Institute of Standards and Technology released the AI RMF in early 2024, providing a structured approach to identifying and managing AI risks across multiple domains. Financial institutions increasingly reference it as a standard framework for governance.

International standards: ISO 42001 (AI Management System) and ISO 27001 (Information Security) are increasingly referenced in financial services audits. The EU AI Act (effective May 2024) imposes classification and governance requirements on high-risk AI applications, including those used in financial services. U.S. institutions should anticipate similar requirements may apply if they operate internationally.

OWASP LLM Top 10: The Open Web Application Security Project published the OWASP Top 10 for Large Language Models in 2023, covering injection attacks, data leakage, model poisoning, and other LLM-specific threats. It is widely adopted as a reference framework for AI security in regulated industries.

Addressing Key Compliance Gaps with Runtime Controls

Many financial institutions have validated their AI systems once at deployment but lack ongoing mechanisms to detect and prevent policy violations at runtime. This is where governance often breaks down: a model that passed pre-deployment testing may produce harmful outputs weeks later due to model drift, adversarial inputs, or changing data.

Runtime controls address this gap by monitoring every prompt and response in real time and enforcing policies before a response reaches a user. A runtime control can:

Detect prompt injection attempts and block them before they manipulate the model.
Constrain outputs to a defined set of topics or tones, refusing out-of-scope requests.
Log every decision into an immutable audit trail, providing regulators with a complete record of what the AI system did and why.
Apply compliance rules automatically, such as preventing the model from disclosing non-public information or making regulatory statements without approval.
Flag anomalies for human review, such as a sudden spike in requests or unusual patterns that may indicate an attack.

From a regulatory perspective, runtime controls also address a critical governance requirement: the audit trail. When a regulator asks how an AI system behaved on a particular date or for a particular customer, you must be able to provide:

The input (the user's prompt or query)
The model's response
Any policies or guardrails that were applied
The decision (allow, constrain, log, block)
A timestamp and version identifier

This is the audit evidence that completes the model risk management picture. Pre-deployment validation proves the model was tested; ongoing monitoring and audit trails prove it remained under control.

How Vaikora Helps

Vaikora provides runtime decision logging and policy enforcement for AI systems in regulated industries. The Vaikora gateway sits between an application and an LLM, evaluating every prompt and response against compliance and safety policies. Each decision is logged with a timestamp and version identifier, creating the kind of audit evidence that financial services regulators expect to see. The gateway also detects prompt injection, data exfiltration, and other LLM-specific threats at decision time, allowing institutions to enforce model risk policies automatically rather than relying on post-hoc review.

The open-core Vaikora gateway and guard MCP server are freely available for self-hosting; the commercial Control Plane adds hosted policy enforcement, compliance presets (SOC 2, HIPAA, GDPR, PCI DSS, ISO 27001), and an approvals workflow for high-risk decisions. This allows financial institutions to implement the audit trail and monitoring component of SR 11-7-compliant AI governance without building custom infrastructure.

Practical Steps to Implement AI Governance Now

1. Conduct an AI inventory. Identify every AI system your institution uses or plans to deploy: chatbots, research tools, decision-support systems, and content generation. Classify each by risk category based on customer impact and data sensitivity.

2. Define governance policies. For each AI system, write a policy that specifies: the model's intended use, the data it can access, the outputs it should produce, and the guardrails that prevent misuse. For a customer-facing chatbot, the policy might say: "Do not make recommendations about specific securities" or "Do not disclose customer account balances." For an internal research tool, it might say: "Do not generate text that cites confidential client information by name."

3. Plan pre-deployment validation. Design a testing program that includes functional testing (does the model work as intended?), adversarial testing (can it be tricked or jailbroken?), and bias testing (do outputs differ by protected attributes?). Document the results.

4. Implement runtime monitoring. Deploy logging and audit infrastructure that captures every decision: the input, the output, any constraints applied, and the timestamp. This is your audit trail.

5. Establish a review cadence. Plan for monthly or quarterly reviews of the AI system's performance and incident logs. If you detect anomalies or policy violations, investigate and remediate.

6. Prepare for regulator requests. Organize your documentation so you can quickly produce evidence to a regulator: the model validation report, the governance policy, the audit logs, and a summary of incidents and resolutions. The goal is to show that you validated the model, documented its limitations, monitored its behavior, and responded to problems.

Frequently asked questions

What are the AI governance requirements for banks?

Banks must treat AI systems as they do traditional models under SR 11-7: validate them before deployment, document assumptions and limitations, monitor performance in production, and maintain an audit trail. This includes adversarial testing, bias assessment, and ongoing surveillance for model drift or anomalies.

How does SR 11-7 apply to AI models?

SR 11-7 requires model validation, risk inventory management, and ongoing monitoring. For AI models, validation includes testing for prompt injection and jailbreaks; ongoing monitoring requires logging every decision and measuring output distribution for drift. The audit trail is the key compliance artifact.

What does the SEC require for AI systems in financial services?

The SEC requires disclosure of material information about AI systems and validation that claims about AI capabilities are accurate. If an AI system influences investment advice or trading decisions, firms must document how it works and have evidence of pre-deployment testing and ongoing monitoring.

How do you manage AI model risk in financial services?

AI model risk management combines pre-deployment validation (testing for accuracy, bias, and adversarial robustness), runtime monitoring (logging decisions and detecting anomalies), and documentation (recording assumptions, limitations, and test results). An immutable audit trail is essential for demonstrating control to regulators.

What is an audit trail for AI systems?

An audit trail is a log of every decision an AI system makes, including the input, the output, any policies applied, and a timestamp. For financial services, the audit trail is critical evidence that the AI system remained under control. Regulators expect to see it during examinations.

What are common AI governance failures in financial services?

Common failures include: validating a model once and then assuming it remains accurate, lacking guardrails to prevent out-of-scope outputs, failing to log decisions, not monitoring for model drift, and lacking documentation that a regulator can understand. These gaps allow AI systems to cause harm before anyone detects it.

Is generative AI regulated the same way as traditional models?

Generative AI is subject to the same governance frameworks as traditional models (SR 11-7, SEC disclosure rules, etc.) but requires additional safeguards for LLM-specific risks, such as prompt injection and hallucination. Pre-deployment testing must include adversarial scenarios; ongoing monitoring must detect unusual patterns or distribution shifts.

What tools or frameworks help with AI governance?

The NIST AI Risk Management Framework provides a structured approach to identifying and managing AI risks. OWASP LLM Top 10 covers security best practices. ISO 42001 and ISO 27001 provide standards for AI and information security management. Runtime policy enforcement tools help automate compliance monitoring and audit logging.

See Vaikora enforce policy on your AI

Open-core AI runtime control. Self-host the MIT gateway free, or run the hosted Control Plane.

Get a demo Self-host the gateway