Vaikora › Blog › Developer Guides

Vaikora Python SDK: AI Runtime Control Setup Guide

Q: Can I use Vaikora with local LLMs?

Yes. Configure VAIKORA_LLM_BASE_URL to point at any OpenAI-compatible server, including local LLM inference engines (Ollama, vLLM, Text Generation WebUI). The policy enforcement and audit logging work the same way.

Developer Guides · June 30, 2026 · 10 min read

An AI runtime control SDK for Python is a toolkit that intercepts and enforces security policies on language model calls in real time, allowing you to ALLOW, LOG, CONSTRAIN, or BLOCK AI actions before execution, without rebuilding your application. Vaikora's Python SDK provides real-time policy enforcement and audit logging for AI applications by routing LLM calls through a centralized gateway that validates actions against your security policies before execution. The SDK integrates seamlessly with OpenAI-compatible clients, letting you enforce deterministic controls and record every decision in an immutable audit chain, all without rewriting your core application logic.

Why Runtime Control Matters for Python AI Applications

Python has become the lingua franca for AI development. Teams use Python to prototype agents, build agentic workflows, connect language models to databases and APIs, and manage complex multi-step tasks. But integrating LLMs into production Python applications creates new security surface: agents can hallucinate, propose unauthorized actions, expose sensitive data, or fall victim to prompt injection and jailbreaks.

Traditional security approaches catch vulnerabilities in your code layer. Runtime control catches problems at the AI layer, applying policy at the moment of decision. A Python agent might propose to query a database or call an external API. Your policy evaluates that proposal in milliseconds and either lets it through, logs it, constrains the output, or blocks it entirely. This happens transparently, without forcing you to rebuild your application around a proprietary SDK.

Vaikora's architecture is open-core: only the gateway (vaikora-llm-gateway) and MCP server (vaikora-guard-mcp) are MIT-licensed and self-hostable. The Vaikora Control Plane (hosted infrastructure, audit chain, compliance presets, approvals workflows) is commercial and closed-source. This article covers the self-hosted gateway path, which is what most Python developers start with.

How the Vaikora Gateway Works with Python

The Vaikora gateway sits between your Python application and the LLM (OpenAI, Anthropic, or other OpenAI-compatible endpoints). Instead of calling OpenAI directly, your code calls the Vaikora gateway, which forwards the request to your LLM, intercepts the response, runs it through your policies, and returns a decision: ALLOW (pass it through), LOG (allow it but record it), CONSTRAIN (modify the output), or BLOCK (reject it).

From your Python code's perspective, the gateway looks like any other OpenAI-compatible server. You change one line in your OpenAI client configuration: the base_url. That single change enables policy enforcement, threat detection (prompt injection, jailbreak attempts, PII exposure, data exfiltration, toxicity), and full audit logging without touching your application logic.

The gateway is stateless and designed for low latency. Policy decisions are completed in sub-second timeframes in typical deployments. Because the gateway is OpenAI-compatible, it works with the OpenAI Python client library (which most Python teams already use) and integrates with frameworks like LangChain, LlamaIndex, and Anthropic's models through wrapper layers.

Setting Up the Vaikora Gateway Locally

The fastest way to get started is to run the Vaikora gateway in Docker. The gateway requires an LLM endpoint (OpenAI, Anthropic, or a local model server) and a policy file.

First, start the gateway:

docker run -d \
  -p 8000:8000 \
  -e VAIKORA_LLM_BASE_URL=https://api.openai.com/v1 \
  -e OPENAI_API_KEY=sk-... \
  -v $(pwd)/policy.yaml:/etc/vaikora/policy.yaml \
  vaikora/llm-gateway:latest

The gateway now listens on http://localhost:8000. Your Python client will call this address instead of the LLM directly.

Configuring Python to Use the Gateway

The Python integration is straightforward because the gateway speaks OpenAI's API. Install the OpenAI Python client if you haven't already:

pip install openai

Then configure your client to point at the gateway:

from openai import OpenAI

client = OpenAI(
    api_key="your-openai-key",  # still needed for LLM auth
    base_url="http://localhost:8000/v1"  # point at Vaikora gateway
)

# Now every call is inspected by Vaikora before reaching OpenAI
response = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "user", "content": "What's the weather?"}
    ]
)

print(response.choices[0].message.content)

That's it. Every request now flows through the Vaikora gateway, which runs your policies, logs the decision, and either allows the response or blocks it. No other changes to your code are needed.

Writing Policies for AI Security

Policies are the rules that Vaikora enforces. They define what kinds of actions your AI agents are permitted to take. A policy typically specifies:

What models are allowed
What data the model can read or write
What external systems the model can call
What kinds of outputs are acceptable (no PII, no jailbreak attempts, no toxic content)
What should be logged versus blocked

Here's a simple policy file in YAML format:

version: "1.0"
policies:
  - name: "default_policy"
    description: "Base policy for all agent calls"
    rules:
      - action: "llm_call"
        condition:
          model: ["gpt-4", "gpt-3.5-turbo"]
        decision: "ALLOW"

      - action: "function_call"
        condition:
          function: ["get_weather", "search_docs"]
        decision: "ALLOW"

      - action: "function_call"
        condition:
          function: ["delete_user", "drop_table"]
        decision: "BLOCK"
        reason: "Destructive operations not permitted in this context"

      - action: "output_contains"
        condition:
          threat_type: "pii_exposure"
        decision: "CONSTRAIN"
        constraint_type: "redact_pii"

      - action: "output_contains"
        condition:
          threat_type: "prompt_injection"
        decision: "BLOCK"
        reason: "Potential jailbreak attempt detected"

Mount this policy file into your gateway container, and every decision will be evaluated against it. Policies are composable; you can write narrow policies for sensitive operations and broader ones for routine tasks.

Reading the Audit Log

Every decision the gateway makes is recorded. You can query the audit log to understand what your agents are doing and whether policies are being respected.

If you're running the Control Plane, the audit chain is cryptographically signed (SHA-256) and immutable. If you're running the self-hosted gateway, audit logs are written to a local database or log file that you configure.

Here's how to retrieve audit events programmatically:

import requests

# Query the audit log for the last 100 decisions
response = requests.get(
    "http://localhost:8000/audit/events",
    params={"limit": 100, "order": "desc"}
)

events = response.json()

for event in events:
    print(f"Decision: {event['decision']}")  # ALLOW, LOG, CONSTRAIN, BLOCK
    print(f"Reason: {event['reason']}")
    print(f"Model: {event['model']}")
    print(f"Timestamp: {event['timestamp']}")
    print("---")

The audit log is essential for compliance. It shows auditors exactly when and why the system allowed or blocked an action, and it's tamper-proof if you're using the Control Plane.

Detecting Threats in Real Time

The Vaikora gateway includes built-in threat detection based on OWASP LLM Top 10 and the MITRE ATLAS framework. It automatically detects:

Prompt injection: Inputs designed to override your original prompt
Jailbreaks: Attempts to trick the model into violating policy
PII exposure: The model trying to output personally identifiable information
Data exfiltration: Attempts to extract sensitive data from your application's context
Toxicity: Harmful, abusive, or illegal content

These threats are detected without any explicit policy rule. The gateway applies threat-detection models continuously, and when a threat is detected, your policy decides the outcome (ALLOW for a benign false positive, CONSTRAIN to redact sensitive data, BLOCK to stop the interaction).

For example, if a user tries to prompt-inject your agent and the gateway detects it, your policy might BLOCK the request entirely. If the model generates a response that contains an unintentional PII leak, your policy might CONSTRAIN it to redact the sensitive information before returning it to the user.

How to Add Security to a Python AI Application

Adding Vaikora security to an existing Python application takes minutes:

Spin up the gateway using the Docker command above.
Write a policy file that defines what your agents are allowed to do.
Change one line in your Python code: set base_url="http://localhost:8000/v1" on your OpenAI client.
Test by running your application and checking the audit log.

If you're using LangChain or another framework, the same pattern applies: point the underlying LLM client at the Vaikora gateway, and policy enforcement is automatic.

Runtime control doesn't require you to rewrite your application. It sits transparently between your code and the LLM, intercepting decisions and applying policy at the moment of truth.

Enforcing AI Policies Programmatically

Policies can be static (written to a YAML file) or dynamic (managed through an API). If you need to change policies at runtime without restarting the gateway, the API approach is more flexible.

The Vaikora gateway exposes a policy management endpoint:

import requests

# Update a policy at runtime
policy_update = {
    "name": "default_policy",
    "rules": [
        {
            "action": "llm_call",
            "condition": {"model": ["gpt-4"]},
            "decision": "ALLOW"
        },
        {
            "action": "function_call",
            "condition": {"function": ["query_user_db"]},
            "decision": "LOG"  # Log all database queries
        }
    ]
}

response = requests.post(
    "http://localhost:8000/policies",
    json=policy_update
)

print(f"Policy updated: {response.status_code}")

This lets you build control panels, dashboards, or automated policy-management systems that adapt policies based on real-time telemetry, user roles, or threat intelligence feeds.

What Python Libraries Exist for AI Security?

Popular Python libraries and frameworks for AI security include:

Guardrails AI: An open-source runtime enforcement framework that validates LLM outputs against structured schemas and rules.
LiteLLM: A proxy layer that standardizes LLM API calls and supports routing through security middleware.
OWASP LLM Top 10: A guidance framework documenting the most critical vulnerabilities in LLM applications (prompt injection, data leakage, supply chain attacks, etc.).
TensorFlow-based prompt injection detectors: Libraries like Vigil and similar classifiers detect adversarial inputs.
Vaikora: Combines runtime policy enforcement with cryptographic audit trails and threat detection. Unlike most alternatives, Vaikora blocks or constrains actions in real time rather than logging after the fact.

Most existing libraries focus on input validation or post-hoc monitoring. Vaikora is distinguished by real-time policy decisions (ALLOW, LOG, CONSTRAIN, BLOCK) with cryptographically-signed audit trails, making it compliance-native.

Scaling Runtime Control in Production

For production deployments, you'll want to run the Vaikora gateway on a managed infrastructure layer, configure it to connect to your LLM (OpenAI, Anthropic, or local), and wire it into your CI/CD pipeline.

The Control Plane provides hosted gateway infrastructure, automatic scaling, geographic redundancy, and the audit chain. The self-hosted gateway can run on Kubernetes, ECS, or any containerized platform you prefer.

Key production considerations:

Latency: The gateway is designed for low-latency policy decisions. At scale, monitor gateway latency to ensure it doesn't bottleneck your application.
Fallback behavior: If the gateway is unavailable, decide whether to block all requests (fail-safe) or pass them through (fail-open). Your security policy should define this.
Policy versioning: Track policy changes over time. The audit log should include which policy version was active when each decision was made.
Compliance reporting: Automate compliance reports (audit log exports, policy change logs, threat-detection summaries) for auditors and regulators.

Integrating with LangChain and Other Frameworks

If you're using LangChain, the integration is the same: point the LLM client at the Vaikora gateway.

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4",
    api_key="your-openai-key",
    base_url="http://localhost:8000/v1"  # Vaikora gateway
)

# All LangChain calls now go through Vaikora
response = llm.invoke("What are the top 3 risks in my supply chain?")
print(response.content)

LangChain is a blackbox from Vaikora's perspective. The gateway sits below it, inspecting every LLM call and response. This works because LangChain uses the OpenAI Python client under the hood.

If you're using LlamaIndex, Anthropic's SDK, or any other framework with OpenAI-compatible support, the same pattern applies.

Compliance and Regulatory Context

Runtime control addresses requirements from multiple compliance frameworks:

NIST AI RMF: Includes continuous monitoring and human review as controls in its MANAGE function, supporting risk-mitigation practices for AI systems.
ISO 42001 (AI management): Requires documented controls on AI systems. Policies and audit logs are your evidence.
EU AI Act: Requires high-risk AI systems to maintain documented human oversight and provide transparency to affected parties.
HIPAA (healthcare): Requires audit logs for all access to protected health information. Vaikora's audit chain meets this requirement.
PCI DSS (payment): Requires strong access controls. AI-native policies can enforce payment-specific rules.
GDPR (privacy): Requires that data processing is logged and auditable. The audit chain makes this visible.

The Vaikora Control Plane includes pre-built compliance presets for SOC 2, HIPAA, GDPR, PCI DSS, and ISO 27001, which accelerate compliance certification by providing policy templates and pre-configured audit reports.

Frequently asked questions

How do you add security to a Python AI application?

Integrate Vaikora by pointing your OpenAI client to the Vaikora gateway (base_url="http://localhost:8000/v1"). Write a policy file defining allowed actions. The gateway then enforces policies in real time, logging every decision in an audit trail. This adds security without rewriting your application.

What Python libraries exist for AI security?

Popular options include Guardrails AI (open-source rule engine), LiteLLM (proxy layer), OWASP LLM Top 10 patterns, and prompt-injection detection libraries. Vaikora is distinguished by runtime policy enforcement combined with cryptographically-signed audit trails. Most alternatives focus on input validation or post-hoc logging, not real-time blocking and policy decisions.

How do you enforce AI policies programmatically?

Define policies in YAML (static) or via the Vaikora API (dynamic). Policies specify allowed models, functions, data access, and threat responses. The gateway evaluates every LLM call and response against these rules, returning ALLOW, LOG, CONSTRAIN, or BLOCK. You can update policies at runtime without restarting.

How do you integrate AI audit logging with Python?

Vaikora automatically logs every decision to the audit trail. Query the audit log via the REST API (/audit/events), filter by timestamp or decision type, and export it for compliance reports. The audit trail is cryptographically signed if you're using the Control Plane, ensuring tamper-proof records.

Is the Vaikora gateway suitable for production use?

Yes. The open-core gateway is MIT-licensed and production-ready. For hosted infrastructure, automatic scaling, geographic redundancy, and the cryptographically-signed audit chain, the Vaikora Control Plane provides enterprise-grade SLAs.

What's the performance impact of runtime control?

The Vaikora gateway is designed for low-latency policy decisions. At scale, monitor gateway latency and configure fallback behavior (fail-safe or fail-open) in case the gateway becomes unavailable.

Can I use Vaikora with local LLMs?

Yes. Configure VAIKORA_LLM_BASE_URL to point at any OpenAI-compatible server, including local LLM inference engines (Ollama, vLLM, Text Generation WebUI). The policy enforcement and audit logging work the same way.

See Vaikora enforce policy on your AI

Open-core AI runtime control. Self-host the MIT gateway free, or run the hosted Control Plane.

Get a demo Self-host the gateway