Vaikora › Blog › Detection & SOC
AI Security Data Sources: Logs and Telemetry for SOC
AI security telemetry is the structured record of every AI agent action, model invocation, and security decision a system makes. SOC teams need five core log types to monitor AI: gateway decision records (allow/block/constrain/log verdicts), prompt and response metadata (input content, output tokens, refusals), tool-call events (what functions the agent invoked and with what parameters), policy-violation signals (attempted jailbreaks, prompt injection, data exfiltration), and identity and authentication context (who triggered the AI, what permissions they hold, what tenant or workspace they belong to). Without this telemetry, AI incidents become invisible to security operations.
Why SOC Teams Must Instrument AI
Traditional security operations built their detection and response playbooks around network logs, endpoint signals, and application events. An AI system introduces a new attack surface: the interaction between a user, an LLM, external tools, and a retrieval database, all in a single millisecond-scale execution loop.
Existing log sources miss the AI-specific threat vectors. Network logs show API calls to the LLM provider, but not what prompt was sent or what policy violation the gateway detected. Endpoint logs show a process spawned a subprocess, but not that an AI agent attempted to run an unapproved system command. Application logs show a database query executed, but not that a retriever pulled sensitive data in response to an injected user query. A SOC team blind to AI telemetry cannot detect prompt injection, unauthorized tool use, data exfiltration through model outputs, or attempts to override security policies.
This gap exists because most organizations deployed AI without native observability. The LLM providers (OpenAI, Anthropic, Google) emit API-level logs via their dashboards, but these logs capture only the completion request and token usage, not the security context. The applications that call LLMs often log the function name ("call_llm") and the response status, but strip the actual prompt and output for privacy or performance reasons. And the security policies and controls that sit between the application and the LLM are often internal custom code, with no standardized telemetry format.
SOC teams need five core log categories to close this visibility gap.
Log Type 1: Gateway Decision Records
The first telemetry SOC teams must collect is the output of every security policy evaluation. When an AI runtime control system evaluates a prompt, tool call, or model output against a policy ruleset, it returns a verdict: ALLOW, LOG, CONSTRAIN, or BLOCK. Each verdict is a security decision that must be logged, timestamped, and correlated with the input that triggered it.
A gateway decision record typically contains:
- Timestamp and request ID: When did the evaluation occur, and what is the unique ID for this entire agent execution?
- Decision verdict: ALLOW, LOG, CONSTRAIN, or BLOCK.
- Evaluation phase: Was this a pre-prompt evaluation, mid-execution tool-call gate, or post-response output filter?
- Policy rule that fired: Which specific security policy (e.g., "block SQL injection patterns") triggered the decision?
- Confidence score: How certain is the system that this is a true threat, on a 0 to 1 scale?
- Remediation action: If the verdict was CONSTRAIN or BLOCK, what was the system action? (e.g., "regenerated response without sensitive data", "logged to audit chain", "blocked tool invocation").
Gateway decision records are the ground truth for AI security alerting. They are generated in real-time, before or at the moment an action is executed, so they have immediate actionable value. A spike in BLOCK verdicts on a particular policy rule may indicate an active attack or a misconfigured agent. A CONSTRAIN event indicates the system intervened to prevent harm without blocking the user entirely, a useful middle-ground signal that SOC teams should correlate with application behavior.
Log Type 2: Prompt and Response Metadata
The second critical log category is the full or redacted metadata of every prompt sent to an LLM and every response returned. This includes the textual content of the user input (or a hash/truncation if sensitive), the system instructions in effect, any chat history context, the exact model name and version, the number of tokens consumed (prompt and completion tokens separately), and the actual generated output (or a summary if the output is sensitive).
Prompt and response metadata serves multiple SOC functions:
- Incident reconstruction: If a policy violation is detected, the SOC investigator needs the exact prompt and output to understand what happened and why the policy fired.
- Pattern detection: Repeated variations of a prompt trying to bypass a filter may indicate a sustained jailbreak or prompt injection campaign.
- Compliance and auditing: Regulated industries (healthcare, finance, government) require proof that sensitive data did not flow into model inputs or outputs. Metadata logs provide that proof.
- Model drift and hallucination tracking: A spike in token usage or a change in model response patterns may signal a model update, a data poisoning attack, or a configuration error.
Metadata logs must be treated as sensitive themselves. A SOC team should implement tiered retention and access controls: full content retained for 7 to 30 days for incident response, and redacted or hashed summaries retained for 90 to 365 days for trend analysis and compliance audits.
Log Type 3: Tool-Call Events
When an AI agent invokes an external tool (a database query, an API call, a shell command, a file system operation), the SOC must see that invocation logged in real-time with full parameter context. A tool-call event includes:
- Tool identity: The registered name or ID of the tool being invoked.
- Tool parameters: The exact arguments passed to the tool. If a tool is called with a query parameter, the SOC needs to see the query. If a tool is called with credentials or secrets, the log should redact the secrets but include the key names.
- Tool outcome: Did the tool succeed, and if so, what did it return? Did it fail, and if so, with what error?
- Authorization context: Did the AI agent have permission to invoke this tool? Some tools should only be callable by agents acting on behalf of specific roles or tenants.
Tool-call events are where many AI security breaches occur. A jailbroken or prompt-injected agent may attempt to invoke a tool that should not be accessible (e.g., a delete operation, a credential fetcher, or a deployment trigger). An unauthorized agent may attempt to query a database across multiple tenants. An agent suffering from model drift may make the same API call repeatedly with slightly different parameters, exhausting rate limits or incurring unplanned costs. Tool-call telemetry makes all these incidents visible to SOC teams.
Log Type 4: Policy-Violation and Threat-Detection Events
Beyond the gateway decision log, the SOC needs dedicated events for security-relevant findings. When a runtime control system detects an attempted prompt injection, a jailbreak pattern, a PII exposure, or an unauthorized tool invocation, it should emit a structured threat-detection event independent of whether the action was allowed or blocked.
These events typically include:
- Threat class: Prompt injection, jailbreak attempt, data exfiltration, hallucination, unauthorized action, policy override attempt, model spoofing, etc.
- Threat confidence and severity: How confident is the system that this is a real threat, and how severe would the impact be if it succeeded?
- Detected artifacts: The specific text or pattern that triggered the threat signal. For prompt injection, the actual injection payload. For PII detection, the data type and count of instances. For unauthorized actions, the tool name and the permission that was lacking.
- Mitigation applied: What did the system do in response? Allow with logging, constrain the output, block the action, or escalate for manual review?
Threat-detection events are the primary driver of SOC alerts and incident creation. They should flow directly into SIEM correlation rules and alert workflows.
Log Type 5: Identity, Tenant, and Authorization Context
Every AI security log must include the human or service identity that triggered the action, the tenant or workspace boundary that applies, and the authorization context at the time of execution. This context enables SOC teams to correlate AI incidents with identity-driven threats and to enforce tenant isolation.
Context logs should include:
- User or service identity: Who triggered this AI agent invocation? Username, service principal ID, API key ID, or session token ID.
- Tenant or organization ID: Which customer or organizational boundary does this action belong to?
- Role or permission set: What was the authenticated subject's effective permission set at the time of the action?
- Originating IP address or network: Where did the request originate?
- Session or transaction ID: Can this action be correlated with upstream application logs or infrastructure events?
This context enables critical SOC investigations. If an AI agent exposed sensitive data, the SOC can immediately determine which user caused it and which tenants were exposed. If an agent attempted an unauthorized action, the SOC can correlate it with identity risk signals (e.g., a newly onboarded contractor, a user from an unusual geographic location, a service account with unusual activity). If multiple agents belong to the same tenant, the SOC can detect cross-agent attacks or lateral movement.
Log Format and Schema Design
For SOC teams to act on AI telemetry, the logs must be structured, machine-readable, and schema-aligned with SIEM ingestion tools. The de facto standard for security event logging is either JSON or CEF (Common Event Format). JSON is preferred for modern cloud-native stacks because it nests complex objects (like an array of policy violations or a tool-call parameter map) naturally. CEF is preferred for on-premises SIEM platforms that require flat key-value pairs.
A minimal JSON schema for an AI security event might look like:
{
"event_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"timestamp": "2026-06-30T14:23:45.123Z",
"event_type": "policy_evaluation",
"request_id": "req_xyz789",
"user_id": "user_12345",
"tenant_id": "tenant_acme",
"user_role": "analyst",
"source_ip": "203.0.113.42",
"agent_name": "research_assistant",
"agent_version": "v2.1.0",
"model": "gpt-4-turbo",
"model_version": "gpt-4-turbo-20240613",
"evaluation_phase": "pre_prompt",
"policy_rule_id": "rule_sql_injection_v3",
"policy_rule_name": "Block SQL injection patterns",
"decision": "BLOCK",
"confidence": 0.94,
"prompt": "[REDACTED: 256 chars]",
"prompt_hash": "sha256:abc123def456...",
"prompt_length_chars": 1247,
"remediation_action": "request_rejected",
"threat_class": "prompt_injection",
"threat_severity": "high",
"threat_confidence": 0.91,
"tool_calls": [
{
"tool_id": "db_query",
"tool_name": "execute_sql",
"parameters": {
"query": "[REDACTED]",
"database": "customers"
},
"outcome": "blocked_by_policy",
"error": null
}
],
"response_summary": "[REDACTED: first 100 tokens]",
"tokens_prompt": 847,
"tokens_completion": 0,
"total_tokens": 847,
"latency_ms": 234,
"audit_hash": "sha256:xyz789abc123...",
"audit_chain_sequence": 987654
}
This schema is comprehensive but not every organization will need every field. SOC teams should work with their AI runtime provider to agree on a core schema (perhaps 15 to 20 fields) and extend it as needed.
SIEM Ingestion and Correlation
Once AI telemetry is being generated, the next step is ingestion into the SIEM platform. Most modern SIEMs (Microsoft Sentinel, Splunk, Elastic, CrowdStrike LogScale) support JSON ingestion via Syslog, direct API, or cloud object storage (AWS S3, Azure Blob).
A SOC team deploying this ingestion should:
- Define retention: How long should raw event data be retained (typically 7 to 90 days), and how long should summarized or indexed data be retained (typically 1 to 3 years)?
- Set up log routing: Parse incoming logs, extract key fields, and route them into appropriate SIEM indexes or datasets. Use SIEM field naming conventions so correlation rules can reference consistent field names across data sources.
- Create detection rules: Build SIEM detection rules (KQL for Sentinel, SPL for Splunk) that correlate AI security events with application events, identity events, and infrastructure events.
A sample detection rule for Microsoft Sentinel might look like:
// Detection: Repeated policy BLOCK verdicts on same user and agent within 5 minutes
let timeWindow = 5m;
let blockThreshold = 3;
AISecurityEvents
| where event_type == "policy_evaluation"
| where decision == "BLOCK"
| where timestamp > ago(timeWindow)
| summarize BlockCount = count(), PolicyRules = make_set(policy_rule_name), LatestTimestamp = max(timestamp)
by user_id, agent_name, tenant_id
| where BlockCount >= blockThreshold
| project
user_id,
agent_name,
tenant_id,
BlockCount,
PolicyRules,
LatestTimestamp,
Severity = iff(BlockCount > 5, "High", "Medium"),
AlertTitle = strcat("Repeated policy blocks detected for user ", user_id, " on agent ", agent_name)
| join kind=inner (
IdentityEvents
| where event_type == "authentication"
| where timestamp > ago(timeWindow)
| project user_id, auth_status, auth_method, auth_timestamp
) on user_id
| project-reorder user_id, agent_name, tenant_id, BlockCount, PolicyRules, auth_method, Severity, AlertTitle
This rule detects a pattern typical of brute-force prompt injection: a single user attempting multiple policy-violating prompts on the same agent in rapid succession, then correlates it with the user's authentication history to see if the authentication method is unusual.
Data Retention and Compliance
SOC teams operating in regulated industries (healthcare, finance, government) must align AI telemetry retention with compliance obligations. HIPAA, for example, requires audit logs be retained for at least 6 years. PCI DSS requires audit logs for at least 1 year. GDPR requires personal data (including logs containing IP addresses or user identifiers) be retained only as long as necessary for the stated purpose.
A recommended retention strategy:
- Real-time hot storage: Raw AI security events retained in hot SIEM storage (searchable within seconds) for 7 to 30 days.
- Warm archive: Summarized or indexed events retained for 90 to 180 days, still searchable but with some latency.
- Cold archive: Compressed or redacted events retained for compliance purposes (1 to 7 years depending on regulation), not directly searchable but available for deep forensics or regulatory audit.
Best Practices for AI Security Telemetry
SOC teams should apply these best practices when designing and implementing AI telemetry collection:
1. Log at the gateway layer, not the application layer. If you log only in application code, you miss threats that occur at the LLM provider boundary. A runtime control gateway logs every action before it is executed, so no threat can slip through an observability gap.
2. Include identity and tenant context in every log. A SOC team cannot correlate AI incidents with identity risk, lateral movement, or multi-tenant isolation breaches without this context. Always include user ID, tenant ID, role, and originating IP.
3. Treat prompt and output data as sensitive. Do not log full prompts and responses to unsecured log streams. Implement tiered redaction: hash sensitive fields for trend analysis, retain full content in encrypted cold storage for incident response, and strip content from long-term archives.
4. Monitor for policy rule drift. If a single policy rule fires an unusually high number of times, the rule may be too broad (generating false positives) or the agent may be under attack. Create a SIEM rule that alerts on rule-level anomalies.
5. Correlate with application and identity events. An AI security incident is rarely isolated. Correlate AI telemetry with application logs (error spikes, database anomalies), identity logs (unusual authentication, permission changes), and infrastructure logs (API throttling, quota exhaustion) to reconstruct the full incident.
6. Test your detection rules regularly. Use red-team testing or synthetic traffic to verify that your SIEM rules correctly detect prompt injection, jailbreaks, unauthorized tool use, and data exfiltration. A detection rule that does not fire on a known attack vector is worse than no rule at all.
Frequently asked questions
What logs do AI agents generate?
AI agents generate logs across five categories: gateway decision records (policy verdicts), prompt and response metadata (input content and output tokens), tool-call events (external function invocations and outcomes), policy-violation and threat-detection events (specific security findings), and identity and authorization context (user, tenant, and permission data). Each event type serves a distinct SOC function, from real-time alerting to compliance audit.
What telemetry do SOC teams need for AI security?
SOC teams need structured, timestamped, machine-readable logs that capture what the AI agent did, whether it violated a security policy, who triggered it, and what data it accessed or exposed. Core fields include decision verdicts, prompt and output metadata, tool invocations, threat classifications, user identity, tenant ID, and the policy rule that fired. This telemetry enables threat detection, incident response, and compliance verification.
How do you ingest AI agent logs into a SIEM?
AI telemetry can be ingested into most modern SIEMs via JSON or CEF formatted logs sent over Syslog, direct API calls, or cloud object storage (S3, Blob). Use the SIEM's native parsers or create custom field mappings to extract key fields and route events to appropriate indexes. Then create detection rules (KQL, SPL, etc.) that correlate AI events with application and identity events to detect patterns typical of AI security incidents.
What is AI security telemetry and what does it contain?
AI security telemetry is the comprehensive record of every AI agent action, security decision, and model invocation made by a system. It contains the user identity, the prompt and response content (often redacted), the tools the agent invoked, the policy rules evaluated, the security verdict (allow/block/constrain), threat classifications, token counts, latency, and cryptographic hashes for audit integrity. This telemetry enables real-time threat detection and forensic investigation of AI security incidents.
Why do SOC teams need AI-specific logs instead of just application logs?
Application logs typically capture only function names and return codes, not the actual prompt content, policy violations, or tool parameters. They do not include the real-time security decisions made at the LLM or runtime control layer. AI-specific telemetry logs these security decisions and incident context in real-time, before or at the moment an action executes, so SOC teams can detect and respond to threats immediately rather than discovering them hours or days later in application logs.
What is the difference between gateway decision logs and policy-violation events?
A gateway decision log records the verdict (allow/block/constrain/log) for every action evaluated by a security policy. A policy-violation event is a specialized log emitted only when a threat or violation is detected, with additional context like threat class (prompt injection, jailbreak, data exfiltration), confidence score, and specific artifacts (the injected text, the exposed PII, the unauthorized tool). Decision logs are high-volume (every action), while violation events are lower-volume and optimized for alerting and investigation.
How long should AI security telemetry be retained?
Retention depends on regulatory requirements and incident response needs. Hot storage (real-time searchable) typically spans 7 to 30 days. Warm storage (archived, slower search) covers 90 to 180 days. Cold storage (compliance-grade, encrypted, infrequently accessed) can run 1 to 7 years depending on regulation (HIPAA 6 years, PCI DSS 1 year, GDPR as-needed, others by contract). Consult your compliance officer and legal team to determine your organization's retention policy.
Can AI telemetry be used to prove compliance with AI regulations like the EU AI Act?
Yes. AI telemetry logs are a primary source of evidence for AI governance audits. The EU AI Act, ISO 42001, and NIST AI RMF all require organizations to demonstrate that they have evaluated AI systems for risks and applied appropriate controls. Telemetry logs showing that policy-violation events were detected, logged, and remediated provide audit-grade proof that the organization implemented technical controls and monitoring.
About Vaikora
Vaikora's runtime control platform emits structured, signed AI security telemetry designed for SIEM ingestion, with audit chain cryptographic signatures for compliance verification. The open-core Vaikora MCP server can be self-hosted, while the commercial Vaikora Control Plane provides pre-built SIEM connectors and compliance-ready log retention policies.
See Vaikora enforce policy on your AI
Open-core AI runtime control. Self-host the MIT gateway free, or run the hosted Control Plane.
Get a demo Self-host the gateway
Vaikora