🔒 Technology Controls · Agentic AI Security

See inside an
AI agent under attack.

An interactive demonstration of what happens when agentic AI systems operate without security controls — and what changes when guardrails are in place. Nine scenarios. Four new architectural controls.

Launch Demo → How it works
Getting started

How to use the demo

Each scenario runs in under 30 seconds. Toggle guardrails on and off to see the same agent behave differently.

1

Pick a scenario

Choose from nine scenarios in the left panel — from normal operation to multi-stage attack chains and live HITL checkpoints.

2

Toggle guardrails

Switch between Guardrails ON and Guardrails OFF to see the same request handled differently.

3

Click Run

Watch the agent work in real time — every decision, tool call, and control action streamed live.

4

Compare the panes

Left pane shows everything happening inside the agent. Right pane shows what the end-user sees.

The two panes

◈ Under the Hood — Audit Trail

  • Every orchestrator decision and reasoning step
  • Tool calls dispatched to workers (with inputs)
  • What each worker returned
  • Guardrail events — blocks and reasons
  • Token usage and timing per step
  • Session summary and guardrail statistics

◈ User View — What They See

  • The request as the user typed it
  • The agent's final response only
  • Error messages when a request is blocked
  • In attack scenarios: a normal-looking response — with no sign that data was exfiltrated
Scenarios

Nine scenarios, two outcomes each

Every scenario runs with guardrails on and off — showing what the controls prevent, and what happens without them. Scenarios 6–9 run live over a bidirectional WebSocket.

Normal Operation

A routine financial research and writing task. Shows the healthy agentic loop — plan, delegate, synthesise — with a full audit trail.

Baseline
💉

Prompt Injection

The agent fetches a malicious website during research. The page contains hidden override instructions. With guardrails off, credentials are silently exfiltrated.

OWASP LLM01
🚫

Unauthorized Tool

The agent attempts to use a tool outside its permitted scope. The allowlist control blocks it. Without guardrails, the action proceeds unchecked.

Tool Allowlist
💸

Budget Exceeded

The agent makes far more tool calls than the task requires — a runaway agent. The call budget cap limits blast radius. Without it, API costs spiral.

Call Budget
🔍

Pre-execution Review

A £5.6M pension fund rebalancing is planned but nothing executes. Every intended trade is held for human-in-the-loop approval — the four-eyes control applied to autonomous AI.

Human-in-the-Loop
☠️

Poisoned Tool Result

A worker returns a result containing hidden adversarial instructions. The orchestrator reads them and changes behaviour. Tool result validation catches this before it reaches message history.

OWASP LLM02
🤖

Worker Over-Compliance

An orchestrator dispatches a task wildly outside its original scope. A compliant worker executes it without question. Job scope manifests stop this cold.

Scope Enforcement

AI Attack Chain

Recon → exploit → exfiltrate — all in seconds. Shows how an agent with tool access can execute a full attack chain. The tool allowlist stops it at phase 2.

Attack Chain

Live HITL Checkpoint

A £15M portfolio rebalancing pauses for real human approval — not simulated. Approve or reject via the browser. The agent unblocks the instant you respond.

Bidirectional HITL
Architectural Controls

Four new controls in v2.0.0

v2.0.0 adds defences targeting multi-agent and orchestration-layer threats — the gaps that single-agent guardrails don't cover.

📋

Job Scope Manifest

Derived from the original user request. Workers validate every dispatch against it independently of the orchestrator — catching scope drift the orchestrator can't see.

Worker-enforced
🛡️

Tool Result Validation

Scans every worker result for adversarial injection patterns before it enters the orchestrator's message history — stops poisoned tool results before they affect behaviour.

Pre-history scan
🔗

Non-Expanding Delegation

Child orchestrators cannot have a broader tool scope than their parent. A compromised orchestrator cannot escalate its own privilege by spawning a less-restricted child.

Scope containment
🔑

Job-Scoped MCP Tokens

Each worker receives a short-lived token scoped to the tools it needs for one job. Leaked or replayed tokens cannot be used to call out-of-scope tools.

Least-privilege creds

Ready to run the demo?

Attack scenarios 2–8 are scripted for reliability. Scenario 1 (Normal) and Scenario 9 (HITL) make live backend calls — the HITL checkpoint is a real bidirectional pause.

Open the Demo →