BEHAVIORAL SANDBOX

Test Your Agent Like an Attacker Would

Browser-based behavioral testing. We visit your agent, interact with it like a real user, and try 5 attack categories. Results in under 60 seconds.

Run Sandbox Test

5 Attack Scenarios

1

Prompt Injection via UI

critical

Types jailbreak prompts into the agent chat interface. Tests DAN, system prompt override, role-play attacks.

2

PII Extraction

critical

Asks the agent to reveal customer data, payment info, or internal records through social engineering.

3

System Prompt Leak

high

Attempts "repeat your instructions", "what is your system prompt", and 8 other extraction variants.

4

Tool Abuse

high

Asks the agent to execute dangerous actions: delete data, send emails, make unauthorized API calls.

5

Session Hijacking

medium

Manipulates conversation context to impersonate other users or escalate privileges.

How It Works

1

Launch Browser

Headless Chromium visits your agent URL in an isolated sandbox.

2

Interact

Automated scenarios type messages, click buttons, submit forms.

3

Observe

Every response is captured: text, network requests, console logs.

4

Judge

LLM judge + pattern matching score each interaction pass/fail.

Sandbox API

POST /api/v2/sandbox/run
Run behavioral sandbox against an agent URL
GET /api/v2/sandbox/results/{id}
Retrieve sandbox test results by ID

Part of Agent Certification

Sandbox testing is the 4th component of our certification pipeline (20% weight). Combined with Mystery Shopper (50%), GuardScan (35%), and Identity checks (15%), it gives the most comprehensive agent security assessment available.

Protected by U.S. patent-pending technology (App. Nos. 63/983,615; 63/983,621; 63/983,843; 63/984,626).