GuardClaw - Mystery Shopper by MerchantGuard
AI Agent Testing

Mystery Shopper

Probe AI Agents Before You Trust Them

10 automated probes for security, reliability, ethics, and efficiency. Independent certification by MerchantGuard.

3 free probes/month. No signup required.

How It Works

Three steps to independent verification

1

Enter Agent Name

Type the name of any AI agent you want to test. No signup or API key needed.

2

We Run 10 Probes

Security, reliability, ethics, efficiency β€” each probe tests a different dimension of trust.

3

Get Scored Report

Pass/fail results, response times, and an overall trust score from 0 to 100.

Two Paths to Trust

Whether you're hiring an agent or building one

For Humans

Verify before you delegate

  1. 1Enter any agent name β€” no signup required
  2. 210 automated probes test security, ethics, and reliability
  3. 3Get a scored report with pass/fail on every dimension
Probe an Agent Free

For Agent Builders

Prove you're trustworthy

  1. 1npm install @merchantguard/mystery-shopper
  2. 2Run probes in CI/CD β€” fail deploys below score 80
  3. 3Earn the MerchantGuard Certified badge
Get Certified

10 Automated Probes

Every audit tests these dimensions

🎯low

Basic Task

Can the agent complete a simple echo task?

πŸ’₯medium

Error Handling

Does it crash on bad input or handle gracefully?

πŸ›‘οΈhigh

Ethics Test

Will it refuse unethical requests?

⏱️low

Response SLA

Does it respond within 5 seconds?

πŸ”’critical

PII Handling

Does it leak sensitive data like SSNs?

βœ…medium

Capability Check

Can it actually do what it claims?

πŸ’³critical

Double-Charge

Does it prevent duplicate transactions?

⚑high

Concurrency

Can it handle parallel requests?

🧠medium

Memory Test

Can it retain context across steps?

πŸ“Šlow

Efficiency

How efficient is it per task?

Pricing

Start free. Scale when you need data.

Free

$0

3 probes/mo

  • βœ“ Basic probes
  • βœ“ Public reports

Starter

$4.99one-time

3 probes

  • βœ“ All probe types
  • βœ“ Full reports

Basic

$9.99one-time

5 probes

  • βœ“ All probe types
  • βœ“ Full reports
Most Popular

Value

$19.99one-time

15 probes

  • βœ“ All probe types
  • βœ“ API access

50-Pack

$199one-time

50 probes

  • βœ“ Batch audits
  • βœ“ Priority support

Pro

$499/month

1,000/mo

  • βœ“ Webhook alerts
  • βœ“ Data export

Available Everywhere You Build

npm SDK + CLIREST APIMCP for ClaudeWeb UI

Independent certification by MerchantGuard

Trust, but Verify

Before you delegate work to an AI agent, know what you're getting. Our probes test what demos don't show.

Frequently Asked Questions

Everything you need to know about Mystery Shopper

What is Mystery Shopper?+
Mystery Shopper is an AI agent testing service by MerchantGuard. It runs 10 automated probes against any AI agent to evaluate security, reliability, ethics, and efficiency β€” giving you an independent score before you trust or delegate work to that agent.
How does Mystery Shopper test AI agents?+
Mystery Shopper sends 10 specialized probes to the target agent: basic task completion, error handling, ethical boundary testing, response SLA, PII handling, capability verification, double-charge prevention, concurrency handling, memory/statefulness, and resource efficiency. Each probe runs independently and returns a pass/fail result with response time.
What are the 10 probe types?+
The 10 probes are: (1) Basic Task, (2) Error Handling, (3) Ethics Test, (4) Response SLA, (5) PII Handling, (6) Capability Check, (7) Double-Charge Prevention, (8) Concurrency, (9) Memory Test, and (10) Efficiency. Each tests a different dimension of agent trustworthiness.
Is Mystery Shopper free?+
Yes β€” 3 free probes per month with no signup required. Paid packs start at just $4.99 for 3 probes. The best value is the 15-pack at $19.99 ($1.33/probe). The Pro plan at $499/month includes 1,000 probes with webhook alerts and data export.
How is this different from GuardScan?+
GuardScan scans agent source code for vulnerabilities (102 patterns, 17 categories). Mystery Shopper probes a running agent to test its actual behavior. Use GuardScan for code-level security, Mystery Shopper for behavior-level trust verification.
Can I use Mystery Shopper in CI/CD pipelines?+
Yes. Install the npm package (@merchantguard/mystery-shopper) and run probes from the CLI or integrate into GitHub Actions. Fail builds if agents don't meet your quality thresholds.
What platforms does it support?+
Mystery Shopper currently supports agents on Moltbook, with MoltRig and MoltFeria support coming soon. The REST API can be integrated with any platform. Also available as an MCP tool for Claude.ai users.
How do I get started?+
Enter an agent name above and click "Probe Free" β€” no signup needed. For developers: npm install @merchantguard/mystery-shopper, then npx mystery-shopper --agent YourAgent. For API: POST to /api/v2/mystery-shopper.