AI Agents Regression Testing

When your prompt changes, know what broke before your users do.

Veval captures real production traces and turns them into a regression baseline. Ship prompt changes with confidence.

Start free →

No credit card required

The Problem

"I updated my prompt and something broke in production — but my tests all passed."

LLM outputs change when prompts change

Traditional unit tests can't catch regressions

You find out when users complain

How It Works

From production trace to regression test in minutes

01

Capture

Instrument your LLM agent with the Veval SDK. Every production run is recorded as a trace.

02

Pin

Mark any trace as a regression test. Build a baseline from real production behaviour.

03

Detect

Run your test suite in CI. Veval replays traces and flags when outputs change.

import veval

veval.init(api_key="...")

@veval.trace
def run_agent(prompt: str) -> str:
    ...

Pricing

Start free. Scale as you grow.

Free

$0/mo

  • Traces/mo1k
  • Test runs/mo50
  • Retention7 days
Recommended

Starter

$XX/mo

  • Traces/mo10k
  • Test runs/mo500
  • Retention7 days

Growth

$XX/mo

  • Traces/mo100k
  • Test runs/mo5k
  • Retention14 days

Pro

$XX/mo

  • Traces/mo250k
  • Test runs/moUnlimited
  • Retention30 days