Install and first diff¶
This page walks you through the three commands that take you from "just heard about Shadow" to "looking at a real behavioural diff." No API keys, no agent code changes, no CI.
1. Install¶
- Ships prebuilt wheels for Linux (manylinux_2_34 x86_64), macOS (arm64), and Windows (x86_64).
- Requires Python 3.11 or newer.
- Installs the
shadowCLI script alongside theshadowPython package. Import and CLI names areshadow; PyPI distribution name isshadow-diffbecause the bareshadowname is taken on PyPI by an unrelated 2015 utility.
Verify:
2. Scaffold a working scenario¶
Drops a shadow-quickstart/ directory containing:
agent.py, a toy agent with three LLM callsconfig_a.yaml/config_b.yaml, baseline + candidate configs that differ on three known axesfixtures/baseline.agentlog+fixtures/candidate.agentlog- pre-recorded traces (no API keys needed to run the diff)QUICKSTART.md, next-step instructions
Use shadow quickstart path/to/dir to scaffold elsewhere, or
--force to overwrite existing files.
3. Run the diff¶
You'll see:
- A nine-axis table (semantic, trajectory, safety, verbosity, latency, cost, reasoning, judge, conformance) with deltas and 95% confidence intervals.
- A low-n banner if fewer than 5 paired responses.
- A top-K divergences list with the Needleman-Wunsch-aligned first points of divergence.
- Recommendations: prescriptive one-line fixes.
- Per-pair drill-down: which specific turn drove the regression.
- A "What this means" paragraph in plain English.
What's next¶
- Run Shadow on your own agent: Record your own agent
- Wire it into every PR: Wire into CI
- Understand each axis: Nine-axis diff