Agent Setup Benchmark — v1.0 Release Notes
Version: 1.0
Released: 2026-04-20
Price: $4.99
Category: analytics
QA Status: APPROVED_FOR_RELEASE (Viper 2026-04-20)
What It Does
Agent Setup Benchmark measures and scores the full OpenClaw first-run experience — timing every phase from install through API key configuration, memory setup, first skill install, first working message, cron scheduling, and channel connection. It produces a friction score (0–100) with phase-by-phase breakdowns showing exactly where new users lose time, plus targeted recommendations for closing gaps. An audit mode runs against any existing install to produce a health snapshot without live timing, useful for validating a production setup or generating community feedback data.
Use Cases
- New OpenClaw users who want to verify their setup is complete and understand what they're missing — the
auditmode gives an instant health snapshot of API keys, memory files, skills, crons, and channel connections with a clear friction score and specific recommendations - Documentation contributors and community members who want structured, shareable data on their setup experience — friction scores and phase timings help identify where OpenClaw onboarding can be improved, and the JSON export is designed for community feedback threads
- Teams deploying OpenClaw to non-technical users who need a repeatable, measurable onboarding benchmark to ensure every new install meets a minimum friction score before handoff, catching gaps in channel config or memory setup before they become support tickets
Requirements
| Requirement | Detail |
|-------------|--------|
| OpenClaw | Any current version |
| Python | 3.8+ |
| Dependencies | stdlib only (json, os, datetime, pathlib, time) — no pip installs |
| API Keys | None required |
| Permissions | Read access to ~/.openclaw/ directory |
Example Usage
Run a quick automated audit on an existing install:
python3 scripts/setup_benchmark.py --mode audit
Start a new live benchmark session (timed):
python3 scripts/setup_benchmark.py --mode start
Mark a phase complete during a live benchmark:
python3 scripts/setup_benchmark.py --mode mark --phase "api-keys"
Complete benchmark and generate report:
python3 scripts/setup_benchmark.py --mode complete --output /tmp/setup-report.json
Run demo mode (simulated data, no timing):
python3 scripts/setup_benchmark.py --mode demo
Expected output (audit mode):
🔍 Agent Setup Audit — 2026-04-20 05:00 UTC
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
✅ OpenClaw installed v2.x.x
✅ API key configured OpenRouter (openai/gpt-4o)
✅ MEMORY.md exists 1,423 chars
✅ USER.md exists 892 chars
✅ Skills installed 10 skills
✅ Crons configured 25 jobs
⚠️ No Telegram channel connection not configured
⚠️ No Discord channel connection not configured
Friction Score: 72/100 (GOOD — minor gaps in channel config)
Recommendations:
1. Connect a messaging channel (Telegram takes ~3 min via /connect telegram)
2. Consider adding a second model provider as backup (OpenAI or Anthropic)
Friction Score Scale:
| Score | Rating | Meaning |
|-------|--------|---------|
| 90–100 | 🟢 Excellent | Setup in under 20 min, all phases complete |
| 70–89 | 🟡 Good | Setup complete, minor gaps |
| 50–69 | 🟠 Moderate | Key phases missing (channels, memory, skills) |
| 0–49 | 🔴 Poor | Core config incomplete — API keys or memory missing |