docs/user-guide
Babysitter User Guide guide
New here? Jump straight to Start here(start-here) for the 20-minute path, or use the task-based(i-want-to) and role-based(by-role-and-level) entry points below to go directly to the page you need.
Pages in this section
Start with the section hub, then move sideways into adjacent pages when you need more detail.
Docs(./index.md) › Architecture
wiki/docs/user-guide/architecture.md
Features
PageDocs(../index.md) › Features
wiki/docs/user-guide/features/index.md
Docs(../index.md) › Getting Started
wiki/docs/user-guide/getting-started/README.md
This document defines the navigation structure for the Babysitter User Guide documentation.
wiki/docs/user-guide/navigation.md
Docs(../index.md) › Reference
wiki/docs/user-guide/reference/index.md
Summary
PageThis is the GitBook-style table of contents for the Babysitter User Guide. It is kept in sync with the Docusaurus sidebar in navigation.md(./navigation.md) and the entry points on the landing page(./index.md).
wiki/docs/user-guide/SUMMARY.md
Tutorials
PageDocs(../index.md) › Tutorials
wiki/docs/user-guide/tutorials/index.md
Babysitter User Guide
**Babysitter enforces obedience on agentic workforces: it runs your workflow as deterministic, code-defined orchestration on any supported harness, where the orchestrator can only do what your process permits. Manage extremely complex, multi-agent workflows with a hook-enforced mandatory stop after every step — enforcement, not assistance.**
New here? Jump straight to Start here for the 20-minute path, or use the task-based and role-based entry points below to go directly to the page you need.
---
Start here
The fastest path from zero to a working run:
1. Installation — install the CLI and your harness plugin (5 min) 2. Quickstart — run your first workflow (10 min) 3. First Run Deep Dive — understand what just happened (10 min)
Prefer to learn the ideas first? Read What is Babysitter? (2 min), then see **how the whole ecosystem fits together** (vision + diagram + runtime flow) and the Two-Loops Architecture.
Want the lay of the land first? The Ecosystem Overview tours every component (the core engine, the adapters family, atlas, genty, the observer dashboard, kradle, and kip-sdk) and helps you choose which you need.
---
I want to…
Task-based entry points — pick the goal that matches what you are doing right now.
| I want to… | Go to |
|---|---|
| **Create a process** (custom workflow) | Process Definitions → Custom Process tutorial |
| **Run on my harness** (Codex, Cursor, Gemini, …) | Install Matrix → Slash Commands |
| **Debug a run** (errors, stuck runs, recovery) | Troubleshooting → Error Catalog |
| **Write tests / set quality targets** | Quality Convergence → Best Practices |
| **Understand the architecture** | Architecture & How It Fits Together → Two-Loops Architecture |
| **Tour the components** | Ecosystem Overview → Adapter Types |
| **Run Babysitter from CI** | Adapters CLI → Configuration |
| **Look up a command or flag** | CLI Reference · Adapters CLI |
| **Learn a term** | Glossary |
---
By role and level
Role-based entry points — start where you fit, then follow the detailed Learning Paths below.
| You are a… | Start with |
|---|---|
| **New user** (first time) | Getting Started overview → Quickstart |
| **Process author** (build workflows) | Process Definitions → Custom Process tutorial |
| **CI / automation integrator** | Adapters CLI → Configuration → Security |
| **Technical lead / architect** | Two-Loops Architecture → Best Practices |
---
Quick Start
Get up and running with Babysitter in minutes.
| Step | Description | Time |
|---|---|---|
| Installation | Install the CLI and Claude Code plugin | 5 min |
| Quickstart | Configure your environment | 5 min |
| First Run | Execute your first babysitter workflow | 10 min |
---
What is Babysitter? (Start Here if You're New)
**Babysitter makes agentic work obedient.** It rests on three pillars:
1. **Deterministic process execution** — your workflow is real JavaScript code (async function process(inputs, ctx)), and the orchestrator can *only* do what that code permits. State is event-sourced in an immutable journal, so any run can be replayed and resumed from any point. 2. **Complex agentic workflows** — tasks, breakpoints, sleeps, parallel dispatch, dependencies, and sub-agent delegation across harnesses. A single headless entry point can orchestrate multi-agent work, delegating each task to whichever installed harness is best suited. 3. **Policy / process adherence (obedience)** — after *every* step there is a hook-enforced **mandatory stop**, a process check ("what does the process permit next?"), and a decision: permit the next task, or halt until a gate passes. **Enforcement, not assistance — gates block progression until satisfied; they're not suggestions.**
The Problem Babysitter Solves
When you turn an AI agent loose on real work, it tends to keep going on its own judgment — skipping steps, declaring "done" without evidence, and drifting from the process you intended. Babysitter removes that discretion: the agent does exactly what the process permits, nothing more, and cannot advance past a gate it hasn't satisfied.
One illustration of how a gate works is the familiar "try, check, fix, repeat" loop — a code-defined gate keeps iterating until its quality criterion is met, then permits the next step. That quality convergence is *one* consequence of code-defined gates, not the whole product.
How It Works (In Plain English)
Your process is code; the orchestrator enforces it. After each step it stops, checks what the process permits next, and only then permits the next task — or halts until a gate passes. The loop below shows one such gate (a quality gate) doing its job:
┌─────────────────────────────────────────────────────────────────┐
│ YOU: "Build a login page with tests" │
│ ↓ │
│ BABYSITTER: Enforces your process; one gate iterates: │
│ 1. AI writes code │
│ 2. Tests run → 60% pass │
│ 3. AI fixes failures │
│ 4. Tests run → 85% pass │
│ 5. AI fixes remaining issues │
│ 6. Tests run → 95% pass ✓ Target met! │
│ ↓ │
│ YOU: Review and approve the final result │
└─────────────────────────────────────────────────────────────────┘Key Terms You'll See
| Term | What It Means | Example |
|---|---|---|
| **Process** | A workflow definition | "Build feature with TDD" |
| **Run** | One execution of a process | Running the TDD workflow for your login page |
| **Task** | A single step in the process | "Write tests", "Run linter", "Check coverage" |
| **Quality Gate** | A check that must pass | Tests must be 90% passing |
| **Breakpoint** | A pause for human approval | "Review this code before I deploy it" (handled in chat or via web UI) |
| **Iteration** | One try-check-fix cycle | Attempt #3 to pass the tests |
| **Convergence** | Improving until target met | Going from 60% → 85% → 95% |
Your First 5 Minutes
**What you'll do:** 1. Install Babysitter (1 command) 2. Run a simple workflow (1 command) 3. See it iterate until tests pass 4. Approve the result
**What you'll learn:**
- How the orchestrator only does what your process permits
- What the mandatory stop and process check do after each step
- How to approve at breakpoints
- What a quality gate looks like (one gate type among several)
**What you'll see:**
/babysitter:call build a calculator with add, subtract, multiply, divide using TDD
Creating run: calculator-20260125-143012
Process: TDD Quality Convergence
Target: 90% quality
Iteration 1: Quality 65/100 - Tests: 6/10 passing
→ AI fixing test failures...
Iteration 2: Quality 82/100 - Tests: 9/10 passing
→ AI improving code coverage...
Iteration 3: Quality 95/100 - Target met! ✅
Claude: The implementation is complete. Quality score: 95/100.
Do you approve the final result?
[Approve] [Request Changes]
You: [Approve]
Done! Your calculator module is ready.**Note:** Breakpoints (approval prompts) are handled directly in the chat when using Claude Code. No external service needed!
**The main command:** /babysitter:call <your request> handles everything automatically.
→ **Start the Quick Start Tutorial**
---
Documentation Sections
Ecosystem & Architecture
The monorepo is one core engine surrounded by a family of components. Start with the architecture, then tour each piece.
| Page | Description |
|---|---|
| Architecture & How It Fits Together | Vision, a component diagram, and the runtime flow — how the engine, adapters, atlas, genty, kradle, and the dashboard cooperate |
| Ecosystem Overview | The whole monorepo and how to choose among components |
| babysitter-sdk | The core event-sourced orchestration engine (GA) |
| adapters (the family) | The multiplexer for all agents — a family of 20 package types, not one thing |
| atlas | The catalog / knowledge graph and atlas CLI (GA) |
| genty | The unified agent runtime and genty CLI (GA) |
| observer-dashboard | Real-time SSE run dashboard (GA) |
| kradle | Kubernetes-native Git forge with per-org assistant (**MVP**) |
| kip-sdk | Intended memory substrate — **spec/design only, no shipping code** |
---
Tutorials
Step-by-step learning guides that take you from beginner to expert.
| Tutorial | Level | Time | Description |
|---|---|---|---|
| Getting Started | Beginner | 20 min | Installation, setup, and your first run |
| Build a REST API | Beginner | 45 min | Create a complete REST API with TDD |
| Custom Process | Intermediate | 60 min | Build your own process definition |
| Multi-Phase Workflows | Advanced | 90 min | Orchestrate complex multi-phase development |
---
Features
Deep dives into Babysitter's core capabilities.
<!-- user-guide-index:features-table:start -->
| Feature | Description |
|---|---|
| **Two-Loops Architecture** | **Deterministic enforcement** - a symbolic orchestrator that can only do what your code permits, with a mandatory stop after every step (enforcement, not assistance) |
| **Process Definitions** | **Workflows as real JavaScript** - tasks, breakpoints, sleeps, parallel dispatch, dependencies, and sub-agent delegation orchestrated from code |
| **Adapters** | **Run complex agentic workflows on any supported harness** (v6) - harness-agnostic runtime, sub-agent delegation across harnesses, plus the host-side adapters CLI |
| **Journal System** | **Event-sourced, immutable journal** - deterministic replay and resume from any point |
| **Process Library** | **2,239 JavaScript process files in the live generated snapshot**, plus methodology, shared-process, skill, and agent layers discovered under library/ |
| Breakpoints | Human-in-the-loop approval gates - enforced pauses for critical decisions |
| Parallel Execution | Concurrent task execution and dependencies for faster results |
| Run Resumption | Continue interrupted workflows from any point via journal replay |
| Quality Convergence | One gate type among several - **five quality gate categories** (tests, code quality, static analysis, security, performance) with 90-score patterns; a consequence of code-defined gates |
| Best Practices | **Four guardrail layers**, multi-gate validation, workflow design, and team collaboration patterns |
<!-- user-guide-index:features-table:end -->
<!-- user-guide-index:process-library-highlight:start -->
<!-- user-guide-index:process-library-highlight:end -->
**Highlight:** The Process Library snapshot currently tracks 2,239 process files across 38 methodology families and the full specialization tree. Explore the library →
**Essential Reading:** Understanding the Two-Loops Architecture is key to designing reliable, bounded agentic workflows with proper guardrails and evidence-driven completion. For how the v6 subsystems fit together, start with the Architecture Overview.
---
Harnesses
Babysitter v6 runs on a dozen AI coding harnesses. Pick yours and follow its install and invocation guide.
| Harness | Description |
|---|---|
| Install Matrix | Every supported harness - install commands, invocation token, and per-harness hook model |
| Claude Code | Fully supported - /babysitter:* slash-commands and the babysit skill |
| Codex | Fully supported - $babysitter:* via the mention picker |
Migrating from the
0.0.xseries? See the Migration Guide for every breaking change.
---
Reference
Technical specifications and lookup resources.
| Reference | Description |
|---|---|
| Slash Commands | **Core modes** (call, yolo, forever, plan) and utility commands for Claude Code |
| CLI Reference | Complete command-line interface documentation |
| Adapters CLI | The host-side adapters CLI - run, install, and manage any harness (v6) |
| Package & Plugin Map | Canonical public/internal docs map for active packages, apps, and harness plugins |
| Configuration | Environment variables and config file options |
| Security | Security model, trust boundaries, and hardening guidance |
| Error Catalog | All error codes with solutions |
| Glossary | Terminology and definitions |
| FAQ | Frequently asked questions |
| Troubleshooting | Common issues and resolutions |
---
Learning Paths
Choose a path based on your role and goals.
For Developers New to Babysitter
**Start here if this is your first time using Babysitter:**
1. **First:** Read the "What is Babysitter?" section above - it takes 2 minutes and explains the core concepts 2. **Then:** Complete the Getting Started tutorial (20 min) - you'll install and run your first workflow 3. **Practice:** Build your first project with REST API Tutorial (45 min) 4. **Reference:** Use the Glossary when you encounter unfamiliar terms (it has a quick-reference table at the top)
For Experienced Developers
1. Quick setup via Installation 2. Learn the Five Quality Gate Types for robust validation 3. Study Best Practices for workflow design 4. Reference the CLI for automation
For Technical Leads and Architects
1. **Start here:** Understand the Two-Loops Architecture philosophy 2. Study Quality Convergence for the 90-score convergence pattern 3. Review the Four Guardrail Layers for safety and control 4. Learn Journal System for audit compliance 5. Explore Custom Process for team workflows
For Quality Engineers
1. **Essential:** Study the Five Quality Gate Types 2. Review The 90-Score Convergence Pattern 3. Understand Evidence-Driven Completion 4. Apply Domain-Specific Targets from Best Practices
For DevOps and Automation Engineers
1. Install using Quickstart 2. Master the CLI Reference 3. Configure via Configuration Reference 4. Automate with Run Resumption
---
What's New
Version 6.0.0
- v6 launch edition: documented the harness-agnostic Adapters runtime across all 12 supported harnesses
- Unified the public npm surface around
@a5c-ai/babysitterfor the main CLI - Split optional runtime orchestration into
@a5c-ai/genty-platform - Refreshed user-facing docs to match the current package and command boundaries
Recent Updates
| Version | Date | Highlights |
|---|---|---|
| 6.0.0 | 2026-06-22 | v6 launch edition: harness-agnostic Adapters runtime documented across 12 harnesses |
| 5.0.0 | 2026-04-25 | CLI/runtime package split clarified across public docs |
For the complete changelog, see the GitHub Releases.
---
Search Tips
Finding what you need quickly:
- **Commands:** Search for the command name (e.g.,
run:create,effects:get) - **Errors:** Search for the error code or key words from the message
- **Concepts:** Use terms from the Glossary
- **Tasks:** Search for what you want to do (e.g., "resume", "breakpoint", "quality")
---
Getting Help
Documentation Resources
- FAQ - Common questions answered
- Troubleshooting - Problem resolution guides
- Error Catalog - Error codes and fixes
Community and Support
- **GitHub Issues:** Report bugs or request features
- **Discussions:** Community Q&A and discussions
---
Documentation Structure
This documentation follows the Diataxis framework:
| Category | Purpose | User Mode |
|---|---|---|
| **Tutorials** | Learning through guided projects | Study |
| **Features** | Understanding capabilities | Study |
| **Reference** | Technical lookup information | Work |
| **How-to Guides** | Task-focused problem solving | Work |
---
Contributing
Found an issue with the documentation? Contributions are welcome.
1. Check existing issues first 2. Submit corrections via pull request 3. Follow the documentation style guide
---
*Last updated: 2026-06-23*