Agentic AI Atlas

Wiki article

docs/development/03-babysitter-plugin-flows

Reading · 12 min

Babysitter Plugin Architecture reference

How the babysitter orchestration system works across coding agent harnesses, from plugin generation through session lifecycle to run completion.

Page nodewiki/docs/development/03-babysitter-plugin-flows.mdNearby pages · 6Documents · 0

Continue reading

Nearby pages in the same section.

a5c: Overview & Development Philosophy Atlas Graph & Agent Catalog Transport Adapter & Proxy Triggers & CI Integration Kradle & Cloud Platform 07 — Live Stack QA Guide

Babysitter Plugin Architecture

How the babysitter orchestration system works across coding agent harnesses, from plugin generation through session lifecycle to run completion.

1. System Overview 2. The Three Muxes 3. Plugin Generation 4. Activation Modes 5. Session Lifecycle 6. Run Orchestration 7. Harness-Specific Flows

---

System Overview

The babysitter plugin installs into any supported coding agent (Claude Code, Codex, Pi, Gemini CLI, etc.) and provides orchestrated process execution. Three "adapter" layers compensate for the differences between harnesses:

Each layer solves a specific interoperability problem:

Adapter	Problem	Solution
hooks-adapter	Each harness has different hook event names, payloads, output formats	Normalizes native events to canonical phases, runs unified handlers, renders results back to harness format
transport-adapter	Harnesses speak different API protocols (Anthropic, OpenAI, Google)	HTTP proxy translates between the harness's native protocol and any upstream provider
adapters	Harnesses have different CLI surfaces, capabilities, plugin loading	Unified `adapters launch` resolves provider config, starts proxy if needed, spawns harness with correct args

---

The Three Muxes

hooks-adapter: Hook Surface Normalization

Each harness fires lifecycle hooks differently. hooks-adapter normalizes them into a canonical interface:

**Adapter families:**

**shell-hook** (Claude Code, Codex, Cursor, Gemini CLI): Hook handlers are shell scripts invoked via subprocess. stdin receives JSON event, stdout returns JSON result.
**programmatic** (Pi, OpenCode): Hook handlers are in-process functions. No subprocess overhead.

**Canonical phases:** sessionStart, stop, sessionEnd, preToolUse, postToolUse, userPromptSubmit, notification, preCompact, beforePromptBuild

Each adapter maps native event names → canonical phases via mappings.ts sourced from the atlas graph.

transport-adapter: Provider Protocol Bridge

When a harness needs to talk to a provider it doesn't support natively, transport-adapter runs as a local HTTP proxy:

**Engines:**

createOpenAICompletionEngine() — Foundry/Azure path. Handles input_schema → parameters tool normalization, streaming delta.tool_calls accumulation, tool_result → role:"tool" message translation.
createGoogleCompletionEngine() — Vertex/Gemini path. Handles functionCall/functionResponse translation, thoughtSignature server-side store for multi-turn preservation.

**When proxy is needed:** Determined by translateForHarness() — if the harness adapter declares proxyRequired: true for a given provider, transport-adapter bridges the gap.

adapters: Unified Launch Surface

adapters launch resolves provider config, decides if a proxy is needed, prepares harness automation state, and spawns the harness:

---

Plugin Generation

npm run generate:plugins compiles unified plugin source into harness-specific distributions via the extensions-adapter compiler:

**Per-harness output structure:**

Harness	Plugin Format	Hook Mechanism
Claude Code	`plugin.json` + `hooks/hooks.json` + shell scripts	Shell hooks: `babysitter-proxied-session-start.sh` → `a5c-hooks-adapter invoke --adapter claude`
Codex	`.codex-plugin/plugin.json` + `hooks/hooks.json` + shell scripts	Shell hooks via hooks.json (auto-detected at `./hooks/hooks.json`)
Pi	`package.json` with `pi.extensions`	In-process programmatic hooks
Gemini CLI	Gemini-native hook config	Shell hooks via adapter

**Installation:**

Claude Code: claude plugin marketplace add a5c-ai/babysitter-claude && claude plugin install --scope project babysitter@a5c.ai
Codex: codex plugin marketplace add a5c-ai/babysitter --ref staging --sparse .agents/plugins
Others: babysitter harness:install-plugin <harness> --workspace <cwd>

---

Activation Modes

The babysitter plugin activates differently depending on how the harness is launched:

Hook-Driven (Interactive)

The harness runs interactively with native hook support. Hooks drive the orchestration loop — the stop hook decides whether to continue or yield.

**Key: hookDriven=true** — The stop hook controls the loop. When the agent finishes a turn, Claude Code fires the stop hook. The hook checks if the babysitter run needs more iterations and returns decision: "block" (continue) or "allow" (stop).

Agent-Driven (Non-Interactive)

The harness runs headless with -p or exec. No native hooks fire. The agent drives the loop in-turn by calling run:iterate repeatedly.

**Key: hookDriven=false** — The agent owns the loop. It calls run:iterate, executes effects, posts results, and loops until completion. No hooks needed.

Bridge-Hooks (Emulated)

When the harness is non-interactive but the babysitter lifecycle needs hooks, adapters launch --bridge-hooks emulates them via CLI calls:

Bridge-Interactive (PTY Bridge)

The harness runs interactively via PTY but presents structured NDJSON output externally. Used when the harness needs TTY for tool use but the caller wants machine-readable output:

---

Session Lifecycle

The `instructions:babysit-skill` Command

When the babysitter skill activates (via /babysitter:call or equivalent), it first calls instructions:babysit-skill to get orchestration guidance:

**Context detection:**

CI vs local, trigger type, repo info
Existing session state from ~/.a5c/state/hooks/sessions/
Active run state from .a5c/runs/
Library process suggestions matching active capabilities

Stop Hook Decision Logic

The stop hook is the key control point in hook-driven mode:

---

Run Orchestration

Run Lifecycle

Effect Types

Processes emit effects via ctx.task():

Effect Kind	Executed By	Example
`agent`	The coding agent (Claude Code, Codex, etc.)	"Write unit tests for the API module"
`skill`	A babysitter skill	"Run the TDD triplet skill"
`shell`	Direct shell command	"npm test", "git commit", "eslint --fix"

Journal Event Flow

Every state change is recorded in the run journal (.a5c/runs/<runId>/journal/):

Code

RUN_CREATED → EFFECT_REQUESTED → EFFECT_RESOLVED → EFFECT_REQUESTED → EFFECT_RESOLVED → RUN_COMPLETED

The replay engine reconstructs state from journal events, enabling resumption after crashes or session switches.

---

Harness-Specific Flows

Claude Code

Codex

Pi

---

Provider Path Details

When a harness speaks a different protocol than the upstream provider, transport-adapter bridges the gap:

**Message translation details:**

Direction	From	To	Key Translation
Request	Anthropic `tool_use`	OpenAI `role:"assistant"` + `tool_calls`	`input` → `arguments`, `id` → `tool_call_id`
Request	Anthropic `tool_result`	OpenAI `role:"tool"`	`tool_use_id` → `tool_call_id`, `content` → `content`
Request	Anthropic `tool_use`	Google `functionCall`	`input` → `args`, `thoughtSignature` from server-side store
Request	Anthropic `tool_result`	Google `functionResponse`	`tool_use_id` → name lookup via `toolIdToName` map
Response	OpenAI `delta.tool_calls`	Anthropic `tool_use` stream	Accumulate chunks → `content_block_start` + `input_json_delta`
Response	Google `functionCall`	Anthropic `tool_use` stream	Extract `thoughtSignature` → store server-side

Babysitter Plugin Architecture reference

How the babysitter orchestration system works across coding agent harnesses, from plugin generation through session lifecycle to run completion.

Page nodewiki/docs/development/03-babysitter-plugin-flows.mdNearby pages · 6Documents · 0

Continue reading

Nearby pages in the same section.

a5c: Overview & Development Philosophy Atlas Graph & Agent Catalog Transport Adapter & Proxy Triggers & CI Integration Kradle & Cloud Platform 07 — Live Stack QA Guide

Babysitter Plugin Architecture

How the babysitter orchestration system works across coding agent harnesses, from plugin generation through session lifecycle to run completion.

1. System Overview 2. The Three Muxes 3. Plugin Generation 4. Activation Modes 5. Session Lifecycle 6. Run Orchestration 7. Harness-Specific Flows

---

System Overview

Each layer solves a specific interoperability problem:

Adapter	Problem	Solution
hooks-adapter	Each harness has different hook event names, payloads, output formats	Normalizes native events to canonical phases, runs unified handlers, renders results back to harness format
transport-adapter	Harnesses speak different API protocols (Anthropic, OpenAI, Google)	HTTP proxy translates between the harness's native protocol and any upstream provider
adapters	Harnesses have different CLI surfaces, capabilities, plugin loading	Unified `adapters launch` resolves provider config, starts proxy if needed, spawns harness with correct args

---

The Three Muxes

hooks-adapter: Hook Surface Normalization

Each harness fires lifecycle hooks differently. hooks-adapter normalizes them into a canonical interface:

**Adapter families:**

**shell-hook** (Claude Code, Codex, Cursor, Gemini CLI): Hook handlers are shell scripts invoked via subprocess. stdin receives JSON event, stdout returns JSON result.
**programmatic** (Pi, OpenCode): Hook handlers are in-process functions. No subprocess overhead.

**Canonical phases:** sessionStart, stop, sessionEnd, preToolUse, postToolUse, userPromptSubmit, notification, preCompact, beforePromptBuild

Each adapter maps native event names → canonical phases via mappings.ts sourced from the atlas graph.

transport-adapter: Provider Protocol Bridge

When a harness needs to talk to a provider it doesn't support natively, transport-adapter runs as a local HTTP proxy:

**Engines:**

createOpenAICompletionEngine() — Foundry/Azure path. Handles input_schema → parameters tool normalization, streaming delta.tool_calls accumulation, tool_result → role:"tool" message translation.
createGoogleCompletionEngine() — Vertex/Gemini path. Handles functionCall/functionResponse translation, thoughtSignature server-side store for multi-turn preservation.

**When proxy is needed:** Determined by translateForHarness() — if the harness adapter declares proxyRequired: true for a given provider, transport-adapter bridges the gap.

adapters: Unified Launch Surface

adapters launch resolves provider config, decides if a proxy is needed, prepares harness automation state, and spawns the harness:

---

Plugin Generation

npm run generate:plugins compiles unified plugin source into harness-specific distributions via the extensions-adapter compiler:

**Per-harness output structure:**

Harness	Plugin Format	Hook Mechanism
Claude Code	`plugin.json` + `hooks/hooks.json` + shell scripts	Shell hooks: `babysitter-proxied-session-start.sh` → `a5c-hooks-adapter invoke --adapter claude`
Codex	`.codex-plugin/plugin.json` + `hooks/hooks.json` + shell scripts	Shell hooks via hooks.json (auto-detected at `./hooks/hooks.json`)
Pi	`package.json` with `pi.extensions`	In-process programmatic hooks
Gemini CLI	Gemini-native hook config	Shell hooks via adapter

**Installation:**

Claude Code: claude plugin marketplace add a5c-ai/babysitter-claude && claude plugin install --scope project babysitter@a5c.ai
Codex: codex plugin marketplace add a5c-ai/babysitter --ref staging --sparse .agents/plugins
Others: babysitter harness:install-plugin <harness> --workspace <cwd>

---

Activation Modes

The babysitter plugin activates differently depending on how the harness is launched:

Hook-Driven (Interactive)

The harness runs interactively with native hook support. Hooks drive the orchestration loop — the stop hook decides whether to continue or yield.

Agent-Driven (Non-Interactive)

The harness runs headless with -p or exec. No native hooks fire. The agent drives the loop in-turn by calling run:iterate repeatedly.

**Key: hookDriven=false** — The agent owns the loop. It calls run:iterate, executes effects, posts results, and loops until completion. No hooks needed.

Bridge-Hooks (Emulated)

When the harness is non-interactive but the babysitter lifecycle needs hooks, adapters launch --bridge-hooks emulates them via CLI calls:

Bridge-Interactive (PTY Bridge)

The harness runs interactively via PTY but presents structured NDJSON output externally. Used when the harness needs TTY for tool use but the caller wants machine-readable output:

---

Session Lifecycle

The `instructions:babysit-skill` Command

When the babysitter skill activates (via /babysitter:call or equivalent), it first calls instructions:babysit-skill to get orchestration guidance:

**Context detection:**

CI vs local, trigger type, repo info
Existing session state from ~/.a5c/state/hooks/sessions/
Active run state from .a5c/runs/
Library process suggestions matching active capabilities

Stop Hook Decision Logic

The stop hook is the key control point in hook-driven mode:

---

Run Orchestration

Run Lifecycle

Effect Types

Processes emit effects via ctx.task():

Effect Kind	Executed By	Example
`agent`	The coding agent (Claude Code, Codex, etc.)	"Write unit tests for the API module"
`skill`	A babysitter skill	"Run the TDD triplet skill"
`shell`	Direct shell command	"npm test", "git commit", "eslint --fix"

Journal Event Flow

Every state change is recorded in the run journal (.a5c/runs/<runId>/journal/):

Code

RUN_CREATED → EFFECT_REQUESTED → EFFECT_RESOLVED → EFFECT_REQUESTED → EFFECT_RESOLVED → RUN_COMPLETED

The replay engine reconstructs state from journal events, enabling resumption after crashes or session switches.

---

Harness-Specific Flows

Claude Code

Codex

Pi

---

Provider Path Details

When a harness speaks a different protocol than the upstream provider, transport-adapter bridges the gap:

**Message translation details:**

Direction	From	To	Key Translation
Request	Anthropic `tool_use`	OpenAI `role:"assistant"` + `tool_calls`	`input` → `arguments`, `id` → `tool_call_id`
Request	Anthropic `tool_result`	OpenAI `role:"tool"`	`tool_use_id` → `tool_call_id`, `content` → `content`
Request	Anthropic `tool_use`	Google `functionCall`	`input` → `args`, `thoughtSignature` from server-side store
Request	Anthropic `tool_result`	Google `functionResponse`	`tool_use_id` → name lookup via `toolIdToName` map
Response	OpenAI `delta.tool_calls`	Anthropic `tool_use` stream	Accumulate chunks → `content_block_start` + `input_json_delta`
Response	Google `functionCall`	Anthropic `tool_use` stream	Extract `thoughtSignature` → store server-side

Babysitter Plugin Architecture reference

Continue reading

Babysitter Plugin Architecture

Table of Contents

System Overview

The Three Muxes

hooks-adapter: Hook Surface Normalization

transport-adapter: Provider Protocol Bridge

adapters: Unified Launch Surface

Plugin Generation

Activation Modes

Hook-Driven (Interactive)

Agent-Driven (Non-Interactive)

Bridge-Hooks (Emulated)

Bridge-Interactive (PTY Bridge)

Session Lifecycle

The `instructions:babysit-skill` Command

Stop Hook Decision Logic

Run Orchestration

Run Lifecycle

Effect Types

Journal Event Flow

Harness-Specific Flows

Claude Code

Codex

Pi

Provider Path Details

Babysitter Plugin Architecture reference

Continue reading

Babysitter Plugin Architecture

Table of Contents

System Overview

The Three Muxes

hooks-adapter: Hook Surface Normalization

transport-adapter: Provider Protocol Bridge

adapters: Unified Launch Surface

Plugin Generation

Activation Modes

Hook-Driven (Interactive)

Agent-Driven (Non-Interactive)

Bridge-Hooks (Emulated)

Bridge-Interactive (PTY Bridge)

Session Lifecycle

The `instructions:babysit-skill` Command

Stop Hook Decision Logic

Run Orchestration

Run Lifecycle

Effect Types

Journal Event Flow

Harness-Specific Flows

Claude Code

Codex

Pi

Provider Path Details