agentic-toolkit/README.md
Khalim Conn-Kowlessar 3f27d10fb6 feat: add /ralph-loop skill (subscription-based runner)
Subscription-based counterpart to `agentic-toolkit run`. Instead of
sandcastle + Docker + Anthropic API, dispatches each ready ticket to
a fresh Claude Code subagent (general-purpose) — same fresh-context
property as per-container sandcastle runs, but zero infra.

Trade-off: no sandbox isolation. Recommend running on a clean checkout.

Mirrors the CLI runner's project schema, phase logic, branch naming,
status transitions, and idempotency. v1 fails on first error (no retry
state machine yet) — failure-handler.ts parity is future work.

README updated with two-path workflow diagram and comparison table.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-05 11:08:31 +00:00

5.5 KiB

agentic-toolkit

Domna's agentic toolkit. Two things in one repo:

  1. A curated, version-pinned skill set for Claude Code (Matt Pocock's skills + Domna's own), installable into any target repo with one script.
  2. A sandcastle-based runner that executes a GitHub Project of issues against a target repo, in either per-ticket-PR mode or single-PR mode.

Quick start (consume in a target repo)

From the root of any Domna repo:

curl -fsSL https://raw.githubusercontent.com/Hestia-Homes/agentic-toolkit/main/setup.sh | bash

This installs the curated skills into the repo and writes skills-lock.json. Re-run whenever the toolkit bumps its pinned versions.

After installation, run /setup-matt-pocock-skills once per repo to record the issue tracker, triage labels, and domain-doc layout.

Running the runner

The runner is invoked from inside this repo, pointing at a target repo.

Prerequisites:

  • Docker Desktop running on macOS
  • GITHUB_TOKEN env var with repo and project scopes
  • A GitHub Project (v2) created via /to-project (or manually with the same Status schema)
git clone https://github.com/Hestia-Homes/agentic-toolkit.git
cd agentic-toolkit
npm install
npm run build

GITHUB_TOKEN=ghp_xxx \
GITHUB_VIEWER_LOGIN=KhalimCK \
node bin/run-sandcastle.js run \
  --project 7 \
  --mode per-ticket \
  --owner Hestia-Homes \
  --repo assessment-model \
  --target-repo ~/Documents/hestia/assessment-model

Modes:

  • per-ticket — one PR per issue, phase-gated. Runner exits between phases; re-run after PRs merge.
  • single-pr — one PR for the whole DAG. Runner halts on any failure.

Workflow

                                                ┌─→ agentic-toolkit run --project N --mode <variant>   (Docker + ANTHROPIC_API_KEY)
/grill-me  →  /to-prd  →  /to-issues  →  /to-project ─┤
                                                └─→ /ralph-loop project=N mode=<variant>              (Claude Code subscription, no Docker)

to-project and ralph-loop skills live under skills/engineering/ and are installed by setup.sh.

Pick a runner

Path Auth Sandbox Cost Parallelism
agentic-toolkit run (sandcastle) ANTHROPIC_API_KEY Docker container API metered Per-container
/ralph-loop (skill) Claude Code subscription None — host repo Subscription flat Serial (one ticket at a time per Claude session)

/ralph-loop is the zero-infra path: no Docker, no API key. It dispatches each ticket to a fresh Claude Code subagent (clean context per tick) using the same project schema and phase logic as the CLI runner. Trade-off: no sandbox isolation — run on a clean checkout. See skills/engineering/ralph-loop/SKILL.md.

Architecture (modules in src/modules/)

Module Role
PhaseScheduler Pure: topological sort of Blocked by → ordered phases.
PromptBuilder Pure: build the per-ticket agent prompt.
FailureHandler Pure state machine: retry / skip / halt given variant + retry count.
ProjectStateClient GitHub Projects v2 + Issues GraphQL (read state, claim, set status, comment).
BranchManager Git + gh ops in the target repo (push, open PR).
AgentRunner Wraps sandcastle.run() with Docker provider and Claude Code agent.
LoopOrchestrator Wires the above; runs the per-tick algorithm.

Variant differences

Concern per-ticket single-pr
Branches one per issue one per project, reused
PRs one per issue one for the whole DAG
Phase gates yes (exit between phases) no (topological order only)
HITL mid-run issue parked; peers continue runner halts
Failure after retry skip + continue halt

Project Status field

/to-project configures a single-select Status field with these options:

  • Backlog — has unmet blockers.
  • Ready — runner-pickable; AFK with all blockers Done.
  • In progress — being executed by an agent right now.
  • In review — PR open, waiting for human merge.
  • Needs human — failed twice, or HITL.
  • Done — issue closed (set automatically on PR merge by Projects' built-in workflow).

Development

npm install
npm test
npm run typecheck

Pure modules (PhaseScheduler, PromptBuilder, FailureHandler, ProjectStateClient parsers) are unit-tested. Integration with sandcastle / git / GraphQL is exercised manually before each release.

Out of scope (v1)

  • Remote / parallel runners across machines (local-first).
  • Slack / email failure notifications (issue comments only).
  • Stacked PRs and phase branches.
  • Cross-repo projects.
  • Pinning Matt Pocock skills to a specific commit SHA — setup.sh tracks HEAD for now; SHA pinning will land when the upstream skills CLI supports repo#sha.