# agentic-toolkit Domna's agentic toolkit. Two things in one repo: 1. **A curated, version-pinned skill set** for Claude Code (Matt Pocock's skills + Domna's own), installable into any target repo with one script. 2. **A sandcastle-based runner** that executes a GitHub Project of issues against a target repo, in either per-ticket-PR mode or single-PR mode. ## Quick start (consume in a target repo) From the root of any Domna repo: ```sh curl -fsSL https://raw.githubusercontent.com/Hestia-Homes/agentic-toolkit/main/setup.sh | bash ``` This installs the curated skills into the repo and writes `skills-lock.json`. Re-run whenever the toolkit bumps its pinned versions. After installation, run `/setup-matt-pocock-skills` once per repo to record the issue tracker, triage labels, and domain-doc layout. ## Running the runner The runner is invoked from inside this repo, pointing at a target repo. Prerequisites: - Docker Desktop running on macOS - `GITHUB_TOKEN` env var with `repo` and `project` scopes - A GitHub Project (v2) created via `/to-project` (or manually with the same Status schema) ```sh git clone https://github.com/Hestia-Homes/agentic-toolkit.git cd agentic-toolkit npm install npm run build GITHUB_TOKEN=ghp_xxx \ GITHUB_VIEWER_LOGIN=KhalimCK \ node bin/run-sandcastle.js run \ --project 7 \ --mode per-ticket \ --owner Hestia-Homes \ --repo assessment-model \ --target-repo ~/Documents/hestia/assessment-model ``` Modes: - `per-ticket` — one PR per issue, phase-gated. Runner exits between phases; re-run after PRs merge. - `single-pr` — one PR for the whole DAG. Runner halts on any failure. ## Workflow ``` ┌─→ agentic-toolkit run --project N --mode (Docker + ANTHROPIC_API_KEY) /grill-me → /to-prd → /to-issues → /to-project ─┤ └─→ /ralph-loop project=N mode= (Claude Code subscription, no Docker) ``` `to-project` and `ralph-loop` skills live under `skills/engineering/` and are installed by `setup.sh`. ### Pick a runner | Path | Auth | Sandbox | Cost | Parallelism | |------|------|---------|------|-------------| | `agentic-toolkit run` (sandcastle) | `ANTHROPIC_API_KEY` | Docker container | API metered | Per-container | | `/ralph-loop` (skill) | Claude Code subscription | None — host repo | Subscription flat | Serial (one ticket at a time per Claude session) | `/ralph-loop` is the zero-infra path: no Docker, no API key. It dispatches each ticket to a fresh Claude Code subagent (clean context per tick) using the same project schema and phase logic as the CLI runner. Trade-off: no sandbox isolation — run on a clean checkout. See `skills/engineering/ralph-loop/SKILL.md`. ## Architecture (modules in `src/modules/`) | Module | Role | |------------------------|------------------------------------------------------------------------------------------------| | `PhaseScheduler` | Pure: topological sort of `Blocked by` → ordered phases. | | `PromptBuilder` | Pure: build the per-ticket agent prompt. | | `FailureHandler` | Pure state machine: retry / skip / halt given variant + retry count. | | `ProjectStateClient` | GitHub Projects v2 + Issues GraphQL (read state, claim, set status, comment). | | `BranchManager` | Git + `gh` ops in the target repo (push, open PR). | | `AgentRunner` | Wraps `sandcastle.run()` with Docker provider and Claude Code agent. | | `LoopOrchestrator` | Wires the above; runs the per-tick algorithm. | ### Variant differences | Concern | per-ticket | single-pr | |-----------------------|---------------------------|-----------------------------| | Branches | one per issue | one per project, reused | | PRs | one per issue | one for the whole DAG | | Phase gates | yes (exit between phases) | no (topological order only) | | HITL mid-run | issue parked; peers continue | runner halts | | Failure after retry | skip + continue | halt | ### Project Status field `/to-project` configures a single-select `Status` field with these options: - `Backlog` — has unmet blockers. - `Ready` — runner-pickable; AFK with all blockers Done. - `In progress` — being executed by an agent right now. - `In review` — PR open, waiting for human merge. - `Needs human` — failed twice, or HITL. - `Done` — issue closed (set automatically on PR merge by Projects' built-in workflow). ## Development ```sh npm install npm test npm run typecheck ``` Pure modules (`PhaseScheduler`, `PromptBuilder`, `FailureHandler`, `ProjectStateClient` parsers) are unit-tested. Integration with sandcastle / git / GraphQL is exercised manually before each release. ## Out of scope (v1) - Remote / parallel runners across machines (local-first). - Slack / email failure notifications (issue comments only). - Stacked PRs and phase branches. - Cross-repo projects. - Pinning Matt Pocock skills to a specific commit SHA — `setup.sh` tracks HEAD for now; SHA pinning will land when the upstream `skills` CLI supports `repo#sha`.