agentic-toolkit/README.md

# agentic-toolkit

Domna's agentic toolkit. Two things in one repo:

1. **A curated, version-pinned skill set** for Claude Code (Matt Pocock's skills + Domna's own), installable into any target repo with one script.
2. **A sandcastle-based runner** that executes a GitHub Project of issues against a target repo, in either per-ticket-PR mode or single-PR mode.

## Quick start (consume in a target repo)

From the root of any Domna repo:

```sh
curl -fsSL https://raw.githubusercontent.com/Hestia-Homes/agentic-toolkit/main/setup.sh | bash
```

This installs the curated skills into the repo and writes `skills-lock.json`. Re-run whenever the toolkit bumps its pinned versions.

After installation, run `/setup-matt-pocock-skills` once per repo to record the issue tracker, triage labels, and domain-doc layout.

## Running the runner

The runner is invoked from inside this repo, pointing at a target repo.

Prerequisites:
- Docker Desktop running on macOS
- `GITHUB_TOKEN` env var with `repo` and `project` scopes
- A GitHub Project (v2) created via `/to-project` (or manually with the same Status schema)

```sh
git clone https://github.com/Hestia-Homes/agentic-toolkit.git
cd agentic-toolkit
npm install
npm run build

GITHUB_TOKEN=ghp_xxx \
GITHUB_VIEWER_LOGIN=KhalimCK \
node bin/run-sandcastle.js run \
  --project 7 \
  --mode per-ticket \
  --owner Hestia-Homes \
  --repo assessment-model \
  --target-repo ~/Documents/hestia/assessment-model
```

Modes:

- `per-ticket` — one PR per issue, phase-gated. Runner exits between phases; re-run after PRs merge.
- `single-pr` — one PR for the whole DAG. Runner halts on any failure.

## Workflow

```
                                                ┌─→ agentic-toolkit run --project N --mode <variant>   (Docker + ANTHROPIC_API_KEY)
/grill-me  →  /to-prd  →  /to-issues  →  /to-project ─┤
                                                └─→ /ralph-loop project=N mode=<variant>              (Claude Code subscription, no Docker)
```

`to-project` and `ralph-loop` skills live under `skills/engineering/` and are installed by `setup.sh`.

### Pick a runner

| Path | Auth | Sandbox | Cost | Parallelism |
|------|------|---------|------|-------------|
| `agentic-toolkit run` (sandcastle) | `ANTHROPIC_API_KEY` | Docker container | API metered | Per-container |
| `/ralph-loop` (skill) | Claude Code subscription | None — host repo | Subscription flat | Serial (one ticket at a time per Claude session) |

`/ralph-loop` is the zero-infra path: no Docker, no API key. It dispatches each ticket to a fresh Claude Code subagent (clean context per tick) using the same project schema and phase logic as the CLI runner. Trade-off: no sandbox isolation — run on a clean checkout. See `skills/engineering/ralph-loop/SKILL.md`.

## Architecture (modules in `src/modules/`)

| Module                 | Role                                                                                           |
|------------------------|------------------------------------------------------------------------------------------------|
| `PhaseScheduler`       | Pure: topological sort of `Blocked by` → ordered phases.                                       |
| `PromptBuilder`        | Pure: build the per-ticket agent prompt.                                                       |
| `FailureHandler`       | Pure state machine: retry / skip / halt given variant + retry count.                           |
| `ProjectStateClient`   | GitHub Projects v2 + Issues GraphQL (read state, claim, set status, comment).                  |
| `BranchManager`        | Git + `gh` ops in the target repo (push, open PR).                                             |
| `AgentRunner`          | Wraps `sandcastle.run()` with Docker provider and Claude Code agent.                           |
| `LoopOrchestrator`     | Wires the above; runs the per-tick algorithm.                                                  |

### Variant differences

| Concern               | per-ticket                | single-pr                   |
|-----------------------|---------------------------|-----------------------------|
| Branches              | one per issue             | one per project, reused     |
| PRs                   | one per issue             | one for the whole DAG       |
| Phase gates           | yes (exit between phases) | no (topological order only) |
| HITL mid-run          | issue parked; peers continue | runner halts             |
| Failure after retry   | skip + continue            | halt                        |

### Project Status field

`/to-project` configures a single-select `Status` field with these options:

- `Backlog` — has unmet blockers.
- `Ready` — runner-pickable; AFK with all blockers Done.
- `In progress` — being executed by an agent right now.
- `In review` — PR open, waiting for human merge.
- `Needs human` — failed twice, or HITL.
- `Done` — issue closed (set automatically on PR merge by Projects' built-in workflow).

## Development

```sh
npm install
npm test
npm run typecheck
```

Pure modules (`PhaseScheduler`, `PromptBuilder`, `FailureHandler`, `ProjectStateClient` parsers) are unit-tested. Integration with sandcastle / git / GraphQL is exercised manually before each release.

## Out of scope (v1)

- Remote / parallel runners across machines (local-first).
- Slack / email failure notifications (issue comments only).
- Stacked PRs and phase branches.
- Cross-repo projects.
- Pinning Matt Pocock skills to a specific commit SHA — `setup.sh` tracks HEAD for now; SHA pinning will land when the upstream `skills` CLI supports `repo#sha`.