mirror of
https://github.com/Hestia-Homes/agentic-toolkit.git
synced 2026-06-08 11:37:26 +00:00
Subscription-based counterpart to `agentic-toolkit run`. Instead of sandcastle + Docker + Anthropic API, dispatches each ready ticket to a fresh Claude Code subagent (general-purpose) — same fresh-context property as per-container sandcastle runs, but zero infra. Trade-off: no sandbox isolation. Recommend running on a clean checkout. Mirrors the CLI runner's project schema, phase logic, branch naming, status transitions, and idempotency. v1 fails on first error (no retry state machine yet) — failure-handler.ts parity is future work. README updated with two-path workflow diagram and comparison table. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
118 lines
5.5 KiB
Markdown
118 lines
5.5 KiB
Markdown
# agentic-toolkit
|
|
|
|
Domna's agentic toolkit. Two things in one repo:
|
|
|
|
1. **A curated, version-pinned skill set** for Claude Code (Matt Pocock's skills + Domna's own), installable into any target repo with one script.
|
|
2. **A sandcastle-based runner** that executes a GitHub Project of issues against a target repo, in either per-ticket-PR mode or single-PR mode.
|
|
|
|
## Quick start (consume in a target repo)
|
|
|
|
From the root of any Domna repo:
|
|
|
|
```sh
|
|
curl -fsSL https://raw.githubusercontent.com/Hestia-Homes/agentic-toolkit/main/setup.sh | bash
|
|
```
|
|
|
|
This installs the curated skills into the repo and writes `skills-lock.json`. Re-run whenever the toolkit bumps its pinned versions.
|
|
|
|
After installation, run `/setup-matt-pocock-skills` once per repo to record the issue tracker, triage labels, and domain-doc layout.
|
|
|
|
## Running the runner
|
|
|
|
The runner is invoked from inside this repo, pointing at a target repo.
|
|
|
|
Prerequisites:
|
|
- Docker Desktop running on macOS
|
|
- `GITHUB_TOKEN` env var with `repo` and `project` scopes
|
|
- A GitHub Project (v2) created via `/to-project` (or manually with the same Status schema)
|
|
|
|
```sh
|
|
git clone https://github.com/Hestia-Homes/agentic-toolkit.git
|
|
cd agentic-toolkit
|
|
npm install
|
|
npm run build
|
|
|
|
GITHUB_TOKEN=ghp_xxx \
|
|
GITHUB_VIEWER_LOGIN=KhalimCK \
|
|
node bin/run-sandcastle.js run \
|
|
--project 7 \
|
|
--mode per-ticket \
|
|
--owner Hestia-Homes \
|
|
--repo assessment-model \
|
|
--target-repo ~/Documents/hestia/assessment-model
|
|
```
|
|
|
|
Modes:
|
|
|
|
- `per-ticket` — one PR per issue, phase-gated. Runner exits between phases; re-run after PRs merge.
|
|
- `single-pr` — one PR for the whole DAG. Runner halts on any failure.
|
|
|
|
## Workflow
|
|
|
|
```
|
|
┌─→ agentic-toolkit run --project N --mode <variant> (Docker + ANTHROPIC_API_KEY)
|
|
/grill-me → /to-prd → /to-issues → /to-project ─┤
|
|
└─→ /ralph-loop project=N mode=<variant> (Claude Code subscription, no Docker)
|
|
```
|
|
|
|
`to-project` and `ralph-loop` skills live under `skills/engineering/` and are installed by `setup.sh`.
|
|
|
|
### Pick a runner
|
|
|
|
| Path | Auth | Sandbox | Cost | Parallelism |
|
|
|------|------|---------|------|-------------|
|
|
| `agentic-toolkit run` (sandcastle) | `ANTHROPIC_API_KEY` | Docker container | API metered | Per-container |
|
|
| `/ralph-loop` (skill) | Claude Code subscription | None — host repo | Subscription flat | Serial (one ticket at a time per Claude session) |
|
|
|
|
`/ralph-loop` is the zero-infra path: no Docker, no API key. It dispatches each ticket to a fresh Claude Code subagent (clean context per tick) using the same project schema and phase logic as the CLI runner. Trade-off: no sandbox isolation — run on a clean checkout. See `skills/engineering/ralph-loop/SKILL.md`.
|
|
|
|
## Architecture (modules in `src/modules/`)
|
|
|
|
| Module | Role |
|
|
|------------------------|------------------------------------------------------------------------------------------------|
|
|
| `PhaseScheduler` | Pure: topological sort of `Blocked by` → ordered phases. |
|
|
| `PromptBuilder` | Pure: build the per-ticket agent prompt. |
|
|
| `FailureHandler` | Pure state machine: retry / skip / halt given variant + retry count. |
|
|
| `ProjectStateClient` | GitHub Projects v2 + Issues GraphQL (read state, claim, set status, comment). |
|
|
| `BranchManager` | Git + `gh` ops in the target repo (push, open PR). |
|
|
| `AgentRunner` | Wraps `sandcastle.run()` with Docker provider and Claude Code agent. |
|
|
| `LoopOrchestrator` | Wires the above; runs the per-tick algorithm. |
|
|
|
|
### Variant differences
|
|
|
|
| Concern | per-ticket | single-pr |
|
|
|-----------------------|---------------------------|-----------------------------|
|
|
| Branches | one per issue | one per project, reused |
|
|
| PRs | one per issue | one for the whole DAG |
|
|
| Phase gates | yes (exit between phases) | no (topological order only) |
|
|
| HITL mid-run | issue parked; peers continue | runner halts |
|
|
| Failure after retry | skip + continue | halt |
|
|
|
|
### Project Status field
|
|
|
|
`/to-project` configures a single-select `Status` field with these options:
|
|
|
|
- `Backlog` — has unmet blockers.
|
|
- `Ready` — runner-pickable; AFK with all blockers Done.
|
|
- `In progress` — being executed by an agent right now.
|
|
- `In review` — PR open, waiting for human merge.
|
|
- `Needs human` — failed twice, or HITL.
|
|
- `Done` — issue closed (set automatically on PR merge by Projects' built-in workflow).
|
|
|
|
## Development
|
|
|
|
```sh
|
|
npm install
|
|
npm test
|
|
npm run typecheck
|
|
```
|
|
|
|
Pure modules (`PhaseScheduler`, `PromptBuilder`, `FailureHandler`, `ProjectStateClient` parsers) are unit-tested. Integration with sandcastle / git / GraphQL is exercised manually before each release.
|
|
|
|
## Out of scope (v1)
|
|
|
|
- Remote / parallel runners across machines (local-first).
|
|
- Slack / email failure notifications (issue comments only).
|
|
- Stacked PRs and phase branches.
|
|
- Cross-repo projects.
|
|
- Pinning Matt Pocock skills to a specific commit SHA — `setup.sh` tracks HEAD for now; SHA pinning will land when the upstream `skills` CLI supports `repo#sha`.
|