agentic-toolkit/README.md
Khalim Conn-Kowlessar 1d8a77b29b feat: scaffold agentic-toolkit (runner + skills + setup)
Initial implementation of Domna's agentic toolkit per PRD #1:

- Runner CLI (src/cli.ts) wrapping sandcastle.run() with Docker provider
- Pure modules: PhaseScheduler, PromptBuilder, FailureHandler with tests
- Project Status v2 GraphQL client + parsers with tests
- BranchManager (git/gh wrapper) and LoopOrchestrator (per-tick algorithm)
- Variant-aware: per-ticket (one PR per issue, phase-gated, exit between phases)
  vs single-pr (one PR for the whole DAG, halt on failure)
- /to-project skill that creates a repo-level project, configures the Status
  schema the runner expects, and sets initial issue statuses
- setup.sh that installs Matt Pocock skills + Domna skills via npx skills

Out of scope at v1: remote runners, Slack notifications, stacked PRs,
cross-repo projects, SHA-pinning of upstream skills (tracks HEAD until the
skills CLI supports repo#sha).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-03 12:40:26 +01:00

4.6 KiB

agentic-toolkit

Domna's agentic toolkit. Two things in one repo:

  1. A curated, version-pinned skill set for Claude Code (Matt Pocock's skills + Domna's own), installable into any target repo with one script.
  2. A sandcastle-based runner that executes a GitHub Project of issues against a target repo, in either per-ticket-PR mode or single-PR mode.

Quick start (consume in a target repo)

From the root of any Domna repo:

curl -fsSL https://raw.githubusercontent.com/Hestia-Homes/agentic-toolkit/main/setup.sh | bash

This installs the curated skills into the repo and writes skills-lock.json. Re-run whenever the toolkit bumps its pinned versions.

After installation, run /setup-matt-pocock-skills once per repo to record the issue tracker, triage labels, and domain-doc layout.

Running the runner

The runner is invoked from inside this repo, pointing at a target repo.

Prerequisites:

  • Docker Desktop running on macOS
  • GITHUB_TOKEN env var with repo and project scopes
  • A GitHub Project (v2) created via /to-project (or manually with the same Status schema)
git clone https://github.com/Hestia-Homes/agentic-toolkit.git
cd agentic-toolkit
npm install
npm run build

GITHUB_TOKEN=ghp_xxx \
GITHUB_VIEWER_LOGIN=KhalimCK \
node bin/run-sandcastle.js run \
  --project 7 \
  --mode per-ticket \
  --owner Hestia-Homes \
  --repo assessment-model \
  --target-repo ~/Documents/hestia/assessment-model

Modes:

  • per-ticket — one PR per issue, phase-gated. Runner exits between phases; re-run after PRs merge.
  • single-pr — one PR for the whole DAG. Runner halts on any failure.

Workflow

/grill-me  →  /to-prd  →  /to-issues  →  /to-project  →  agentic-toolkit run --project N --mode <variant>

to-project lives in skills/engineering/to-project/SKILL.md and is installed by setup.sh.

Architecture (modules in src/modules/)

Module Role
PhaseScheduler Pure: topological sort of Blocked by → ordered phases.
PromptBuilder Pure: build the per-ticket agent prompt.
FailureHandler Pure state machine: retry / skip / halt given variant + retry count.
ProjectStateClient GitHub Projects v2 + Issues GraphQL (read state, claim, set status, comment).
BranchManager Git + gh ops in the target repo (push, open PR).
AgentRunner Wraps sandcastle.run() with Docker provider and Claude Code agent.
LoopOrchestrator Wires the above; runs the per-tick algorithm.

Variant differences

Concern per-ticket single-pr
Branches one per issue one per project, reused
PRs one per issue one for the whole DAG
Phase gates yes (exit between phases) no (topological order only)
HITL mid-run issue parked; peers continue runner halts
Failure after retry skip + continue halt

Project Status field

/to-project configures a single-select Status field with these options:

  • Backlog — has unmet blockers.
  • Ready — runner-pickable; AFK with all blockers Done.
  • In progress — being executed by an agent right now.
  • In review — PR open, waiting for human merge.
  • Needs human — failed twice, or HITL.
  • Done — issue closed (set automatically on PR merge by Projects' built-in workflow).

Development

npm install
npm test
npm run typecheck

Pure modules (PhaseScheduler, PromptBuilder, FailureHandler, ProjectStateClient parsers) are unit-tested. Integration with sandcastle / git / GraphQL is exercised manually before each release.

Out of scope (v1)

  • Remote / parallel runners across machines (local-first).
  • Slack / email failure notifications (issue comments only).
  • Stacked PRs and phase branches.
  • Cross-repo projects.
  • Pinning Matt Pocock skills to a specific commit SHA — setup.sh tracks HEAD for now; SHA pinning will land when the upstream skills CLI supports repo#sha.