mirror of https://github.com/Hestia-Homes/agentic-toolkit.git synced 2026-06-08 11:37:26 +00:00

No description

Find a file

Jun-te Kim 677a0add9d Merge pull request #5 from Hestia-Homes/feature/hestia-skills-instlaled error in setup		2026-05-13 17:01:36 +01:00
bin	feat: scaffold agentic-toolkit (runner + skills + setup)	2026-05-03 12:40:26 +01:00
scripts	added tdd	2026-05-13 15:45:44 +00:00
skills/engineering	added tdd	2026-05-13 15:45:44 +00:00
src	fix: use named docker import from sandcastle sandboxes/docker	2026-05-03 12:41:35 +01:00
.gitignore	added tdd	2026-05-13 15:45:44 +00:00
package.json	feat: scaffold agentic-toolkit (runner + skills + setup)	2026-05-03 12:40:26 +01:00
README.md	feat: add /ralph-loop skill (subscription-based runner)	2026-05-05 11:08:31 +00:00
setup.sh	error in setup	2026-05-13 16:00:41 +00:00
skills-lock.json	added tdd	2026-05-13 15:45:44 +00:00
skills.config.json	added tdd	2026-05-13 15:45:44 +00:00
tsconfig.json	feat: scaffold agentic-toolkit (runner + skills + setup)	2026-05-03 12:40:26 +01:00
vitest.config.ts	feat: scaffold agentic-toolkit (runner + skills + setup)	2026-05-03 12:40:26 +01:00

README.md

agentic-toolkit

Domna's agentic toolkit. Two things in one repo:

A curated, version-pinned skill set for Claude Code (Matt Pocock's skills + Domna's own), installable into any target repo with one script.
A sandcastle-based runner that executes a GitHub Project of issues against a target repo, in either per-ticket-PR mode or single-PR mode.

Quick start (consume in a target repo)

From the root of any Domna repo:

curl -fsSL https://raw.githubusercontent.com/Hestia-Homes/agentic-toolkit/main/setup.sh | bash

This installs the curated skills into the repo and writes skills-lock.json. Re-run whenever the toolkit bumps its pinned versions.

After installation, run /setup-matt-pocock-skills once per repo to record the issue tracker, triage labels, and domain-doc layout.

Running the runner

The runner is invoked from inside this repo, pointing at a target repo.

Prerequisites:

Docker Desktop running on macOS
GITHUB_TOKEN env var with repo and project scopes
A GitHub Project (v2) created via /to-project (or manually with the same Status schema)

git clone https://github.com/Hestia-Homes/agentic-toolkit.git
cd agentic-toolkit
npm install
npm run build

GITHUB_TOKEN=ghp_xxx \
GITHUB_VIEWER_LOGIN=KhalimCK \
node bin/run-sandcastle.js run \
  --project 7 \
  --mode per-ticket \
  --owner Hestia-Homes \
  --repo assessment-model \
  --target-repo ~/Documents/hestia/assessment-model

Modes:

per-ticket — one PR per issue, phase-gated. Runner exits between phases; re-run after PRs merge.
single-pr — one PR for the whole DAG. Runner halts on any failure.

Workflow

                                                ┌─→ agentic-toolkit run --project N --mode <variant>   (Docker + ANTHROPIC_API_KEY)
/grill-me  →  /to-prd  →  /to-issues  →  /to-project ─┤
                                                └─→ /ralph-loop project=N mode=<variant>              (Claude Code subscription, no Docker)

to-project and ralph-loop skills live under skills/engineering/ and are installed by setup.sh.

Pick a runner

Path	Auth	Sandbox	Cost	Parallelism
`agentic-toolkit run` (sandcastle)	`ANTHROPIC_API_KEY`	Docker container	API metered	Per-container
`/ralph-loop` (skill)	Claude Code subscription	None — host repo	Subscription flat	Serial (one ticket at a time per Claude session)

/ralph-loop is the zero-infra path: no Docker, no API key. It dispatches each ticket to a fresh Claude Code subagent (clean context per tick) using the same project schema and phase logic as the CLI runner. Trade-off: no sandbox isolation — run on a clean checkout. See skills/engineering/ralph-loop/SKILL.md.

Architecture (modules in `src/modules/`)

Module	Role
`PhaseScheduler`	Pure: topological sort of `Blocked by` → ordered phases.
`PromptBuilder`	Pure: build the per-ticket agent prompt.
`FailureHandler`	Pure state machine: retry / skip / halt given variant + retry count.
`ProjectStateClient`	GitHub Projects v2 + Issues GraphQL (read state, claim, set status, comment).
`BranchManager`	Git + `gh` ops in the target repo (push, open PR).
`AgentRunner`	Wraps `sandcastle.run()` with Docker provider and Claude Code agent.
`LoopOrchestrator`	Wires the above; runs the per-tick algorithm.

Variant differences

Concern	per-ticket	single-pr
Branches	one per issue	one per project, reused
PRs	one per issue	one for the whole DAG
Phase gates	yes (exit between phases)	no (topological order only)
HITL mid-run	issue parked; peers continue	runner halts
Failure after retry	skip + continue	halt

Project Status field

/to-project configures a single-select Status field with these options:

Backlog — has unmet blockers.
Ready — runner-pickable; AFK with all blockers Done.
In progress — being executed by an agent right now.
In review — PR open, waiting for human merge.
Needs human — failed twice, or HITL.
Done — issue closed (set automatically on PR merge by Projects' built-in workflow).

Development

npm install
npm test
npm run typecheck

Pure modules (PhaseScheduler, PromptBuilder, FailureHandler, ProjectStateClient parsers) are unit-tested. Integration with sandcastle / git / GraphQL is exercised manually before each release.

Out of scope (v1)

Remote / parallel runners across machines (local-first).
Slack / email failure notifications (issue comments only).
Stacked PRs and phase branches.
Cross-repo projects.
Pinning Matt Pocock skills to a specific commit SHA — setup.sh tracks HEAD for now; SHA pinning will land when the upstream skills CLI supports repo#sha.