Architecture
System Overview
claude-skill-tools is built around three CLI tools — composer, sandbox, and session-explorer — plus a shared utility layer. All three tools orchestrate Claude Code CLI sessions in isolated git worktrees, but they serve different purposes:
- Composer is the orchestration engine. It runs multi-step compositions that chain specialized agent roles into a workflow.
- Sandbox manages the git worktrees where agents do their work, and runs the ralph developer/reviewer iteration loop.
- Session Explorer parses Claude’s JSONL session logs and produces analysis reports or hosts an interactive browser for exploring session data.
The shared layer provides path resolution, ANSI formatting, configuration management, and common utilities used by all three tools.
There are zero runtime dependencies. The only packages in node_modules are TypeScript and @types/node, both dev dependencies. Every shell command is executed via Node’s built-in child_process module. Every HTTP server uses Node’s built-in http module. There is no CLI framework, no argument parser library, no test utility beyond Vitest.
Project Structure
prompts/ Role prompt markdown files (analyst, architect,
developer, developer_single, reviewer, tester)
hooks/ PreToolUse guard hook (sandbox-guard.sh)
src/
bin/ CLI entry point shims
composer.ts Re-exports src/composer/composer.ts
sandbox.ts Re-exports src/sandbox/sandbox.ts
session-explorer.ts Re-exports src/session-explorer/index.ts
composer/ Orchestration engine
config/
compositions.ts Composition definitions (step sequences)
types.ts Composition, Step, StepType, SessionState types
composer.ts CLI entry: arg parsing, welcome banner, signal handlers
commands.ts All command implementations (compose, distill, report)
execution.ts Step loop: runComposition(), template resolution,
interactive prompt (n/s/p/q)
state.ts JSON state persistence (load/save SessionState)
tmux.ts Split-pane execution for tmux environments
sandbox/ Worktree management and ralph loop
config/
paths.ts Sandbox-specific path constants
types.ts SandboxState, RalphIteration types
sandbox.ts CLI entry: command dispatch, worktree create/delete/list
ralph-helpers.ts Agent execution: runAgentWithTimer() (headless with
progress), runInteractiveAgentWithLog() (interactive
with transcript), generateReadableLog(), comment
parsing (parseComments, filterIgnored, expandRanges)
distill.ts Feature request synthesis from sandbox artifacts
retro.ts Learning extraction from completed sessions
audit.ts Audit log generation from audit-raw.jsonl
sandbox-guard.ts TypeScript guard hook (Windows compatibility)
connectors/ External integrations
ado-pull-request/
create.ts Branch push, PR description assembly, az repos pr create
ado-work-item/
fetch.ts Work item fetch via az boards work-item show
metrics/ Session tracking and cost analysis
session-map.ts Composer-to-Claude session ID mapping
(stored in ~/claude-skill-tools/session-maps/)
session-metrics.ts JSONL log parsing: token usage, cost, tool call
breakdowns; HTML/text/JSON report generation
uuid.ts Deterministic session ID generation
session-explorer/ Single-session analysis and browsing
index.ts CLI entry: two modes (report + server)
parser.ts JSONL session log parser
summary.ts Metric computation from parsed sessions
report.ts HTML report generation
server.ts Local HTTP server for interactive browsing
shared/ Common utilities
paths.ts PACKAGE_ROOT, resolveRepoRoot(), state dir helpers
config.ts Config resolution: repo-level override at
.claude/.skill-state/config.json, falling back to
~/claude-skill-tools/config.json
ui.ts ANSI formatting: colors, banners, error blocks,
die(). Respects NO_COLOR env var.
utils.ts promptUser(), nowISO(), sleep(), copyDirIfExists()
tests/
tier1/ Pure function tests (no I/O, no mocking)
tier2/ Filesystem tests with temp dirs and .git/ setup
helpers/
fixtures.ts createTempDir, removeTempDir, writeJson, writeFileKey Design Decisions
Zero runtime dependencies
Every dependency is a maintenance burden, a supply chain risk, and a layer of indirection between you and the behavior of the tool. claude-skill-tools uses only Node built-ins:
child_process.spawnSyncandchild_process.spawnfor shell commandsfsandpathfor file operationshttpfor the session-explorer serverreadlinefor interactive prompts
This makes the tool fully auditable. You can read every line of code that executes.
Manual argument parsing
CLI arguments are parsed with switch/case blocks in each tool’s main function. There is no yargs, no commander, no minimist. This keeps the argument handling co-located with the command dispatch and avoids the abstraction overhead of a CLI framework for what are relatively simple argument structures.
JSON state on disk
All persistent state is stored as JSON files. There is no database, no SQLite, no key-value store.
- Composer sessions:
<repo>/.claude/.skill-state/composer/<sessionId>.json - Sandbox state:
<repo>/.claude/.skill-state/sandbox/<slug>.json - Session maps:
~/claude-skill-tools/session-maps/ - Configuration:
~/claude-skill-tools/config.json(user-level),<repo>/.claude/.skill-state/config.json(repo-level override)
JSON files are human-readable, diffable, and trivially debuggable. When something goes wrong, you can open the state file and see exactly what the tool thinks is happening.
Template variables
Step commands in compositions use {placeholder} syntax that is resolved at runtime via resolveTemplate(). For example, a step command might be:
claude -p "Review the code in {worktree}" --system-prompt prompts/reviewer.mdThe template engine replaces {worktree} with the actual worktree path from the session state. This keeps composition definitions declarative and readable.
ESM-only with .js extensions
The project is an ESM package ("type": "module" in package.json) targeting Node.js >= 18. All internal imports use .js extensions per NodeNext module resolution rules. Vitest resolves these to the corresponding .ts files during testing.
TypeScript strict mode
All strict checks are enabled. There are no any types. This catches a class of bugs at compile time that would otherwise surface as runtime errors in a tool that orchestrates long-running agent sessions.
Data Flow
A typical composer run follows this path:
CLI invocation (src/bin/composer.ts)
|
v
Arg parsing (composer.ts main())
|
v
Command dispatch (commands.ts cmdCompose())
|
v
Session creation (state.ts) -- generates session ID, persists initial state
|
v
Composition lookup (config/compositions.ts) -- resolves named composition
|
v
Step loop (execution.ts runComposition())
|
+---> For each step:
| |
| v
| Template resolution -- replace {placeholders} with runtime values
| |
| v
| Shell execution -- spawnSync or spawn (tmux: split pane + poll)
| |
| v
| State capture -- branch name, worktree path, exit code
| |
| v
| Interactive prompt -- n(ext), s(kip), p(rev), q(uit)
| |
| v
| State persistence -- save updated SessionState to JSON
|
v
Composition completeFor sandbox ralph iterations, the flow is:
sandbox ralph (sandbox.ts)
|
v
Developer agent (headless via claude -p)
|-- Real-time progress display (runAgentWithTimer)
|-- Auto-commits changes on completion
|
v
Reviewer agent (headless via claude -p)
|-- Writes comments.md with findings
|
v
User decision point
|-- Accept: done
|-- Ignore specific comments: filter and re-iterate
|-- Re-iterate: developer runs again with reviewer feedback
|
v
Ralph log updated (ralph-log.md) -- tracks all iterationsState Storage
State is split between two locations based on scope and lifetime:
Repo-level state (.claude/.skill-state/)
Lives inside the repository, under a path that is typically gitignored. Contains state that is specific to a particular repo’s active work:
| Directory | Contents |
|---|---|
composer/<sessionId>.json | Composer session state: current step, template vars, timestamps |
sandbox/<slug>.json | Sandbox state: worktree path, branch name, ralph iteration count |
config.json | Repo-level config overrides (ADO org, field mappings) |
User-level state (~/claude-skill-tools/)
Lives in the user’s home directory. Contains state that spans repositories or persists beyond a single repo’s lifecycle:
| Directory | Contents |
|---|---|
session-maps/ | Mappings from composer session IDs to Claude CLI session IDs |
parsed-cache/ | Cached parsed session data for session-explorer |
config.json | User-level default configuration |
Config resolution
Configuration uses a two-level merge strategy. The repo-level config at .claude/.skill-state/config.json is merged over the user-level config at ~/claude-skill-tools/config.json on a per-field basis. This allows repository-specific overrides (such as a different ADO organization) without duplicating the entire config.
Testing Strategy
Tests use Vitest and are organized into two tiers based on their I/O requirements:
Tier 1: Pure function tests (tests/tier1/)
These test pure logic with no filesystem access, no mocking, and no side effects. They are fast, deterministic, and have the highest return on investment. Examples include slug generation, template resolution, and comment parsing.
Tier 2: Filesystem tests (tests/tier2/)
These test behavior that requires a real filesystem. Each test:
- Creates a temporary directory
- Initializes a
.git/directory inside it - Changes the working directory to the temp dir
- Calls
_resetRepoRootCache()to clear the cached repo root - Runs the test
- Restores the original working directory in
afterAll
Shared fixtures (tests/helpers/fixtures.ts) provide createTempDir, removeTempDir, writeJson, and writeFile utilities.
Coverage
Run npm run test:coverage for a v8 coverage report. There is no configured coverage threshold, but the goal is comprehensive tier 1 coverage for all pure logic and targeted tier 2 coverage for filesystem-dependent behavior.
No linter
There is no ESLint or Prettier configuration. TypeScript strict mode serves as the primary static analysis tool. Code style consistency is maintained by convention rather than enforcement tooling.