CLI-First Skill Design NEW
Problem
When building agent skills (reusable capabilities), there's tension between:
- API-first design: Skills as functions/classes—great for programmatic use, but hard to debug and test manually
- GUI-first design: Skills as visual tools—easy for humans, but agents can't invoke them
Teams end up building two interfaces or choosing one audience over the other.
Solution
Design all skills as CLI tools first. A well-designed CLI is naturally dual-use: humans can invoke it from the terminal, and agents can invoke it via shell commands.
graph LR
A[Skill Logic] --> B[CLI Interface]
B --> C[Human: Terminal]
B --> D[Agent: Bash Tool]
B --> E[Scripts: Automation]
B --> F[Cron: Scheduled]
Core principles:
- One script, one skill: Each capability is a standalone executable
- Subcommands for operations:
skill.sh list,skill.sh get <id>,skill.sh create - Structured output: JSON for programmatic use, human-readable for TTY
- Exit codes: 0 for success, non-zero for errors (enables
&&chaining) - Environment config: Credentials via env vars, not hardcoded
# Example: Trello skill as CLI
trello.sh boards # List all boards
trello.sh cards <BOARD_ID> # List cards on board
trello.sh create <LIST_ID> "Title" # Create card
trello.sh move <CARD_ID> <LIST_ID> # Move card
# Human usage
$ trello.sh boards
{"id": "abc123", "name": "Personal", "url": "..."}
{"id": "def456", "name": "Work", "url": "..."}
# Agent usage (via Bash tool)
Bash: trello.sh cards abc123 | jq '.[0].name'
How to use it
Skill structure:
~/.claude/skills/
├── trello/
│ └── scripts/
│ └── trello.sh # Main CLI entry point
├── asana/
│ └── scripts/
│ └── asana.sh
├── honeybadger/
│ └── scripts/
│ └── honeybadger.sh
└── priority-report/
└── scripts/
└── priority-report.sh # Composes other skills
CLI design checklist:
- [ ] Standalone executable with shebang (
#!/bin/bash) - [ ] Help text via
--helpor no-args - [ ] Subcommands for CRUD operations
- [ ] JSON output (pipe to
jqfor formatting) - [ ] Credentials from
~/.envrcor environment - [ ] Meaningful exit codes
- [ ] Stderr for errors, stdout for data
Composition example:
# priority-report.sh composes multiple skill CLIs
#!/bin/bash
echo "## GitHub"
gh pr list --search "review-requested:@me"
echo "## Trello"
~/.claude/skills/trello/scripts/trello.sh cards abc123
echo "## Asana"
~/.claude/skills/asana/scripts/asana.sh tasks personal
Trade-offs
Pros:
- Dual-use by default: Same interface for humans and agents
- Debuggable: Run manually to test, inspect output
- Composable: Pipe, chain, and combine with Unix tools
- Portable: Works in any shell, no runtime dependencies
- Transparent: Agent's tool calls are visible shell commands
- Testable: Easy to write integration tests
Cons:
- Shell limitations: Complex data structures awkward in bash
- Error handling: Less structured than exceptions
- Performance: Process spawn overhead vs function calls
- State management: No persistent state between invocations
- Windows compatibility: Requires WSL or Git Bash
When to use something else:
- High-frequency calls (>100/sec): Use in-process functions
- Complex object graphs: Use structured API
- Real-time streaming: Use WebSocket/SSE
References
- Unix Philosophy: "Write programs that do one thing and do it well"
- Dual-Use Tool Design pattern
- Claude Code skills directory structure
- 12-Factor App: Config via environment