AGENTS.md Playground
Hands-on exercises for writing effective AGENTS.md files. Each exercise targets a specific section — combine them in Exercise 6.
Exercise 1 — Mission Statement
Write a mission block (2–4 sentences). Pick a project:
| Option | Stack |
|---|---|
| A — E-commerce API | Python, FastAPI, PostgreSQL, Redis |
| B — Mobile backend | Go, gRPC, MongoDB |
| C — Design system | TypeScript, React, Storybook |
Reference Solution (Option A)
> **Project:** ShopAPI — RESTful e-commerce backend.
> **Stack:** Python 3.12, FastAPI, PostgreSQL 16, Redis 7.
> **Core constraint:** All endpoints must be idempotent;
> double-submit must never create duplicate orders.
Self-Check: understandable in 10 seconds | core constraint is a hard rule | no implementation details
Exercise 2 — Toolchain Registry
Create a toolchain table. Every row answers: "What command do I run?"
Reference Solution
| Action | Command | Authority |
|---|---|---|
| Build | uv run python -m build |
pyproject.toml |
| Test | uv run pytest -v --tb=short |
pyproject.toml |
| Lint | uv run ruff check . --fix |
pyproject.toml |
| Type check | uv run mypy app/ --strict |
pyproject.toml |
| Migrate | uv run alembic upgrade head |
alembic/ |
| Full verify | ruff check . && pytest -v && mypy app/ --strict |
Before every commit |
Self-Check: every action has a copy-pasteable command | "Authority" points to config file | no linter rules restated in prose
Exercise 3 — Judgment Boundaries
Write the three-tier system: NEVER (hard limits), ASK (pause for human), ALWAYS (proactive habits).
Reference Solution
NEVER: commit secrets/.env/credentials | run migrations without approval | add deps without discussion | force push to main | delete files to fix errors | guess on ambiguous specs
ASK: before DB schema changes | before modifying CI config | before deleting files | when multiple valid approaches exist
ALWAYS: explain plan before coding | run ruff check . && pytest -v after changes | handle all errors explicitly | write tests for new code
Self-Check: NEVER = irreversible/dangerous | ASK = judgment calls | ALWAYS = verifiable actions | no vague language
Exercise 4 — Escalation + Definition of Done
Write escalation rules (what to do when blocked) and a Definition of Done (exit checklist).
Reference Solution
Escalation:
- Tests fail 3× → stop, report full output and file path
- Missing dep → check
pyproject.toml, then ask before adding - Merge conflicts → stop, list files, never auto-resolve
- Unknown build error → share full traceback, do not retry blindly
- Never: delete files to fix errors, skip tests, force push
Definition of Done:
ruff check .exits 0pytest -vexits 0mypy app/ --strictexits 0- No new
# type: ignoreor# noqaadded - Commit follows:
type(scope): description
Exercise 5 — Full AGENTS.md
Combine Exercises 1–4 into a complete file. Add When… task-organized sections. Target: ≤ 150 lines.
Reference Solution
> **Project:** ShopAPI — RESTful e-commerce backend.
> **Stack:** Python 3.12, FastAPI, PostgreSQL 16, Redis 7.
> **Core constraint:** Idempotent endpoints; no duplicate orders.
## Toolchain
| Action | Command | Authority |
|---|---|---|
| Test | `uv run pytest -v --tb=short` | `pyproject.toml` |
| Lint | `uv run ruff check . --fix` | `pyproject.toml` |
| Type check | `uv run mypy app/ --strict` | `pyproject.toml` |
| Full verify | `ruff check . && pytest -v && mypy app/ --strict` | — |
## NEVER
- Commit secrets or `.env`
- Add dependencies without discussion
- Force push to main/develop
- Guess on ambiguous specs — stop and ask
## ASK
- Before DB schema changes / migrations
- Before deleting files
- When multiple valid approaches exist
## ALWAYS
- Explain plan before writing code
- Run full verify after changes
- Write tests for new code
## When Writing Code
- Follow existing patterns in `app/`
- Run `ruff check . --fix` after every file change
## When Reviewing Code
- Run `bandit -r app/` for security
- Verify coverage: `pytest --cov=app --cov-fail-under=80`
## Escalation
- Tests fail 3× → stop, report full output
- Missing dep → check pyproject.toml, then ask
- Merge conflicts → stop, list files, never auto-resolve
## Definition of Done
1. `ruff check .` exits 0 2. `pytest -v` exits 0 3. `mypy --strict` exits 0
4. Commit: `type(scope): description`
Exercise 6 — Monorepo Scoping
Design directory-scoped AGENTS.md files for this monorepo:
repo/
├── AGENTS.md ← root (shared rules)
├── services/
│ ├── api/ ← Python FastAPI
│ ├── web/ ← TypeScript React
│ └── worker/ ← Python Celery
└── packages/
└── shared-types/ ← TypeScript
Reference Solution
Root: mission + "never import across service boundaries" + make test-all / make lint-all
services/api/AGENTS.md: uv run pytest / uv run ruff check / uv run alembic upgrade head + "Pydantic in schemas/, SQLAlchemy in models/"
services/web/AGENTS.md: pnpm test / pnpm lint / pnpm build + "Components PascalCase, hooks use prefix, types from @shop/shared-types"
Self-Check: root = shared rules only | service files extend, not duplicate | each ≤ 150 lines
Audit Checklist
| # | Criterion |
|---|---|
| 1 | Total file ≤ 150 lines |
| 2 | Mission is 2–4 sentences |
| 3 | Every tool action has a copy-pasteable command |
| 4 | No linter/formatter rules restated in prose |
| 5 | NEVER / ASK / ALWAYS tiers present |
| 6 | Escalation covers: test fail, missing dep, conflict |
| 7 | Definition of Done uses commands, not opinions |
| 8 | No negative instructions — use tooling instead |
| 9 | No prose paragraphs — only tables, lists, commands |
| 10 | Agent can reproduce build commands verbatim |