AGENTS.md Playground

Hands-on exercises for writing effective AGENTS.md files. Each exercise targets a specific section — combine them in Exercise 6.

Exercise 1 — Mission Statement

Write a mission block (2–4 sentences). Pick a project:

Option	Stack
A — E-commerce API	Python, FastAPI, PostgreSQL, Redis
B — Mobile backend	Go, gRPC, MongoDB
C — Design system	TypeScript, React, Storybook

Reference Solution (Option A)

> **Project:** ShopAPI — RESTful e-commerce backend.
> **Stack:** Python 3.12, FastAPI, PostgreSQL 16, Redis 7.
> **Core constraint:** All endpoints must be idempotent;
> double-submit must never create duplicate orders.

Self-Check: understandable in 10 seconds | core constraint is a hard rule | no implementation details

Exercise 2 — Toolchain Registry

Create a toolchain table. Every row answers: "What command do I run?"

Reference Solution

Action	Command	Authority
Build	`uv run python -m build`	`pyproject.toml`
Test	`uv run pytest -v --tb=short`	`pyproject.toml`
Lint	`uv run ruff check . --fix`	`pyproject.toml`
Type check	`uv run mypy app/ --strict`	`pyproject.toml`
Migrate	`uv run alembic upgrade head`	`alembic/`
Full verify	`ruff check . && pytest -v && mypy app/ --strict`	Before every commit

Self-Check: every action has a copy-pasteable command | "Authority" points to config file | no linter rules restated in prose

Exercise 3 — Judgment Boundaries

Write the three-tier system: NEVER (hard limits), ASK (pause for human), ALWAYS (proactive habits).

Reference Solution

ASK: before DB schema changes | before modifying CI config | before deleting files | when multiple valid approaches exist

ALWAYS: explain plan before coding | run ruff check . && pytest -v after changes | handle all errors explicitly | write tests for new code

Self-Check: NEVER = irreversible/dangerous | ASK = judgment calls | ALWAYS = verifiable actions | no vague language

Exercise 4 — Escalation + Definition of Done

Write escalation rules (what to do when blocked) and a Definition of Done (exit checklist).

Reference Solution

Escalation:

Tests fail 3× → stop, report full output and file path
Missing dep → check pyproject.toml, then ask before adding
Merge conflicts → stop, list files, never auto-resolve
Unknown build error → share full traceback, do not retry blindly
Never: delete files to fix errors, skip tests, force push

Definition of Done:

ruff check . exits 0
pytest -v exits 0
mypy app/ --strict exits 0
No new # type: ignore or # noqa added
Commit follows: type(scope): description

Exercise 5 — Full AGENTS.md

Combine Exercises 1–4 into a complete file. Add When… task-organized sections. Target: ≤ 150 lines.

Reference Solution

> **Project:** ShopAPI — RESTful e-commerce backend.
> **Stack:** Python 3.12, FastAPI, PostgreSQL 16, Redis 7.
> **Core constraint:** Idempotent endpoints; no duplicate orders.

## Toolchain
| Action | Command | Authority |
|---|---|---|
| Test | `uv run pytest -v --tb=short` | `pyproject.toml` |
| Lint | `uv run ruff check . --fix` | `pyproject.toml` |
| Type check | `uv run mypy app/ --strict` | `pyproject.toml` |
| Full verify | `ruff check . && pytest -v && mypy app/ --strict` | — |

## NEVER
- Commit secrets or `.env`
- Add dependencies without discussion
- Force push to main/develop
- Guess on ambiguous specs — stop and ask

## ASK
- Before DB schema changes / migrations
- Before deleting files
- When multiple valid approaches exist

## ALWAYS
- Explain plan before writing code
- Run full verify after changes
- Write tests for new code

## When Writing Code
- Follow existing patterns in `app/`
- Run `ruff check . --fix` after every file change

## When Reviewing Code
- Run `bandit -r app/` for security
- Verify coverage: `pytest --cov=app --cov-fail-under=80`

## Escalation
- Tests fail 3× → stop, report full output
- Missing dep → check pyproject.toml, then ask
- Merge conflicts → stop, list files, never auto-resolve

## Definition of Done
1. `ruff check .` exits 0  2. `pytest -v` exits 0  3. `mypy --strict` exits 0
4. Commit: `type(scope): description`

Exercise 6 — Monorepo Scoping

Design directory-scoped AGENTS.md files for this monorepo:

repo/
├── AGENTS.md              ← root (shared rules)
├── services/
│   ├── api/               ← Python FastAPI
│   ├── web/               ← TypeScript React
│   └── worker/            ← Python Celery
└── packages/
    └── shared-types/      ← TypeScript

Reference Solution

Root: mission + "never import across service boundaries" + make test-all / make lint-all

services/api/AGENTS.md: uv run pytest / uv run ruff check / uv run alembic upgrade head + "Pydantic in schemas/, SQLAlchemy in models/"

services/web/AGENTS.md: pnpm test / pnpm lint / pnpm build + "Components PascalCase, hooks use prefix, types from @shop/shared-types"

Self-Check: root = shared rules only | service files extend, not duplicate | each ≤ 150 lines

Audit Checklist

#	Criterion
1	Total file ≤ 150 lines
2	Mission is 2–4 sentences
3	Every tool action has a copy-pasteable command
4	No linter/formatter rules restated in prose
5	NEVER / ASK / ALWAYS tiers present
6	Escalation covers: test fail, missing dep, conflict
7	Definition of Done uses commands, not opinions
8	No negative instructions — use tooling instead
9	No prose paragraphs — only tables, lists, commands
10	Agent can reproduce build commands verbatim

AGENTS.md Playground

Exercise 1 — Mission Statement

Exercise 2 — Toolchain Registry

Exercise 3 — Judgment Boundaries

Exercise 4 — Escalation + Definition of Done

Exercise 5 — Full AGENTS.md

Exercise 6 — Monorepo Scoping

Audit Checklist

See also