Templates
Copy these into your project to start a research loop immediately.
research.md
The single source of truth for any autoresearch session. The agent reads this at the start of every iteration.
markdown
# Research: [Goal Title]
## Goal
[Plain-language description of what you're trying to achieve]
## Success Metric
- Metric: [metric_name_with_units]
- Target: [< 0.5 | > 0.95 | == 0]
- Direction: [minimize | maximize]
## Constraints
- Max iterations: 20
- Evaluator: python evaluate.py
- Keep policy: score_improvement
- Guard: [command that must always pass, e.g. python -m pytest]
- Noise runs: 3
- Min delta: 0
## History
| # | Change | Metric | Result | Timestamp |
|---|--------|--------|--------|-----------|
| 0 | Baseline ([description]) | [value] | -- | [YYYY-MM-DD] |evaluate.py
Generic evaluator template. Replace the measurement logic with your domain.
python
#!/usr/bin/env python3
"""
Evaluator contract: print {"pass": bool, "score": float} then exit 0.
Wrap the main command in: timeout 5m python evaluate.py
"""
import json, sys, statistics, time, subprocess
TARGET = 0.5 # change to your target
DIRECTION = "min" # "min" or "max"
N_RUNS = 3 # set to 1 if not noisy
def measure() -> float:
"""Replace this with your actual measurement."""
times = []
for _ in range(N_RUNS):
t0 = time.perf_counter()
subprocess.run(["python", "main.py"], check=True, capture_output=True)
times.append(time.perf_counter() - t0)
return statistics.median(times)
score = measure()
passed = (score < TARGET) if DIRECTION == "min" else (score > TARGET)
print(json.dumps({"pass": passed, "score": round(score, 4)}))
sys.exit(0)autoresearch-results.tsv
The machine-readable results log. Eight columns, tab-separated.
iteration commit metric delta status description timestamp notes
0 a1b2c3d 2.3991 0.0 baseline recursive quicksort 2026-04-20
1 b2c3d4e 0.8709 -1.5282 keep radix sort base 256 2026-04-20
2 - 0.9122 +0.0413 discard radix sort base 128 2026-04-20 worse
3 c3d4e5f 0.4979 -0.3730 keep micro-optimized radix 2026-04-20 TARGET METresearch_log.md
Detailed per-iteration notes. Append after every iteration.
markdown
# Research Log: [Goal Title]
## Iteration 1
**Hypothesis**: [specific, testable hypothesis with mechanism]
**Change**: [what was changed, file:line]
**Result**: [score] — [KEEP | DISCARD]
**Notes**: [what was learned, what to try next]
## Iteration 2
...AGENTS.md (for research projects)
markdown
# Research Agent Instructions
## Goal
[One-sentence goal from research.md]
## Workflow
1. Read research.md — understand goal, metric, full history
2. Read git log — what has been tried?
3. Propose ONE hypothesis
4. Make ONE change
5. Run evaluator: python evaluate.py
6. Keep or revert based on score
7. Log to research.md and research_log.md
8. Repeat
## Forbidden
- Do not modify evaluate.py or test files
- Do not make more than one change per iteration
- Do not ask "should I continue?" — keep going until target or max_iterations