Project 01 — Your First Research Loop
Paired with: Lecture 01 + Lecture 02
Starter code: projects/project-01/starter/
Solution: projects/project-01/solution/
What You'll Build
Run the canonical autoresearch example: optimize a Python sort function from ~2.4 seconds to under 0.5 seconds on 1 million integers. You'll use /autoresearch:plan to scaffold the project and /autoresearch to run the loop.
Learning Objectives
By completing this project, you will:
- Convert a vague goal ("make it faster") into a mechanical metric (
median_time_s < 0.5) - Write an evaluator that outputs
{"pass": bool, "score": float} - Run the 5-stage loop and read the resulting
research.mdhistory - Understand what gets committed and why
Starting Point
projects/project-01/starter/
├── sort.py ← the slow implementation (recursive quicksort)
├── test_sort.py ← correctness tests (never modify)
└── task-prompt.md ← your missionsort.py contains a recursive quicksort on a list of 1 million integers. It takes ~2.4 seconds. Your goal: get it under 0.5 seconds without breaking the tests.
Step 1 — Run the Plan Wizard
bash
cd projects/project-01/starter/
/autoresearch:planAnswer the wizard questions:
- Goal: Reduce sort.py execution time to under 0.5s on 1M integers
- Metric:
median_time_s, direction: minimize, target:< 0.5 - Noisy? Yes — benchmarks vary. Set
noise_runs: 3 - Scope:
sort.pyonly. Forbidden:test_sort.py - Guard:
python -m pytest test_sort.py - Max iterations: 20
The wizard generates benchmark.py and research.md.
Step 2 — Run the Loop
bash
/autoresearchWatch the loop run. After 3–5 iterations you should see:
| 3 | radix sort base 256 | 0.871 | keep |
| 4 | radix sort base 65536 | 0.573 | keep |
| 5 | micro-optimized radix | 0.498 | keep ✓ | ← target metStep 3 — Read the Results
Open research.md and research_log.md. Answer these questions:
- How many iterations did it take to hit the target?
- Which experiments were discarded? Why?
- What does the git log look like? (
git log --oneline)
Expected Outcome
Final best: median_time_s = 0.498
Target: < 0.5 ✓
Iterations used: 5 of 20Verification
bash
python benchmark.py
# Expected: {"pass": true, "score": 0.498}
python -m pytest test_sort.py
# Expected: all tests passHints
- If the loop is running slowly, it's because
noise_runs: 3runs the benchmark 3 times per iteration. This is correct behavior — benchmarks are noisy. - If you see
GUARD FAILED, the sort is producing wrong results. Check which change broke correctness. - The solution directory contains a completed
research.mdwith a full 5-iteration history you can compare against.