Book

ML Handbook: ML Engineering with AI

A handbook for ML engineers whose primary implementation engine is an AI agent. Practice for the era when the bottleneck is deciding and verifying, not typing.

35 chapters7 sections2.4 hr readintermediate

Read Now

AI-EngineeringClaude-CodeML-PracticeMLOps

35 chapters

Foundations

5 chapters

01
Why AI-augmented ML is different
The interesting changes are second-order. Most posts focus on the first-order wins; the leverage is in what happens when the cost of trying an idea collapses.
3 min read
02
The new bottlenecks
Honest accounting of an AI-augmented week shows "writing code" rarely tops the list anymore. Three things take its place.
3 min read
03
Claude Code mental model
To use the agent well, you need an honest model of what it is. Most failures come from treating it as something else.
3 min read
04
Agent context files
The highest-leverage category of files in any AI-assisted ML repo is the set agents read to figure out how to behave. CLAUDE.md is the most visible. It is one of several. They form a layered system; teams that treat…
8 min read
05
Operating principles
The short list. Print it. Pin it next to your monitor. Apply on every task.
2 min read

Iteration Loop

6 chapters

Tooling Stack

5 chapters

Three Modalities

3 chapters

Failure Modes

6 chapters

Documenting with AI

5 chapters

Recipes

5 chapters

ML Handbook: ML Engineering with AI

Table of Contents

Foundations

Why AI-augmented ML is different

The new bottlenecks

Claude Code mental model

Agent context files

Operating principles

Iteration Loop

Classic vs AI-augmented loop

Hypothesis → experiment → measure

Reproducibility is non-negotiable

Designing research loops

Reward & stop conditions

Case studies

Tooling Stack

Claude Code as the engine

Experiment tracking

Data & model versioning

Repo conventions

SOTA tool roundup (2026)

Three Modalities

Traditional ML with AI

Deep learning with AI

Agentic AI with AI

Failure Modes

Data leakage & dataset mixing

Hardcoded paths & magic constants

Fabricated benchmarks

Silent test failures

Hallucinated APIs & versions

Guardrails checklist

Documenting with AI

Doc-driven development

Numbers must be measured

The same-commit rule

Model & dataset cards

Client deliverables (DOCX / PPTX / XLSX)

Recipes

Recipe: spinning up a new project

Recipe: adding a baseline

Recipe: running a benchmark

Recipe: writing a model card

Recipe: debugging with Claude