Manual bisect is fine for 10 steps. For 20+, or when reproduction is annoying (build, run, click, check), automate it.
git bisect run <cmd> tells Git to execute <cmd> at each candidate commit. The command's exit code decides the verdict:
- 0 — commit is good
- 125 — skip (couldn't test this commit; usually means it won't build)
- any other 1-127 — commit is bad
- 128 or higher — abort the bisect (e.g. the script itself errored)
A typical regression-hunt script:
#!/usr/bin/env bash
set -e
npm install --silent || exit 125 # can't even install? skip
npm run build || exit 125 # build broke for unrelated reasons? skip
npm test -- --grep "checkout flow" # the actual probe
# npm test exits 0 = good, non-zero = bad — exactly what bisect wants
Then:
git bisect start HEAD v1.4.0 # bad..good in one shot
git bisect run ./scripts/probe.sh
Git checks out, runs the script, marks the commit, picks the next midpoint, repeats. You go get coffee. Ten minutes later it prints <hash> is the first bad commit.
Tips:
- Make the probe deterministic. Flaky probes give wrong answers; you'll bisect to a commit that didn't actually introduce the bug.
- Use
set -e so unexpected failures in setup don't get mistaken for "bad."
- Keep the probe fast. It runs once per bisect step (log₂ N times). Strip it to the minimum that proves the bug.
- For really expensive probes, narrow the range manually first (
git bisect good <ref> to skip ancient history) before handing off to run.