Jun 7, 2026·14 min read· trunk-based· workflows· feature-flags· ci· team

Trunk-based development: when it wins and when it doesn't

Trunk-based development works for teams that have automated tests, feature flags, and a fast review culture. Without those, it hurts. Here is when TBD wins, when it loses, and why.

A friend of mine joined a 20-person engineering team last year. On her first day, the lead told her "we just moved to trunk-based development — we ship to main all the time, it is great." Three months later, she sent me a message. The team was deploying once a week, main was red half the time, and two engineers had quit.

What happened? The team adopted the surface of trunk-based development. They deleted develop. They merged feature branches faster. But they did not have the things that make trunk-based development actually work — automated tests they trusted, feature flags to hide unfinished work, and a code review culture fast enough to keep up.

Trunk-based development (TBD) is the modern default for web teams that deploy often and have the supporting practices in place. Outside that profile, the answer may be different. This post explains the prerequisites, the daily mechanics, where TBD genuinely wins, and where it quietly fails.

What "trunk-based" actually means

Everyone works on one shared branch — usually called main or trunk. New work goes onto small, short-lived branches that live for hours, not days. When the branch is ready, it merges back to main. There is no long-lived develop branch, no release branches that sit around for weeks, no integration branches.

The point is not that everyone commits directly to main. Most TBD teams use pull requests and code review. The point is that the gap between starting work and merging it is small. A branch born on Monday morning should be gone by Monday evening.

Paul Hammant's trunkbaseddevelopment.com is the most thorough reference for the model. He defines it as "a source-control branching model where developers collaborate on code in a single branch called trunk, resist any pressure to create other long-lived development branches by employing documented techniques."

That phrase — "documented techniques" — is doing a lot of work in the definition. Without those techniques, you do not have TBD. You just have a develop branch that you renamed.

The three prerequisites

Before a team can run TBD without pain, three things must be in place. Skip any of them and you will hurt.

1. Automated tests you actually trust

If you merge to main 20 times a day, you cannot test every change by hand. The test suite has to catch regressions. If engineers ignore failing tests because they are "flaky" or "always like that," main will break, and the team will lose trust in the workflow.

What "trust" looks like in practice:

A green test suite means the code is safe to deploy. A red one blocks merging.
Tests run in under 10 minutes for the most common changes. (Slower than that, and people start merging without waiting.)
Flaky tests get fixed or removed within a few days, not ignored for months.

The 2018 DORA State of DevOps report found that test automation is one of the strongest predictors of high software-delivery performance. TBD without it is not TBD; it is chaos.

2. Feature flags

A feature flag is a switch in code. Something like:

if (featureFlags.newSearchUI.enabled(user)) {
  return <NewSearchUI />;
}
return <OldSearchUI />;

Flags let you merge half-finished work to main without showing it to users. The new code is on disk. The new tests pass. But until you flip the flag, no real user sees the change.

This is what makes "short-lived branches" possible for big features. Without flags, a two-week feature requires a two-week branch. With flags, the same feature merges in small pieces every day, each one hidden until the whole thing is ready.

Common flag tools include LaunchDarkly, Optimizely, Unleash, Flagsmith, and many homegrown systems. The vendor matters less than the discipline of using flags consistently. Martin Fowler's Feature Toggles article is the clearest writeup of the patterns and trade-offs.

A practical warning: feature flags accumulate. After a year, your code is full of if (flag.enabled) checks for features that shipped six months ago. Plan to clean them up, or the codebase rots.

3. A fast review culture

If a pull request takes two days to get reviewed, branches are not short-lived. They are just delayed.

"Fast review" does not mean rushed review. It means a team norm that reviews happen within a few hours during the working day, not when someone gets around to it. Teams hit this in different ways — pairing, a rotating "reviewer of the day," small PRs that take 10 minutes to read, async chat reminders. The mechanism is less important than the result.

If your PRs sit for days waiting for review, fix that before adopting TBD. Otherwise the branch lifetime number will lie, and the workflow will not deliver the benefits.

A day on a trunk-based team

Here is what a normal day looks like for one engineer.

9:30 a.m. Pull latest main. Read the team channel for any "do not deploy" notices.

git checkout main
git pull --rebase

9:45 a.m. Pick up a small task — "Add a 'sort by date' option to the customer list." Estimated half a day.

git checkout -b sort-by-date

11:30 a.m. First version works. Tests pass locally. The new UI is wrapped in a feature flag (sort_by_date_v1), turned off in production.

git add .
git commit -m "feat(customers): add sort-by-date behind flag"
git push -u origin sort-by-date

11:35 a.m. Open a PR. Tag a teammate.

12:30 p.m. Teammate reviews, leaves three small comments. Engineer addresses them over lunch.

1:30 p.m. PR is approved. Merge queue picks it up, re-tests against the latest main, merges. The change is on main, but the feature flag is off, so no user sees it yet.

2:00 p.m. Start the next small change — extending the new sort to include the "by status" option, also behind the same flag.

5:00 p.m. Two more PRs merged. The flag is still off for real users, but the QA environment has it on, and the product manager has been clicking around to give feedback.

Friday morning. The flag flips on for 10% of users. Engineering watches dashboards. No errors. Flag goes to 100% by end of day. The next sprint, the flag and the old code path get cleaned up.

That is the rhythm. Small pieces. Same day. Flag-guarded. No long-lived branches.

Where TBD genuinely wins

The evidence for TBD as a high-performing pattern is strong, but the evidence is correlational. High-performing teams tend to use short branches. They also tend to deploy often, automate tests, and run blameless retros. TBD travels with these habits.

Three settings where TBD wins clearly:

Web SaaS, daily or more deploys. If you control the servers, can deploy whenever you want, and have feature flags, TBD removes branch overhead with no real downside. Most modern web companies fall into this bucket.

Large monorepos. Google's 2016 paper "Why Google Stores Billions of Lines of Code in a Single Repository" describes a monorepo with thousands of engineers, all committing to one trunk. The whole system is built on the assumption that branches are short, work is incremental, and the trunk is always good. Meta and Stripe describe similar setups in their engineering blogs.

Teams using GitHub merge queues. GitHub's merge_group event (generally available since July 2023) re-tests each PR against the latest main before it actually merges. This solves the "two PRs each pass tests but together break main" problem that gets worse as a team grows. With a merge queue, a 50-engineer team can run TBD without main going red every afternoon.

Where TBD loses

There are real environments where TBD is the wrong tool. Be honest about your situation before adopting it.

Versioned software with long support windows

Suppose you ship an SDK that thousands of customer apps depend on. You have v3.0 in the wild, v3.1 about to ship, and v3.2 in development. A security bug is discovered in v3.0. Customers who have not upgraded need a patch.

TBD has no good answer to this. The fix needs to land on a release/3.0 branch that has not received new features in months. You need a structured way to ship parallel timelines. That is what Gitflow and release-branch workflows are for. Post #7 of this series covers the mechanics in detail.

Slow test suites

If your test suite takes two hours, TBD does not work. Engineers cannot wait two hours to merge a small change. Either tests get skipped, or the queue backs up, or branches get longer to amortise the wait — and then it is not TBD anymore.

Fix the slow tests first. Parallelise, split, mock external services, run only affected tests for small changes. Until the suite is fast, TBD will fight you.

Regulated industries with formal change control

Medical devices, financial trading, aerospace, parts of healthcare — these industries often require formal change-control records. Every change needs an approval trail. Some require formal QA cycles on a frozen build before release.

TBD can work here, but the supporting paperwork has to fit. Often a more structured workflow (GitLab Flow, Release Flow, or even Gitflow) gives auditors what they want with less custom tooling.

Teams that do not deploy often

If you ship one big release every quarter, the case for TBD is weaker. The benefits — fast feedback, small merges, low integration cost — exist, but you pay setup cost for tooling you do not fully use. A simpler workflow like GitHub Flow with periodic release tags often fits better.

Teams without feature flag infrastructure

This is the most common trap. A team reads that TBD is "the modern way," removes their develop branch, and starts merging unfinished work to main. With no feature flags, that unfinished work is now shipped to users on the next deploy. Bugs appear. The team either reverts to a longer branch lifetime (now you are back to the old workflow with extra steps), or ships broken software.

Install the flag system before you adopt the workflow.

Three teams running TBD in different shapes

Abstract advice is hard to apply. Three real-shape teams running TBD, each at a different scale.

Team A: Three-person startup. One repo, daily deploys, no formal QA. Engineers commit directly to main for trivial changes and open PRs for anything non-trivial. PRs get reviewed within two hours, often by the only other engineer. Feature flags are a 50-line homegrown library. Branch lifetime is measured in hours, not days. The whole workflow document is two paragraphs.

Team B: 25-engineer SaaS. Monorepo with 30 services. Deploys 15–25 times per day. Every PR goes through a merge queue. Feature flags are managed in LaunchDarkly. CODEOWNERS routes payments PRs to the payments team and infra PRs to SRE. Tests for a typical PR run in 6 minutes; the merge queue re-tests against the latest main before merging. Median branch lifetime is 11 hours.

Team C: 200-engineer commerce platform. Monorepo. Deploys per service, often 50+ deploys per day across services. Three-tier review for risky areas (payments, identity, infra). Sharded CI runs only affected tests for a typical PR. Flags are managed through an internal platform with rollout dashboards. Engineers rarely touch develop-style branches — they do not exist. Releases are continuous; "release notes" are a generated changelog from PR titles.

The pattern: TBD looks similar in shape at every scale, but the supporting infrastructure scales with the team. Small teams need almost no tooling. Big teams need a lot. The workflow itself does not change much.

What about deployment cadence?

A frequent question: "do I have to deploy every commit to use TBD?"

No. TBD requires main to be deployable, not that it is deployed on every commit. Many TBD teams deploy on a schedule. Some examples:

A small team deploys twice a day, at 11am and 4pm, with a person watching.
A medium team deploys every PR merge during business hours, with automatic rollback on error budget breach.
A larger team deploys per-service on each merge, with canary stages and automated promotion to full traffic.

The discipline is about main always being shippable. Whether you ship is a deployment decision, not a Git decision. Confusing the two is one of the most common reasons teams reject TBD ("we cannot deploy 20 times a day, so TBD is not for us"). You do not need to. You only need main to be safe whenever you decide to ship.

Migration is a separate problem

You may read this post, agree with all of it, and still not know how to get there from your current Gitflow setup. That is normal. Migration is harder than adoption.

Post #8 in this series — Migrating from Gitflow to trunk-based development — is a step-by-step playbook for that move. The short version: gradual, over 2–4 months, in five phases. Do not try to migrate overnight. Teams that try usually grind to a halt and revert.

Common myths

Myth 1: "TBD means no branches." Wrong. TBD uses branches all the time — they just live for hours. The model is "branches are cheap and short," not "branches are forbidden." A PR is still a branch.

Myth 2: "TBD means every commit ships to users." Wrong. TBD requires every commit on main to be safe to deploy. It does not require you to actually deploy every commit. Many TBD teams deploy on a schedule (twice a day, once a week). The discipline is that main is always shippable, not that it is always shipped.

Myth 3: "Feature flags are only for A/B tests." Wrong. Most TBD flags are release flags — used to hide unfinished work for a few days until it is ready, then removed. Some flags do live longer for experiments or per-customer toggles, but the everyday use is short-lived "merge it, hide it, ship it when ready, delete the flag."