Multi-repo coordination: submodules, subtrees, and internal packages compared
Three honest approaches to sharing code across many repos — Git submodules, Git subtrees, and internal package registries. When each fits, and the pitfalls of each.
A team I worked with last year shipped a small API change. The change touched a shared client library, two backend services, and one mobile app — four separate repositories. By the time the change was live in production, fourteen PRs had been opened, in a specific order, across four repos. Two of them merged before the others, breaking a staging environment for two days. The release shipped late.
This is the multi-repo coordination problem. When code is split across many repositories, changes that span multiple repos are awkward. There is no atomic commit. There is no single "merge button." You stitch the change together across repos and hope nothing breaks while you do it.
There are three honest approaches: Git submodules, Git subtrees, and internal package registries. None of them is universally right. This post compares them — what each one is, when each fits, what each one's failure modes look like.
The problem, more precisely
You have multiple repositories. They share some code or depend on each other. You want changes to flow between them without manual copy-paste.
Three flavors of the problem:
Shared library. Repo A is a service. Repo B is another service. Both need the same auth-client code. You want one source of truth for auth-client.
Shared types or contracts. Repo A is a backend. Repo B is a frontend. They share TypeScript types describing the API. Changes to those types must propagate.
Cross-repo coordinated change. A new API field is added in repo A. The frontend in repo B needs to use it. The mobile app in repo C also needs to use it. All three repos need to update around the same time.
The three approaches handle these differently.
Approach 1: Git submodules
A Git submodule is a pointer from one repo (the "superproject") to a specific commit in another repo (the "submodule"). The submodule's files appear inside the superproject's working directory, but they are managed as a separate Git repo with its own history.
The reference is the git-submodule(1) man page.
Add a submodule:
cd my-service
git submodule add https://github.com/org/auth-client libs/auth-client
git commit -m "Add auth-client as submodule"
Clone a repo that has submodules:
git clone --recurse-submodules https://github.com/org/my-service
# or, if already cloned:
git submodule update --init --recursive
Update the submodule to a newer commit:
cd libs/auth-client
git pull origin main
cd ../..
git add libs/auth-client
git commit -m "Bump auth-client to latest"
What submodules are good for:
- A shared library that changes rarely and where you want explicit, pinned-version control over which commit each consumer uses.
- Vendoring open-source dependencies whose history you want to keep visible.
- Cases where the submodule has its own CI, release cadence, and ownership separate from the superproject.
What submodules are bad at:
- Detached HEAD. When you check out a submodule via the superproject, you land on a detached HEAD by default. New engineers commit changes there, push them, and find their commits "disappeared" — they were never on a branch.
- CI surface. Every CI job in the superproject needs to know about submodules. Forget
--recurse-submodulesonce and the build fails in a confusing way. - Coordinated changes. Changing the submodule plus the superproject in one logical change still requires two PRs in two repos.
- Tooling support. Many tools (IDE search, linters, deploy systems) have rough edges around submodules. Expect to write workarounds.
A practical warning. Submodules confuse new engineers. If you adopt them, write a one-page document explaining the workflow and put it in CONTRIBUTING.md. Otherwise the team will spend hours debugging "lost commits" every few months.
Approach 2: Git subtrees
A Git subtree copies another repo's history into your repo at a subdirectory. After the subtree is added, the files appear as if they were always part of your repo. You can edit them like any other files. Periodically, you can sync changes with the upstream repo.
Subtrees are implemented as a contrib script (git-subtree). The reference is git-subtree(1). The script ships with Git on most distributions.
Add a subtree:
cd my-service
git subtree add --prefix=libs/auth-client \
https://github.com/org/auth-client main --squash
After this command, libs/auth-client is a normal directory in your repo. No submodule indirection. No detached HEAD. Engineers can edit files there like any others.
Pull upstream changes:
git subtree pull --prefix=libs/auth-client \
https://github.com/org/auth-client main --squash
Push local changes back to the source repo (less common):
git subtree push --prefix=libs/auth-client \
https://github.com/org/auth-client my-changes-branch
What subtrees are good for:
- Pulling in a third-party library whose code you might want to patch locally.
- "I want this code in my repo but I might occasionally sync upstream improvements."
- Avoiding the operational complexity of submodules for consumers who do not care about pinning.
What subtrees are bad at:
- Harder for publishers. If you are the owner of
auth-clientand have ten consumers, subtree push from each consumer creates a tangled history. - Bigger repos. Subtrees copy history into your repo. Your repo grows. Submodules avoid this.
- Less common knowledge. Most engineers know submodules exist. Few know subtrees. Expect to teach.
- Merge confusion. Subtree merges can look strange in
git logif you do not use--squash. Always use--squashunless you have a specific reason not to.
Approach 3: Internal package registries
Publish your shared code as a versioned package to an internal registry. Consumers depend on it like any third-party dependency.
This is what most modern teams do. The infrastructure exists for every major language:
- JavaScript/TypeScript. GitHub Packages, GitLab Package Registry, JFrog Artifactory, Verdaccio (self-hosted).
- Python. Private PyPI servers, JFrog Artifactory, AWS CodeArtifact, GitHub Packages.
- Java/Kotlin. Maven Central with internal mirroring, JFrog Artifactory, Sonatype Nexus.
- Rust. Private cargo registries, Artifactory.
- Go. Private Go module proxies (Athens), GoCenter.
The workflow:
- The
auth-clientrepo is its own repo, with its own tests, CI, and release process. - When it ships v1.4.0, CI publishes the package to the internal registry.
- Consumer repos depend on
auth-client@^1.4.0in their package manifest. - Consumers upgrade on their own schedule.
What packages are good for:
- Almost everything. This is the default modern answer for most teams.
- Clear version semantics (semver).
- Standard tooling — your existing package manager already knows how to handle it.
- Each consumer chooses its own upgrade timing.
- IDE and lint tooling treats it like any other library — no special cases.
What packages are bad at:
- Coordinated changes that span repos. A breaking change in
auth-clientrequires publishing a new major version, then upgrading each consumer one at a time. Atomic cross-repo commits are still impossible. - Versioning chaos. With ten internal packages and twenty consumers, dependency resolution can get tangled. Set up renovate or Dependabot to keep things sane.
- Release overhead. Every internal package needs a publish pipeline. For trivially shared code (one file, three lines), this can feel heavy. (For that case, copy-paste plus a comment may genuinely be simpler — be honest with yourself.)
A decision framework
| Situation | Try |
|---|---|
| Shared library, multiple consumers, you control both ends | Internal package |
| Vendoring an open-source library whose history you want intact | Submodule or subtree |
| Local patches to an upstream library, occasional upstream syncs | Subtree (with --squash) |
| Strict pinning required (security, regulated industry) | Submodule |
| You actually need atomic cross-project commits | Monorepo (not multi-repo) |
| One-off code share with no future upstream syncing planned | Copy-paste with a comment |
If atomic cross-repo commits are something you need regularly, multi-repo is the wrong shape. Consider moving to a monorepo. Post #5 in this series (Monorepo Git techniques) covers what that looks like.
Coordinating cross-repo changes that need to ship together
The hardest multi-repo problem is the change that must span repos atomically — a breaking API change in repo A that the frontend in repo B and mobile app in repo C must adopt at the same time. Three patterns help.
Pattern 1: Backwards-compatible rollout
Make the change in repo A in two steps:
- First, ship a release that supports both the old API and the new API. Both work; the new one is preferred.
- Update consumers in repos B and C to use the new API. Ship them at their own pace.
- Once all consumers are migrated, ship a third release of A that removes the old API.
This avoids the atomic-coordination problem entirely. It costs one extra release, but it lets each consumer move on its own schedule.
Most cross-repo breaking changes can be reshaped into this pattern. Library and SDK authors do this all the time — it is the standard discipline of "deprecate first, remove later."
Pattern 2: Feature flags
The new API is implemented in repo A but gated behind a feature flag. Consumers in B and C update their code to use the new API, also flag-gated. On launch day, flags flip in all repos simultaneously.
This works when the flag system spans repos (a shared flag service that all three repos query). It is more operationally complex than backwards-compatibility but useful when backwards-compatible rollout is genuinely impossible.
Pattern 3: Coordinated release window
Plan a release window where all three repos ship together, with a runbook and a rollback plan. This is the "ten PRs in fourteen sequenced steps" pattern from the hook of this post. It works, but it is expensive and error-prone. Save it for genuinely rare events.
The order of these patterns matters: try backwards-compatible rollout first, feature flags second, coordinated release window only when neither works. Most teams default to pattern 3 because it feels obvious, then suffer the consequences.
Common pitfalls and how to avoid them
Submodule pitfall: forgotten updates
The classic submodule failure: someone updates the submodule but forgets to commit the new pointer in the superproject. The next person clones, gets the old version, and is confused.
Fix: Add a CI check that fails if the submodule pointer is behind a known-good commit. Many CI systems have a built-in step for this.
Subtree pitfall: history pollution
Without --squash, subtree merges import every upstream commit into your repo. Your history fills with commits from a library you do not own.
Fix: Always use --squash for subtree add and pull. The trade-off is you lose granular history of upstream changes, but you keep your repo's history readable.
Package pitfall: version drift
Repo A depends on auth-client@^1.4.0. Repo B depends on auth-client@^1.7.0. A bug in auth-client@1.4.2 is fixed in auth-client@1.7.3, but Repo A never upgrades. The bug ships from Repo A for months.
Fix: Run dependency-update tooling (Renovate, Dependabot, or equivalent). Define an upgrade SLA — "all internal packages updated within 30 days of a minor release." Audit periodically.
Package pitfall: breaking changes without notice
A maintainer ships auth-client@2.0.0 with breaking changes. Consumers do not notice. Their build still works against ^1.x. The new feature they wanted is locked behind v2. Eventually one consumer upgrades and breaks production.
Fix: Maintain a changelog. Announce breaking changes in a team channel. Pin major versions explicitly in consumer manifests, then upgrade deliberately.
Internal package versioning conventions
If you adopt internal packages, the team needs versioning conventions. The most common:
Semantic versioning (semver). MAJOR.MINOR.PATCH. Increment MAJOR on breaking changes, MINOR on backwards-compatible features, PATCH on backwards-compatible fixes. semver.org is the spec. This is the dominant convention for libraries.
Pinning policy. Decide whether consumers pin to exact versions (auth-client@1.4.2), caret ranges (^1.4.0 — accepts compatible updates), or tilde ranges (~1.4.0 — accepts patch updates only). For internal packages, caret is usually the right default: it picks up bug fixes automatically while resisting accidental breaking changes.
Release cadence. Define how often the package ships. "After every merge to main" is one option (best for fast-moving packages). "Weekly batches" is another (better when consumers do not want a flood of updates).
Changelog discipline. Every release gets a one-line changelog entry. Most projects use the Keep a Changelog format or Conventional Commits to generate changelogs automatically.
The combination of these conventions makes internal packages predictable. Without them, "what version are we on?" becomes a recurring 20-minute conversation.
A scenario: when each approach is right
Three short scenarios to illustrate where each approach fits.
Scenario A: shared UI components library. A team builds a design system used by 6 frontend apps across the company. The library has its own version cadence, its own maintainers, and consumers want clear semver guarantees so they can choose when to upgrade.
→ Internal package is the right answer. Publish to GitHub Packages or equivalent. Consumers pin to caret versions.
Scenario B: vendored open-source library with local patches. A backend team uses an open-source rate-limiter. They have two custom patches not yet accepted upstream. They want to occasionally pull upstream improvements.
→ Subtree is the right answer. Pull upstream changes with git subtree pull --squash. Apply patches as normal commits. When upstream accepts a patch, the next pull eliminates the local diff.
Scenario C: third-party library that must be auditable. A regulated industry team uses a cryptography library. The compliance team requires that the exact commit shipped to production is identifiable and verifiable.
→ Submodule is the right answer. The submodule pin is the audit artifact. Updates are explicit and reviewed.
Notice how the same kind of problem ("we use some external code") has three different right answers depending on the constraints. No approach is universally correct.
Common myths
Myth 1: "Submodules are deprecated." Wrong. Submodules are not deprecated. They are an explicit, well-supported part of Git. They are also not the right answer for most modern internal code-sharing — that does not make them deprecated. They are the right tool for narrow use cases (strict pinning, third-party vendoring).
Myth 2: "Subtrees are 'better submodules'." Wrong. Subtrees solve a different problem. Submodules pin to a specific commit you can update later. Subtrees copy code in. The choice depends on whether you want a pointer or a copy.
Myth 3: "If you have multiple repos, you should always use a monorepo instead." Wrong. Multi-repo is the right shape for strongly independent teams, open-source releases, and projects with very different toolchains. The pain shows up when teams that should coordinate are split across repos. Match the shape to the coordination need.
What to read next
- Monorepo Git techniques: sparse checkout, partial clones, and staying fast at scale — the alternative when coordination needs outweigh independence.
- Release branches and hotfix workflows that don't lose commits — how to manage versioned releases of an internal package.
- Scaling a Git workflow from solo to large team — when repo shape decisions become forcing functions.
- Choosing a Git workflow: a decision guide for real teams — the series overview.
To get hands-on with the submodule mechanics, the Submodules lesson below opens a live terminal where you can try the common commands safely in two minutes.