Skip to content

TDR-006: Alembic Multi-Head on Main — Card Templates Drop and Stripe Payments Chains Never Merged

Description

apps/backend/swisper/alembic/ on origin/main currently has two unmerged migration heads. Running the canonical command alembic upgrade head (singular) fails with:

FAILED: Multiple head revisions are present for given argument 'head';
please specify a specific target revision, '<branchname>@head' to narrow
to a specific head, or 'heads' for all heads

Current heads on origin/main

                                          ┌── 20260413_drop_card_templates  (head)
... 20260308_add_data_objects_retracted_at ┘
... 20260408_cached_thoughts ── 20260414_payment_tables ── 20260417_purchase_correlation_id_and_user_default_pm ── 20260422_purchase_plaintext_columns  (head)
Head Revision file Down-revision Origin
20260413_drop_card_templates 20260413_drop_card_templates_table.py 20260308_add_data_objects_retracted_at Card-templates cleanup PR
20260422_purchase_plaintext_columns 20260422_purchase_plaintext_columns.py 20260417_purchase_correlation_id_and_user_default_pm End of Stripe payments chain (20260414_payment_tables20260417_…20260422_…)

The two chains diverge at unrelated ancestors (20260308_add_data_objects_retracted_at vs 20260408_cached_thoughts) and never re-converge. Neither PR's reviewer caught the divergence.

Why this exists

Two PRs added migrations in parallel without seeing each other:

  1. The card-templates cleanup PR added 20260413_drop_card_templates_table.py against the then-current head it picked (20260308_add_data_objects_retracted_at).
  2. The Stripe-payments PR series (#1400 and follow-ups) extended the 20260408_cached_thoughts chain with three new revisions (202604142026041720260422).

This violates the project's stated convention from CLAUDE.md:

Alembic — One Alembic migration per PR — consolidate instead of stacking

A merge migration was never created when the second of the two heads landed.

Separate but adjacent: apps/backend/config-service/app/alembic/versions/036_seed_stripe_price_ids.py uses INSERT … ON CONFLICT (name) DO NOTHING to seed payments.stripe_price.basic and payments.stripe_price.pro. On a fresh DB this works. On a pre-existing local DB whose configuration table was created before migration 036 was authored (any developer who ran the stack before April 22), this silently inserts zero rows because the rows happen to already exist with values that conflict — or never inserts because the table state at upgrade time is inconsistent with the migration's assumptions. We hit this during the rebase: backend startup failed with Configuration key 'payments.stripe_price.pro' not found in cache until we manually inserted the rows. This is a smaller adjacent debt to flag here so the resolution PR can fix both.

Impact

Developer experience

  • ⚠️ Stack bring-up requires non-canonical commands. New developers (and anyone rebuilding their local DB) cannot follow standard alembic workflow — alembic upgrade head fails. The workaround is alembic upgrade heads (plural), but that is not in any onboarding doc.
  • ⚠️ CI/test isolation may mask or amplify this. If CI uses fresh DBs every run, both heads apply cleanly. But any process that relies on alembic current returning a single revision (some tooling does) may misbehave.

Production deploys

  • ⚠️ Production migrations require explicit head naming. Whoever deploys must know to use alembic upgrade heads or specify both revisions. A scripted alembic upgrade head in a deploy pipeline would fail.
  • ⚠️ Future migrations are blocked from following convention. Any new migration added on top of either head leaves the other head still unmerged. The next author has to either pick a head (extending the divergence) or write the merge migration we should have written before. The longer it goes, the more the divergence ossifies.

Data correctness

  • No data correctness impact. Both heads have been applied independently in the local environment we tested and produce the expected schema. The graph is divergent but both branches are individually consistent.

Adjacent: configuration cache failure

  • ⚠️ Backend boots with a runtime error on stacks where 036 was a no-op. Configuration validation failed: payments.stripe_price.pro not found in cache on startup. Tests that exercise the API entrypoint fail until the rows are manually inserted into config_service.configuration.

Overall impact: Low-Medium — the system works once you know the workarounds, but every developer hits this and burns time figuring it out, and the convention is silently violated as a result.

Proposed Resolution

Step 1 — Merge the alembic heads (one PR, one commit)

Create a merge migration that joins both heads:

cd apps/backend
alembic merge -m "merge card_templates_drop and purchase_plaintext heads" \
  20260413_drop_card_templates 20260422_purchase_plaintext_columns

This generates a new revision file with both existing heads as down_revision (a tuple). The migration body should be empty (no schema changes — it is a graph-merge, not a content change).

After the merge, alembic heads returns a single head, and alembic upgrade head (singular) works again.

Step 2 — Fix the 036 seed to be idempotent on existing DBs

Either:

Option A (preferred): Replace INSERT … ON CONFLICT (name) DO NOTHING with INSERT … ON CONFLICT (name) DO UPDATE SET value = COALESCE(NULLIF(configuration.value, ''), EXCLUDED.value). This makes 036 self-healing — re-running it on a DB that has the rows but with empty values populates them; on a DB that doesn't have the rows, it inserts them.

Option B: Add a re-runnable seed script outside alembic (e.g., scripts/seed_required_config.py) that backend startup calls on first boot when keys are missing. Alembic stays for schema; seeds become a backend responsibility.

Option A is smaller and keeps the seed in alembic where the migration history is. Recommended.

Step 3 — Add a CI guard against future multi-head divergence

Add a CI step (in .github/workflows/) that runs:

heads=$(docker compose exec -T backend alembic heads | grep -c 'head')
[ "$heads" -le 1 ] || { echo "Multiple alembic heads detected — merge required."; exit 1; }

This catches the failure at PR time rather than at deploy time. Same check for apps/backend/config-service/app/alembic/.

Step 4 — Document the convention

Add a one-liner to apps/backend/swisper/alembic/README.md (create if absent):

If your branch's alembic heads shows more than one head after rebasing on main, run alembic merge -m "<message>" <head1> <head2> and commit the empty merge migration before opening the PR.

Effort

  • Step 1: 15 min (create merge migration, run, verify, PR).
  • Step 2: 30 min (rewrite 036 ON CONFLICT clause, manually verify against a stale DB, PR).
  • Step 3: 30 min (CI step + verification).
  • Step 4: 10 min.

Total: ~1.5 hours for the proper fix. The CI guard is the most valuable item — it makes this class of issue unrepresentable going forward.

Priority

Medium.

Why not High: - ✅ System functions once the workaround is known. - ✅ No data-correctness or security impact. - ✅ Production deploy can use alembic upgrade heads as a stopgap.

Why not Low: - ⚠️ Every new developer onboarding hits this. - ⚠️ Every alembic upgrade head in scripts/CI/runbooks is a latent failure. - ⚠️ Each new migration that extends just one head deepens the divergence and increases the eventual merge-migration burden. - ⚠️ The convention violation will repeat (no CI guard) until Step 3 ships.

When to escalate to High: - If a deploy fails because of alembic upgrade head being scripted somewhere. - If a third independent head appears (would mean two PRs both ignored the divergence).

  • Convention: CLAUDE.md"Alembic — One Alembic migration per PR — consolidate instead of stacking"
  • Heads in scope:
  • apps/backend/swisper/alembic/versions/20260413_drop_card_templates_table.py
  • apps/backend/swisper/alembic/versions/20260422_purchase_plaintext_columns.py
  • (chain) apps/backend/swisper/alembic/versions/20260414_add_payment_tables.py, 20260417_purchase_correlation_id_and_user_default_pm.py
  • Adjacent seed bug: apps/backend/config-service/app/alembic/versions/036_seed_stripe_price_ids.py
  • Config keys validated at backend startup: apps/backend/swisper/core/initialize_configuration.py (line ~79)
  • Originating PRs (best-guess; verify before resolving):
  • Card-templates drop: PR introducing 20260413_drop_card_templates_table.py
  • Stripe payments: #1400 (Feature/stripe integration) and follow-ups

Status Updates

  • 2026-04-26: Identified during voice-v3-ear-tools rebase onto main. Workaround applied locally (alembic upgrade heads + manual INSERT INTO config_service.configuration). No code changes shipped.