Production Runbook: Vertex AI Embedding Migration¶

Overview¶

This runbook covers the migration from Kvant BGE-M3 (1024 dims) to Vertex AI gemini-embedding-001 (2000 dims) for all embedding operations.

Migration Type: Breaking change (requires maintenance window) Estimated Duration: 30-60 minutes depending on data volume Risk Level: Medium (data loss possible if not backed up)

Prerequisites¶

[ ] Vertex AI credentials configured in production environment
[ ] GOOGLE_APPLICATION_CREDENTIALS pointing to service account JSON
[ ] Service account has aiplatform.endpoints.predict permission
[ ] Maintenance window scheduled and communicated

Pre-Migration Checklist¶

1. Verify Current State¶

# Check current embedding dimensions
kubectl exec -it <backend-pod> -- python -c "
from sqlmodel import Session
from sqlalchemy import text
from swisper.core.db import engine

with Session(engine) as session:
    for table in ['facts', 'people', 'document_chunks', 'email_chunks', 'attachment_chunks', 'system_facts']:
        result = session.execute(text(f'''
            SELECT COUNT(*),
                   COUNT(embedding),
                   MIN(vector_dims(embedding)),
                   MAX(vector_dims(embedding))
            FROM {table}
        '''))
        row = result.fetchone()
        print(f'{table}: total={row[0]}, with_embedding={row[1]}, dims={row[2]}-{row[3]}')
"

2. Backup Database¶

# Full database backup
kubectl exec -it <db-pod> -- pg_dump -U postgres -d app -F c -f /tmp/pre_embedding_migration_backup.dump

# Copy backup to safe location
kubectl cp <db-pod>:/tmp/pre_embedding_migration_backup.dump ./pre_embedding_migration_backup.dump

# Verify backup
ls -la ./pre_embedding_migration_backup.dump

3. Verify Vertex AI Connectivity¶

kubectl exec -it <backend-pod> -- python -c "
from swisper.gateways.llm.legacy.providers.vertex import LangChainVertex
import asyncio

async def test():
    adapter = LangChainVertex()
    embedding = await adapter.embed_text('test connectivity')
    print(f'Vertex AI OK - dimensions: {len(embedding)}')

asyncio.run(test())
"

Migration Steps¶

Step 1: Enable Maintenance Mode¶

# Scale down backend to prevent new writes
kubectl scale deployment helvetiq-backend --replicas=0

# Verify no active connections
kubectl exec -it <db-pod> -- psql -U postgres -d app -c "SELECT count(*) FROM pg_stat_activity WHERE datname = 'app';"

Step 2: Run Alembic Migration¶

# Apply schema migration (changes vector columns from 1024 to 2000)
kubectl exec -it <backend-pod> -- alembic upgrade head

# Verify migration applied
kubectl exec -it <backend-pod> -- alembic current
# Expected: m13_migrate_embeddings_to_3072 (head)

Step 3: Update LLM Node Configuration¶

kubectl exec -it <db-pod> -- psql -U postgres -d app -c "
UPDATE llm_node_configuration
SET provider = 'vertex', model = 'gemini-embedding-001'
WHERE node_name = 'embedding';
"

# Verify
kubectl exec -it <db-pod> -- psql -U postgres -d app -c "
SELECT node_name, provider, model
FROM llm_node_configuration
WHERE node_name = 'embedding';
"

Step 4: Run Re-Embedding Script¶

# Dry run first (no changes)
kubectl exec -it <backend-pod> -- python -m swisper.scripts.migrate_embeddings_to_vertex --dry-run

# If dry run looks good, run actual migration
kubectl exec -it <backend-pod> -- python -m swisper.scripts.migrate_embeddings_to_vertex

# Monitor progress in logs
kubectl logs -f <backend-pod>

Recovery Options:

# If migration fails partway, resume from offset:
kubectl exec -it <backend-pod> -- python -m swisper.scripts.migrate_embeddings_to_vertex --table facts --offset 500

# Migrate single table:
kubectl exec -it <backend-pod> -- python -m swisper.scripts.migrate_embeddings_to_vertex --table people

Step 5: Verify Migration¶

# Check embedding dimensions
kubectl exec -it <db-pod> -- psql -U postgres -d app -c "
SELECT 'facts' as table_name, COUNT(*), vector_dims(embedding)
FROM facts WHERE embedding IS NOT NULL GROUP BY vector_dims(embedding)
UNION ALL
SELECT 'people', COUNT(*), vector_dims(embedding)
FROM people WHERE embedding IS NOT NULL GROUP BY vector_dims(embedding);
"

# Expected output:
#  table_name | count | vector_dims
# -----------+-------+-------------
#  facts     |   XXX |        2000
#  people    |   XXX |        2000

Step 6: Scale Backend Back Up¶

kubectl scale deployment helvetiq-backend --replicas=2

# Verify pods are healthy
kubectl get pods -l app=helvetiq-backend
kubectl logs -f <backend-pod> | head -50

Step 7: Smoke Test¶

Open the application
Test entity resolution: "Send email to David"
Test semantic search: "What do I know about my family?"
Test fact extraction: Say something about yourself

Rollback Procedure¶

If migration fails or causes issues:

Option A: Restore from Backup¶

# Scale down backend
kubectl scale deployment helvetiq-backend --replicas=0

# Restore database
kubectl exec -it <db-pod> -- pg_restore -U postgres -d app -c /tmp/pre_embedding_migration_backup.dump

# Revert LLM config
kubectl exec -it <db-pod> -- psql -U postgres -d app -c "
UPDATE llm_node_configuration
SET provider = 'kvant', model = 'inference-bge-m3'
WHERE node_name = 'embedding';
"

# Rollback Alembic
kubectl exec -it <backend-pod> -- alembic downgrade m12_role_to_string

# Scale up
kubectl scale deployment helvetiq-backend --replicas=2

Option B: Re-deploy Previous Version¶

# Rollback to previous deployment
kubectl rollout undo deployment/helvetiq-backend

# Restore database
kubectl exec -it <db-pod> -- pg_restore -U postgres -d app -c /tmp/pre_embedding_migration_backup.dump

Post-Migration¶

[ ] Remove backup file after 7 days of stable operation
[ ] Update monitoring dashboards for new embedding dimensions
[ ] Document any issues encountered

Contacts¶

On-call Engineer: [Name]
Database Admin: [Name]
Product Owner: [Name]

Appendix: Files Changed¶

File	Change
`apps/backend/swisper/gateways/llm/legacy/providers/vertex.py`	Added gemini-embedding-001 implementation
`apps/backend/swisper/persistence/models/fact.py`	Changed Vector(1024) to Vector(2000)
`apps/backend/swisper/persistence/models/document.py`	Changed Vector(1024) to Vector(2000)
`apps/backend/swisper/alembic/versions/m13_migrate_embeddings_to_3072.py`	Schema migration
`apps/backend/swisper/scripts/migrate_embeddings_to_vertex.py`	Re-embedding script
`apps/backend/swisper/gateways/llm/adapter.py`	Updated comments