Production Runbook: Vertex AI Embedding Migration¶
Overview¶
This runbook covers the migration from Kvant BGE-M3 (1024 dims) to Vertex AI gemini-embedding-001 (2000 dims) for all embedding operations.
Migration Type: Breaking change (requires maintenance window) Estimated Duration: 30-60 minutes depending on data volume Risk Level: Medium (data loss possible if not backed up)
Prerequisites¶
- [ ] Vertex AI credentials configured in production environment
- [ ]
GOOGLE_APPLICATION_CREDENTIALSpointing to service account JSON - [ ] Service account has
aiplatform.endpoints.predictpermission - [ ] Maintenance window scheduled and communicated
Pre-Migration Checklist¶
1. Verify Current State¶
# Check current embedding dimensions
kubectl exec -it <backend-pod> -- python -c "
from sqlmodel import Session
from sqlalchemy import text
from swisper.core.db import engine
with Session(engine) as session:
for table in ['facts', 'people', 'document_chunks', 'email_chunks', 'attachment_chunks', 'system_facts']:
result = session.execute(text(f'''
SELECT COUNT(*),
COUNT(embedding),
MIN(vector_dims(embedding)),
MAX(vector_dims(embedding))
FROM {table}
'''))
row = result.fetchone()
print(f'{table}: total={row[0]}, with_embedding={row[1]}, dims={row[2]}-{row[3]}')
"
2. Backup Database¶
# Full database backup
kubectl exec -it <db-pod> -- pg_dump -U postgres -d app -F c -f /tmp/pre_embedding_migration_backup.dump
# Copy backup to safe location
kubectl cp <db-pod>:/tmp/pre_embedding_migration_backup.dump ./pre_embedding_migration_backup.dump
# Verify backup
ls -la ./pre_embedding_migration_backup.dump
3. Verify Vertex AI Connectivity¶
kubectl exec -it <backend-pod> -- python -c "
from swisper.gateways.llm.legacy.providers.vertex import LangChainVertex
import asyncio
async def test():
adapter = LangChainVertex()
embedding = await adapter.embed_text('test connectivity')
print(f'Vertex AI OK - dimensions: {len(embedding)}')
asyncio.run(test())
"
Migration Steps¶
Step 1: Enable Maintenance Mode¶
# Scale down backend to prevent new writes
kubectl scale deployment helvetiq-backend --replicas=0
# Verify no active connections
kubectl exec -it <db-pod> -- psql -U postgres -d app -c "SELECT count(*) FROM pg_stat_activity WHERE datname = 'app';"
Step 2: Run Alembic Migration¶
# Apply schema migration (changes vector columns from 1024 to 2000)
kubectl exec -it <backend-pod> -- alembic upgrade head
# Verify migration applied
kubectl exec -it <backend-pod> -- alembic current
# Expected: m13_migrate_embeddings_to_3072 (head)
Step 3: Update LLM Node Configuration¶
kubectl exec -it <db-pod> -- psql -U postgres -d app -c "
UPDATE llm_node_configuration
SET provider = 'vertex', model = 'gemini-embedding-001'
WHERE node_name = 'embedding';
"
# Verify
kubectl exec -it <db-pod> -- psql -U postgres -d app -c "
SELECT node_name, provider, model
FROM llm_node_configuration
WHERE node_name = 'embedding';
"
Step 4: Run Re-Embedding Script¶
# Dry run first (no changes)
kubectl exec -it <backend-pod> -- python -m swisper.scripts.migrate_embeddings_to_vertex --dry-run
# If dry run looks good, run actual migration
kubectl exec -it <backend-pod> -- python -m swisper.scripts.migrate_embeddings_to_vertex
# Monitor progress in logs
kubectl logs -f <backend-pod>
Recovery Options:
# If migration fails partway, resume from offset:
kubectl exec -it <backend-pod> -- python -m swisper.scripts.migrate_embeddings_to_vertex --table facts --offset 500
# Migrate single table:
kubectl exec -it <backend-pod> -- python -m swisper.scripts.migrate_embeddings_to_vertex --table people
Step 5: Verify Migration¶
# Check embedding dimensions
kubectl exec -it <db-pod> -- psql -U postgres -d app -c "
SELECT 'facts' as table_name, COUNT(*), vector_dims(embedding)
FROM facts WHERE embedding IS NOT NULL GROUP BY vector_dims(embedding)
UNION ALL
SELECT 'people', COUNT(*), vector_dims(embedding)
FROM people WHERE embedding IS NOT NULL GROUP BY vector_dims(embedding);
"
# Expected output:
# table_name | count | vector_dims
# -----------+-------+-------------
# facts | XXX | 2000
# people | XXX | 2000
Step 6: Scale Backend Back Up¶
kubectl scale deployment helvetiq-backend --replicas=2
# Verify pods are healthy
kubectl get pods -l app=helvetiq-backend
kubectl logs -f <backend-pod> | head -50
Step 7: Smoke Test¶
- Open the application
- Test entity resolution: "Send email to David"
- Test semantic search: "What do I know about my family?"
- Test fact extraction: Say something about yourself
Rollback Procedure¶
If migration fails or causes issues:
Option A: Restore from Backup¶
# Scale down backend
kubectl scale deployment helvetiq-backend --replicas=0
# Restore database
kubectl exec -it <db-pod> -- pg_restore -U postgres -d app -c /tmp/pre_embedding_migration_backup.dump
# Revert LLM config
kubectl exec -it <db-pod> -- psql -U postgres -d app -c "
UPDATE llm_node_configuration
SET provider = 'kvant', model = 'inference-bge-m3'
WHERE node_name = 'embedding';
"
# Rollback Alembic
kubectl exec -it <backend-pod> -- alembic downgrade m12_role_to_string
# Scale up
kubectl scale deployment helvetiq-backend --replicas=2
Option B: Re-deploy Previous Version¶
# Rollback to previous deployment
kubectl rollout undo deployment/helvetiq-backend
# Restore database
kubectl exec -it <db-pod> -- pg_restore -U postgres -d app -c /tmp/pre_embedding_migration_backup.dump
Post-Migration¶
- [ ] Remove backup file after 7 days of stable operation
- [ ] Update monitoring dashboards for new embedding dimensions
- [ ] Document any issues encountered
Contacts¶
- On-call Engineer: [Name]
- Database Admin: [Name]
- Product Owner: [Name]
Appendix: Files Changed¶
| File | Change |
|---|---|
apps/backend/swisper/gateways/llm/legacy/providers/vertex.py |
Added gemini-embedding-001 implementation |
apps/backend/swisper/persistence/models/fact.py |
Changed Vector(1024) to Vector(2000) |
apps/backend/swisper/persistence/models/document.py |
Changed Vector(1024) to Vector(2000) |
apps/backend/swisper/alembic/versions/m13_migrate_embeddings_to_3072.py |
Schema migration |
apps/backend/swisper/scripts/migrate_embeddings_to_vertex.py |
Re-embedding script |
apps/backend/swisper/gateways/llm/adapter.py |
Updated comments |