Fact System — Architecture
This content was migrated from Documentation/fact_entity_preference_extraction.md and
restructured into audience sections. Review for accuracy against
the current codebase.
Context and Purpose
The Fact System exists to give Swisper persistent, personalized memory. Without it, every conversation would start from zero — the assistant would not know the user's name, family, allergies, or preferences.
The driving requirements behind its design are:
- Automatic extraction — Users should not have to explicitly save information. The system learns from natural conversation
- Entity-first attribution — Every fact must be correctly linked to the right person. Misattributed facts (storing a son's hobby under a colleague's name) are worse than missing facts
- Non-blocking extraction — Fact extraction must not add latency to the user's response. It runs in the background while the response streams
- Semantic retrieval — Facts must be retrievable by meaning, not just exact text. "What does Martin eat?" should surface dietary facts even if stored as "Martin is vegetarian"
Architecture Overview
The Fact System spans multiple nodes in the Global Supervisor graph and a set of backend services for persistence and retrieval.
flowchart TB
subgraph Extraction["Extraction Pipeline"]
MSG([User Message]) --> IC[Intent Classification]
IC -->|has_extractable_facts| FE[Fact Extraction Node]
IC -->|entities detected| ER[Entity Resolution Node]
FE --> EM[Extraction Merge Node]
ER --> EM
EM -->|conflict detected| FCR[Fact Conflict Resolution]
EM -->|no conflict| PERSIST
end
subgraph Retrieval["Retrieval Pipeline"]
IC -->|needs_semantic_retrieval| SR[Semantic Retrieval]
IC -->|temporal query| TR[Temporal Retrieval]
SR --> MA[Memory Assembly]
TR --> MA
end
subgraph Storage["Persistence Layer"]
PERSIST[Fact Persistence Service]
PERSIST --> PG[(PostgreSQL + pgvector)]
PERSIST --> EMBED[Embedding Generation]
EMBED -->|Vertex AI| PG
MA --> CACHE[(Redis Cache)]
end
subgraph Output["To UI Response"]
MA --> UI[UI Response Nodes]
UI -->|facts woven into response| RESP([User Response])
end
Flow summary: Intent classification determines which pipeline stages run. Fact extraction and entity resolution run in parallel, then merge. The merge step links facts to entities, detects conflicts, and persists. Retrieval pulls relevant facts via vector search and temporal queries, assembling them for the UI response nodes.
Component Responsibilities
| Component |
Responsibility |
| Fact Extraction Node |
LLM call to extract facts from the user message. Applies the persistence test (would the user want this remembered next week?) |
| Entity Resolution Node |
Resolves entity mentions against the user's contact database. Detects ambiguity. Creates new entities when red-flag analysis confirms a new person |
| Extraction Merge Node |
Links extracted facts to resolved entities via subject_entity_id. Detects fact conflicts. Filters facts about ambiguous/skipped entities |
| Fact Conflict Resolution Node |
Routes conflicting facts (changed email, updated job) through HITL for user confirmation |
| Semantic Retrieval Node |
Vector similarity search against stored fact embeddings. Returns top-K relevant facts |
| Temporal Retrieval Node |
Time-based fact retrieval for date-anchored queries (birthdays, scheduled events, travel) |
| Memory Assembly Node |
Merges semantic and temporal retrieval results into unified fact context |
| Fact Persistence Service |
Stores facts to PostgreSQL with embedding generation, deduplication, and confidence scoring |
| Fact Retrieval Service |
Orchestrates vector search with relevance thresholds and entity-scoped boosting |
| Preference Extraction Service |
Extracts communication preferences from conversation. Conditional — only runs when has_preferences flag is set |
Data Model
Fact Schema
| Field |
Type |
Purpose |
fact_id |
UUID |
Primary key |
text |
encrypted string |
Fact content (max 200 chars), PGP-encrypted |
type |
FactType enum |
Category: Schedule, Travel, Health, Relationship, Work, Hobby, etc. |
embedding |
vector(2000) |
pgvector embedding for semantic search (Vertex AI gemini-embedding-001) |
confidence |
float |
0.0–1.0, boosted to 0.9 for critical types (Allergy, Medical, Health) |
subject_entity_id |
UUID (nullable) |
Foreign key to persons table. NULL for user-about-self facts |
relation_to_user |
boolean |
true = fact about the user; false = fact about an entity |
scope |
string |
"avatar" (all workspaces) or "workspace" (workspace-specific) |
durability |
string |
"durable", "long", "medium", "short" |
sensitivity |
string |
"low", "medium", "high" — controls privacy mode filtering |
temporal_marker |
string |
Raw time reference ("next Friday", "June 15th") |
time_anchor |
date |
Resolved date for temporal queries |
Entity Schema (Persons Table)
| Field |
Type |
Purpose |
person_id |
UUID |
Primary key |
display_name |
encrypted string |
Name (PGP-encrypted) |
role_to_user |
PersonRole enum |
partner, child, parent, friend, colleague, pet, service_provider, other |
aliases |
string array |
Alternative names and references |
user_id, avatar_id |
UUID |
Ownership scoping |
Fact Attribution Model
| Attribution Type |
subject_entity_id |
relation_to_user |
Example |
| User fact |
NULL |
true |
"User is allergic to peanuts" |
| Entity fact |
person UUID |
false |
"Leo loves pizza" |
| Orphan fact |
NULL |
false |
Must be prevented — skipped during extraction |
Key Design Decisions
1. Skip Ambiguous Entities Over Guessing
- Chosen: When entity resolution is ambiguous (confidence < 0.8), skip both the entity and associated facts entirely
- Rejected: Store with NULL entity; store with best-guess entity; create a new entity
- Rationale: Orphan facts (NULL entity, not about user) are unusable and pollute the database. Wrong attribution is worse than missing data. The system loses some valid facts but prevents entity explosion and misattribution
- Chosen: Fact extraction runs as a background task, not blocking the response
- Rejected: Synchronous extraction before response generation
- Rationale: Extraction takes 3–4.5 seconds (LLM call + storage). Making users wait that long would be unacceptable. Current-turn information is available directly from the message; extracted facts are needed for future turns
3. Unified LLM Call for Facts + Entities + Topics
- Chosen: Single LLM call extracts all three outputs
- Rejected: Separate LLM calls for each extraction type
- Rationale: More token-efficient and faster. The LLM has full context to make coherent entity-fact linking decisions within a single pass
4. Semantic Deduplication Over String Matching
- Chosen: LLM-based entity resolution with red-flag detection and coherence analysis
- Rejected: String matching, fuzzy matching, phonetic matching
- Rationale: "Martin (son, 8 years old)" and "Martin (colleague, accountant)" share the same name but are clearly different people. Only semantic understanding can detect age-career contradictions and relationship impossibilities
Interfaces and Contracts
| Interface |
Direction |
Format |
Consumer |
| Intent Classification → Fact Pipeline |
Inbound |
Optimization flags in GlobalSupervisorState |
Fact Extraction, Entity Resolution nodes |
| Fact Persistence → PostgreSQL |
Outbound |
SQLModel ORM with pgvector |
Facts table with embeddings |
| Fact Retrieval → Semantic Search |
Bidirectional |
Vector similarity query (cosine distance) |
Memory Assembly node |
| Extraction Merge → Conflict Queue |
Outbound |
Redis key-value (chat_id → conflicts) |
Fact Conflict Resolution node |
| Fact System → UI Response |
Outbound |
facts_by_entity dict in state |
UI Response Nodes for prompt injection |
| Facts API → Settings UI |
Bidirectional |
REST API (/api/v1/users/{user_id}/facts) |
Frontend fact management |
Known Trade-offs and Debt
| Item |
Impact |
Remediation |
| Fact decay is batch-only |
A fact_decay_batch background job (swisper/jobs/) reduces fact relevance over time, but decay is not applied inline during retrieval — old facts may still rank highly until the next batch run |
Consider integrating decay scoring directly into the retrieval query for real-time relevance adjustment |
| Embedding model lock-in |
Changing the embedding model (currently Vertex AI gemini-embedding-001, dimension 2000) requires re-embedding all stored facts |
Maintain embedding model version metadata; build a migration pipeline for re-embedding |
| Preference persistence gap |
Conversational preferences ("be brief") are session-only. Users must use Settings UI for permanent changes, which is not discoverable |
Consider auto-promoting frequently repeated session preferences to permanent |
| Entity resolution is extraction-time only |
Once an entity decision is made, it's not revisited. A wrong decision persists unless manually corrected |
Add periodic entity reconciliation or allow users to merge/split entities via the Settings UI |