Skip to content

Fact System — Architecture

This content was migrated from Documentation/fact_entity_preference_extraction.md and restructured into audience sections. Review for accuracy against the current codebase.

Context and Purpose

The Fact System exists to give Swisper persistent, personalized memory. Without it, every conversation would start from zero — the assistant would not know the user's name, family, allergies, or preferences.

The driving requirements behind its design are:

  • Automatic extraction — Users should not have to explicitly save information. The system learns from natural conversation
  • Entity-first attribution — Every fact must be correctly linked to the right person. Misattributed facts (storing a son's hobby under a colleague's name) are worse than missing facts
  • Non-blocking extraction — Fact extraction must not add latency to the user's response. It runs in the background while the response streams
  • Semantic retrieval — Facts must be retrievable by meaning, not just exact text. "What does Martin eat?" should surface dietary facts even if stored as "Martin is vegetarian"

Architecture Overview

The Fact System spans multiple nodes in the Global Supervisor graph and a set of backend services for persistence and retrieval.

flowchart TB
    subgraph Extraction["Extraction Pipeline"]
        MSG([User Message]) --> IC[Intent Classification]
        IC -->|has_extractable_facts| FE[Fact Extraction Node]
        IC -->|entities detected| ER[Entity Resolution Node]
        FE --> EM[Extraction Merge Node]
        ER --> EM
        EM -->|conflict detected| FCR[Fact Conflict Resolution]
        EM -->|no conflict| PERSIST
    end

    subgraph Retrieval["Retrieval Pipeline"]
        IC -->|needs_semantic_retrieval| SR[Semantic Retrieval]
        IC -->|temporal query| TR[Temporal Retrieval]
        SR --> MA[Memory Assembly]
        TR --> MA
    end

    subgraph Storage["Persistence Layer"]
        PERSIST[Fact Persistence Service]
        PERSIST --> PG[(PostgreSQL + pgvector)]
        PERSIST --> EMBED[Embedding Generation]
        EMBED -->|Vertex AI| PG
        MA --> CACHE[(Redis Cache)]
    end

    subgraph Output["To UI Response"]
        MA --> UI[UI Response Nodes]
        UI -->|facts woven into response| RESP([User Response])
    end

Flow summary: Intent classification determines which pipeline stages run. Fact extraction and entity resolution run in parallel, then merge. The merge step links facts to entities, detects conflicts, and persists. Retrieval pulls relevant facts via vector search and temporal queries, assembling them for the UI response nodes.

Component Responsibilities

Component Responsibility
Fact Extraction Node LLM call to extract facts from the user message. Applies the persistence test (would the user want this remembered next week?)
Entity Resolution Node Resolves entity mentions against the user's contact database. Detects ambiguity. Creates new entities when red-flag analysis confirms a new person
Extraction Merge Node Links extracted facts to resolved entities via subject_entity_id. Detects fact conflicts. Filters facts about ambiguous/skipped entities
Fact Conflict Resolution Node Routes conflicting facts (changed email, updated job) through HITL for user confirmation
Semantic Retrieval Node Vector similarity search against stored fact embeddings. Returns top-K relevant facts
Temporal Retrieval Node Time-based fact retrieval for date-anchored queries (birthdays, scheduled events, travel)
Memory Assembly Node Merges semantic and temporal retrieval results into unified fact context
Fact Persistence Service Stores facts to PostgreSQL with embedding generation, deduplication, and confidence scoring
Fact Retrieval Service Orchestrates vector search with relevance thresholds and entity-scoped boosting
Preference Extraction Service Extracts communication preferences from conversation. Conditional — only runs when has_preferences flag is set

Data Model

Fact Schema

Field Type Purpose
fact_id UUID Primary key
text encrypted string Fact content (max 200 chars), PGP-encrypted
type FactType enum Category: Schedule, Travel, Health, Relationship, Work, Hobby, etc.
embedding vector(2000) pgvector embedding for semantic search (Vertex AI gemini-embedding-001)
confidence float 0.0–1.0, boosted to 0.9 for critical types (Allergy, Medical, Health)
subject_entity_id UUID (nullable) Foreign key to persons table. NULL for user-about-self facts
relation_to_user boolean true = fact about the user; false = fact about an entity
scope string "avatar" (all workspaces) or "workspace" (workspace-specific)
durability string "durable", "long", "medium", "short"
sensitivity string "low", "medium", "high" — controls privacy mode filtering
temporal_marker string Raw time reference ("next Friday", "June 15th")
time_anchor date Resolved date for temporal queries

Entity Schema (Persons Table)

Field Type Purpose
person_id UUID Primary key
display_name encrypted string Name (PGP-encrypted)
role_to_user PersonRole enum partner, child, parent, friend, colleague, pet, service_provider, other
aliases string array Alternative names and references
user_id, avatar_id UUID Ownership scoping

Fact Attribution Model

Attribution Type subject_entity_id relation_to_user Example
User fact NULL true "User is allergic to peanuts"
Entity fact person UUID false "Leo loves pizza"
Orphan fact NULL false Must be prevented — skipped during extraction

Key Design Decisions

1. Skip Ambiguous Entities Over Guessing

  • Chosen: When entity resolution is ambiguous (confidence < 0.8), skip both the entity and associated facts entirely
  • Rejected: Store with NULL entity; store with best-guess entity; create a new entity
  • Rationale: Orphan facts (NULL entity, not about user) are unusable and pollute the database. Wrong attribution is worse than missing data. The system loses some valid facts but prevents entity explosion and misattribution

2. Background Extraction (Non-Blocking)

  • Chosen: Fact extraction runs as a background task, not blocking the response
  • Rejected: Synchronous extraction before response generation
  • Rationale: Extraction takes 3–4.5 seconds (LLM call + storage). Making users wait that long would be unacceptable. Current-turn information is available directly from the message; extracted facts are needed for future turns

3. Unified LLM Call for Facts + Entities + Topics

  • Chosen: Single LLM call extracts all three outputs
  • Rejected: Separate LLM calls for each extraction type
  • Rationale: More token-efficient and faster. The LLM has full context to make coherent entity-fact linking decisions within a single pass

4. Semantic Deduplication Over String Matching

  • Chosen: LLM-based entity resolution with red-flag detection and coherence analysis
  • Rejected: String matching, fuzzy matching, phonetic matching
  • Rationale: "Martin (son, 8 years old)" and "Martin (colleague, accountant)" share the same name but are clearly different people. Only semantic understanding can detect age-career contradictions and relationship impossibilities

Interfaces and Contracts

Interface Direction Format Consumer
Intent Classification → Fact Pipeline Inbound Optimization flags in GlobalSupervisorState Fact Extraction, Entity Resolution nodes
Fact Persistence → PostgreSQL Outbound SQLModel ORM with pgvector Facts table with embeddings
Fact Retrieval → Semantic Search Bidirectional Vector similarity query (cosine distance) Memory Assembly node
Extraction Merge → Conflict Queue Outbound Redis key-value (chat_id → conflicts) Fact Conflict Resolution node
Fact System → UI Response Outbound facts_by_entity dict in state UI Response Nodes for prompt injection
Facts API → Settings UI Bidirectional REST API (/api/v1/users/{user_id}/facts) Frontend fact management

Known Trade-offs and Debt

Item Impact Remediation
Fact decay is batch-only A fact_decay_batch background job (swisper/jobs/) reduces fact relevance over time, but decay is not applied inline during retrieval — old facts may still rank highly until the next batch run Consider integrating decay scoring directly into the retrieval query for real-time relevance adjustment
Embedding model lock-in Changing the embedding model (currently Vertex AI gemini-embedding-001, dimension 2000) requires re-embedding all stored facts Maintain embedding model version metadata; build a migration pipeline for re-embedding
Preference persistence gap Conversational preferences ("be brief") are session-only. Users must use Settings UI for permanent changes, which is not discoverable Consider auto-promoting frequently repeated session preferences to permanent
Entity resolution is extraction-time only Once an entity decision is made, it's not revisited. A wrong decision persists unless manually corrected Add periodic entity reconciliation or allow users to merge/split entities via the Settings UI