Summarization¶
The Summarization System compresses long conversations to keep Swisper fast and cost-efficient. When a conversation exceeds configurable thresholds (20 messages or ~4,000 tokens), it generates a concise summary of older messages, keeps the most recent exchanges verbatim, and regenerates the chat title to reflect the evolved topic.
The system uses smart loading — when a summary exists, only the summary and the last 4 messages are loaded from the database, avoiding unnecessary I/O for long conversations.
Key Components¶
| Component | Purpose |
|---|---|
| Summarization Check | Evaluates message count and token estimates against thresholds |
| Summarization Node | Generates summary via LLM and regenerates chat title (computation-only, no DB writes) |
| Smart Loading | Loads only summary + recent messages when a summary exists, avoiding full history reads |
| Message Persist | Writes summary and title to database atomically at end of turn |
Documentation Sections¶
- Overview — What this module does and who it serves
- Architecture — System design, components, and trade-offs