ADR-009: Prompt Architecture Simplification¶
Status: Accepted
Date: 2025-11-12
Deciders: Swisper Team, SwisperStudio Team
Context: Prompt Studio Feature (SP-5, SP-6, SP-7)
Context and Problem Statement¶
During implementation of Prompt Studio (experimentation workspace), we discovered complexity in managing prompts with:
1. Multi-LLM nodes: Nodes like memory_node that make 2-3 LLM calls internally
2. Prompt fragments: Templates split into core_fragment.md + variant_*.md requiring assembly
Questions: - How do we experiment with individual LLM calls within a multi-LLM node? - How do we manage prompt fragments (core + variant) in the UI? - How do we maintain clean state and template relationships?
Decision Drivers¶
Technical Drivers:¶
- Experimentation: Users need to test and iterate on individual prompts
- Observability: Each LLM call should be separately visible and traceable
- Cost tracking: Granular token/cost attribution per LLM call
- Template management: Simple, understandable prompt structure
UX Drivers:¶
- Clarity: User should understand what each node does
- Testing: Ability to experiment with specific prompts
- Debugging: When something fails, pinpoint which LLM call
- State visibility: Clear input/output contracts per node
Maintenance Drivers:¶
- Simplicity: Easier for prompt engineers to manage
- Versioning: Clear git history per template
- Reusability: Nodes can be composed and reused
- Consistency: One pattern for all nodes
Considered Options¶
Option 1: Keep Complex Architecture¶
- Multiple LLM calls per node
- Prompt fragments (core + variant assembly)
- Complex state management
- Nested observations or sub-calls array
Option 2: Simplify to Atomic Nodes¶
- One LLM call per node
- One template per node (no fragments)
- Each node does ONE thing
- Clear 1:1 relationships
Option 3: Hybrid Approach¶
- User-facing nodes are atomic
- Internal utility nodes can be complex
- Mixed patterns
Decision Outcome¶
Chosen option: "Option 2: Simplify to Atomic Nodes"
Rationale:
The principle of one LLM call per node and one template per node provides: 1. Clear separation of concerns 2. Better experimentability 3. Simpler mental model 4. Cleaner code
This aligns with LangGraph best practices and makes the system more maintainable.
Architectural Principles¶
Principle 1: One LLM Call Per Node¶
Rule: Each LangGraph node makes exactly ONE LLM API call (or zero).
Before (memory_node):
async def memory_node(state):
# LLM Call 1: Extract facts
facts = await llm.call(prompt="Extract facts...")
# LLM Call 2: Classify facts
classified = await llm.call(prompt="Classify...")
# LLM Call 3: Summarize
summary = await llm.call(prompt="Summarize...")
return state
After (atomic nodes):
async def extract_facts_node(state):
facts = await llm.call(prompt="Extract facts...")
state["extracted_facts"] = facts
return state
async def classify_facts_node(state):
classified = await llm.call(prompt="Classify {extracted_facts}...")
state["classified_facts"] = classified
return state
async def summarize_facts_node(state):
summary = await llm.call(prompt="Summarize {classified_facts}...")
state["fact_summary"] = summary
return state
# Graph edges:
graph.add_edge("extract_facts_node", "classify_facts_node")
graph.add_edge("classify_facts_node", "summarize_facts_node")
Impact: - ✅ Each node has ONE clear purpose - ✅ Each LLM call is separately observable - ✅ Each can be experimented with independently - ✅ Clear cost attribution per call - ⚠️ More nodes in graph (3 instead of 1) - ⚠️ State has more intermediate fields
Principle 2: One Template Per Node (No Fragments)¶
Rule: Each node has ONE complete template file (no assembly required).
Before (fragment assembly):
prompts/
├── intent_classification/
│ ├── core_fragment.md # Base prompt
│ ├── variant_simple.md # Additional rules
│ └── variant_complex.md # Different rules
└── builder.py → Assembles fragments at runtime
After (single templates):
prompts/
├── intent_classification_simple.md # Complete prompt for simple variant
├── intent_classification_complex.md # Complete prompt for complex variant
└── No builder needed - routing picks the right template
Implementation: - Use routing logic to decide which node/template to use - Each template is self-contained and complete - No runtime assembly required
Impact: - ✅ Templates are easier to read and understand - ✅ No assembly logic needed in Prompt Studio - ✅ Git history is clearer (changes are obvious) - ✅ Prompt engineers see complete context - ⚠️ Some duplication (shared text appears in multiple templates) - ⚠️ More template files to maintain
Trade-off Accepted: Clarity and maintainability outweigh DRY principle here.
Positive Consequences¶
For Experimentation:¶
- ✅ Each LLM call can be tested independently in Prompt Studio
- ✅ Clear 1:1 mapping: 1 observation = 1 experiment = 1 template
- ✅ Users can experiment with specific prompts without running entire workflow
For Debugging:¶
- ✅ When error occurs, know exactly which LLM call failed
- ✅ Token usage and cost clearly attributed to specific calls
- ✅ Latency bottlenecks easily identified
For Prompt Engineering:¶
- ✅ Templates are complete and self-contained (no mental assembly)
- ✅ Changes are obvious in Git diffs
- ✅ Can test templates in isolation
- ✅ No fragment dependency management
For Development:¶
- ✅ Simpler code (no assembly service needed)
- ✅ Easier to test (atomic nodes)
- ✅ Better separation of concerns
- ✅ Reusable nodes (composable)
Negative Consequences¶
State Management:¶
- ⚠️ State object grows larger with intermediate fields
- ⚠️ Need clear naming conventions for temporary vs persistent state
- ⚠️ More fields to document and understand
Mitigation:
- Use clear prefixes: temp_, intermediate_, final_
- Document state schema clearly
- Consider state cleanup strategy
Graph Complexity:¶
- ⚠️ More nodes in LangGraph (3x-5x in some cases)
- ⚠️ More edges and routing logic
- ⚠️ Graph visualization becomes busier
Mitigation: - Use clear naming conventions - Color coding in visualizations - Grouping/collapsing related nodes in UI - Good documentation
Template Management:¶
- ⚠️ More template files (2-3x increase)
- ⚠️ Some content duplication across variants
- ⚠️ Changes to shared text require updating multiple files
Mitigation: - Accept duplication for clarity - Use prompt versioning - Establish naming conventions - Regular prompt review cycles
Implementation Requirements¶
For Swisper SDK/Backend:¶
Required Changes: 1. ✅ Refactor multi-LLM nodes into atomic nodes 2. ✅ Create separate template for each variant 3. ✅ Update routing logic to pick templates 4. ✅ Ensure each node creates ONE observation per LLM call
Examples:
- memory_node → extract_facts_node, classify_facts_node, summarize_facts_node
- ui_node_simple + ui_node_complex → Keep as separate complete templates
- intent_classification → Already follows this (one LLM call)
Timeline: Gradual migration, can coexist during transition
For SwisperStudio:¶
Already Implemented: 1. ✅ Experiment workspace assumes one template per node 2. ✅ No fragment assembly logic 3. ✅ Direct template → experiment mapping 4. ✅ Smart placeholder mapping (state → template placeholders)
No Changes Needed: SwisperStudio is ready for this architecture
Validation¶
Success Metrics:¶
Observability: - ✅ Each LLM call visible as separate observation in traces - ✅ Cost and token usage attributable per call - ✅ Can experiment with any individual prompt
Maintainability: - ✅ Prompt engineers can understand templates without assembly logic - ✅ Git diffs are clear and meaningful - ✅ Time to update prompts decreases
Performance: - ⚠️ Graph overhead < 5% of total execution time - ✅ No regression in end-to-end latency
Adoption: - ✅ Swisper team successfully refactors existing nodes - ✅ New nodes follow atomic pattern - ✅ Documentation updated
Review Date: 2025-12-12 (1 month after implementation)
Migration Strategy¶
Phase 1: New Nodes (Immediate)¶
- All NEW nodes follow atomic principle
- One LLM call per node
- One template per node
Phase 2: Refactor High-Value Nodes (Q1 2026)¶
- Refactor nodes frequently experimented with
- Priority:
memory_node, complex UI nodes - Keep simple nodes as-is
Phase 3: Complete Migration (Q2 2026)¶
- All nodes atomic
- Remove fragment assembly code
- Update all documentation
Backward Compatibility:¶
- Old multi-LLM nodes continue to work
- Experimentation shows "partial data" warning
- Gradual migration, no breaking changes
Examples¶
Example 1: Memory Node Refactoring¶
Before:
# One node, 3 LLM calls, complex state management
memory_node(state) → Updates multiple state fields
After:
extract_facts_node(state):
- Input: user_message, conversation_history
- Template: extract_facts.md with {{USER_MESSAGE}}, {{CONVERSATION}}
- Output: state["extracted_facts"]
classify_facts_node(state):
- Input: state["extracted_facts"]
- Template: classify_facts.md with {{EXTRACTED_FACTS}}
- Output: state["classified_facts"]
summarize_facts_node(state):
- Input: state["classified_facts"]
- Template: summarize_facts.md with {{CLASSIFIED_FACTS}}
- Output: state["fact_summary"]
Graph:
Example 2: UI Node Variants¶
Before:
prompts/ui_node/
├── core_fragment.md
├── variant_simple.md
└── variant_complex.md
→ Runtime assembly based on routing
After:
prompts/
├── ui_node_simple.md # Complete template
├── ui_node_complex.md # Complete template
└── Routing decides: simple vs complex
Graph:
Links¶
- Prompt Studio Feature Spec
- Development Plan
- ADR-008: Phase 2 Architecture
- Related: Prompt Studio Stories SP-5, SP-6, SP-7
Notes¶
Implementation Status:¶
SwisperStudio: ✅ Ready (assumes atomic architecture) Swisper SDK: ⏳ Needs refactoring
Key Insight:¶
This decision emerged from practical UX requirements during Prompt Studio implementation. The ability to experiment with individual prompts drove the need for atomic nodes.
Follow-up Decisions Needed:¶
- State cleanup strategy: When to remove intermediate fields?
- Naming conventions: How to name split nodes? (
extract_factsvsmemory_extract_facts?) - Graph visualization: How to group related atomic nodes in UI?
Alternatives Considered and Rejected:¶
Sub-calls array: Too complex for minimal benefit
Hybrid approach: Inconsistent, confusing
Keep fragments: Makes experimentation impossible
Decision Date: 2025-11-12
Review Date: 2025-12-12
Status: Accepted - Pending Swisper SDK implementation