ADR-009: Prompt Architecture Simplification¶

Status: Accepted
Date: 2025-11-12
Deciders: Swisper Team, SwisperStudio Team
Context: Prompt Studio Feature (SP-5, SP-6, SP-7)

Context and Problem Statement¶

During implementation of Prompt Studio (experimentation workspace), we discovered complexity in managing prompts with: 1. Multi-LLM nodes: Nodes like memory_node that make 2-3 LLM calls internally 2. Prompt fragments: Templates split into core_fragment.md + variant_*.md requiring assembly

Questions: - How do we experiment with individual LLM calls within a multi-LLM node? - How do we manage prompt fragments (core + variant) in the UI? - How do we maintain clean state and template relationships?

Decision Drivers¶

Technical Drivers:¶

Experimentation: Users need to test and iterate on individual prompts
Observability: Each LLM call should be separately visible and traceable
Cost tracking: Granular token/cost attribution per LLM call
Template management: Simple, understandable prompt structure

UX Drivers:¶

Clarity: User should understand what each node does
Testing: Ability to experiment with specific prompts
Debugging: When something fails, pinpoint which LLM call
State visibility: Clear input/output contracts per node

Maintenance Drivers:¶

Simplicity: Easier for prompt engineers to manage
Versioning: Clear git history per template
Reusability: Nodes can be composed and reused
Consistency: One pattern for all nodes

Considered Options¶

Option 1: Keep Complex Architecture¶

Multiple LLM calls per node
Prompt fragments (core + variant assembly)
Complex state management
Nested observations or sub-calls array

Option 2: Simplify to Atomic Nodes¶

One LLM call per node
One template per node (no fragments)
Each node does ONE thing
Clear 1:1 relationships

Option 3: Hybrid Approach¶

User-facing nodes are atomic
Internal utility nodes can be complex
Mixed patterns

Decision Outcome¶

Chosen option: "Option 2: Simplify to Atomic Nodes"

Rationale:

The principle of one LLM call per node and one template per node provides: 1. Clear separation of concerns 2. Better experimentability 3. Simpler mental model 4. Cleaner code

This aligns with LangGraph best practices and makes the system more maintainable.

Architectural Principles¶

Principle 1: One LLM Call Per Node¶

Rule: Each LangGraph node makes exactly ONE LLM API call (or zero).

Before (memory_node):

async def memory_node(state):
    # LLM Call 1: Extract facts
    facts = await llm.call(prompt="Extract facts...")

    # LLM Call 2: Classify facts
    classified = await llm.call(prompt="Classify...")

    # LLM Call 3: Summarize
    summary = await llm.call(prompt="Summarize...")

    return state

After (atomic nodes):

async def extract_facts_node(state):
    facts = await llm.call(prompt="Extract facts...")
    state["extracted_facts"] = facts
    return state

async def classify_facts_node(state):
    classified = await llm.call(prompt="Classify {extracted_facts}...")
    state["classified_facts"] = classified
    return state

async def summarize_facts_node(state):
    summary = await llm.call(prompt="Summarize {classified_facts}...")
    state["fact_summary"] = summary
    return state

# Graph edges:
graph.add_edge("extract_facts_node", "classify_facts_node")
graph.add_edge("classify_facts_node", "summarize_facts_node")

Impact: - ✅ Each node has ONE clear purpose - ✅ Each LLM call is separately observable - ✅ Each can be experimented with independently - ✅ Clear cost attribution per call - ⚠️ More nodes in graph (3 instead of 1) - ⚠️ State has more intermediate fields

Principle 2: One Template Per Node (No Fragments)¶

Rule: Each node has ONE complete template file (no assembly required).

Before (fragment assembly):

prompts/
├── intent_classification/
│   ├── core_fragment.md           # Base prompt
│   ├── variant_simple.md           # Additional rules
│   └── variant_complex.md          # Different rules
└── builder.py → Assembles fragments at runtime

After (single templates):

prompts/
├── intent_classification_simple.md    # Complete prompt for simple variant
├── intent_classification_complex.md   # Complete prompt for complex variant
└── No builder needed - routing picks the right template

Implementation: - Use routing logic to decide which node/template to use - Each template is self-contained and complete - No runtime assembly required

Impact: - ✅ Templates are easier to read and understand - ✅ No assembly logic needed in Prompt Studio - ✅ Git history is clearer (changes are obvious) - ✅ Prompt engineers see complete context - ⚠️ Some duplication (shared text appears in multiple templates) - ⚠️ More template files to maintain

Trade-off Accepted: Clarity and maintainability outweigh DRY principle here.

Positive Consequences¶

For Experimentation:¶

✅ Each LLM call can be tested independently in Prompt Studio
✅ Clear 1:1 mapping: 1 observation = 1 experiment = 1 template
✅ Users can experiment with specific prompts without running entire workflow

For Debugging:¶

✅ When error occurs, know exactly which LLM call failed
✅ Token usage and cost clearly attributed to specific calls
✅ Latency bottlenecks easily identified

For Prompt Engineering:¶

✅ Templates are complete and self-contained (no mental assembly)
✅ Changes are obvious in Git diffs
✅ Can test templates in isolation
✅ No fragment dependency management

For Development:¶

✅ Simpler code (no assembly service needed)
✅ Easier to test (atomic nodes)
✅ Better separation of concerns
✅ Reusable nodes (composable)

Negative Consequences¶

State Management:¶

⚠️ State object grows larger with intermediate fields
⚠️ Need clear naming conventions for temporary vs persistent state
⚠️ More fields to document and understand

Mitigation: - Use clear prefixes: temp_, intermediate_, final_ - Document state schema clearly - Consider state cleanup strategy

Graph Complexity:¶

⚠️ More nodes in LangGraph (3x-5x in some cases)
⚠️ More edges and routing logic
⚠️ Graph visualization becomes busier

Mitigation: - Use clear naming conventions - Color coding in visualizations - Grouping/collapsing related nodes in UI - Good documentation

Template Management:¶

⚠️ More template files (2-3x increase)
⚠️ Some content duplication across variants
⚠️ Changes to shared text require updating multiple files

Mitigation: - Accept duplication for clarity - Use prompt versioning - Establish naming conventions - Regular prompt review cycles

Implementation Requirements¶

For Swisper SDK/Backend:¶

Required Changes: 1. ✅ Refactor multi-LLM nodes into atomic nodes 2. ✅ Create separate template for each variant 3. ✅ Update routing logic to pick templates 4. ✅ Ensure each node creates ONE observation per LLM call

Examples: - memory_node → extract_facts_node, classify_facts_node, summarize_facts_node - ui_node_simple + ui_node_complex → Keep as separate complete templates - intent_classification → Already follows this (one LLM call)

Timeline: Gradual migration, can coexist during transition

For SwisperStudio:¶

Already Implemented: 1. ✅ Experiment workspace assumes one template per node 2. ✅ No fragment assembly logic 3. ✅ Direct template → experiment mapping 4. ✅ Smart placeholder mapping (state → template placeholders)

No Changes Needed: SwisperStudio is ready for this architecture

Validation¶

Success Metrics:¶

Observability: - ✅ Each LLM call visible as separate observation in traces - ✅ Cost and token usage attributable per call - ✅ Can experiment with any individual prompt

Maintainability: - ✅ Prompt engineers can understand templates without assembly logic - ✅ Git diffs are clear and meaningful - ✅ Time to update prompts decreases

Performance: - ⚠️ Graph overhead < 5% of total execution time - ✅ No regression in end-to-end latency

Adoption: - ✅ Swisper team successfully refactors existing nodes - ✅ New nodes follow atomic pattern - ✅ Documentation updated

Review Date: 2025-12-12 (1 month after implementation)

Migration Strategy¶

Phase 1: New Nodes (Immediate)¶

All NEW nodes follow atomic principle
One LLM call per node
One template per node

Phase 2: Refactor High-Value Nodes (Q1 2026)¶

Refactor nodes frequently experimented with
Priority: memory_node, complex UI nodes
Keep simple nodes as-is

Phase 3: Complete Migration (Q2 2026)¶

All nodes atomic
Remove fragment assembly code
Update all documentation

Backward Compatibility:¶

Old multi-LLM nodes continue to work
Experimentation shows "partial data" warning
Gradual migration, no breaking changes

Examples¶

Example 1: Memory Node Refactoring¶

Before:

# One node, 3 LLM calls, complex state management
memory_node(state) → Updates multiple state fields

After:

extract_facts_node(state):
  - Input: user_message, conversation_history
  - Template: extract_facts.md with {{USER_MESSAGE}}, {{CONVERSATION}}
  - Output: state["extracted_facts"]

classify_facts_node(state):
  - Input: state["extracted_facts"]
  - Template: classify_facts.md with {{EXTRACTED_FACTS}}
  - Output: state["classified_facts"]

summarize_facts_node(state):
  - Input: state["classified_facts"]  
  - Template: summarize_facts.md with {{CLASSIFIED_FACTS}}
  - Output: state["fact_summary"]

Graph:

intent → extract_facts → classify_facts → summarize_facts → planning

Example 2: UI Node Variants¶

Before:

prompts/ui_node/
├── core_fragment.md
├── variant_simple.md
└── variant_complex.md
→ Runtime assembly based on routing

After:

prompts/
├── ui_node_simple.md      # Complete template
├── ui_node_complex.md     # Complete template
└── Routing decides: simple vs complex

Graph:

planning → (route decision) → ui_node_simple OR ui_node_complex

Links¶

Prompt Studio Feature Spec
Development Plan
ADR-008: Phase 2 Architecture
Related: Prompt Studio Stories SP-5, SP-6, SP-7

Notes¶

Implementation Status:¶

SwisperStudio: ✅ Ready (assumes atomic architecture) Swisper SDK: ⏳ Needs refactoring

Key Insight:¶

This decision emerged from practical UX requirements during Prompt Studio implementation. The ability to experiment with individual prompts drove the need for atomic nodes.

Follow-up Decisions Needed:¶

State cleanup strategy: When to remove intermediate fields?
Naming conventions: How to name split nodes? (extract_facts vs memory_extract_facts?)
Graph visualization: How to group related atomic nodes in UI?

Alternatives Considered and Rejected:¶

Sub-calls array: Too complex for minimal benefit
Hybrid approach: Inconsistent, confusing
Keep fragments: Makes experimentation impossible

Decision Date: 2025-11-12
Review Date: 2025-12-12
Status: Accepted - Pending Swisper SDK implementation