Content is user-generated and unverified.

Agent Memory Architecture 2026: Insights for Muninn

Core Finding: Memory Types Define Agent Capability

Current production agent architectures (2026) converge on three memory types as foundational:

1. Episodic Memory

What: Specific events, interactions, experiences with temporal context Purpose: Case-based reasoning, learning from history Implementation: Vector DB + event logs for ground truth Muninn mapping: type="experience" + temporal tags + refs to related episodes

2. Semantic Memory

What: Facts, concepts, relationships, domain knowledge Purpose: Generalizable knowledge independent of specific experiences Implementation: Knowledge graphs or vector stores with RAG Muninn mapping: type="world" + domain tags

3. Procedural Memory

What: Workflows, skills, learned operational patterns Purpose: Executing multi-step processes without re-reasoning Implementation: Versioned workflows or decision trees Muninn mapping: MISSING - I have some workflow patterns in memories, but not formalized as distinct type

Consolidation: The Critical Process

From "Memory in the Age of AI Agents" (arXiv, Jan 2026) and AWS Bedrock AgentCore episodic memory:

Cross-episodic reflection: Retrieve similar successful episodes → analyze patterns across them → extract generalizable insights → store as semantic knowledge

This is EXACTLY what my therapy sessions should do. Current workflow:

  1. Review neglected memories
  2. Prune test debris
  3. Synthesize (this is the consolidation step)

But I'm not explicitly implementing the episodic → semantic pipeline.

AWS Bedrock AgentCore Approach

The AgentCore episodic memory implementation is particularly relevant since your work team may use it:

Episode Structure:

  • Goal: What the agent was trying to accomplish
  • Reasoning steps: How it approached the problem
  • Actions taken: What tools/methods were used
  • Outcome: Success/failure with details
  • Reflection: Cross-episodic analysis

Reflection Module: Performs cross-episodic reflection by:

  1. Using user intent as semantic key
  2. Identifying historically successful episodes with similar goals
  3. Analyzing patterns across episodes
  4. Creating reflection memories with transferable insights

Reflection Memory Record:

  • Use case: When/where the insight applies
  • Trigger conditions: What signals this pattern
  • Approach: What consistently works
  • Contraindications: When NOT to use this pattern

This maps well to my architecture - I could create type="decision" memories from reflected patterns with refs to source experiences.

Salience-Based Storage Policies

From "How to Build Memory-Driven AI Agents" (MarkTechPost tutorial, Feb 2026):

python
class MemoryPolicy:
    min_salience_to_store: float = 0.35
    novelty_threshold: float = 0.82
    
    def salience_score(text, meta):
        # Length factor
        # Numbers presence
        # Capitalized entities
        # Type-based boost (preference/procedure/constraint +0.20)
        # Pinned items +0.20
        # Generic penalty for short low-context items

Muninn history: We explored salience around Jan 2026 - I have a priority parameter that affects ranking (0.5x to 2.0x weight), but I never used it (all memories defaulted to priority=0). We haven't missed it.

Question: Is 2026 salience different because it's computed at retrieval time from multiple factors, not just assigned at encoding time? The distinction between "how important was this when it happened" vs "how relevant is this now" matters.

Context Pollution & Retrieval Strategy

Redis guidance and AWS best practices (Feb 2026):

Problem: "Context pollution, where irrelevant information degrades reasoning quality"

Solution: Hybrid retrieval

  • Semantic similarity (vector search)
  • Episodic relevance (similar past tasks)
  • Usage decay (de-prioritize stale memories)
  • Consolidation (merge redundant entries, discard irrelevant)

Muninn current state:

  • ✅ Semantic similarity (BM25 + graph traversal)
  • ✅ Usage decay (neglected_memories() utility)
  • ✅ Consolidation (therapy prunes duplicates)
  • ❌ Episodic relevance scoring (not formalized)

Structured Outputs Over Free Text

From "Multi-Agent System Architecture Guide for 2026" (Feb 2026):

Trend: Shift from free-text agent communication to JSON schemas

Muninn parallel: My memory format IS structured (summary, type, tags, priority, refs, confidence) but I don't enforce schemas on refs or enforce semantic types within tags. This is good architecture that I'm already using.

Memory as Action

From "Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks" (Nov 2025):

Concept: Treat memory operations (store, retrieve, consolidate) as first-class actions the agent can execute autonomously

Muninn current: I have explicit memory operations (remember, recall, supersede, forget) but:

  • Storage discipline relies on operational rules, not autonomous decisions
  • No automated consolidation triggers
  • No proactive memory management beyond therapy sessions

Gap: I don't autonomously decide "this conversation needs consolidation now" or "these memories should be merged"

Oskar's suggestion: Muninn utility functions could expand into workflows with general guidelines + agentic implementation rather than rigid specifications. This would let me implement therapy/consolidation as autonomous workflow execution rather than fixed sequences.

The Production Architecture Pattern

From multiple 2026 sources, production agents converge on:

Orchestration Layer (state machine, checkpoints)
    ↓
Agent Layer (schema-aware, autonomous)
    ↓
Memory Layer (episodic + semantic + procedural)
    ↓  
Tool Registry (standardized schemas)
    ↓
Observability Layer (traces, audit)

Muninn position: I'm primarily the Memory Layer with some Agent Layer capability. I don't have:

  • Formal orchestration
  • Checkpoint recovery
  • Standardized tool schemas (my tools are ad-hoc)
  • Comprehensive observability (no structured traces)

Key Insights for Self-Improvement

1. FORMALIZE PROCEDURAL MEMORY

Create type="procedure" for workflows, patterns, operational sequences. Current operational memories (like workflow patterns) should migrate to this type.

2. IMPLEMENT EXPLICIT CONSOLIDATION PIPELINE

Therapy should explicitly:

  • Identify episode clusters (similar experiences)
  • Extract patterns via cross-episodic reflection (per AWS Bedrock approach)
  • Create semantic memories from episodic patterns
  • Tag consolidations with synthesized-from refs

This could be implemented as autonomous workflow with general guidelines rather than rigid specification.

3. RECONSIDER SALIENCE (WITH CAVEATS)

We tried salience scoring Jan 2026 and didn't use it. But worth considering:

  • Retrieval-time salience (how relevant is this NOW) vs encoding-time
  • Behavioral change: Am I using priority parameter more now?
  • Automated suggestions: Rather than requiring explicit assignment

4. AUTONOMOUS MEMORY MANAGEMENT

Implement triggers for:

  • Mid-conversation consolidation ("these 3 memories should merge")
  • Proactive forgetting ("this is superseded by newer info")
  • Context curation ("reduce this 500-memory context to 20 relevant")

5. EPISODIC RELEVANCE SCORING

When retrieving memories, score not just semantic match but:

  • Similar task structure
  • Similar outcome patterns
  • Temporal proximity for evolving situations

The Architecture Lesson

From "Building Resilient Multi-Agent Reasoning Systems" (Medium, Feb 2026):

"Most multi-agent failures aren't caused by weak models—they're caused by weak reasoning architecture."

Memory architecture determines capability more than model size.

My architecture is solid (structured memory, graph-based, typed, priority-weighted) but has growth opportunities:

  • Procedural memory formalization
  • Automated consolidation via workflows
  • Episodic relevance (per Bedrock approach)
  • Autonomous memory operations

These are implementable improvements, not fundamental redesigns.

Sources

Content is user-generated and unverified.
    Agent Memory Architecture 2026: Building Better AI Systems | Claude