What is Agent Memory?
Conceptually, memory is how agents build “continuity of self.” Concretely, it’s a combination of mechanisms that store and retrieve useful information:
- Short-term memory: session-scoped working memory and conversation history—what’s happening right now and in the last few turns.
- Long-term memory: durable knowledge that persists across sessions—facts, preferences, and patterns the agent should retain.
- Episodic memory: structured records of interactions and events over time—who asked what, what the agent did, and how it turned out.
- Context manager: a discipline for combining global, session, task, and local state into a just-right context sent to the model.
This layered design balances immediacy (short-term), durability (long-term), chronology (episodic), and precision (context management). The result is an agent that feels consistent, learns from experience, and remains efficient.
For a deeper conceptual and practical tour, see the Memory System Guide.
How SuperOptiX Memory Works
SuperOptiX provides a powerful, multi-layer memory model you can use via Python, via DSPy adapters configured with JSON-like configs, or declaratively through SuperSpec (YAML).
- Short-term memory captures rolling conversation context and working notes. Use it for ephemeral state and the last N messages.
- Long-term memory persists knowledge with optional semantic search—store guidance (“always return runnable code”), user preferences, and domain facts. Enable embeddings if you want recall by meaning, not just literal keywords.
- Episodic memory tracks episodes and events—great for analytics and learning (e.g., “episode resolved successfully,” “user preferred example-based explanations”).
- The context manager merges relevant state across scopes to build clean, bounded prompts for the LLM.
Choosing a Memory Backend
Pick the backend that matches your deployment needs:
- file: portable, zero-ops JSON/pickle storage; great for demos and quick local runs.
- sqlite: reliable embedded database; sensible default for most agents.
- redis: networked, high-throughput in-memory store for production workloads.
Use Memory from Python (Public API)
Below are usage-only examples for working with memory in your own Python code.
from superoptix.memory import AgentMemory, FileBackend, SQLiteBackend
# RedisBackend is also available if you install and configure redis
# Create an agent memory (defaults to SQLite)
memory = AgentMemory(agent_id="writer-assistant")
# Short-term: store ephemeral context
memory.remember("User prefers TypeScript", memory_type="short", ttl=3600)
# Long-term: store durable knowledge with categories/tags
memory.remember(
"Always provide runnable code snippets",
memory_type="long",
category="authoring_guidelines",
tags=["writing", "code", "quality"]
)
# Recall (semantic search if embeddings are enabled)
results = memory.recall("runnable code", memory_type="long", limit=5)
for r in results:
print(r["content"])
# Track an interaction episode with events
episode_id = memory.start_interaction({"user_id": "alice"})
memory.add_interaction_event("user_question", "How to configure memory backends?")
# ... generate your response ...
memory.end_interaction({"success": True})
# Introspection and housekeeping
print(memory.get_memory_summary())
memory.cleanup_memory()
# Explicit backend selection
file_memory = AgentMemory("file-demo", backend=FileBackend(".superoptix/memory"))
sqlite_memory = AgentMemory("sqlite-demo", backend=SQLiteBackend(".superoptix/mem.db"))
Configure Memory via DSPy Adapters (JSON)
SuperOptiX integrates memory into DSPy-based agents through adapters. You don’t need to wire internals—provide a JSON-like configuration dict (or load it from a .json
file), and the adapter will:
- retrieve relevant long-term memories for the query,
- include recent short-term conversation snippets,
- manage episodes and events,
- persist useful insights after responses.
See DSPy’s adapter documentation for background on the adapter pattern.
How DSPy Adapters Integrate with Memory
The DSPy adapter creates a memory-enhanced agent module that automatically handles the complete memory lifecycle:
Memory Initialization: When you create a DSPyAdapter
, it automatically instantiates an AgentMemory
system based on your config. The adapter reads the memory.enabled
and memory.enable_embeddings
flags to configure the memory system appropriately.
Memory-Enhanced Agent Module: The adapter generates a custom DSPy module (MemoryEnhancedAgentModule
) that wraps your agent logic with memory operations. This module:
- Starts an interaction episode when processing begins
- Retrieves relevant memories before generating responses
- Stores conversation history and insights after completion
- Manages the complete interaction lifecycle
Context Building Process: Before sending a query to the LLM, the adapter:
- Searches long-term memory for semantically relevant knowledge
- Retrieves recent conversation context from short-term memory
- Merges persona information, relevant memories, and conversation history
- Builds a clean, bounded context string for the model
Memory Persistence: After the LLM generates a response, the adapter:
- Stores the Q&A pair in short-term memory for immediate context
- Adds the interaction to the conversation history
- Logs events (user query, agent response) to the episodic memory
- Ends the interaction episode with success/failure metadata
Example JSON config (save as agent.config.json
)
{
"llm": {
"provider": "ollama",
"model": "llama3.2:1b",
"api_base": "http://localhost:11434",
"temperature": 0.2
},
"persona": {
"name": "MemoryDemo",
"description": "Demonstrates SuperOptiX layered memory"
},
"memory": {
"enabled": true,
"enable_embeddings": true
}
}
Advanced Memory Configuration
You can fine-tune memory behavior through additional configuration options:
{
"llm": {
"provider": "ollama",
"model": "llama3.2:1b",
"api_base": "http://localhost:11434"
},
"persona": {
"name": "AdvancedMemoryBot",
"description": "Advanced memory configuration example"
},
"memory": {
"enabled": true,
"enable_embeddings": true,
"short_term_capacity": 200,
"memory_retrieval": {
"max_memories": 5,
"min_similarity": 0.3,
"include_conversation_history": true
},
"episodic_tracking": {
"auto_start_episodes": true,
"event_logging": true,
"outcome_tracking": true
}
}
}
Run with the DSPy adapter
import json
import asyncio
from superoptix.adapters.dspy_adapter import DSPyAdapter
# Or: from superoptix.adapters.observability_enhanced_dspy_adapter import ObservabilityEnhancedDSPyAdapter
with open("agent.config.json", "r") as f:
config = json.load(f)
adapter = DSPyAdapter(config)
# adapter = ObservabilityEnhancedDSPyAdapter(config) # for detailed tracing/debugging
async def main():
result = await adapter.run({
"query": "Remind me how to enable memory in SuperSpec.",
"context": {"user_id": "alice"} # optional context becomes part of the episode
})
print(result["result"])
print("Memory stats:", result.get("memory_stats") or result.get("observability", {}).get("memory_stats"))
asyncio.run(main())
Memory Statistics and Monitoring
The adapter returns comprehensive memory statistics with each response:
# Example response structure
{
"result": "To enable memory in SuperSpec, add the memory section...",
"episode_id": "ep_12345",
"memory_stats": {
"interactions": 15,
"short_term_items": 8,
"long_term_items": 42,
"active_episodes": 1
}
}
Observability-Enhanced Adapter
For production deployments, use the ObservabilityEnhancedDSPyAdapter
which provides:
- Detailed memory operation tracing
- Performance metrics for memory operations
- Debug breakpoints for memory inspection
- Integration with external observability tools (MLflow, Langfuse)
Tip: To extend observability, include:
"observability": {
"debug_mode": false,
"trace_memory": true,
"enable_breakpoints": false
}
Configure Memory in SuperSpec (YAML)
SuperSpec is SuperOptiX’s declarative DSL. You describe your agent, and SuperOptiX compiles it to a runnable DSPy pipeline. Learn about SuperSpec at the SuperSpec page and browse the full SuperSpec documentation.
apiVersion: agent/v1
kind: AgentSpec
metadata:
name: memory-demo
id: memory_demo
namespace: demo
level: genies
spec:
language_model:
location: local
provider: ollama
model: llama3.1:8b
api_base: http://localhost:11434
temperature: 0.7
max_tokens: 2048
memory:
enabled: true
short_term:
enabled: true
max_tokens: 2000
window_size: 10
long_term:
enabled: true
storage_type: local # file | sqlite | redis
max_entries: 500
persistence: true
episodic:
enabled: true
max_episodes: 100
context_manager:
enabled: true
max_context_length: 4000
context_strategy: sliding_window
Compile and run with the Super CLI
# Ensure a local model is installed (Ollama is the default backend)
super model install llama3.2:8b
# Compile and run the agent
super agent compile memory_demo
super agent run memory_demo --goal "Show me how memory works in SuperOptiX"
For a complete overview of the SuperOptiX platform, visit the SuperOptiX website. For a deep dive into memory systems and examples, check out the Memory System Guide.
Practical Patterns and Tips
- Start with sqlite for persistence; use file for simple portability; use redis for high-throughput services.
- Use short-term memory for rolling conversation context; use long-term memory for durable knowledge with categories and tags.
- Treat episodic memory as your analytics backbone: start episodes around conversations/tasks, log events, and end with outcomes.
- Enable embeddings when you need “by-meaning” recall; leave it off to save compute for keyword-only search.
- Periodically call cleanup APIs for long-running services to keep memory lean.
- Use the observability-enhanced adapter for production deployments to monitor memory performance and debug issues.
- Configure appropriate memory retrieval limits to balance context richness with prompt efficiency.