What is Memory Hierarchy?
Memory hierarchy is an architectural pattern that organizes memory systems into multiple levels or tiers based on characteristics like access speed, capacity, and persistence. Similar to computer memory hierarchies (registers, cache, RAM, disk), agent memory hierarchies typically include fast, limited-capacity working memory for immediate context, intermediate short-term memory for recent interactions, and slower but higher-capacity long-term memory for historical information and learned knowledge.
Each level in the hierarchy serves different purposes and has different performance characteristics. Working memory (often the model's context window) provides immediate access but very limited capacity. Short-term memory might use in-memory buffers or caches to store recent conversation history with fast access but limited retention. Long-term memory uses persistent storage (vector databases, knowledge graphs) with large capacity but slower retrieval, storing accumulated knowledge and experiences that persist across sessions.
The hierarchy enables efficient memory management by keeping frequently accessed or recently used information in faster levels while moving less-used information to slower but more capacious storage. Effective hierarchical systems implement policies for promoting important information to higher levels (faster access), demoting or consolidating old information to lower levels, and coordinating retrieval across levels. This architecture allows agents to balance the competing demands of comprehensive memory coverage and responsive performance.