Memory Retrieval: Pattern Completion in Hippocampus and Vector Search

Recall is reconstruction, not playback

You do not retrieve a memory the way a tape recorder plays a tape. You retrieve fragments, and the brain fills in the rest. That is why memories are often confidently wrong, why a small cue (a smell, a song) can pull back a whole scene, and why two siblings remember the same dinner differently. Retrieval is constructive. It is the brain's pattern-completion system filling in the gaps from a partial cue, and modern vector search does almost exactly the same thing.

The biology

Hippocampal subregion CA3 is the pattern-completion engine. CA3 is densely interconnected (about 4% of its neurons connect to each other), forming an auto-associative network. When given a partial cue, recurrent dynamics complete the pattern by activating the rest of the original memory trace. The technical name for this is attractor dynamics: a partial input is "pulled" toward the closest stored pattern.

Spreading activation in neocortical semantic networks then propagates from the activated concepts to related ones. If "hospital" gets activated, "nurse," "doctor," and "stethoscope" come along for the ride.

Critically, retrieval is reconstructive, not reproductive. Memories are blended with current knowledge, schemas, and expectations. This is why eyewitness testimony is famously unreliable. The same constructive process is also why inference works: you can answer questions you were never explicitly told the answer to, by pattern-completing across related stored knowledge.

The technology

Vector similarity search (cosine similarity, dot product) is the direct pattern-completion analog. A partial cue (the search embedding) triggers the most similar stored pattern. HNSW (Hierarchical Navigable Small World) graphs, the dominant production index, achieve over 95% recall with logarithmic search complexity, which is well within the speed needed for real-time use.

Hybrid retrieval has become the production standard, combining three pathways: semantic search (dense vectors capturing meaning), keyword search (BM25 capturing exact terms), and graph traversal (following entity relationships, the closest analog to spreading activation). Reciprocal Rank Fusion (Cormack et al., SIGIR 2009) elegantly combines rankings from incompatible scoring systems using rank positions: score(d) = sum of 1 / (k + rank(d)) across systems. RRF is now baked into Elasticsearch, OpenSearch, Azure AI Search, and most production stacks.

Re-ranking models (Cohere Rerank, ColBERT) apply deeper relevance assessment as a second stage, paralleling how the brain applies additional evaluation to initially activated memories.

Multi-hop retrieval, decomposing queries into sub-questions with sequential retrievals, is the closest computational analog to spreading activation. HyDE (Hypothetical Document Embeddings) generates a hypothetical answer, embeds it, then retrieves real documents matching that vector. This improves recall for underspecified queries by optimizing retrieval cues, which is a near-perfect parallel of the encoding specificity principle from cognitive psychology.

The constructive nature of LLM generation also mirrors the constructive nature of human retrieval. The model blends retrieved fragments with parametric knowledge and current context, producing reconstructed rather than replayed responses. That includes the same failure mode (confabulation, hallucination), which is essentially what humans do when memory pattern completion succeeds at producing a plausible answer that is not actually accurate.

Where the gap is

Vector similarity search is production-grade with sub-millisecond queries at billion scale. Hybrid retrieval is becoming the default architecture. Multi-hop retrieval remains an active research area: getting an LLM to reliably decompose a complex question into multiple retrieval steps and combine the results is still flaky. HippoRAG (NeurIPS 2024) outperforms standard RAG by 20% on multi-hop QA, but there is plenty of room above that.

Practical implication: if you only have one retrieval pathway, you are leaving recall on the table. Hybrid retrieval (semantic + keyword + at least a light graph traversal) plus a re-ranker on top is the production sweet spot for most use cases, and it lines up directly with how the brain blends multiple cues to reconstruct a memory.

Series footer

← Previous: Memory Consolidation · Series anchor · Next: Memory Reconsolidation →

Memory Retrieval: Pattern Completion in Hippocampus and Vector Search

Recall is reconstruction, not playback

The biology

The technology

Where the gap is

Series footer

More Posts

What We Learned Testing Embedding Dimensions and pgvector halfvec for RAG

From Human Memory to Machine Memory: A Field Guide to AI Memory Architecture

Sensory Memory: The Quarter-Second Buffer Behind Whisper and Kafka