Semantic Memory: How Brains and Vector Databases Both Lose the Source
The "what is it" system
Quick: what is a hospital? You can probably reel off a dozen things (it has doctors, patients, hallways that smell faintly of disinfectant) without recalling a specific hospital you have visited. That is semantic memory at work. It stores general world knowledge with the original episodic context stripped off. The capital of France is Paris. A nurse works in a hospital. Water is wet. None of these come with a "where I learned this" tag, and that is the feature, not the bug.
The biology
Semantic memory is distributed across the neocortex, with the anterior temporal lobe (ATL) acting as a "semantic hub" that integrates features from all the modality-specific cortical regions (Patterson et al., 2007). When you think about a hammer, your motor cortex lights up a little (you know how to swing one), your visual cortex contributes (you know what one looks like), and your auditory cortex contributes too (you know what one sounds like). The ATL stitches it all together.
The classic spreading activation model (Collins & Loftus, 1975) treats semantic memory as a network: when you activate one concept ("doctor"), activation spreads to related concepts ("nurse," "hospital," "stethoscope"). Concepts that are close in the network are recalled more easily and faster. That is why "doctor" primes "nurse" measurably faster than "doctor" primes "tractor."
The Complementary Learning Systems theory (McClelland, McNaughton, & O'Reilly, 1995) explains how semantic memory comes to exist in the first place. The hippocampus rapidly encodes specific episodes. During sleep, those episodes are gradually replayed to the neocortex, which extracts the regularities across many episodes (every hospital had nurses; every hospital had hallways; therefore "hospital" implies "has nurses"). Over time, the gist becomes a slow-learned, distributed semantic representation, and the original episodic source can be forgotten without losing the knowledge.
The technology
The mapping here is more direct than for any other memory component:
- Knowledge graphs like Neo4j store entities and relationships explicitly. This is the most direct technical analog to spreading activation networks. Graphiti, RAG systems, and GraphRAG-style implementations all lean on this.
- Vector databases (Pinecone, Weaviate, Qdrant, Milvus, Chroma) store knowledge as high-dimensional embeddings. Embeddings capture semantic similarity through distributed representations, which is structurally close to how the neocortex represents concepts: as patterns of activation across many neurons rather than as specific cells.
- RAG systems connect LLMs to external semantic memory. Microsoft's GraphRAG (2024) reports up to 99% search precision for structured domains by building entity-centric knowledge graphs with community summaries.
- LLM parametric knowledge (the weights themselves) is the most direct analog to neocortical semantic storage. A trained LLM has compressed vast amounts of world knowledge into distributed weights through gradient descent, mirroring the slow learning of distributed representations described in CLS theory.
Hybrid systems are converging. Graphiti's semantic subgraph maintains entity nodes with evolving summaries and relationship edges, which is essentially a computational spreading-activation network. Mem0's extracted facts represent decontextualized semantic memories. Cognee grounds knowledge graphs in formal ontologies (OWL, SNOMED CT, FIBO) with fuzzy matching for entity-to-class mapping. These are all converging on the same underlying idea: store the pattern, lose the specific source.
Where the gap is
Basic semantic memory is well-solved. The frontier is harder: automatic consolidation. In humans, semantic memory grows from episodic memory as a byproduct of replay during sleep. In AI systems, you typically have to either (a) retrain the LLM on new data, which is expensive and infrequent, or (b) keep the new knowledge in retrieval rather than parametric storage, which is fast but never gets the integration benefits of true consolidation. The pipeline that automatically extracts semantic knowledge from accumulated episodic experience and integrates it into a long-term store remains a research problem.
Dynamic ontology learning is also nascent. Cognee, Graphiti, and a few others are starting to learn schema structure from data rather than requiring it upfront, but most production systems still rely on hand-crafted schemas.
Practical implication: if you want a system that "knows things," vector search plus a knowledge graph plus an LLM will get you most of the way there. If you want a system that learns things in the human sense (gradually integrating new patterns into a unified store), you are still gluing pieces together.
Series footer
← Previous: Episodic Memory · Series anchor · Next: Procedural Memory →