Vector Store Memory

Stores information as vector embeddings in an external database and retrieves relevant memories via semantic similarity search. This is the backbone of RAG (Retrieval-Augmented Generation) — the dominant paradigm for giving LLMs access to knowledge beyond their training data and context window.

Structure

Information is chunked, embedded into vectors, and stored. At query time, the input is embedded and compared against stored vectors. The most semantically similar results are injected into the prompt alongside the user's query.

Mechanism

Write Path
Read Path
Lifecycle

Text is split into chunks (sentences, paragraphs, or semantic units)
Each chunk is embedded into a high-dimensional vector via an embedding model
Vectors are stored in a vector database with metadata (source, timestamp, tags)
Common databases: Pinecone, Chroma, Weaviate, Qdrant, pgvector, FAISS

Key Characteristics

Semantic retrieval — finds relevant content by meaning, not keyword matching
Scales massively — can index entire knowledge bases, codebases, or document corpora
Persistent — survives across sessions, deployments, and restarts
Retrieval quality varies — embedding model and chunking strategy critically impact results
No reasoning over relationships — finds similar content but can't traverse connections

When to Use

You need to give an agent access to a large knowledge base (docs, code, policies)
Information is too large to fit in the context window
Queries are semantically diverse — you can't predict what will be relevant
You're building RAG pipelines for question answering or document chat
Long-term memory needs to persist across sessions and scale over time

Structure​

Mechanism​

Key Characteristics​

When to Use​

Structure

Mechanism

Key Characteristics

When to Use