Conversation Buffer
The most fundamental memory pattern. The full conversation history — every user and assistant message — is passed directly in the LLM's context window on each turn. No external storage, no transformation. The context window is the memory.
Structure
Every new message is appended to the buffer. The entire buffer is sent to the model on each turn. When the buffer exceeds the context window, messages are either truncated from the front (sliding window) or the system fails.
Mechanism
- Write Path
- Read Path
- Lifecycle
- Every user message and assistant response is appended to a message list
- No transformation or processing — raw messages are stored as-is
- Buffer grows linearly with conversation length
- No deduplication or compression
- Entire message history is injected into the LLM prompt on each turn
- Model sees all prior context — nothing is hidden or filtered
- Retrieval is trivial: just include everything
- Most recent messages naturally have the strongest influence on the model
- Created: When a conversation starts
- Updated: Every turn (append new messages)
- Expires: When the conversation ends or context window is exceeded
- Variants: Sliding window (keep last K turns), token-limited buffer (keep up to N tokens)
Key Characteristics
- Zero effort — this is the default behavior of every LLM chat interface
- Perfect recall within window — the model sees every message verbatim
- Hard ceiling — bounded by context window size
- No persistence — memory dies when the session ends
- Cost scales linearly — longer conversations mean more input tokens per turn
When to Use
- Short, focused conversations that fit in the context window
- Tasks where every previous message matters (debugging sessions, code reviews)
- You're building a prototype and don't need persistence yet
- The conversation is unlikely to exceed the model's context limit
- You need the simplest possible memory implementation