Skip to main content

Conversation Buffer

The most fundamental memory pattern. The full conversation history — every user and assistant message — is passed directly in the LLM's context window on each turn. No external storage, no transformation. The context window is the memory.


Structure

Every new message is appended to the buffer. The entire buffer is sent to the model on each turn. When the buffer exceeds the context window, messages are either truncated from the front (sliding window) or the system fails.


Mechanism

  • Every user message and assistant response is appended to a message list
  • No transformation or processing — raw messages are stored as-is
  • Buffer grows linearly with conversation length
  • No deduplication or compression

Key Characteristics

  • Zero effort — this is the default behavior of every LLM chat interface
  • Perfect recall within window — the model sees every message verbatim
  • Hard ceiling — bounded by context window size
  • No persistence — memory dies when the session ends
  • Cost scales linearly — longer conversations mean more input tokens per turn

When to Use

  • Short, focused conversations that fit in the context window
  • Tasks where every previous message matters (debugging sessions, code reviews)
  • You're building a prototype and don't need persistence yet
  • The conversation is unlikely to exceed the model's context limit
  • You need the simplest possible memory implementation