Skip to main content

GitHub Copilot

GitHub's AI-powered coding assistant built into VS Code, JetBrains, and other IDEs. Combines fill-in-the-middle (FIM) completion with chat, agent mode, and MCP integration. Multi-model (GPT-4o, Claude Sonnet).


Architecture


Inline Completion Pipeline

A multi-stage pipeline revealed through reverse engineering of the VS Code extension:

1. Relevant Document Selection — queries the 20 most recently accessed files of the same language.

2. Prompt Wishlist — assembles context with six element types sorted by priority: BeforeCursor, AfterCursor, SimilarFile, ImportedFile, LanguageMarker, PathMarker.

3. Token Budget Fulfillment — elements added until the token budget is exhausted.

4. Contextual Filter — logistic regression over 11 features (language type, acceptance history, cursor context) suppresses requests below a 15% confidence threshold.

5. Fill-in-the-Middle — the model receives both code before the cursor (prefix) and after (suffix), generating the middle. Default: 15% of tokens reserved for suffix.

The original Copilot used a 12B parameter "cushman-ml" model, not the full 175B GPT-3.


Context Gathering

MethodMechanism
Neighboring TabsJaccard similarity over sliding 60-line windows from open files
Import GraphWhen imports are added, corresponding files are retrieved and ranked
Symbol ReferencesLSP data or AST traversal finds referenced symbols
Vector EmbeddingsSemantically similar files identified through embeddings
Snippet ScoringEach snippet scored by proximity, similarity, relevance, and recency

Chat Architecture

Three-layer architecture per Microsoft's documentation:

1. Extension (Local) — captures prompt, checks local index, identifies relevant files, tokenizes and bundles context.

2. Proxy (Cloud) — sanitization, compliance checks, rate limiting. Forwards to backend LLM. No persistent storage of prompts or suggestions.

3. Backend LLM — generates streaming response.

One round trip per message. TLS v1.2 with ephemeral key exchange for forward secrecy.


Agent Mode

Introduced February 2025. Operates as an agentic loop:

  1. Determines relevant context and files autonomously
  2. Offers code changes and terminal commands
  3. Monitors correctness and iterates to fix issues

Context provided: user query, summarized workspace structure, machine context, tool descriptions. Tools include read_file, workspace search, terminal commands, and compiler/linter diagnostics.

MCP integration (January 2026) ships a default GitHub MCP server for repository, issue, and PR management.


Telemetry

Acceptance is measured by checking if suggestions remain in code after 15s, 30s, 2min, 5min, and 10min intervals using word-level edit distance with a 50% threshold.


Patterns Used

PatternHow It's Used
RAGContext gathering via Jaccard similarity, imports, AST, and embeddings
PipelineExtension > Proxy > LLM three-stage request flow
StreamingReal-time token delivery for completions and chat
ReActAgent mode's autonomous tool-use loop
Tool RouterAgent selects from file read, search, and terminal tools
MCPGitHub MCP server for repo/issue/PR integration