Devin

Cognition Labs' autonomous software engineering agent. Runs in a full isolated VM with browser, terminal, and code editor. Separates intelligence (Brain) from execution (DevBox) with a persistent knowledge base across sessions.

Architecture

Brain / Body Separation

Devin's architecture splits intelligence from execution:

Brain — the reasoning layer hosted in Cognition's Azure tenant. Each session gets an isolated container. This is where the LLM reasons and decides actions.

DevBox — the execution environment in the customer's VPC or Cognition's cloud. A full Ubuntu 24.04 VM on bare metal instances (AWS i3 or Azure Lasv3) with git, Python, Java, Docker, VSCode server, VNC, and proprietary scripts.

Communication is over HTTPS/443 via WebSocket. No customer data is stored at rest outside the customer's environment.

Sub-Agent System

Devin uses specialized sub-agents within the agentic loop:

Agent	Role
Code Editor Agent	File manipulation and code generation
Command Line Agent	Terminal command execution
Error Handler Agent	Failure analysis using output data and RAG memory, triggers iterative refinement
Browser Agent	Web research via sandboxed browser (VNC)

The system operates in tight feedback loops: test failures and linter errors trigger autonomous iteration until the build passes.

Interactive Planning

Devin 2.0 introduced a structured planning flow:

Scan — automatically scans the codebase to understand context
Plan — develops a detailed plan with relevant files and findings
Review — users can edit and approve the plan before execution
Execute — autonomous execution with dynamic replanning as needed

If the user changes direction mid-task, Devin revises its plan and continues.

Memory and State

Layer	Scope	Mechanism
Knowledge Base	Cross-session	Persistent instructions — coding standards, deployment workflows, naming conventions
Session Memory	Single session	Context maintained across the agentic loop, restorable to previous states
DeepWiki	Cross-session	Auto-indexes all repos every few hours, generates wiki-style docs with architecture diagrams
VM Snapshots	Cross-session	Full VM state saved and restored for continuity

Performance degrades beyond ~10 ACUs (Agent Compute Units) per session.

Sandbox Environment

Each session runs in an isolated VM:

Infrastructure: Bare metal instances for ad-hoc VM creation
Isolation: Per-session VMs prevent conflict between sessions
Security: AES-256 encryption at rest, TLS 1.3+ in transit, secrets decrypted at session start then re-encrypted
Tools: Shell, code editor, browser, Docker, external integrations (SonarQube, Veracode, Jira, Slack)

Performance

From Cognition's 2025 production review:

67% of PRs merged (up from 34% at launch)
Best suited for tasks with clear requirements that would take a junior engineer 4-8 hours
Excels at code migrations (SAS to PySpark, Angular to React, .NET Framework to .NET Core)
Security fixes: 20x efficiency gain — 1.5 minutes vs human average of 30 minutes per vulnerability

Patterns Used

Pattern	How It's Used
Plan-and-Execute	Interactive planning with user approval before execution
Hierarchical Agent	Brain delegates to specialized sub-agents (editor, shell, browser, error)
Code Execution	Full VM sandbox with terminal, Docker, and build tools
Reflection	Error handler analyzes failures and triggers iterative fixes
File-Based Memory	Knowledge base persists across sessions
Human-in-the-Loop	Plan review and approval before autonomous execution
Knowledge Graph	DeepWiki indexes repos into interconnected documentation

Architecture​

Brain / Body Separation​

Sub-Agent System​

Interactive Planning​

Memory and State​

Sandbox Environment​

Performance​

Patterns Used​