Runtime.
The Feature
The Code Mode Issue
A quarterly about machines that think — and the substrates that hold them
“The agent has always been writing.”
The industry has operated under a flawed assumption — that AI agents fundamentally work by calling tools from a menu. Cloudflare's 2025-26 releases demonstrate this model was misguided. The actual primitive is code itself.
The agent has always been writing
Large language models possess extensive training on production code but minimal exposure to bespoke tool schemas. Meet them where they're fluent.
The industry has operated under a flawed assumption: that AI agents fundamentally work by calling tools from a menu. Cloudflare’s 2025–26 releases demonstrate this model was misguided. The actual primitive is code itself.
Large language models possess extensive training on production code but minimal exposure to bespoke tool schemas. Rather than asking models to use unfamiliar formats, platforms should meet them where they’re genuinely fluent: actual programming languages.
This shift has structural implications beyond token efficiency. When agents write executable code, that code requires a secure runtime. This enables agents to operate as persistent infrastructure rather than ephemeral tools, decoupled from individual sessions or devices.
The Code Mode Thesis
Two tools only. search and execute. A 99.9% reduction from naive implementations. 1,000-token footprint regardless of endpoint count.
Traditional MCP servers present tools as discrete options in a catalog, consuming tokens in context for each tool definition. Cloudflare’s infrastructure — roughly 2,500 API endpoints — would require approximately 1.17 million tokens if implemented this way, exceeding every production model’s context window.
The solution: provide two tools only. A search function returns relevant OpenAPI specification slices on demand. An execute function runs JavaScript the model authors. Pagination, retries, conditional logic, and API chaining happen within code blocks rather than through narrated tool sequences.
The result: A fixed ~1,000-token footprint regardless of endpoint count. A 99.9% reduction from naive implementations. Performance improvements beyond mere efficiency — models prove more capable writing familiar TypeScript than interpreting synthetic tool schemas.
The implementation presents code in a typed JavaScript environment where tools become methods on a codemode.* namespace with auto-generated TypeScript definitions. OpenAPI $ref structures are pre-resolved. Authentication remains host-side, never embedded in model-generated code.
Inside the Dynamic Worker
V8 isolates start in milliseconds, consume single-digit megabytes, and carry battle-tested security posture. The difference from containers is the whole business model.
Traditional approaches run AI-generated code in containers: hundreds of milliseconds startup time, hundreds of megabytes memory footprint. This architecture becomes economically unviable at consumer scale — maintaining warm containers per user or reusing containers across users both present unacceptable tradeoffs.
Cloudflare employed V8 isolates — existing infrastructure for running untrusted code at CDN edges. These start in milliseconds, consume single-digit megabytes, and carry battle-tested security posture.
The Dynamic Worker Loader enables Workers to instantiate fresh isolates with runtime-specified code on the same physical machine. No pool coordination. No sizing decisions. Execution occurs in the same region as the request, microseconds after code generation completes.
Default Dynamic Workers provide no filesystem access, no environment variables, and blocked outbound connectivity. Code gains capabilities only through explicit host-side fetcher handlers, ensuring secrets remain outside the sandbox.
Project Think
An ephemeral agent is a tool. A durable agent is infrastructure. The execution ladder, sub-agents, session trees, self-authored extensions.
Contemporary coding agents operate within single sessions on local machines, requiring repeated manual setup and serving only individual users. Project Think establishes primitives for persistent, multi-tenant agent infrastructure.
The Execution Ladder
Agents operate across graduated sandbox tiers, ascending only when tasks demand:
- Workspace — Persistent filesystem with SQLite backing, surviving across sessions
- Isolate — Fast, stateless V8 execution for routine operations
- npm Runtime — Package resolution when dependencies are needed
- Browser — Full Chrome DevTools Protocol for interaction-based tasks
- Sandbox — Linux containers for legacy dependencies
Most operations remain at rung 2. Climbing higher increases cost and security surface simultaneously.
Sub-agents and Session Trees
Agents create isolated children with dedicated storage and typed RPC communication to parents. Sessions form tree structures where branches can be managed independently, with full-text searchable histories resembling version-controlled conversations.
Self-authored Extensions
Agents can generate custom tools at runtime, discovering needs and extending capabilities dynamically. New tools inherit sandbox restrictions, providing technical safeguards though philosophical questions remain.
Memory as Infrastructure
Agent knowledge accumulation typically disappears after sessions end. Memory profiles make it durable, shareable, and propagatable.
Agent knowledge accumulation typically disappears after sessions end. Agent Memory preserves this as durable infrastructure, accessible across sessions, agents, and team members.
Memory profiles function as named containers attachable to agents but optionally shared. Development teams can propagate institutional knowledge: conventions discovered by one agent become available to colleagues’ agents. Code review bots and coding agents share memory so feedback shapes future generation.
Core Operations
- Ingest: Bulk path for context compaction, converting conversations to memories
- Remember: Spot storage of important information during operation
- Recall: Retrieval scoped to current query requirements without conversation blocking
- List & forget: Human oversight mechanisms for memory management
Internal Cloudflare usage demonstrates the benefit: agents learn when previous comments were misguided or when flagged patterns had legitimate justification, resulting in quieter, more precise future feedback.
Field Guide
Code Mode — Architectural pattern providing single “write code” tools with typed APIs rather than individual tool menus.
Durable Object — Stateful primitive: single-threaded, globally addressable object with persistent storage backing each Agent.
Dynamic Worker — Runtime-instantiated Worker executing model-specified code in fresh isolates.
Execution Ladder — Project Think’s graduated sandbox hierarchy climbed only when necessary.
Memory Profile — Named container of agent memories, shareable across agents and humans.
Project Think — Preview-stage package for long-running, durable agent infrastructure.
Sub-agent — Isolated child agent with dedicated SQLite and typed parent communication.
Workspace — Persistent filesystem surviving across sessions and machines.