Introducing Super Code Mode: Optimize Code Mode with GEPA. Run Anywhere.

There is a real shift happening in how teams build AI agents on top of MCP (Model Context Protocol). For a while, the default pattern was simple: expose lots of tools, pass them to the model, and let the agent figure it out. That works when the tool surface is small. It starts breaking down when APIs grow, workflows become multi-step, and tool metadata begins consuming too much context.

That is one reason Code Mode is getting so much attention. And with the emergence of Code Mode MCP implementations, the pattern is becoming much more practical for real MCP workflows: keep the tool surface small, let the model discover capabilities, generate code for
orchestration, run that code in a sandbox, and return only what is needed.  But after working with Code Mode across different MCP setups, one thing became very clear: the server is only half the story.

The quality of outcomes depends heavily on the client-side behavior layer:

  • How the system prompt is written
  • How the Code Mode strategy is described
  • How tools are named or aliased
  • How tool descriptions are phrased
  • How the agent is guided to discover first, then execute

Two teams can use the same MCP server and still get very different results based on this layer alone. That is exactly why we built  Super Code Mode

What Is SuperCodeMode?

SuperCodeMode is a GEPA-powered, backend-agnostic toolkit for optimizing Code Mode-style client behavior in MCP workflows.

In one line: Optimize Code Mode with GEPA. Run anywhere.

SuperCodeMode is built to optimize the parts of the client that materially affect agent behavior and reliability, including:

  • system_prompt
  • codemode_description
  • tool_alias_map
  • tool_description_overrides

These look like “just text fields,” but in practice they influence whether the model discovers the right tools, chooses the right execution path, writes stable code, and returns useful results. SuperCodeMode uses GEPA to optimize that behavior systematically instead of relying on one-off prompt tweaks.

Why We Built It

Code Mode is amazing but we want to enhance to solves a real engineering problem: tool-context overhead. When large APIs are exposed as hundreds or thousands of individual tools, agent performance suffers in predictable ways:

  • Too much context spent on tool metadata
  • Noisy or brittle tool selection
  • Harder multi-step composition
  • Poor scaling as the API surface grows

Code Mode improves this by shifting from “call one tool at a time” toward a code-first orchestration pattern:

  • Discover capabilities dynamically
  • Generate code for multi-step workflows
  • Execute in a controlled sandbox
  • Return only the relevant result

That is a major improvement.But once you adopt Code Mode, the next challenge shows up quickly: How do you optimize Code Mode behavior without getting locked into one backend or runtime provider? SuperCodeMode is our answer to that problem.

GEPA-First, Backend-Agnostic by Design

SuperCodeMode is intentionally built around a clean separation of concerns and focused on GEPA.

  • GEPA handles optimization
  • Runners and backends handle execution
  • Users choose the runtime that fits their environment

This design keeps the optimization workflow stable while allowing runtime execution to vary. In practice, that means you can optimize once and run across different environments without rewriting your whole stack.

SuperCodeMode can be used with:

  • Cloudflare MCP (streamable HTTP)
  • Local MCP servers (stdio)
  • Docker-backed execution
  • Monty (Python-native sandbox path)
  • Custom/internal bridges

This matters because teams have different constraints. Some want a managed remote path. Some need fast local iteration. Some need stricter sandboxing. Some cannot depend on Docker. Some already have internal infrastructure. SuperCodeMode does not force a single answer.

What Makes SuperCodeMode Different

There are now multiple interesting Code Mode implementations in the ecosystem, and that is a good thing. SuperCodeMode is not trying to replace those implementations. Instead, it focuses on a gap that becomes important very quickly in production and research workflows:
optimization quality, benchmarking, and observability for Code Mode client behavior.

1) GEPA-First Optimization (Not Just Execution)

Many projects show how to run Code Mode. SuperCodeMode helps you improve it over time using GEPA. That means you can evaluate candidate prompt and routing configurations, capture traces, and optimize behavior with a repeatable loop.

2) Backend-Agnostic Runtime Support

SuperCodeMode is useful even if you do not control the server-side Code Mode implementation.You can optimize the client-side behavior while keeping your existing MCP server unchanged, including Cloudflare or local MCP deployments.

3) Built-in Benchmarking and Comparison

SuperCodeMode includes practical benchmarking workflows so you can compare strategies using real outcomes, not just intuition.

For example, you can compare:

  • tool_call (naive execution-first policy)
  • codemode_baseline
  • codemode_optimized

This makes it easier to answer questions like:

  • Did optimization actually improve reliability?
  • Did tool routing behavior improve?
  • Did error rates or execution quality improve?
  • How do results vary by runtime backend?

4) Observability-Ready by Default

SuperCodeMode includes a shared telemetry schema and supports multiple observability backends so teams can inspect behavior, compare runs, and debug failures more effectively.

Supported paths include:

  • JSONL
  • OTLP
  • LangSmith
  • Langfuse
  • MLflow
  • Logfire

We also added compact run summaries and benchmark summaries to make CI workflows and experiment
comparisons much easier to manage.

What Ships in SuperCodeMode

The first public release is intentionally focused and practical. It includes the core pieces needed to
start using and evaluating Code Mode optimization workflows immediately.

  • Python package and CLI (scm)
  • GEPA-centric Code Mode optimization flows
  • MCP runners (stdio and streamable HTTP)
  • Cloudflare MCP support
  • Local / Docker / Monty execution backends
  • Benchmarking commands and summary artifacts
  • Observability integrations and a shared telemetry schema
  • Documentation and runnable examples

The goal for this release was not to be “everything.” The goal was to build a strong foundation that is useful today and extensible tomorrow.

A Practical Example of Why This Helps

Imagine you are already using a Code Mode-compatible MCP setup, but the results are inconsistent. Maybe the model executes too early before discovering the right capability. Maybe it picks the wrong tool. Maybe the generated code is brittle. Maybe behavior changes across runtime environments.

SuperCodeMode helps you improve that system without changing the server. You can keep the same MCP backend and optimize the client behavior layer with GEPA: prompting, Code Mode guidance, aliasing, and tool-facing descriptions. This is especially useful when the server is shared, managed by another team, or outside your direct control. It also gives you a clean way to benchmark and compare behavior across different runtime paths using the
same optimization framework.

Quick Start

If you want to try SuperCodeMode immediately, the CLI is simple to start with:

pip install supercodemode
scm doctor
scm showcase --runner mcp-http
scm benchmark --runner mcp-stdio

From there, you can move into optimization runs, benchmark comparisons, and observability integrations.

Who This Is For

SuperCodeMode is a strong fit if you are:

  • Building MCP-based agent systems
  • Experimenting with Code Mode patterns
  • Using GEPA for optimization loops
  • Comparing Cloudflare and local MCP paths
  • Trying to avoid backend lock-in
  • Running evaluations and wanting better telemetry and benchmark visibility

If you only need a simple demo of a single Code Mode server, this may be more than you need. But if you
care about optimization quality, reproducibility, and portability, this becomes valuable very quickly.

Ecosystem Context

We see SuperCodeMode as part of a broader ecosystem around Code Mode and MCP, not as a closed system. If you are new to the pattern, these references are especially useful:

SuperCodeMode’s contribution to this ecosystem is focused on GEPA-driven optimization, benchmarking, and runtime flexibility.

Project Links

Closing

Code Mode is a strong pattern for the next generation of MCP workflows. SuperCodeMode makes it practical to optimize that pattern with GEPA while keeping your runtime choices open.

Optimize Code Mode with GEPA. Run anywhere.