Code Mode in RLM Code with UTCP and Cloudflare MCP Servers

Code Mode is getting real momentum across the MCP ecosystem. Cloudflare has introduced a Code Mode MCP
path, and in parallel we can already run Code Mode-style workflows in RLM Code for controlled
experiments and benchmark-driven comparisons.

If you are new to the concept, Code Mode is a tooling pattern where an agent plans and executes coding
tasks through a structured tool contract instead of relying only on long conversational context. In
practice, this means reproducible runs, clearer tool traces, and better benchmarking discipline across
different MCP backends.

This post is written as a research-first setup guide: one backend that matches the current Code Mode
contract directly, and one remote backend that you can still evaluate safely with a compatible strategy.

Execution Contract and Backend Strategy

After publishing the RLM Code release and recording a live demo, the most common question we received was
simple: can we run one Code Mode workflow across different MCP backends and still benchmark it rigorously?</ p>

The answer is yes, with one important detail about tool contracts in the current release.

UTCP local MCP (@utcp/code-mode-mcp) with strategy=codemode
Cloudflare remote MCP (Cloudflare announcement) with strategy=tool_call</ code>

Recommended flow (TL;DR)

Connect both MCP servers and verify tool visibility with /mcp-tools.
Run UTCP jobs with strategy=codemode for native bridge compatibility.
Run Cloudflare jobs with strategy=tool_call when tool names differ from the Code Mode
contract.
Use rlm bench compare and rlm bench report for side-by-side artifact
review.

Quick takeaways: Use strategy=codemode with UTCP, use
strategy=tool_call with Cloudflare in this release, and benchmark both with identical prompts
and steps so your compare output stays defensible.

Demo video

YouTube: Code Mode in RLM Code

Quick strategy matrix

Backend	Strategy in this release	Why
UTCP Code Mode MCP	`codemode`	Matches current bridge contract expected by harness
Cloudflare remote MCP	`tool_call`	Tool surface can differ from current Code Mode bridge names

Configuration

Add both MCP servers in your project rlm_config.yaml. You can cross-check config and command
conventions in the RLM Code docs.

UTCP local bridge

mcp_servers:
    utcp-codemode:
      name: utcp-codemode
      description: "Local Code Mode MCP bridge"
      enabled: true
      auto_connect: false
      timeout_seconds: 30
      retry_attempts: 3
      transport:
        type: stdio
        command: npx
        args:
          - "@utcp/code-mode-mcp"

Cloudflare remote bridge

mcp_servers:
    cloudflare-codemode:
      name: cloudflare-codemode
      description: "Cloudflare MCP via remote bridge"
      enabled: true
      auto_connect: false
      timeout_seconds: 30
      retry_attempts: 3
      transport:
        type: stdio
        command: npx
        args:
          - "mcp-remote"
          - "https://mcp.cloudflare.com/mcp"

Cloudflare note: on first connect, mcp-remote can request interactive authentication if you
are not already logged in. Complete auth once, then reconnect. Package link: mcp-remote.

Demo commands (steps=3)

UTCP with Code Mode

/mcp-connect utcp-codemode
  /mcp-tools utcp-codemode
  /harness run "analyze this repo, find TODO/FIXME, and create report.json" steps=3 mcp=on strategy=codemode
  mcp_server=utcp-codemode

Cloudflare with tool_call

/mcp-connect cloudflare-codemode
  /mcp-tools cloudflare-codemode
  /harness run "list available tools and run one safe read-only action, then summarize in 3 bullets" steps=3
  mcp=on strategy=tool_call mcp_server=cloudflare-codemode

Research compare workflow

/rlm bench preset=generic_smoke mode=harness strategy=codemode mcp=on mcp_server=utcp-codemode
  limit=1 steps=3
  /rlm bench preset=generic_smoke mode=harness strategy=tool_call mcp=on mcp_server=cloudflare-codemode
  limit=1 steps=3
  /rlm bench compare candidate=latest baseline=previous
  /rlm bench report candidate=latest baseline=previous format=markdown

How Code Mode works (technical architecture)

At a high level, Code Mode in RLM Code is a harness strategy on top of MCP. The architecture has three
layers:

Harness layer: task orchestration, prompting, guardrails, telemetry.
MCP bridge contract: tools exposed to the harness.
Provider implementation: UTCP, Cloudflare, or custom server runtime.

In this release, strategy=codemode expects the bridge tools search_tools and
call_tool_chain. UTCP exposes this contract directly, so it runs natively. Cloudflare can
expose a different tool naming surface, so we use tool_call there today. Repo: SuperagenticAI/
rlm-code.

Guardrails and sandbox responsibilities

RLM Code is responsible for planner guardrails and harness-level controls. MCP providers are responsible
for their own runtime execution boundaries. For research quality and safer iterations, keep strict sandbox
posture on the RLM side and run deterministic benchmark presets.

Why Cloudflare may show “could not resolve call_tool_chain/search_tools”

That error means the selected server does not expose the exact tool names required by the current Code
Mode strategy. It does not mean Cloudflare MCP is broken. It means there is a bridge-name mismatch for this
release contract.

Practical fix: keep Cloudflare runs on strategy=tool_call and keep UTCP runs on
strategy=codemode until a dedicated Cloudflare Code Mode strategy is added.

Troubleshooting checklist

Confirm the active server with /mcp-tools <server-name> before launching harness
runs.
Re-run Cloudflare auth if mcp-remote prompts or stalls on first connect.
Keep the same task prompt, preset, and steps across both runs to avoid noisy benchmark
deltas.
Store generated reports in versioned artifacts so baseline/candidate comparisons stay reproducible.</ li>

Research and benchmark possibilities

Run the same preset across multiple MCP backends to isolate tool-surface effects.
Compare strategy cost and completion behavior under fixed steps and fixed prompts.
Track regression gates over time with bench compare and bench validate.
Keep benchmark artifacts for reproducibility and paper-style reporting.

Final takeaway

You can demonstrate both approaches in one workflow today. UTCP gives you native Code Mode in the current
RLM release. Cloudflare gives you a strong remote MCP path with tool_call. Together they form a
practical benchmark matrix for real research and release decisions.

Introducing SuperOpt: Research on Agentic Environment Optimization for Autonomous AI Agents

Superagentic AI Blog

Full Stack Agentic AI, Agent Optimization, Agent Engineering and Agent Experience.