Local-first · CLI + MCP + Observability as first-class peers

The AI gateway
your agents can operate.

Switchmaxxer is an open-source, local-first LLM reverse-proxy where the CLI, MCP, and observability surfaces are first-class peers — designed from day one to be operated by agents, not just by humans.

Get started How agents operate it →

Switchmaxxer routes LLM traffic like a railway switchboard — one operator, multiple lines

Quick start

# point any OpenAI SDK at Switchmaxxer
client = OpenAI(base_url="http://127.0.0.1:4080/v1")

response = client.chat.completions.create(
    model="claude-sonnet-4-6",   # named route, not a provider model
    messages=[{"role": "user", "content": "Hello"}]
)

Architecture

One proxy. Every provider.

Clients talk to Switchmaxxer. Switchmaxxer resolves named routes from the catalog, applies runtime policy from configuration, translates Anthropic / OpenAI message dialects on the fly if needed, and tracks every observation in a local data store off the hot path.

Switchmaxxer architecture: client applications on the left connect through the local gateway to OpenAI, Anthropic, Google Gemini, OpenRouter, and other providers on the right

Inbound Layer

Two dialects on one port: /v1/chat/completions (OpenAI) and /anthropic/v1/messages (Anthropic). Any compatible SDK connects without modification.

Routing Layer

Named routes live in catalog.json alongside service_providers and models. Per-route api_mode controls outbound dialect; timeout_ms and streaming limits are policy, not request flags.

Upstream Layer

Auth injection from api_key_env (or owner-only secrets.json). DNS-pinned endpoints, screened against private/loopback addresses unless explicitly opted-in per provider.

Observability Layer

One in-process observability store powers logs, observations, traces, benchmarks, optimize runs, and the Control Plane Audit Ledger. Off the hot path, queryable from CLI and MCP alike.

Operability

Operators and agents share one surface.

Every operator capability is an agent capability. Not a subset. Not a sandbox. The same surface, with capability tiers you grant explicitly.

CLI

Drive the gateway directly: gateway run, config validate, routes, trace, bench, optimize, ledger. Stable exit codes and --json on every surface that matters.

MCP

The same surface, agent-fitted. Capability tiers — read, mutation, privileged — are granted explicitly in configuration. Your agent operates Switchmaxxer within the trust envelope you set.

Observability

One store, one query layer, one mental model. Traces, benchmarks, optimize decisions, and the Control Plane Audit Ledger live in the same observability database — readable from CLI and MCP.

Bench · Optimize · Apply

Agents can finally optimize their own routes.

Point two routes at the same model — claude-sonnet-4-6 direct and openrouter-claude-sonnet-4-6 through OpenRouter — and let your agent ask which one is fastest right now?

1. Bench — controlled, fresh, persisted

Run measured benchmarks against any route, any path (gateway / direct / both), any prompt. No stale data. No mystery provenance. Every run lands in the local store.

switchmaxxer bench --routes <a>,<b> --path both

2. Optimize — cost or latency, with receipts

Score routes by cost against catalog rate cards and a reference token workload, or by latency through the benchmark runtime. Recommendations persist with the run that backed them.

switchmaxxer optimize --model <m> --objective latency

3. Apply — safely, with rollback

optimize apply writes a pre-apply catalog snapshot before mutating. optimize restore rolls back if something goes sideways. Privileged MCP clients can do both.

switchmaxxer optimize apply <run-id> --reload --verify

4. Audit — every mutation, ledgered

Every control-plane mutation writes to the Control Plane Audit Ledger. Inspect with ledger list / ledger show, or expose to privileged MCP clients.

switchmaxxer ledger list --json

API

Drop-in compatible.

Point existing SDK clients at Switchmaxxer. No client code changes required when switching providers or models.

openai_example.py

from openai import OpenAI
import os

# Point at Switchmaxxer instead of api.openai.com
client = OpenAI(
    base_url="http://127.0.0.1:4080/v1",
    api_key=os.environ["SWITCHMAXXER_INBOUND_API_KEY"],
)

# Use a named route — Switchmaxxer resolves the upstream provider
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Summarize this document."}],
    stream=True,
)

for chunk in response:
    print(chunk.choices[0].delta.content, end="")

anthropic_example.py

import anthropic
import os

# Point at Switchmaxxer's Anthropic-compatible listener
client = anthropic.Anthropic(
    base_url="http://127.0.0.1:4080/anthropic",
    api_key=os.environ["SWITCHMAXXER_INBOUND_API_KEY"],
)

# Route names live in catalog.json; api_mode=anthropic-messages
message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Analyze this codebase."}],
)

print(message.content[0].text)

terminal

# Validate config + catalog, then start the gateway
./switchmaxxer config validate
./switchmaxxer gateway run

# Test a route through the live runtime path
./switchmaxxer test --route claude-sonnet-4-6

# Send a one-off prompt through a named route
./switchmaxxer invoke --route claude-sonnet-4-6 --prompt "hello"

# Bench two routes for the same model, both paths, JSON output
./switchmaxxer bench --routes claude-sonnet-4-6,openrouter-claude-sonnet-4-6 \
  --path both --json

# Ask which route is cheapest, then apply with rollback safety
./switchmaxxer optimize --model claude-sonnet-4-6 --objective cost --json
./switchmaxxer optimize apply <run-id> --route claude-sonnet-4-6 --reload --verify

# Hot-reload config after safe edits
./switchmaxxer gateway reload

Integrations

Works with your existing stack.

Any client that speaks the OpenAI or Anthropic wire protocol connects to Switchmaxxer without modification.

MCP

Control plane Agent surface

Agents inspect traces, run benchmarks, request optimize recommendations, and apply or restore route changes — gated by read, mutation, privileged tiers.

systemd

Deployment

Run as a systemd --user service. Logs land in journald and surface through switchmaxxer gateway logs with normalized JSON.

SQLite

Observability In-process

Node 22 node:sqlite in-process. Observations, traces, benchmarks, optimize runs, and the audit ledger all share one store.

OpenAI

Provider Inbound dialect

Upstream provider and inbound protocol. Any OpenAI-compatible client connects to /v1/chat/completions unchanged.

Anthropic

Provider Inbound dialect

Upstream provider and inbound protocol. Anthropic-compatible clients connect to /anthropic/v1/messages on the same port.

OpenRouter

Provider

Stand a route up against the same model through OpenRouter, then let optimize pick the best path on cost or latency.

Inspection & Retention

One store. One query layer. One mental model.

Observations flow into traces, traces feed benchmark runs, benchmark runs back optimize recommendations — and every mutation is ledgered. All one observability store, all queryable from CLI and MCP.

Trace, verify, repair

Reconstruct the full lifecycle of any request. Verify trace completeness, repair gaps when something dropped mid-flight, and inspect by request id.

switchmaxxer trace show <request-id> --json

Observations, structured

Every meaningful event in a request's lifecycle is captured and queryable. Walk observations directly when you need finer grain than a trace summary.

switchmaxxer trace observations --route claude-sonnet-4-6

Named routes, swappable upstreams

Clients hold the route name; catalog.json holds the binding. Swap the upstream model or provider without touching a single client.

switchmaxxer routes update <route> --service-provider <p>

Whole-store retention

One command for the whole observability store; bench and optimize histories also have their own scoped prune, delete, and clear.

switchmaxxer prune --older-than 30d --json

Security & Governance

Local defaults. Explicit policy.

Switchmaxxer is designed around safe defaults and explicit configuration rather than permissive assumptions.

Loopback by default

Binds to 127.0.0.1. Inbound auth required unless allow_unauthenticated_gateway is explicitly opted into on a loopback bind — and even then, browser cross-site request signals are rejected.

Provider keys via env, not files

Resolved through api_key_env by default. Optional owner-only secrets.json for machine-specific overrides. Inline api_key values are gated behind privileged surfaces and warn during validation.

Owner-only file modes

Config, catalog, and secrets fail closed on group/world permission bits. chmod 0600 or the gateway refuses to read them. No quiet permissive fallbacks.

SSRF-screened upstreams

Provider endpoints are DNS-pinned and screened against private and loopback addresses unless allow_private_endpoints is opted in per provider.

MCP capability tiers

read, mutation, privileged are granted explicitly in configuration. Default is read-only. Mutation and privileged tiers are opt-in for trusted local automation only.

Control Plane Audit Ledger

Every mutation — CLI or MCP — writes to the ledger. Inspect with ledger list / ledger show. optimize apply snapshots catalog state so optimize restore can roll it back.

Start routing in minutes.

Clone, build, copy the example config and catalog, point your SDK at 127.0.0.1:4080.

Install & start

# Requires Node.js 22+
git clone https://github.com/<your-org>/switchmaxxer.git
cd switchmaxxer
npm install
npm run build

cp configs/config.example.json config.json
cp configs/catalog.example.json catalog.json
chmod 0600 config.json catalog.json

export SWITCHMAXXER_OPENAI_API_KEY=...
export SWITCHMAXXER_INBOUND_API_KEY=...

./switchmaxxer config validate
./switchmaxxer gateway run

View on GitHub Read the docs

The AI gatewayyour agents can operate.