SLASHED — 50% off any direct API price. Period.

— The Catalogue · per 1M output tokens

Eleven models. One key.

A unified gateway across Anthropic, Google, and OpenAI — not the headline, the plumbing. One OpenAI-compatible endpoint. One canonical ID per model. Half the rack price.

Anthropic

Claude Opus 4.6

$15.00$7.50

Anthropic

Claude Sonnet 4.6

$3.00$1.50

Anthropic

Claude Haiku 4.5

$0.25$0.13

Google

Gemini 3.1 Pro

$3.50$1.75

Google

Gemini 3 Flash

$0.075$0.038

OpenAI

GPT-5.4

$5.00$2.50

OpenAI

GPT-5.3 Codex

$5.00$2.50

OpenAI

GPT-5.2

$4.00$2.00

OpenAI

GPT-5.2 Codex

$4.00$2.00

OpenAI

GPT-5.5 / Codex 5.5

$5.00$2.50

OpenAI

GPT-5.1 Codex Max

$8.00$4.00

— Migration Notes · before / after

Three changed characters. Bill drops 50%.

If your code talks to OpenAI, it already talks to SLASHED. Change the URL. Change the key prefix. Keep everything else.

DIRECT

# Paying full rack rate
from openai import OpenAI

client = OpenAI(
  base_url="https://api.openai.com/v1",
  api_key="sk-...",
)
client.chat.completions.create(
  model="gpt-5.4",
  messages=[...]
)

SLASHED

# Half the bill, same call
from openai import OpenAI

client = OpenAI(
  base_url="https://api.slashed.pro/v1",
  api_key="sl-...",
)
client.chat.completions.create(
  model="gpt-5.4",
  messages=[...]
)

Full reference at api.slashed.pro — endpoints, error codes, rate limits, every supported model ID.

— Tenets · vs other gateways · May 2026

A gateway should be invisible.

Same OpenAI-compatible spec. We just didn't break it. Six contract tenets we hold ourselves to — verifiable against the public OpenAPI spec and the live endpoint.

// COMMON ANTI-PATTERN

// SLASHED DOES THIS

01 · Tax

Hidden system preamble

Some routers silently prepend an undisclosed system preamble to every call. Billed to you. Cached on their side, paid by you forever.

01 · Zero

No hidden context

SLASHED bills only the tokens you submit and the tokens we return. No router-authored preamble, no silent system prompt, no platform context on the invoice. Ever.

02 · Ghost

Silent empty completions

Some gateways return an empty completion for unhealthy upstreams — no error, just a 200 with no content. The failure is invisible; the debug bill is yours.

02 · Live

Content is populated

Supported models return usable content or a structured error. Empty completions are treated as gateway failures, not customer-facing success. We page on it.

03 · Tiers

Duplicate-SKU pseudo-models

Same model. Two IDs. Two prices. "Tier optimisation." Translation: pay more for the same compute behind a different label.

03 · One

Canonical model IDs

Each model has one public ID and one public price. Capability controls are parameters, not duplicate SKUs with separate billing treatment.

04 · Shape

Off-spec response fields

Some gateways inconsistently move model output into undocumented fields. Your client crashes. You add a try/catch. You move on.

04 · Spec

Stable response schema

Assistant output stays in the OpenAI-compatible content field. Auxiliary fields never become undocumented substitutes for the customer-visible answer. Boring is a feature.

05 · Fog

Headline savings, no rate card

Some routers advertise eye-catching savings with no public price table. "Talk to sales for Pro." Reconciliation requires a phone call.

05 · Card

Published rate card

Every model carries a public per-token rate. Savings claims reconcile to direct list without sales calls, blended tiers, or undisclosed exceptions. The CFO can paste it into Excel.

06 · 502

Wrong HTTP semantics

Ask for a model that doesn't exist? Some gateways return 502 Bad Gateway. Your retry loop hammers them. You can't tell their bug from yours.

06 · Code

Correct HTTP semantics

Invalid model names resolve as 400 client errors. Upstream 5xx codes are reserved for actual upstream failures. Your retry, alerting, and incident logic just works.

Half the bill. All the receipts. None of the excuses.

Import your qua- key → — See your 30-day savings in 60 seconds. No card. No phone call.

— Tenets · four pillars, one promise

No discovery decks. No tier-1 maze.

— 01

Drop-in compatible

Works with Claude Code, Codex, Cursor, Factory Droid, OpenCode, raw OpenAI SDKs. Zero vendor abstractions to learn.

— 02

Encrypted, never stored

TLS 1.3 in transit. Prompts and completions live only as long as the request. No retention, no training, no leaks.

— 03

Real-time spend

Live token counter, per-key caps, per-seat ceilings, daily digests. The dashboard mirrors the API, line for line.

— 04

Operators, not tickets

Email hits engineers who read traces and ship fixes — no ticket maze, no tier-1 hand-off. Incidents get owners, timelines, postmortems. Not vibes. Not vanished threads.

— By the Numbers · live invoice

Live invoice. No surprises.

Connect any sl-, qua-, sk-, or or- key. The dashboard imports your last 30 days of usage, projects what you would have paid direct, and shows the delta in real time. CFO-shareable, board-shareable, audit-trail-clean.

Open dashboard → Read the API →

Your AI bill, slashed in half.

Drop in. Don't refactor.

Watch the bill fall.

Direct API — that's 50% off. Every model. No fake tiers.

Eleven models. One key.

Three changed characters. Bill drops 50%.

A gateway should be invisible.

No discovery decks. No tier-1 maze.

Drop-in compatible

Encrypted, never stored

Real-time spend

Operators, not tickets

Live invoice. No surprises.

Stop overpaying. Start shipping.