Is AllRoutes compatible with my existing OpenAI code?

Yes. AllRoutes is fully OpenAI SDK compatible. Change your base URL to api.allroutes.ai/v1 and your API key. Your existing code works instantly with every model from every provider.

How does pricing work?

Pay-as-you-go with credits. No markup on model pricing. Free tier: 4% platform fee. Pro ($29/mo) and above: 2.5%. BYOK is always 0%. Credits never expire.

What is BYOK and why is it free?

Bring Your Own Key means you use your own provider API keys through AllRoutes. You pay providers directly, we charge 0% commission. You still get routing, caching, analytics, and fallbacks.

How is AllRoutes different from OpenRouter?

Same model catalog and compatibility. But AllRoutes adds transparent routing, semantic caching (40-80% savings), compliance profiles, PII redaction, self-hosted model support, an admin portal, and charges less (0% BYOK vs their 5%).

Can I connect my own self-hosted models?

Yes. Connect Ollama, vLLM, TGI, or any OpenAI-compatible endpoint. AllRoutes auto-discovers available models, monitors health, and falls back to cloud if your server goes down.

Zero Data Retention by default. Prompts and completions are never logged unless you opt in. Provider-level data policies are enforced. EU data residency available on Team and Enterprise plans.

How do I get support?

Email us at support@allroutes.ai for any questions. Pro and Team plans include priority support. Enterprise plans include a dedicated Slack channel and 24/7 coverage.

Join the Waitlist — $50 Free Credits

One API. Every AI Model.

Smart routing, semantic caching, cost optimization, and real-time analytics. All behind a single API key. Works with 400+ models from 60+ providers.

Read the Docs

app.js

import AllRoutes from "allroutes";
 
const client = new AllRoutes({ apiKey: process.env.ALLROUTES_KEY });
 
const response = await client.chat.completions.create({
  model: "auto",
  messages: [{ role: "user", content: "Hello!" }],
  routing: { strategy: "cost-optimized" },
  cache: true,
});
// Cost: $0.0003 · Provider: anthropic · Cached: true

Platform Features

Everything you need to ship AI products

A complete platform for building, routing, caching, monitoring, and optimizing AI-powered applications.

Smart Routing

Intelligent request routing across AI providers with automatic failover, load balancing, and latency-based selection.

Semantic Caching

Exact-match and semantic caching to cut costs by up to 90% and reduce latency on repeated queries.

Plugin System

Extensible middleware pipeline with rate limiting, request transforms, guardrails, and custom logic.

Real-time Analytics

Live dashboards with cost tracking, latency percentiles, token usage, and per-model breakdowns.

PromptVault

Version-controlled prompt management with A/B testing, rollback, and production deployment.

ReplayKit

Record and replay API sessions for debugging, regression testing, and compliance auditing.

CostPilot

Automated cost optimization with budget alerts, model recommendations, and spend forecasting.

ModelRadar

Live model health monitoring with latency benchmarks, availability tracking, and failover triggers.

Getting Started

Up and running in three steps

Go from zero to production-ready AI routing in minutes, not months.

Connect

Point your existing OpenAI SDK to AllRoutes. No code changes required beyond the base URL.

client = AllRoutes(base_url="https://api.allroutes.ai/v1")

Route

Define routing strategies, fallback chains, and quality thresholds. AllRoutes picks the best model.

routing: { strategy: "cost-optimized",
fallback: ["claude-sonnet", "gpt-4o"] }

Optimize

Semantic caching, cost analytics, and automated recommendations reduce your spend on autopilot.

// 40% cost reduction
// 99.99% uptime
// <50ms overhead

Developer Experience

Drop-in compatible

AllRoutes is fully OpenAI-compatible. Switch your base URL and instantly access 400+ models from 60+ providers.

OpenAI SDK compatible

Smart routing with automatic failover

Built-in semantic caching saves up to 90%

Real-time cost tracking per request

from allroutes import AllRoutes

client = AllRoutes(api_key="allroutes_sk_...")

response = client.chat.completions.create(
    model="auto",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing"},
    ],
    routing={
        "strategy": "cost-optimized",
        "fallback": ["gpt-4o", "claude-sonnet-4", "gemini-2.0-flash"],
    },
    cache=True,
)

print(response.choices[0].message.content)
print(f"Cost: ${response.usage.cost:.4f}")
print(f"Provider: {response.provider}")

import AllRoutes from "allroutes";

const client = new AllRoutes({ apiKey: "allroutes_sk_..." });

const response = await client.chat.completions.create({
  model: "auto",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "Explain quantum computing" },
  ],
  routing: {
    strategy: "cost-optimized",
    fallback: ["gpt-4o", "claude-sonnet-4", "gemini-2.0-flash"],
  },
  cache: true,
});

console.log(response.choices[0].message.content);
console.log(`Cost: $${response.usage.cost.toFixed(4)}`);
console.log(`Provider: ${response.provider}`);

curl https://api.allroutes.ai/v1/chat/completions \
  -H "Authorization: Bearer allroutes_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain quantum computing"}
    ],
    "routing": {
      "strategy": "cost-optimized",
      "fallback": ["gpt-4o", "claude-sonnet-4", "gemini-2.0-flash"]
    },
    "cache": true
  }'

AI Models

Providers

Uptime SLA

Routing Overhead

Testimonials

Loved by engineering teams

See why hundreds of teams trust AllRoutes to power their AI infrastructure.

"AllRoutes cut our AI infrastructure costs by 42% in the first month. The semantic caching alone paid for itself."

Sarah Chen

CTO, NovaMind AI

"We switched from managing 6 different provider SDKs to one AllRoutes call. Our integration code went from 2,000 lines to 50."

Marcus Rivera

Engineering Lead, StackShift

"The automatic failover saved us during the OpenAI outage. Our users didn't even notice -- AllRoutes switched to Claude in under 200ms."

Priya Sharma

VP Engineering, DataForge

"PromptVault's A/B testing helped us improve response quality by 28%. We can iterate on prompts without redeploying."

James O'Brien

ML Engineer, Cognify Labs

"AllRoutes' smart routing finds the cheapest model that meets our quality threshold. It's like autopilot for AI spend."

Aisha Patel

Founder, LinguaFlow

"The real-time analytics dashboard gives us visibility we never had. We can see exactly where every dollar goes."

Thomas Müller

Head of AI, Synthex

Why AllRoutes?

Compare the alternatives

See how AllRoutes stacks up against going direct or using other gateways.

Feature	AllRoutes AllRoutes	Direct API	OpenRouter
OpenAI-compatible API	✓	✕	✓
400+ models	✓	✕	✓
Automatic failover	✓	✕	~
Semantic caching	✓	✕	✕
Cost-optimized routing	✓	✕	✕
Real-time analytics	✓	✕	~
Prompt management	✓	✕	✕
Session replay	✓	✕	✕
Plugin system	✓	✕	✕
Budget controls	✓	✕	~
Self-hostable	✓	~	✕
Guardrails	✓	✕	✕

Pricing

Simple, transparent pricing

Start free, scale as you grow. No hidden fees, no surprises.

Monthly

Annual Save 20%

Free

For hobby projects and experimentation.

$0/month

1,000 requests/month

5 AI models

Basic analytics

Community support

1 API key

Let's talk

Have questions? We'd love to hear from you. Send us a message and we'll respond as soon as possible.

Live Chat

Chat with our team in real-time. We typically respond within minutes.

Email Us

[email protected] -- We'll get back to you within 24 hours.

Enterprise?

Talk to Sales about custom plans, SLAs, and dedicated infrastructure.