RouteAll.ai — Developer Documentation

Quickstart

1

Create an account

2

Generate an API key

3

Make your first request

Python

from openai import OpenAI
client = OpenAI(api_key="your-routeall-api-key", base_url="https://api.routeall.ai/v1")
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

4

Choose a subscription

✅

Authentication

HTTP Header

Authorization: Bearer YOUR_API_KEY

⚠️

Using Environment Variables

Python

import os
client = OpenAI(api_key=os.environ["ROUTEALL_API_KEY"], base_url="https://api.routeall.ai/v1")

Base URL & Endpoints

Base URL

https://api.routeall.ai/v1

🌏

Available Endpoints

POST/v1/chat/completions

GET/v1/models

POST/v1/embeddings

POST/v1/images/generations

Chat Completions

Request Parameters

Parameter	Type	Required	Description
model	string	✅	Model ID e.g. `gpt-4o`, `claude-opus-4-6`
messages	array	✅	Array of `role` + `content` objects
stream	boolean	—	Enable streaming. Default: `false`
temperature	number	—	Randomness 0–2. Default: `1`
max_tokens	integer	—	Max tokens in response

Example Request

cURL

curl https://api.routeall.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello!"}]}'

Example Response

200 OK

{
  "id": "chatcmpl-abc123",
  "model": "gpt-4o",
  "choices": [{"message": {"role": "assistant", "content": "Hello! How can I help?"}, "finish_reason": "stop"}],
  "usage": {"prompt_tokens": 10, "completion_tokens": 8, "total_tokens": 18}
}

Streaming

Python

from openai import OpenAI
client = OpenAI(api_key="YOUR_KEY", base_url="https://api.routeall.ai/v1")
stream = client.chat.completions.create(
    model="claude-sonnet-4-6", stream=True,
    messages=[{"role":"user","content":"Tell me a story"}]
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Node.js

const client = new OpenAI({apiKey:'YOUR_KEY',baseURL:'https://api.routeall.ai/v1'});
const stream = await client.chat.completions.create({
  model:'gpt-4o', stream:true,
  messages:[{role:'user',content:'Tell me a story'}]
});
for await (const chunk of stream)
  process.stdout.write(chunk.choices[0]?.delta?.content||'');

cURL

curl https://api.routeall.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o","stream":true,"messages":[{"role":"user","content":"Hello"}]}'

List Models

cURL

curl https://api.routeall.ai/v1/models -H "Authorization: Bearer YOUR_API_KEY"

Python SDK

Shell

pip install openai

Python

from openai import OpenAI
import os
client = OpenAI(api_key=os.getenv("ROUTEALL_API_KEY"), base_url="https://api.routeall.ai/v1")
resp = client.chat.completions.create(model="gpt-4o", messages=[{"role":"user","content":"Hi"}])
print(resp.choices[0].message.content)

Node.js SDK

Shell

npm install openai

TypeScript

import OpenAI from 'openai';
const client = new OpenAI({apiKey: process.env.ROUTEALL_API_KEY, baseURL: 'https://api.routeall.ai/v1'});
const res = await client.chat.completions.create({model:'claude-opus-4-6', messages:[{role:'user',content:'Hello!'}]});
console.log(res.choices[0].message.content);

cURL

curl https://api.routeall.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello!"}]}'

Go

Shell

go get github.com/sashabaranov/go-openai

Go

package main
import ("context";"fmt"; openai "github.com/sashabaranov/go-openai")
func main() {
    cfg := openai.DefaultConfig("YOUR_API_KEY"); cfg.BaseURL = "https://api.routeall.ai/v1"
    client := openai.NewClientWithConfig(cfg)
    resp, _ := client.CreateChatCompletion(context.Background(), openai.ChatCompletionRequest{
        Model:"gpt-4o", Messages:[]openai.ChatCompletionMessage{{Role:"user",Content:"Hello!"}},
    })
    fmt.Println(resp.Choices[0].Message.Content)
}

Supported Models

OpenAI

OpenAIgpt-4o

OpenAIgpt-4.1

OpenAIgpt-4.1-mini

OpenAIgpt-5

OpenAIo1

OpenAIo4-mini

Anthropic Claude

Anthropicclaude-opus-4-6

Anthropicclaude-sonnet-4-6

Anthropicclaude-haiku-4-5-20251001

Anthropicclaude-opus-4-6-thinking

Google Gemini

Geminigemini-2.5-pro

Geminigemini-2.5-flash

Geminigemini-3-pro-preview

Geminigemini-2.0-flash

xAI Grok

xAIgrok-4

xAIgrok-3

xAIgrok-3-mini

xAIgrok-4-fast-reasoning

💡

Pricing

Plan	Monthly Fee	Credits	Discount	gpt-4o Input
Default	Free	—	—	$1.875/1M
Starter	$8	$8	5% off	$1.781/1M
Builder	$35	$40	15% off	$1.594/1M
Pro ⭐	$130	$160	25% off	$1.406/1M
Scale	$450	$600	30% off	$1.313/1M

♾️

Error Handling

HTTP Code	Error Type	Description
400	invalid_request_error	Bad request
401	authentication_error	Invalid or missing API key
402	payment_required	Insufficient credits
404	model_not_found	Model ID not recognized
429	rate_limit_error	Too many requests
500	server_error	Internal error — retry
503	service_unavailable	Upstream provider unavailable

Python Error Handling

Python

from openai import OpenAI, APIError, RateLimitError, AuthenticationError
client = OpenAI(api_key="YOUR_KEY", base_url="https://api.routeall.ai/v1")
try:
    response = client.chat.completions.create(model="gpt-4o", messages=[...])
except AuthenticationError: print("Invalid API key")
except RateLimitError: print("Rate limit hit")
except APIError as e: print(f"Error {e.status_code}: {e.message}")

Rate Limits

Plan	Requests / Min	Tokens / Min
Default	60	100,000
Starter	120	200,000
Builder	300	500,000
Pro	600	1,000,000
Scale	1,200	2,000,000

Build with RouteAll.ai

Quickstart

Authentication

Using Environment Variables

Base URL & Endpoints

Available Endpoints

Chat Completions

Request Parameters

Example Request

Example Response

Streaming

List Models

Python SDK

Node.js SDK

cURL

Go

Supported Models

OpenAI

Anthropic Claude

Google Gemini

xAI Grok

Pricing

Error Handling

Python Error Handling

Rate Limits

FAQ

Do I need to change my existing code?

Do credits expire?

Which region will my requests be served from?

Can I use the Anthropic or Gemini SDK directly?

Model ratio vs group ratio?

How do I get enterprise pricing?