skill
Use FastRouter's official skill.md file to give your AI coding assistants knowledge of FastRouter features
---
name: FastRouter
description: Use when routing AI requests through a unified LLM gateway, managing multiple providers, setting up virtual model aliases, configuring fallback policies, enabling BYOK (Bring Your Own Key), processing batch requests, tracking costs, enforcing guardrails, evaluating model outputs, or integrating the MCP Gateway. Also use when setting up FastRouter as a model provider in OpenClaw — triggered by phrases like "set up fastrouter", "add fastrouter provider", "configure fastrouter with API key sk-v1-xxxxx", or "update fastrouter models". Reach for this skill when building AI applications that need multi-provider support, reliability, cost optimization, multimodal capabilities, or analytics.
metadata:
docs-proj: fastrouter
version: "1.0"
---Product Summary
FastRouter.ai is an enterprise-grade LLM Gateway that acts as a control plane for routing requests across 100+ AI models from multiple providers through a single OpenAI-compatible API. It provides intelligent routing, automatic failover, cost governance, multimodal support (text, image, video, audio), and built-in observability — eliminating vendor lock-in while reducing AI spend. No setup fees, no monthly minimums, and free credits to start.
FastRouter.ai sits between your application and LLM providers (OpenAI, Anthropic, Google, xAI, Meta, Groq, Mistral, and more), handling request routing, credential management, cost tracking, error fallback, and performance optimization. Key differentiators include the Auto Router (dynamic model selection by cost/latency/quality), Virtual Model Aliases (custom model pools with policy-driven selection), per-key budget controls, batch processing, custom evaluations, guardrails, MCP Gateway, and native support for multiple API formats (OpenAI Chat Completions, OpenAI Responses, Anthropic Messages, Gemini Native).
See docs.fastrouter.ai for full documentation.
When to Use
Reach for FastRouter when:
Multi-model routing: You need to route requests across providers for cost, latency, or quality optimization
High availability: You want automatic failover and retries across providers when one goes down
Cost governance: You need per-key budgets, rate limits, model restrictions, and real-time spend tracking
Model comparison: You want side-by-side evaluation of model outputs in interactive playgrounds
Multimodal pipelines: You're working with text, image, video, or audio through one API with consistent auth and billing
Batch processing: You're running bulk operations (up to 50K requests) across OpenAI and Anthropic models
Enterprise controls: You need RBAC, BYOK, provisioning keys, project isolation, and audit logging
IDE integration: You want a drop-in replacement for OpenAI in Cursor, Cline, Claude Code, and other tools
OpenClaw setup: A user wants to add FastRouter as a provider in OpenClaw (see OpenClaw section below)
Quick Reference
Base URLs
https://api.fastrouter.ai/api/v1
Primary API gateway (LLM endpoints)
https://api.fastrouter.ai/prod/
Provisioning / admin endpoints only
https://api.fastrouter.ai/v1/models
Model catalog (for OpenClaw setup)
Authentication
All endpoints use Bearer token authentication:
Core API Endpoints
Chat Completions
POST
/api/v1/chat/completions
Text generation, function calling, structured outputs
Responses
POST
/api/v1/responses
OpenAI Responses API format
Embeddings
POST
/api/v1/embeddings
Vector embeddings
Image Generation
POST
/api/v1/images/generations
Image creation (GPT Image 1, DALL-E)
Image Edit
POST
/api/v1/images/edits
Image editing
Video Generation
POST
/api/v1/videos
Video creation (Veo 3, Sora 2, Kling, etc.)
Video Status
POST
/api/v1/getAsyncResponse
Poll video generation status
Audio Transcription
POST
/v1/audio/transcriptions
Speech-to-text (Whisper)
Audio Translation
POST
/v1/audio/translations
Audio to English text
Text-to-Audio
POST
/api/v1/chat/completions
Audio generation (ace-step model)
Audio Status
POST
/api/v1/getAsyncResponse
Poll audio generation status
List Models
GET
/api/v1/models
Full model catalog with metadata
Generations
GET
/api/v1/generation
Request generation details/stats
Moderations
POST
(see docs)
Content moderation
Model Naming Convention
Models use the provider/model-name format:
openai/gpt-5.4
anthropic/claude-4.5-sonnet
google/gemini-3.1-pro-preview
x-ai/grok-4.1-fast
perplexity/sonar-pro
Append :online to enable web search on any model: x-ai/grok-4.1-fast:online
Common Model IDs
OpenAI
openai/gpt-5.4-nano, openai/gpt-5.4, openai/gpt-5.4-mini, openai/o4-mini, openai/gpt-image-1, openai/sora-2-pro
Anthropic
anthropic/claude-opus-4.6, anthropic/claude-sonnet-4.6, anthropic/claude-haiku-4.5
google/gemini-3.1-pro-preview, google/gemini-3.1-flash-image-preview, google/veo3.1-fast, google/gemini-3.1-flash-lite-preview
xAI
x-ai/grok-4, x-ai/grok-4.20-beta
Minimax
minimax/minimax-m2.7, minimax/minimax-m2.5-highspeed
Perplexity
perplexity/sonar-pro, perplexity/sonar-reasoning-pro
Audio
ace-step/prompt-to-audio
Video
kling-ai/kling-v3, wanx/wan-v2-6, bytedance/seedance-pro
Request Parameters
FastRouter-specific parameters (pass via extra_body in OpenAI SDK):
reasoning
object
{"max_tokens": N} — enable reasoning/thinking tokens
tags
array
["tag1", "tag2"] — custom metadata tags for analytics filtering
Key Types
API Key
Standard LLM requests via Authorization: Bearer
Provisioning Key
Admin-only; creates/manages Service Account Keys
Service Account Key
Scoped access keys with per-key budgets, rate limits, model restrictions
Routing Strategy Types
Auto Router
Dynamic model selection by cost/latency/quality — no configuration needed
Virtual Model Aliases
Custom pool of models with policy-driven selection (weighted, priority, round-robin, cost-optimized)
Fallback Models
Prioritized fallback chain across providers for high availability
Provider Routing
Multi-provider selection for same model (lowest latency, lowest cost, round robin, weighted, priority)
Decision Guidance
When to Use Each Routing Strategy
No strong model preference, want best value
Auto Router
A/B testing or gradual rollout across models
Virtual Model Alias (weighted)
Critical workload needing guaranteed uptime
Fallback Models (priority chain)
Same model available on multiple providers
Provider Routing (lowest latency/cost)
Category-based tasks needing different models
Virtual Model Alias (category routing)
Cost-first, quality-second selection
Auto Router (cost-optimized mode)
When to Use BYOK vs. FastRouter Credits
Getting started / low volume
FastRouter credits (simpler)
You have existing provider contracts
BYOK — use your own rate limits and billing
Need specific model versions
BYOK — you control the provider account
Azure or AWS Bedrock integration
BYOK — required for these providers
Simplicity is priority
FastRouter credits
Cost optimization across providers
BYOK + FastRouter credits as fallback
When to Use Batch vs. Real-time
Dataset processing (10K+ rows)
✅ Yes
❌ No
User-facing chatbot
❌ No
✅ Yes
Scheduled overnight jobs
✅ Yes
❌ No
Latency-sensitive requests
❌ No
✅ Yes
OpenAI or Anthropic only
✅ Yes
✅ Yes
Cost-sensitive bulk inference
✅ Yes
❌ No
Workflows
1. Basic Setup (Python / OpenAI SDK)
2. Basic Setup (cURL)
3. Structured JSON Output
4. Streaming with Reasoning Tokens
5. Web Search
6. Batch Processing
Upload a JSONL file to the dashboard or API. Each line follows this format:
Limits: up to 50,000 requests per file. Results delivered as downloadable JSONL within 24h.
7. Provisioning Keys (Programmatic Key Management)
Provisioning endpoints:
Create key
POST /prod/createServiceKey
Update key
POST /prod/updateServiceKey
List keys
POST /prod/getServiceKeys
Delete key
POST /prod/deleteServiceKey
Important: Save the
api_key_id_hashreturned on creation — required for all future updates and deletions.
OpenClaw Integration
Use this section when a user wants to set up FastRouter as a model provider in OpenClaw. Triggered by phrases like "set up fastrouter", "add fastrouter provider", "configure fastrouter", "update fastrouter models", or when a user provides an API key starting with sk-v1-.
Inputs
API Key (required): Starts with
sk-v1-followed by a hex string. Ask for it if not provided.Base URL (optional): Defaults to
https://api.fastrouter.ai
Steps
Step 1 — Extract the API key
Parse the API key from the user's message. Must start with sk-v1-. Do NOT proceed without one.
Step 2 — Fetch the live model list
Step 3 — Filter models
Keep only models where:
is_activeistruearchitecture.output_modalitiesincludes"text"architecture.input_modalitiesincludes"text"or"image"
For each qualifying model, extract:
id— the model identifiercontext_length— context window sizetop_provider.max_completion_tokens— max output tokens (if 0 or missing, usemin(context_length, 8192))Input types: list of
"text"and/or"image"from input_modalities
Step 4 — Build the provider config
Do NOT include a "name" key — OpenClaw rejects it:
Step 5 — Update openclaw.json
Use
readto load~/.openclaw/openclaw.jsonMerge the provider into
models.providers.fastrouter(preserving all other config)Add model references to
agents.defaults.models— for each model:"fastrouter/MODEL_ID": {}Use
writeto save the updated config
Step 6 — Restart the gateway (requires user approval)
Step 7 — Report to user
Tell the user:
How many models were added
They can switch models with
/model fastrouter/MODEL_IDSuggest popular models: claude, gpt, gemini, deepseek variants
OpenClaw Error Handling
API unreachable
Tell user the FastRouter API may be down; try again later
No qualifying models
Warn that no text/image models were found
Config file missing
Create the full structure from scratch
Invalid API key format
Ask user to double-check their key starts with sk-v1-
OpenClaw Notes
Provider key in config is
fastrouterExisting fastrouter config is replaced with fresh model list on update
All other providers and settings are preserved
Cost is set to zero — FastRouter handles billing separately
Video-only and audio-only models are excluded from the model list
Common Gotchas
Base URLs:
api.fastrouter.aiboth serve LLM endpoints; provisioning usesapi.fastrouter.ai/prod/exclusivelyModel naming is required: Always use
provider/model-nameformat (e.g.,openai/gpt-4o, notgpt-4o); requests with bare model names may fail or route incorrectlyReasoning token continuity: When using reasoning-capable models, you must pass
reasoning_detailsfrom the assistant message back in follow-up requests or you will get errors in multi-turn conversationsAudio and video are async: Text-to-audio and video generation return a job ID; you must poll
getPromptToAudioResponseorgetVideoResponsefor the result — there is no synchronous responseAudio endpoint path differs: Transcription/translation use
/v1/audio/...(no/apiprefix), unlike other endpoints at/api/v1/...Batch file consistency: All requests in a JSONL batch must target the same endpoint (chat completions OR embeddings), but can mix models and providers within that constraint
Batch provider support: Batch processing currently supports OpenAI and Anthropic only — other providers are not available in batch mode
BYOK billing: When using Bring Your Own Key, rate limits and costs are governed by your provider account, not FastRouter credits
Provisioning Keys ≠ API Keys: Provisioning Keys cannot make LLM requests; they are admin-only for key lifecycle management
Service key hash: Save the
api_key_id_hashreturned fromcreateServiceKey— it is the only identifier for future updates and deletionsWeb search pricing:
:onlinesuffix and search-preview models cost $5 per 1,000 answers in addition to model token costsNano Banana (gemini-2.5-flash-image): Use through Chat Completions, not the Image Generation endpoint; supports custom
aspectRatioparameterStreaming + tool calls: When using
stream: truewith function calling, tool calls only appear in the final streaming chunkFree credits expire: Promotional credits expire in 30 days
Verification Checklist
Resources
Main website
https://fastrouter.ai
Documentation home
https://docs.fastrouter.ai
Model catalog
https://fastrouter.ai/models
Chat Completions API
https://docs.fastrouter.ai/api-reference/chat-completions
Responses API
https://docs.fastrouter.ai/api-reference/responses
Embeddings API
https://docs.fastrouter.ai/api-reference/embeddings
List Models API
https://docs.fastrouter.ai/api-reference/models
Auto Router
https://docs.fastrouter.ai/api-reference/auto-router
Image Generation API
https://docs.fastrouter.ai/api-reference/image-generation
Video Generation API
https://docs.fastrouter.ai/api-reference/video-generation
Transcriptions & Translations
https://docs.fastrouter.ai/api-reference/transcriptions-and-translations-api
Text-to-Audio API
https://docs.fastrouter.ai/api-reference/text-to-audio-generation-api
Batch Processing API
https://docs.fastrouter.ai/api-reference/batch-processing
Generations (Request Details)
https://docs.fastrouter.ai/api-reference/generations
Error Codes
https://docs.fastrouter.ai/api-reference/error-codes
Virtual Model Aliases
https://docs.fastrouter.ai/virtual-model-aliases
Fallback Models
https://docs.fastrouter.ai/fallback-models
Provider Routing Strategies
https://docs.fastrouter.ai/provider-routing-strategies
Automatic Model Selection
https://docs.fastrouter.ai/automatic-model-selection
BYOK (External Keys), Custom Endpoint & Models
https://docs.fastrouter.ai/add-external-keys-byok
Guardrails
https://docs.fastrouter.ai/guardrails
Custom Evaluations
https://docs.fastrouter.ai/custom-evaluations
Video Evaluations
https://docs.fastrouter.ai/video-evaluations
Flex Inference
https://docs.fastrouter.ai/explore-features/flex-pricing
Prompt Caching
https://docs.fastrouter.ai/prompt-caching
Batch Processing
https://docs.fastrouter.ai/batch-processing
Structured Outputs
https://docs.fastrouter.ai/structured-outputs
Function Calling
https://docs.fastrouter.ai/function-calling
Reasoning Tokens
https://docs.fastrouter.ai/reasoning-tokens
Response Caching
https://docs.fastrouter.ai/response-caching
MCP Gateway
https://docs.fastrouter.ai/mcp-gateway
PDF Processing
https://docs.fastrouter.ai/pdf-processing
Web Search
https://docs.fastrouter.ai/web-search
Dynamic Tags
https://docs.fastrouter.ai/dynamic-tags-per-request
File & Image Inputs
https://docs.fastrouter.ai/file-and-image-inputs
Tracing
https://docs.fastrouter.ai/tracing
Alerts
https://docs.fastrouter.ai/alerts
Credits
https://docs.fastrouter.ai/credits
Provisioning Keys
https://docs.fastrouter.ai/provisioning-keys
Keys & Settings
https://docs.fastrouter.ai/keys-and-settings
Projects
https://docs.fastrouter.ai/projects
Organization & Members
https://docs.fastrouter.ai/organization-and-members
IDE Integrations
https://docs.fastrouter.ai/integrations/ide-integrations
Claude Code Integration
https://docs.fastrouter.ai/integrations/claude-code
Hermes Agent Integration
https://docs.fastrouter.ai/integrations/running-hermes-agent-with-fastrouter
Changelog
https://docs.fastrouter.ai/changelog
For full documentation and navigation, see: https://docs.fastrouter.ai
Last updated
