Function Calling
FastRouter supports Function Calling for models capable of planning and invoking tools or functions. This allows LLMs to return structured function calls instead of natural language responses.
Overview
Function calling empowers LLMs to:
Identify when a tool or function is needed based on user input.
Select the appropriate function from a provided set of tools.
Generate structured JSON arguments to invoke that function.
When you provide a list of tools in your API request, compatible models can choose to respond with one or more function calls. You then execute those functions in your application code and feed the results back to the model in a subsequent request. This creates a multi-turn conversation loop for tasks like data retrieval, API integrations, or complex workflows.
FastRouter.ai routes your requests to the best available providers (e.g., Google AI Studio, OpenAI) while supporting OpenAI-compatible formats for tools. This includes parallel function calling for models that support it.
Key Benefits:
Build agents that interact with real-world APIs (e.g., weather services, calendars).
Handle complex queries by breaking them into tool-based steps.
Improve reliability with structured outputs over free-form text.
Supported Models
FastRouter.ai supports function calling on models that natively offer this capability. Here's a partial list (check the Models page for the latest):
Google Models: Gemini 2.5 Pro, Gemini 2.5 Flash and others
OpenAI Models: GPT-5, GPT-4.1, GPT-4o, o4-mini, o3-mini and others
Anthropic Models: Claude Opus 4.1, Claude Sonnet 4.5, Claude Haiku 4.5 and others
xAI Models: Grok 4, Grok 3 and others
Use the provider field in your request to route to a specific backend if needed.
Usage
To use function calling:
Define your tools in the
toolsarray of your/chat/completionsrequest. Each tool follows the OpenAI-compatible schema (a JSON object withname,description, andparameters).Send the request to FastRouter.ai's endpoint:
https://api.fastrouter.ai/api/v1/chat/completions.If the model responds with a
tool_callsarray in the response, execute the functions in your code.Append the tool results as a new message (with
role: "tool") and send a follow-up request to let the model generate a final response.
Request Parameters:
tools: Array of tool definitions.tool_choice: Optional; controls how the model uses tools (e.g.,"auto"for automatic selection,"none"to disable, or specify a tool name).
Response Format:
If a tool is called, the response will include
choices[0].message.tool_calls—an array of objects withfunction.nameandfunction.arguments(JSON string).Execute the function and respond with a message like:
{"role": "tool", "content": "JSON result", "tool_call_id": "call_id_from_response"}.
For authentication, use your API key in the Authorization: Bearer YOUR_API_KEY header.
Executing Tools: Multi-Turn Example
After receiving a tool_calls response, execute the function in your code and send the result back. Here's how to handle a full loop.
Python (Using requests)
requests)Node.js (Using fetch)
fetch)Important: Preserving Reasoning Details in Multi-Turn Tool Calls
When using models that support reasoning (extended thinking) like Google's Gemini-3-Pro , the response may include reasoning_details in the assistant message. You must pass these reasoning details back in subsequent requests to maintain context continuity and allow the model to continue reasoning from where it left off.
Why This Matters
Models with reasoning capabilities generate internal thinking blocks that inform their decisions
Omitting
reasoning_detailsin follow-up requests can lead to errorsPreserving this context ensures the model maintains its chain of thought across tool execution
Example: Handling Reasoning Details with Tool Calls
Best Practices
Error Handling: If tool execution fails, return an error message in the
contentfield.Security: Validate arguments before executing tools to prevent injection attacks.
Streaming: Function calling works with
stream: true, but tool calls appear in the final chunk.Costs: Tool calls count toward token usage—monitor via the response's
usagefield.Testing: Start with simple tools and iterate based on model behavior.
Reasoning Details: When using models with reasoning capabilities, preserve
reasoning_detailsfrom the assistant message and include it in subsequent requests to maintain context continuity.
Last updated
