functionFunction Calling

FastRouter supports Function Calling for models capable of planning and invoking tools or functions. This allows LLMs to return structured function calls instead of natural language responses.

Overview

Function calling empowers LLMs to:

  • Identify when a tool or function is needed based on user input.

  • Select the appropriate function from a provided set of tools.

  • Generate structured JSON arguments to invoke that function.

When you provide a list of tools in your API request, compatible models can choose to respond with one or more function calls. You then execute those functions in your application code and feed the results back to the model in a subsequent request. This creates a multi-turn conversation loop for tasks like data retrieval, API integrations, or complex workflows.

FastRouter.ai routes your requests to the best available providers (e.g., Google AI Studio, OpenAI) while supporting OpenAI-compatible formats for tools. This includes parallel function calling for models that support it.

Key Benefits:

  • Build agents that interact with real-world APIs (e.g., weather services, calendars).

  • Handle complex queries by breaking them into tool-based steps.

  • Improve reliability with structured outputs over free-form text.

Supported Models

FastRouter.ai supports function calling on models that natively offer this capability. Here's a partial list (check the Models page arrow-up-rightfor the latest):

  • Google Models: Gemini 2.5 Pro, Gemini 2.5 Flash and others

  • OpenAI Models: GPT-5, GPT-4.1, GPT-4o, o4-mini, o3-mini and others

  • Anthropic Models: Claude Opus 4.1, Claude Sonnet 4.5, Claude Haiku 4.5 and others

  • xAI Models: Grok 4, Grok 3 and others

Use the provider field in your request to route to a specific backend if needed.

Usage

To use function calling:

  1. Define your tools in the tools array of your /chat/completions request. Each tool follows the OpenAI-compatible schema (a JSON object with name, description, and parameters).

  2. Send the request to FastRouter.ai's endpoint: https://api.fastrouter.ai/api/v1/chat/completions.

  3. If the model responds with a tool_calls array in the response, execute the functions in your code.

  4. Append the tool results as a new message (with role: "tool") and send a follow-up request to let the model generate a final response.

Request Parameters:

  • tools: Array of tool definitions.

  • tool_choice: Optional; controls how the model uses tools (e.g., "auto" for automatic selection, "none" to disable, or specify a tool name).

Response Format:

  • If a tool is called, the response will include choices[0].message.tool_calls—an array of objects with function.name and function.arguments (JSON string).

  • Execute the function and respond with a message like: {"role": "tool", "content": "JSON result", "tool_call_id": "call_id_from_response"}.

For authentication, use your API key in the Authorization: Bearer YOUR_API_KEY header.

Executing Tools: Multi-Turn Example

After receiving a tool_calls response, execute the function in your code and send the result back. Here's how to handle a full loop.

Python (Using requests)

Node.js (Using fetch)

Important: Preserving Reasoning Details in Multi-Turn Tool Calls

When using models that support reasoning (extended thinking) like Google's Gemini-3-Pro , the response may include reasoning_details in the assistant message. You must pass these reasoning details back in subsequent requests to maintain context continuity and allow the model to continue reasoning from where it left off.

Why This Matters

  • Models with reasoning capabilities generate internal thinking blocks that inform their decisions

  • Omitting reasoning_details in follow-up requests can lead to errors

  • Preserving this context ensures the model maintains its chain of thought across tool execution

Example: Handling Reasoning Details with Tool Calls

Best Practices

  • Error Handling: If tool execution fails, return an error message in the content field.

  • Security: Validate arguments before executing tools to prevent injection attacks.

  • Streaming: Function calling works with stream: true, but tool calls appear in the final chunk.

  • Costs: Tool calls count toward token usage—monitor via the response's usage field.

  • Testing: Start with simple tools and iterate based on model behavior.

  • Reasoning Details: When using models with reasoning capabilities, preserve reasoning_details from the assistant message and include it in subsequent requests to maintain context continuity.

Last updated