Alerts
Alerts help you monitor your API usage and performance in real-time. Set up custom thresholds to get notified when metrics exceed expected values or change compared to historical baselines.
Overview
Alerts evaluate your selected metrics at regular intervals and notify you when conditions are met. Each alert can have two severity levels:
Warning — Early indicator that a metric is trending toward a problem
Critical — Immediate attention required; metric has exceeded acceptable limits
Alerts are scoped to specific Projects, API Keys, and Models, giving you granular control over what you monitor.


Available Metrics
Performance
Response Time
Median (p50) end-to-end latency from request receipt to response completion
ms
Time to First Token
Median (p50) time from request submission until the first response token begins streaming
ms
Throughput
Median request throughput for streaming responses (stream = true)
req/s
Reliability
Error Count
Total number of failed requests within the evaluation window
errors
Error Rate
Percentage of requests that failed
%
Usage & Cost
Token Consumption
Total tokens consumed (input + output)
tokens
Daily Spend
Cumulative spending for the current day (resets at midnight UTC)
$
Monthly Spend
Cumulative spending for the current month (resets on the 1st)
$
Total Requests
Total number of API requests
requests
Scoping Alerts
Each alert can be scoped to monitor specific subsets of your traffic:
Projects
Select which projects to include in the alert evaluation. You can choose:
All Projects — Monitor aggregate metrics across your entire organization
Specific Projects — Monitor one or more selected projects
API Keys
Select which API keys to include:
All Keys — Monitor all API keys within the selected projects
Specific Keys — Monitor one or more selected API keys
Models
Select which models to include:
All Models — Monitor requests to any model
Specific Models — Monitor requests to selected models only (e.g., only GPT-4.1 and Claude 4.5 Sonnet)
Example: You could create an alert that monitors Response Time only for your "Production" project, only for requests using a particular API key, and only for GPT-4.1 requests.
Alert Types
Static Value
Triggers when the metric crosses a fixed threshold.
Use cases:
Response Time > 2000ms
Error Rate > 5%
Daily Spend > $500
Configuration:
Condition
Above or Below
Warning Threshold
Value that triggers a warning (optional)
Critical Threshold
Value that triggers a critical alert (required)
Percentage Change
Triggers when the metric changes by a specified percentage compared to a historical baseline.
Use cases:
Response Time increased 50% vs same time yesterday
Request volume dropped 30% vs same time last week
Error Count spiked 100% vs previous hour
Configuration:
Comparison Period
Historical baseline to compare against
Condition
Above or Below
Warning Threshold (%)
Percentage change that triggers a warning
Critical Threshold (%)
Percentage change that triggers a critical alert
Evaluation Interval
The evaluation interval determines how often the alert checks your metrics. Choose based on how quickly you need to detect issues:
5 min
Error Rate, Error Count
Fastest detection, may be noisy for volatile metrics
15 min
Response Time, TTFT, Throughput
Balances speed and noise reduction
30 min
Response Time, Token Consumption
Smooths transient spikes
1 hour
Spend metrics, Usage patterns
Good for slow-changing metrics
Daily
Daily Spend, Monthly Spend
End-of-day summaries
Recommendations by Metric
Error Rate / Error Count
5 min
Response Time / TTFT
15–30 min
Throughput
15 min
Token Consumption
30–60 min
Daily/Monthly Spend
1 hour or Daily
Comparison Periods
When using Percentage Change alerts, you compare the current value against a historical baseline. The comparison period determines which historical window to use.
Previous Period
The immediately preceding interval
Detecting sudden spikes
Same time 1 hour ago
Same interval, 1 hour earlier
Intra-day patterns
Same time 1 day ago
Same interval, 24 hours earlier
Daily patterns (e.g., business hours vs off-hours)
Same time 1 week ago
Same interval, 7 days earlier
Weekly patterns (e.g., weekday vs weekend)
Same time 1 month ago
Same interval, 30 days earlier
Monthly patterns, seasonal trends
How Comparison Works
The system compares two time windows of equal length:
Example: Alert runs at 3:00 PM with a 15-minute interval, comparing to "Same time 1 day ago"
Percentage Change Formula
Example:
Current Response Time: 450ms
Previous Response Time: 300ms
% Change: ((450 - 300) / 300) × 100 = 50%
If your alert threshold is "Above 40%", this would trigger.
Thresholds
Each alert supports two threshold levels:
Warning Threshold
An early indicator that the metric is trending toward a problem. Useful for:
Getting advance notice before issues become critical
Allowing time to investigate before escalation
Tracking trends that may need attention
Critical Threshold
The primary alert trigger indicating immediate attention is needed. This is the main threshold that should reflect your SLA or operational limits.
Condition Direction
Above
Alert when metric exceeds threshold
Response Time, Error Rate, Spend
Below
Alert when metric falls below threshold
Throughput, Request Volume
Tip: Set your Warning threshold at ~50-70% of your Critical threshold to give yourself response time.
Notification Behavior
Notification Channels
Email — Send alerts to specified email addresses
Organization Owners — Automatically notify all org owners
Project Members — Automatically notify all members of affected projects
Alert States
Alerts transition between three states:
Examples
Example 1: High Response Time Alert
Goal: Get notified when API response times are slow
Metric
Response Time
Scope
Production project, All keys, All models
Alert Type
Static Value
Condition
Above
Warning Threshold
1500 ms
Critical Threshold
3000 ms
Evaluation Interval
15 min
Behavior: Every 15 minutes, calculates the median response time. If it exceeds 1500ms, a warning is sent. If it exceeds 3000ms, a critical alert is sent.
Example 2: Error Rate Spike Detection
Goal: Detect sudden increases in error rate compared to normal
Metric
Error Rate
Scope
All projects, All keys, All models
Alert Type
Percentage Change
Comparison Period
Same time 1 day ago
Condition
Above
Warning Threshold
50%
Critical Threshold
100%
Evaluation Interval
5 min
Behavior: Every 5 minutes, compares the current error rate to the same 5-minute window yesterday. If today's error rate is 50% higher, a warning is sent. If it's 100% higher (doubled), a critical alert is sent.
Example 3: Daily Spend Limit
Goal: Get notified before exceeding daily budget
Metric
Daily Spend
Scope
Production project, All keys, All models
Alert Type
Static Value
Condition
Above
Warning Threshold
$400
Critical Threshold
$500
Evaluation Interval
1 hour
Behavior: Every hour, checks the cumulative daily spend. Sends a warning at $400 and a critical alert at $500.
Example 4: Traffic Drop Detection
Goal: Detect if request volume suddenly drops (may indicate an outage)
Metric
Total Requests
Scope
Production project, All keys, All models
Alert Type
Percentage Change
Comparison Period
Same time 1 hour ago
Condition
Below
Warning Threshold
30%
Critical Threshold
50%
Evaluation Interval
5 min
Behavior: Every 5 minutes, compares request count to the same period 1 hour ago. If traffic drops by 30%, a warning is sent. If it drops by 50%, a critical alert is sent.
Best Practices
Start with Critical thresholds only — Add Warning thresholds once you understand your baseline metrics.
Use appropriate intervals — Don't use 5-minute intervals for metrics that naturally fluctuate; you'll get too many false positives.
Leverage Percentage Change for anomalies — Static thresholds work well for known limits, but percentage change is better for detecting unusual patterns.
Scope alerts appropriately — Create separate alerts for Production vs Staging environments rather than one alert for everything.
Set up resolved notifications — Knowing when an issue is resolved is as important as knowing when it started.
Document your thresholds — Keep a record of why you chose specific threshold values so future team members understand the rationale.
FAQ
Q: What happens if there's no data in the evaluation window?
A: The alert maintains its current state. If there were zero requests in the window, metrics like Error Rate cannot be calculated, so the alert remains unchanged.
Q: Can I have multiple alerts for the same metric?
A: Yes. You might have one alert for Production and another for Staging, or different thresholds for different models.
Q: When do cumulative metrics (Daily Spend, Monthly Spend) reset?
A: Daily Spend resets at midnight UTC. Monthly Spend resets on the 1st of each month at midnight UTC.
Q: What timezone are alerts evaluated in?
A: All alert evaluations use UTC. "Same time 1 day ago" means the same UTC time yesterday.
Q: Can I pause an alert without deleting it?
A: Yes. You can pause an alert from the Alerts list page. Paused alerts retain their configuration but do not evaluate or send notifications.
Last updated
