KPI Rules

Define threshold rules to monitor key performance indicators and get alerted when they breach.

What are KPI Rules?

KPI (Key Performance Indicator) rules let you define thresholds for important metrics and automatically create alerts or incidents when those thresholds are breached.

Built-in Metrics

  • Latency - End-to-end response time (p50, p95, p99, avg)
  • Token Usage - Input/output tokens per run
  • Cost - Estimated cost per run or total
  • Error Rate - Percentage of failed runs
  • Throughput - Runs per minute/hour

Custom Metrics

You can also track custom metrics defined in your code via the SDK.

Creating KPI Rules via UI

Step 1: Navigate to Thresholds

Go to Controls → Thresholds in the sidebar.

Step 2: Create New Rule

Click Create Rule and fill in:

  • Name - Descriptive name (e.g., "High Latency Alert")
  • Workflow - Select specific workflow or "All Workflows"
  • Metric - Choose from dropdown or enter custom metric ID
  • Aggregation - avg, sum, p50, p95, p99, max, min
  • Comparator - Greater than, Less than, etc.
  • Threshold - Numeric value
  • Window - Time window for aggregation (5m, 15m, 1h, etc.)

Step 3: Configure Alerts

  • Severity - Warning or Critical
  • Auto-create Incident - Toggle to create incidents automatically
  • Alert Channels - Select notification channels

Step 4: Enable and Save

Toggle the rule to enabled and click Save.

Creating KPI Rules via SDK

Inline with @instrument

agent.py
from turingpulse import instrument, KPIConfig

@instrument(
    agent_id="customer-support",
    kpis=[
        # Alert if latency > 5 seconds
        KPIConfig(
            kpi_id="latency_ms",
            use_duration=True,
            alert_threshold=5000,
            comparator="gt",
            severity="warning",
        ),
        
        # Alert if cost > $1 per run
        KPIConfig(
            kpi_id="cost_usd",
            value=lambda ctx: ctx.metadata.get("total_cost", 0),
            alert_threshold=1.0,
            comparator="gt",
            severity="critical",
            auto_create_incident=True,
        ),
        
        # Alert if token count > 4000
        KPIConfig(
            kpi_id="total_tokens",
            value=lambda ctx: (
                ctx.metadata.get("input_tokens", 0) + 
                ctx.metadata.get("output_tokens", 0)
            ),
            alert_threshold=4000,
            comparator="gt",
        ),
    ]
)
def handle_query(query: str):
    return agent.run(query)

Via API

create_rule.py
import requests

response = requests.post(
    "https://api.turingpulse.ai/v1/kpi-rules",
    headers={"Authorization": "Bearer sk_live_..."},
    json={
        "name": "High Latency Alert",
        "workflow_id": "customer-support",  # or "*" for all
        "metric": "latency_ms",
        "aggregation": "p95",
        "comparator": "gt",
        "threshold": 5000,
        "window": "5m",
        "severity": "warning",
        "auto_create_incident": False,
        "alert_channels": ["email:ops@company.com"],
        "enabled": True,
    }
)

KPI Configuration Reference

KPIConfig Options

OptionTypeDescription
kpi_idstrUnique identifier for the KPI
use_durationboolUse execution duration as the value
valuecallableFunction to extract value from context
alert_thresholdfloatThreshold value for alerting
comparatorstrgt, lt, gte, lte, eq
severitystrwarning, critical
auto_create_incidentboolCreate incident on breach

Comparators

ComparatorDescriptionExample
gtGreater thanAlert if latency > 5000ms
ltLess thanAlert if accuracy < 0.8
gteGreater than or equalAlert if errors >= 10
lteLess than or equalAlert if throughput <= 5/min
eqEqual toAlert if status == "failed"

Viewing KPI Alerts

When a KPI threshold is breached:

  1. An alert is created and visible in Operations → Overview → KPIs tab
  2. Notifications are sent to configured alert channels
  3. If auto_create_incident is enabled, an incident is created
ℹ️
Alert Deduplication
Alerts are deduplicated within the configured window. Multiple breaches within the same window won't create duplicate alerts.

Best Practices

  • Start with warnings - Use warning severity first, then escalate to critical once you understand normal behavior.
  • Use appropriate windows - Short windows (5m) for real-time alerts, longer windows (1h) for trend-based alerts.
  • Set realistic thresholds - Base thresholds on historical data, not arbitrary values.
  • Group related KPIs - Create rules for related metrics together (e.g., latency + error rate).

Next Steps