Concepts

Understand the core concepts that power TuringPulse observability and governance.

Traces & Spans

A trace represents a complete execution of your AI workflow, from start to finish. Each trace contains one or more spans, which represent individual operations within the workflow.

Trace Structure

Trace ID - Unique identifier for the entire execution
Root Span - The top-level span representing the workflow entry point
Child Spans - Nested operations like LLM calls, tool invocations, etc.
Metadata - Custom attributes attached to traces and spans

Span Types

Type	Description
`llm`	LLM API calls (OpenAI, Anthropic, etc.)
`tool`	Tool/function invocations
`retrieval`	Vector search and RAG operations
`agent`	Agent decision-making steps
`chain`	Sequential chain executions
`custom`	User-defined operations

Workflows

A workflow is a logical grouping of traces that represent the same AI application or agent. Workflows help you organize and compare runs over time.

Workflow Properties

Workflow ID - Unique identifier (e.g., "chat-assistant")
Workflow Name - Human-readable display name
Version - Optional version tracking for A/B testing
Environment - Development, staging, or production

💡

Best Practice

Use consistent workflow IDs across environments to track the same agent through development to production.

Projects & Organizations

TuringPulse uses a hierarchical structure to organize your data:

Organization - Top-level container for your team
Project - Groups related workflows (e.g., "Customer Support")
Workflow - Individual AI application or agent
Run - Single execution of a workflow

Access Control

Permissions are managed at the organization and project level:

Admin - Full access to all settings and data
Member - View and create traces, run evaluations
Viewer - Read-only access to dashboards and traces

Evaluations

Evaluations measure the quality of your AI outputs using various metrics. TuringPulse supports three types of evaluation metrics:

Metric Types

Type	Description	Example
Heuristic	Rule-based metrics computed locally	ROUGE, BLEU, JSON validity
LLM-as-Judge	LLM-powered scoring	Relevance, coherence, safety
Custom	Your own metric functions	Domain-specific scoring

Evaluation Modes

Online - Evaluate traces in real-time as they're logged
Offline - Batch evaluate historical traces
Experiment - Compare different configurations

KPIs & Thresholds

Key Performance Indicators (KPIs) are metrics you want to track and alert on. TuringPulse provides built-in KPIs and supports custom definitions.

Built-in KPIs

Latency - End-to-end response time
Token Usage - Input/output tokens per run
Cost - Estimated cost per run
Error Rate - Percentage of failed runs
Throughput - Runs per minute/hour

Threshold Alerts

Configure thresholds to get alerted when KPIs exceed acceptable ranges:

Warning - Approaching limits
Critical - Exceeded limits, requires attention

Drift & Anomalies

TuringPulse automatically detects changes in your AI system's behavior:

Drift Detection

Drift occurs when the statistical properties of your outputs change over time. This can indicate:

Model updates or degradation
Changes in input distribution
Prompt modifications
External API changes

Anomaly Detection

Anomalies are individual runs that deviate significantly from normal behavior. TuringPulse uses statistical methods to identify outliers in:

Latency
Token usage
Output length
Error patterns

Governance

TuringPulse provides governance features to ensure AI safety and compliance:

Human-in-the-Loop (HITL)

Require human approval before certain actions are executed. Use cases:

High-stakes decisions
Sensitive data access
External API calls

Human-after-the-Loop (HATL)

Review and audit actions after they've been executed. Use cases:

Quality assurance sampling
Compliance auditing
Training data collection

Human-on-the-Loop (HOTL)

Monitor AI actions in real-time with the ability to intervene. Use cases:

Live customer interactions
Critical system operations
Escalation workflows

Incidents

Incidents are automatically created when issues are detected:

KPI threshold breaches
Drift detection alerts
Anomaly clusters
Error rate spikes

Each incident includes root cause analysis, affected traces, and recommended actions.

Next Steps

Quickstart - Get started in 5 minutes
Log Traces - Start capturing telemetry
Evaluation Overview - Learn about metrics
Governance Overview - Set up review workflows