Basic usage
Use theexperiment()
function to submit an experiment to Gentrace. It manages the lifecycle of a Gentrace experiment, automatically starting and finishing the experiment while providing context for evaluation functions like eval()
/ evalOnce()
and evalDataset()
/ eval_dataset()
.
Important: The
experiment()
function is designed to work with
evaluation functions. To fully understand how to use experiments
effectively, you should also review: - eval()
/
evalOnce()
- For running individual test cases -
evalDataset()
/ eval_dataset()
- For batch
evaluation against datasets These functions must be called within an
experiment context to properly track and group your test results.When the
pipeline_id
parameter is omitted, the experiment will automatically submit to the default pipeline. In Python, use @experiment
without parentheses when no parameters are provided. In TypeScript, use experiment(async () => { ... })
. This is convenient for quick testing, but we recommend explicitly specifying the pipeline ID for production use.Overview
An experiment in Gentrace represents a collection of test cases or evaluations run against your AI pipeline. Theexperiment()
function:
- Creates an experiment run in Gentrace with a unique experiment ID
- Provides context for evaluation functions to associate their results with the experiment
- Manages lifecycle by automatically starting and finishing the experiment
- Captures metadata and organizes test results for analysis
Parameters
Advanced usage
With metadata
Multiple test cases
Context and lifecycle
Theexperiment()
function manages the experiment lifecycle automatically:
1
Start
Creates a new experiment run
in Gentrace
2
Context
Provides experiment context to nested evaluation
functions
3
Execution
Runs your experiment callback/function
4
Finish
Marks the experiment as complete in Gentrace with status
updates
Accessing experiment context
Error handling
The experiment function handles errors gracefully and automatically associates all errors and exceptions with the OpenTelemetry span. When an error occurs within an experiment or evaluation, it is captured as span events and attributes, providing full traceability in your observability stack.OTEL span error integration
When errors occur within experiments:- Automatic Capture: All
Error
objects (TypeScript) and exceptions (Python) are automatically captured as span events - Stack Traces: Full stack traces are preserved in the span attributes for debugging
- Error Attributes: Error messages, types, and metadata are recorded as span attributes
- Span Status: The span status is automatically set to
ERROR
when unhandled exceptions occur
Best practices
1. Use descriptive names and metadata
2. Group related tests
Organize related test cases within a single experiment:3. Handle async operations properly
Requirements
- Gentrace SDK Initialization: Must call
init()
with a valid API key. The SDK automatically configures OpenTelemetry for you. For custom OpenTelemetry setups, see the manual setup guide - Valid Pipeline ID: Must provide a valid UUID for an existing Gentrace pipeline
Related functions
init()
- Initialize the Gentrace SDKinteraction()
- Instrument AI functions for tracing within experimentsevalDataset()
/eval_dataset()
- Run tests against a dataset within an experimentevalOnce()
/eval()
- Run individual test cases within an experimenttraced()
- Alternative approach for tracing functions