evalDataset()
for batch evaluations, test cases can also be accessed and managed independently.
Basic usage
Overview
The SDK is built by Stainless and provides type-safe access to Gentrace entities. ThetestCases
object exposes methods to list, create, update, and delete test cases.
Test case structure
Each test case contains:Unique identifier
Human-readable name for the test case
Dictionary/object containing the input data for your AI function
Optional expected outputs for validation
UUID of the dataset this test case belongs to
Creation timestamp
Last update timestamp
Resource methods
Create a test case
Retrieve a test case
Delete a test case
List with filters
Common usage with evalDataset()
Test cases are frequently used withevalDataset()
for running batch evaluations:
interaction
parameter should be a function wrapped with interaction()
for proper OpenTelemetry tracing within experiments.
See also
evalDataset()
- Common usage pattern for batch evaluations- Datasets - Managing datasets that contain test cases