Skip to main content
Version: 4.7.66

Test Cases

🛑Alpha

OpenTelemetry support is currently in alpha and may undergo significant changes.

Test Cases

The test cases SDK provides programmatic access to Gentrace test cases. While commonly used with evalDataset() for batch evaluations, test cases can also be accessed and managed independently.

Overview​

The SDK is built by Stainless and provides type-safe access to Gentrace entities. The testCases object exposes methods to list, create, update, and delete test cases.

Basic usage​

typescript
import { init, testCases } from 'gentrace';
init({
apiKey: process.env.GENTRACE_API_KEY,
});
// List test cases from a dataset
const testCasesList = await testCases.list({
datasetId: 'your-dataset-id'
});
// Access the test cases
for (const testCase of testCasesList.data) {
console.log(testCase.name);
console.log(testCase.inputs);
}

Test case structure​

Each test case contains:

  • name (optional): Human-readable name for the test case
  • id (optional): Unique identifier
  • inputs: Dictionary/object containing the input data for your AI function

Resource methods​

Create a test case​

typescript
const testCase = await testCases.create({
datasetId: 'your-dataset-id',
inputs: { query: 'What is AI?' },
name: 'Basic AI question',
expectedOutputs: { answer: 'Artificial Intelligence is...' } // optional
});

Retrieve a test case​

typescript
const testCase = await testCases.retrieve('test-case-id');
console.log(testCase.inputs);

Delete a test case​

typescript
await testCases.delete('test-case-id');

List with filters​

typescript
// Filter by pipeline
const testCasesList = await testCases.list({
pipelineId: 'pipeline-id',
// or use pipelineSlug: 'pipeline-slug'
});

Common usage with evalDataset()​

Test cases are frequently used with evalDataset() for running batch evaluations:

typescript
await evalDataset({
data: async () => {
const testCasesList = await testCases.list({ datasetId: DATASET_ID });
return testCasesList.data;
},
interaction: yourAIFunction, // See interaction() docs
});

The interaction parameter should be a function wrapped with interaction() for proper OpenTelemetry tracing within experiments.

See also​