Test Cases
🛑Alpha
OpenTelemetry support is currently in alpha and may undergo significant changes.
Test Cases
The test cases SDK provides programmatic access to Gentrace test cases. While commonly used with evalDataset()
for batch evaluations, test cases can also be accessed and managed independently.
Overview​
The SDK is built by Stainless and provides type-safe access to Gentrace entities. The testCases
object exposes methods to list, create, update, and delete test cases.
Basic usage​
- TypeScript
- Python
typescript
import { init, testCases } from 'gentrace';init({apiKey: process.env.GENTRACE_API_KEY,});// List test cases from a datasetconst testCasesList = await testCases.list({datasetId: 'your-dataset-id'});// Access the test casesfor (const testCase of testCasesList.data) {console.log(testCase.name);console.log(testCase.inputs);}
python
import osfrom gentrace import init, test_casesinit(api_key=os.environ["GENTRACE_API_KEY"])# List test cases from a datasettest_case_list = await test_cases.list(dataset_id="your-dataset-id")# Access the test casesfor test_case in test_case_list.data:print(test_case.name)print(test_case.inputs)
Test case structure​
Each test case contains:
name
(optional): Human-readable name for the test caseid
(optional): Unique identifierinputs
: Dictionary/object containing the input data for your AI function
Resource methods​
Create a test case​
- TypeScript
- Python
typescript
const testCase = await testCases.create({datasetId: 'your-dataset-id',inputs: { query: 'What is AI?' },name: 'Basic AI question',expectedOutputs: { answer: 'Artificial Intelligence is...' } // optional});
python
test_case = await test_cases.create(dataset_id="your-dataset-id",inputs={"query": "What is AI?"},name="Basic AI question",expected_outputs={"answer": "Artificial Intelligence is..."} # optional)
Retrieve a test case​
- TypeScript
- Python
typescript
const testCase = await testCases.retrieve('test-case-id');console.log(testCase.inputs);
python
test_case = await test_cases.retrieve("test-case-id")print(test_case.inputs)
Delete a test case​
- TypeScript
- Python
typescript
await testCases.delete('test-case-id');
python
await test_cases.delete("test-case-id")
List with filters​
- TypeScript
- Python
typescript
// Filter by pipelineconst testCasesList = await testCases.list({pipelineId: 'pipeline-id',// or use pipelineSlug: 'pipeline-slug'});
python
# Filter by pipelinetest_case_list = await test_cases.list(pipeline_id="pipeline-id",# or use pipeline_slug="pipeline-slug")
Common usage with evalDataset()​
Test cases are frequently used with evalDataset()
for running batch evaluations:
- TypeScript
- Python
typescript
await evalDataset({data: async () => {const testCasesList = await testCases.list({ datasetId: DATASET_ID });return testCasesList.data;},interaction: yourAIFunction, // See interaction() docs});
python
async def fetch_test_cases():test_case_list = await test_cases.list(dataset_id=DATASET_ID)return test_case_list.dataawait eval_dataset(data=fetch_test_cases,interaction=your_ai_function, # See interaction() docs)
The interaction
parameter should be a function wrapped with interaction()
for proper OpenTelemetry tracing within experiments.
See also​
- Test Cases API Reference - Full API documentation
evalDataset()
- Common usage pattern for batch evaluations- Datasets API Reference - Managing datasets that contain test cases