Create and manage datasets with test cases for AI evaluation in Gentrace.
Datasets in Gentrace are collections of test cases used to evaluate your AI models and pipelines. They provide a structured way to organize your test data, track performance over time, and ensure consistent evaluation across different model versions.
// List all test cases in a datasetconst testCasesList = await testCases.list({ datasetId: 'your-dataset-id'});// Access the test casesfor (const testCase of testCasesList.data) { console.log(testCase.name); console.log(testCase.inputs);}
The easiest way to import large amounts of test data is through CSV files. Your CSV should have columns for the test case name, inputs, and expected outputs.Example CSV structure:
Copy
Ask AI
name,input_query,input_context,expected_response"Billing question","How much does the premium plan cost?","New customer inquiry","The premium plan costs $29/month...""Technical issue","Login not working","Existing customer","Please try clearing your browser cache..."
To import via the web interface:
Navigate to your dataset in the Gentrace dashboard
You can also import JSON or JSONL files with structured test case data:
Copy
Ask AI
[ { "name": "Billing question", "inputs": { "query": "How much does the premium plan cost?", "context": "New customer inquiry" }, "expectedOutputs": { "response": "The premium plan costs $29/month and includes..." } }]
Each pipeline can have a “golden dataset” - a special dataset that represents your core test cases. You can mark a dataset as golden when creating or updating it:
Copy
Ask AI
// Mark a dataset as golden during creationconst goldenDataset = await datasets.create({ name: 'Golden Test Cases', description: 'Core evaluation test cases', pipelineId: 'your-pipeline-id', isGolden: true});// Or update an existing dataset to be goldenawait datasets.update('dataset-id', { isGolden: true});