Use this file to discover all available pages before exploring further.
The Python eval_dataset() and TypeScript evalDataset() functions run a series of evaluations against a dataset using a provided interaction() function.These functions must be called within the context of an experiment() and automatically create individual test spans for each test case in the dataset.
The Gentrace SDK automatically configures OpenTelemetry when you
call init(). If you have an existing OpenTelemetry setup or need
custom configuration, see the manual setup
guide.
Returns a sequence of results from the interaction function. Failed test cases (due to validation errors) will have None values in the corresponding positions.
Schema validation ensures that your test cases have the correct structure and data types before being passed to your interaction function.Use Zod for TypeScript or Pydantic for Python to define your input schemas.
You can provide test cases from any source by implementing a custom data provider function. Each data point must conform to the TestInputs structure from above.This is useful when you want to use test cases from the Gentrace API via the test cases SDK or directly defining them in-line.
The dataset evaluation functions handle errors gracefully and automatically associate all errors and exceptions with OpenTelemetry spans. When validation or interaction errors occur, they are captured as span events and attributes.
experiment(async () => { await evalDataset({ data: () => [ { inputs: { prompt: 'Valid input' } }, { inputs: { invalid: 'data' } }, // This might fail validation { inputs: { prompt: 'Another valid input' } }, ], schema: z.object({ prompt: z.string() }), interaction: async (inputs) => { // This function might throw errors if (inputs.prompt.includes('error')) { throw new Error('Simulated error'); } return await myAIFunction(inputs); }, }); // evalDataset continues processing other test cases even if some fail // All errors are automatically captured in OpenTelemetry spans});
Gentrace SDK Initialization: Must call init() with a valid API key. The SDK automatically configures OpenTelemetry for you. For custom OpenTelemetry setups, see the manual setup guide
Experiment Context: Must be called within an experiment() function
Valid Data Provider: The data function must return an array of test cases