Using local test data
Gentrace allows you to use local test data for evaluations, rather than using test case data defined in datasets.
Using local test data is currently in alpha and may undergo significant changes.
This feature is currently only available in our TypeScript SDK. We have not implemented this functionality in Python yet.
Using createTestRunners()
​
To use local test data, you can use the createTestRunners()
function. This function creates test runners from local data, which you can then use to run your evaluations.
typescript
import { createTestRunners, LocalTestData, Pipeline } from "@gentrace/core";import { initPlugin } from "@gentrace/openai";// Initialize Gentrace and OpenAI pluginconst plugin = await initPlugin({ apiKey: process.env.OPENAI_KEY });const pipeline = new Pipeline({slug: "your-pipeline-slug",plugins: { openai: plugin }});// Define local test dataconst localData: LocalTestData[] = [{name: "Test Case 1",inputs: { prompt: "Convert this sentence to JSON: John is 10 years old." },},// Add more test cases as needed];// Create test runners using local dataconst testRunners = createTestRunners(pipeline, localData);// Process test casesfor (const [runner, testCase] of testRunners) {// Use runner.openai to make API callsconst completion = await runner.openai.chat.completions.create({model: "gpt-3.5-turbo",messages: [{ role: "user", content: testCase.inputs.prompt }],});// Optionally, add local evaluationsrunner.addEval({name: "example-eval",value: 0.8,// Add more evaluation details as needed});}// Submit test runnersconst result = await submitTestRunners(pipeline, testRunners);console.log("Test result ID:", result.resultId);
Viewing results​
After submitting your test runners, you can view the results in the Gentrace UI. The evaluations created using local test data will render in the same way as other evaluations in Gentrace, allowing for easy comparison and analysis.
For more information on how to compare results and view them in different formats, refer to our guide on Building comparisons.