Skip to main content
Version: 4.7.8

Test results - Run test

The runTest() function also creates a test result but simplifies the operation by pulling test cases and submitting the test result in a single function.

As part of this process, you specify a PipelineRun class instance that captures intermediate generative steps that will be associated with the test result.

Learn more about how to use this function in a guided way in the tracing docs.

Example

typescript
import { init, runTest, Pipeline } from "@gentrace/core";
import { generateAiResponse } from "../pipelines";
 
const PIPELINE_SLUG = "my-pipeline";
 
init({
apiKey: process.env.GENTRACE_API_KEY,
});
 
const pipeline = new Pipeline({
slug: PIPELINE_SLUG,
})
 
await runTest(
PIPELINE_SLUG,
async (testCase) => {
const runner = pipeline.start();
 
const outputs = await runner.measure(
(inputs) => {
return {
example: generateAiResponse(inputs),
};
},
[testCase.inputs],
);
 
await runner.submit();
 
// 🚧 Passing the runner back from this function is very important
return [outputs, runner];
},
);

Arguments

pipelineSlug: string

testFunction: (testCase: TestCase) => Promise<[any, PipelineRun]>

This function accepts a test case as a parameter. The return value of this function is an array where the first element is the output of the test case and the second element is the PipelineRun class instance that captures intermediate generative steps.

typescript
await runTest(PIPELINE_SLUG,
async (testCase) => {
const runner = pipeline.start();
const outputs = await runner.measure(
(inputs) => {
return {
example: generateAiResponse(inputs),
};
},
[testCase.inputs],
);
await runner.submit();
// 🚧 Passing the runner back from this function is very important
return [outputs, runner];
}
);

context?: { name: String, metadata: MetadataValueObject }

typescript
await runTest(
PIPELINE_SLUG,
async (testCase) => {
const runner = pipeline.start();
 
const outputs = await runner.measure(
(inputs) => {
console.log("inputs", inputs);
// Simply return inputs as outputs
return {
example:
"<h1>Example</h1><div>This is an <strong>example</strong></div>",
};
},
[testCase.inputs],
{
context: {
render: {
type: "html",
key: "example",
},
},
},
);
 
await runner.submit();
 
return [outputs, runner];
},
{
name: "Rendering HTML",
metadata: {
promptString: {
type: "string",
value: "What is the basic unit of life?",
},
}
}
);

caseFilter: (testCase: TestCase) => boolean

Optional filter function that is called for each test case. For example, you can define a function to only run test cases that have a certain name prefix.

typescript
await runTest(
PIPELINE_SLUG,
async (testCase) => {
const runner = pipeline.start();
 
const outputs = await runner.measure(
(inputs) => {
return {
yourOutputKey: "Your output value",
}
},
[testCase.inputs],
);
 
await runner.submit();
 
return [outputs, runner];
},
(testCase) => testCase.name.startsWith("Production test case:")
);

Return value

This endpoint returns a simple object with the test result ID as a UUID string. Here's an example response structure.

json
{
"resultId": "FACB6642-4725-4FAE-9323-634E72533C89"
}

You can then use this ID to retrieve the test result using the getTestResult() function or check the status with the getTestResultStatus() function.

Types

🛠️ MetadataValueObject

type: string

{ [key: string]: any }