Skip to main content
Version: 4.5.0

Results with select cases

You may not want to create a result containing the runs for every test case. Instead, you may want to create a result with only a subset of test cases.

For example, if your pipeline degrades on Japanese language queries, you may want to create a result with those test cases only.

submitTestResult() / submit_test_result()

View documentation for simple submission of test results here. You can use this function in a couple ways to accomplish the same goal.

Filter TestCase list

typescript
import { getTestCases, init, submitTestResult } from "@gentrace/core";
import { aiPipeline } from "../pipelines"; // TODO: replace with your own pipeline
 
init({
apiKey: process.env.GENTRACE_API_KEY ?? "",
});
 
const PIPELINE_SLUG = "guess-the-year";
 
const japaneseCases = (await getTestCases(PIPELINE_SLUG))
.filter(tc => tc.name.startsWith("Japanese"));
 
const outputsList : Record<String, any> = []
 
for (const testCase of japaneseCases) {
const outputs = aiPipeline(testCase);
outputsList.push(outputs);
}
 
await submitTestResult(
PIPELINE_SLUG,
japaneseCases,
outputs
);

Pull specific TestCase

typescript
import { getTestCase, init, submitTestResult } from "@gentrace/core";
import { aiPipeline } from "../pipelines"; // TODO: replace with your own pipeline
 
init({
apiKey: process.env.GENTRACE_API_KEY
});
 
const PIPELINE_SLUG = "guess-the-year";
 
const TEST_CASE_ID = "38485a80-4291-4f33-8797-36d4e9f9ad3f";
 
const failingCase = await getTestCase(TEST_CASE_ID);
 
const caseOutputs : Record<string, any> = aiPipeline(failingCase);
 
// Creates result only for this failing case
await submitTestResult(
PIPELINE_SLUG,
[failingCase],
[caseOutputs],
);

runTest() / run_test()

View documentation for the callback submission of test results here. You can use a filter function to specify the cases you want to run.

typescript
await runTest(
PIPELINE_SLUG,
async (testCase) => {
const runner = pipeline.start();
 
const outputs = await runner.measure(
(inputs) => {
return {
yourOutputKey: "Your output value",
}
},
[testCase.inputs],
);
 
await runner.submit();
 
return [outputs, runner];
},
(testCase) => testCase.name.startsWith("Japanese test case:")
);

The specified callback will only fire for test cases that match the filter function.

If you want to run the callback only on a single test case, you can write a modified filter function that only returns true for that test case.

typescript
await runTest(
PIPELINE_SLUG,
async (testCase) => {
const runner = pipeline.start();
 
const outputs = await runner.measure(
(inputs) => {
return {
yourOutputKey: "Your output value",
}
},
[testCase.inputs],
);
 
await runner.submit();
 
return [outputs, runner];
},
(testCase) => testCase.id === "38485a80-4291-4f33-8797-36d4e9f9ad3f"
(parameter) testCase: Omit<TestCase, "createdAt" | "updatedAt" | "archivedAt">
);