Skip to main content
Version: 2.0.0

Test case basics

Test cases are example scenarios that your generative AI pipeline might encounter. Test cases are uniquely associated with a pipeline. A test case contains:

  • A unique name
  • Inputs that will be passed to your AI pipeline
  • Expected output (optional, depending on the evaluators that you need)
  • Expected output steps (optional, also depending on the evaluators that you need)

Schema

This section breaks down the test case schema in more detail.

Name

Simple, human-readable name for your test case.

Inputs

Inputs are the parameters to your AI pipeline, expressed as a JSON string.

Let's say you have a simple AI pipeline (a function with an OpenAI invocation) that composes an email. The function accepts a sender, receiver, and query as string.

typescript
import { init } from "@gentrace/core"
import { OpenAI } from "@gentrace/openai";
 
init({
apiKey: 'my-gentrace-api-key', // TODO: Add your Gentrace API key
})
 
const openai = new OpenAI({
apiKey: 'my-open-ai-api-key', // TODO: Add your OpenAI API key
});
 
export const compose = async ({
sender,
receiver,
query,
}: {
sender: string;
receiver: string;
query: string;
}) => {
const response = await openai.chat.completions.create({
pipelineId: "draft",
model: "gpt-3.5-turbo",
temperature: 0,
messages: [
{
role: "system",
content: `Write a concise and complete email from ${sender} to ${receiver} ${query}.`,
},
],
});
return {
content: response.choices[0]!.message!.content,
pipelineRunId: response.pipelineRunId!,
};
};
 

The below JSON object input string could represent the parameters to this pipeline.

json
{
"query": "bragging about superiority",
"sender": "[email protected]",
"receiver": "[email protected]"
}
Exact match required

Each key from the inputs should exactly match what your AI pipeline expects. The inputs must be a JSON object. Arrays or primitive types (e.g. number, strings, booleans) are not permitted.

Expected outputs (optional)

This object captures the expected, ideal outputs of your pipeline.

Referring to the code example in the prior section, the expected output would be the ideal chat completion string returned from the function. Here's an example string that could work well as the expected output for the case.

Dear Joker,
It has come to our attention that instances of bragging about superiority with respect to the
Justice League have been made, and we want to emphasize that such behavior is not condoned
or representative of our organization's values of justice, respect, and collaboration.
Best,
Superman

This would need to be inserted as a JSON structure, eg { "value": "Dear Joker..." }

Small datasets

If you only need to specify a few test cases, you can create them directly from the UI by selecting "New test case".

New test case modal