Test case basics
Test cases are example scenarios that your generative AI pipeline might encounter. Test cases are uniquely associated with a pipeline. A test case contains:
- A unique name
- Inputs that will be passed to your AI pipeline
- Expected output (optional, depending on the evaluators that you need)
- Expected output steps (optional, also depending on the evaluators that you need)
Schema
This section breaks down the test case schema in more detail.
Name
Simple, human-readable name for your test case.
Inputs
Inputs are the parameters to your AI pipeline, expressed as a JSON string.
Let's say you have a simple AI pipeline (a function with an OpenAI invocation) that composes an email. The function accepts a sender, receiver, and query as string.
- TypeScript
- Python
typescript
import {init } from "@gentrace/core"import {OpenAI } from "@gentrace/openai";init ({apiKey : 'my-gentrace-api-key', // TODO: Add your Gentrace API key})constopenai = newOpenAI ({apiKey : 'my-open-ai-api-key', // TODO: Add your OpenAI API key});export constcompose = async ({sender ,receiver ,query ,}: {sender : string;receiver : string;query : string;}) => {constresponse = awaitopenai .chat .completions .create ({pipelineId : "draft",model : "gpt-3.5-turbo",temperature : 0,messages : [{role : "system",content : `Write a concise and complete email from ${sender } to ${receiver } ${query }.`,},],});return {content :response .choices [0]!.message !.content ,pipelineRunId :response .pipelineRunId !,};};
python
import openaidef compose(sender, receiver, query):prompt = f"Write a concise and complete email from {sender} to {receiver} {query}."chat_completion = openai.ChatCompletion.create(model="gpt-3.5-turbo",messages=[{"role": "user", "content": prompt}])return chat_completion.choices[0].message.content
The below JSON object input string could represent the parameters to this pipeline.
json
{"query": "bragging about superiority",}
Each key from the inputs should exactly match what your AI pipeline expects. The inputs must be a JSON object. Arrays or primitive types (e.g. number, strings, booleans) are not permitted.
Expected outputs (optional)
This object captures the expected, ideal outputs of your pipeline.
Referring to the code example in the prior section, the expected output would be the ideal chat completion string returned from the function. Here's an example string that could work well as the expected output for the case.
Dear Joker,It has come to our attention that instances of bragging about superiority with respect to theJustice League have been made, and we want to emphasize that such behavior is not condonedor representative of our organization's values of justice, respect, and collaboration.Best,Superman
This would need to be inserted as a JSON structure, eg { "value": "Dear Joker..." }
Small datasets
If you only need to specify a few test cases, you can create them directly from the UI by selecting "New test case".