Production evaluation
Gentrace evaluators, whether built in Gentrace or external, can be used in production.
When successfully configured, you'll get live grades on your production data, as shown below.
Collect data in production
To collect data in production, first setup production data collection.
You can do this in a two different ways:
- Basic: Use the simple versions of our OpenAI or Pinecone integrations
- Advanced: Set up full tracing
Once set up, data will trickle in to your live data view.
Attach evaluators
Now, attach evaluators to production.
Enable evaluators for production
To enable evaluators for production, navigate to the evaluator page, then set a sampling probability.
The evaluator will run against a random sample of your data according to this probability.
A given evaluator will only run in production up to a maximum of 100 times per day.
Process if necessary
Depending on how you're submitting data, you may find that you need to process your data before it can be evaluated.
In your evaluator, add a processor to process your data in production.
For example, in my safety evaluator, I use this processor:
typescript
function process({ outputs, steps }) {if (outputs.value) return { value: outputs.value };return { value: outputs.choices[0].message.content };}
This processor will use outputs.value
in development (where outputs are already processed) and outputs.choices[0].message.content
from our trace in production.
For more details on processors, see Processors.