Version: 4.7.66

Production evaluation

Gentrace evaluators, whether built in Gentrace or external, can be used in production.

When successfully configured, you'll get live grades on your production data, as shown below.

Prod eval view

Collect data in production

To collect data in production, first setup production data collection.

You can do this in a two different ways:

Basic: Use the simple versions of our OpenAI or Pinecone integrations
Advanced: Set up full tracing

Once set up, data will trickle in to your live data view.

Live data view

Attach evaluators

Now, attach evaluators to production.

Enable evaluators for production

To enable evaluators for production, navigate to the evaluator page, then set a sampling probability.

Sampling probability

The evaluator will run against a random sample of your data according to this probability.

Maximum throughput

A given evaluator will only run in production up to a maximum of 100 times per day.

Process if necessary

Depending on how you're submitting data, you may find that you need to process your data before it can be evaluated.

In your evaluator, add a processor to process your data in production.

For example, in my safety evaluator, I use this processor:

typescript
function process({ outputs, steps }) { 
  if (outputs.value) return { value: outputs.value };
  return { value: outputs.choices[0].message.content };
}

This processor will use outputs.value in development (where outputs are already processed) and outputs.choices[0].message.content from our trace in production.

For more details on processors, see Processors.

Collect data in production​

Attach evaluators​

Enable evaluators for production​

Process if necessary​

Collect data in production

Attach evaluators

Enable evaluators for production

Process if necessary