Gentrace Documentation

Derivations are snippets of code that run against traces and experiments to extract data, monitor for errors, and more. Derivations can optionally run an agent to analyze the trace (a variant of LLM-as-judge that we call Agent-as-judge). They show up as columns in the Gentrace UI. Derivations example

Structure of a derivation

Language

Write derivations in Python or JavaScript.

Return type

All derivations must return a typed value. The type is specified in the dropdown at the bottom of the derivation and must match the return type of the function. Some types can be marked as “eval”. Eval derivations are averaged to compute a trace’s score. Derivation return type dropdown

Function signature and arguments

Derivations are functions written in Python or JavaScript. Derivations receive the following arguments:

The trace
(If available) The source test case from the test dataset
All other derivations in the same view

function evaluate({
  // The trace data to analyze
  trace,
  // The source test case from the test dataset (if available)
  testCase,
  // All other derivations in the same view, spread as individual
  // camelCase properties
  ...otherDerivations
}) {

  ...

  // The return type must match the type specified in the
  // dropdown at the bottom of the derivation.
  return ...
}

LLM-as-judge (Agent-as-judge)

Derivations can use an LLM to analyze traces with callAgent() / call_agent(). The function accepts parameters for instructions, resources (like the trace), output schema, and optionally images.

const { count } = await callAgent({
  instructions: "How many r's in strawberry?",
  jsonSchema: {
    type: 'object',
    properties: {
      count: { type: 'number' },
    },
    required: ['count'],
  },
});

// Analyzing a user message
const { sentiment } = await callAgent({
  instructions: 'Analyze the sentiment of: ' + userMessage,
  jsonSchema: {
    type: 'object',
    properties: {
      sentiment: {
        type: 'string',
        enum: ['positive', 'neutral', 'negative'],
      },
      confidence: {
        type: 'number',
        minimum: 0,
        maximum: 1,
      },
    },
    required: ['sentiment', 'confidence'],
  },
});

// Analyzing a trace
const { longestMessage } = await callAgent({
  instructions:
    'Get the sentiment of the longest user message in this trace',
  resources: [{ type: 'trace' }],
  jsonSchema: {
    type: 'object',
    properties: {
      sentiment: {
        type: 'string',
        enum: ['positive', 'neutral', 'negative'],
      },
      confidence: {
        type: 'number',
        minimum: 0,
        maximum: 1,
      },
      longestMessage: { type: 'string' },
    },
    required: ['sentiment', 'confidence', 'longestMessage'],
  },
});

Running derivations

Derivations run in the context of a view. Derivations are run in three ways:

Automatically by Gentrace Chat
Automatically on trace ingest when sampled according to the view’s auto-run settings
Manually, by:
- Pressing “Run last 10” or “Run last 100” in the top bar of the view
- Right clicking on a column header in the table
- Right clicking on a row or cell in the traces table
- Pressing “Run” with a derivation selected

Example derivations

Use the prompts below in Gentrace Chat to analyze your traces.

Understand agent execution

Summarize the entire trace as a series of steps

Extract the user message that triggered the agent

Extract the final assistant message (if present)

Show which tools were used, and how many times

Understand user experience

Extract the user's name and organization (if present in the trace)

Rate the user's frustration level based on the trace

Measure cost and performance

Show the total number of LLM calls

Show the number of input, output, and/or total tokens across the trace

Show the total number of tool calls

Monitor for errors

Did the agent satisfy the user's request?

Show the number of failed tool calls

Show the number of failed LLM calls

Write evaluations with LLM-as-judge

Compare the factualness of the assistant's response to expected output

Show the percentage of assertions that pass in the trace

Getting started

Error analysis

Tracing

Evaluation

Integrations

Administration

Reference

SDK Entities

Derivations

Structure of a derivation

Language

Return type

Function signature and arguments

LLM-as-judge (Agent-as-judge)

Running derivations

Example derivations

Understand agent execution

Understand user experience

Measure cost and performance

Monitor for errors

Write evaluations with LLM-as-judge

Getting started

Error analysis

Tracing

Evaluation

Integrations

Administration

Reference

SDK Entities

​Structure of a derivation

​Language

​Return type

​Function signature and arguments

​LLM-as-judge (Agent-as-judge)

​Running derivations

​Example derivations

​Understand agent execution

​Understand user experience

​Measure cost and performance

​Monitor for errors

​Write evaluations with LLM-as-judge

Structure of a derivation

Language

Return type

Function signature and arguments

LLM-as-judge (Agent-as-judge)

Running derivations

Example derivations

Understand agent execution

Understand user experience

Measure cost and performance

Monitor for errors

Write evaluations with LLM-as-judge