Version: 4.7.14

Tracing

Gentrace tracing enriches data collection with additional context about the steps your AI pipeline went through to reach a particular output.

Traces can be attached to both experiments and production, and in either case, can be processed and then evaluated.

Setup

Traces are collected using runners, which must first be initialized with a Pipeline configuration object. This object is used to track the steps of your pipeline.

TypeScript
Python

typescript
import { Pipeline } from "@gentrace/core";
 
const PIPELINE_SLUG = "compose"
 
const pipeline = new Pipeline({
  slug: PIPELINE_SLUG,
});
 
export const compose = async (...) => {
  // Runner will capture the trace
  const runner = pipeline.start();
  
  // ...
};

python
import gentrace
PIPELINE_SLUG = "compose"
pipeline = gentrace.Pipeline(
    PIPELINE_SLUG,
)
def compose(...):
    # Runner automatically captures and meters invocations to OpenAI
    runner = pipeline.start()
    
    # ...

Collection

You then need to modify your pipeline to use the Gentrace pipeline runner to instrument your code.

This can be done in two ways:

Automatically with our plugins

Our OpenAI or Pinecone plugins can automatically trace multiple steps within your code.

As an example, here's how you could use the OpenAI plugin to track a two-step pipeline.

TypeScript
Python

typescript
import { Pipeline } from "@gentrace/core";
import { initPlugin } from "@gentrace/openai";
 
const PIPELINE_SLUG = "compose"
 
const plugin = await initPlugin({
  apiKey: process.env.OPENAI_KEY,
});
 
const pipeline = new Pipeline({
  slug: PIPELINE_SLUG,
  plugins: {
    openai: plugin
  }
});
 
export const compose = async (
  sender: string,
  receiver: string,
  query: string
) => {
  // Runner automatically captures and meters invocations to OpenAI
  const runner = pipeline.start();
  
  // This is a near type-match of the official OpenAI Node.JS package handle.
  const openai = runner.openai;
  
  // FIRST STAGE (INITIAL DRAFT).
  // Since we're using the OpenAI handle provided by our runner, we capture inputs
  // and outputs automatically as a distinct step.
  const initialDraftResponse = await openai.chat.completions.create({
    model: "gpt-3.5-turbo",
    temperature: 0.8,
    messages: [
      {
        role: "system",
        content: `Write an email on behalf of ${sender} to ${receiver}: ${query}`,
      },
    ],
  });
 
  const initialDraft = initialDraftResponse.data.choices[0]!.message!.content;
  
  // SECOND STAGE (SIMPLIFICATION)
  // We also automatically capture inputs and outputs as a step here too.
  const simplificationResponse = await openai.chat.completions.create({
    model: "gpt-3.5-turbo",
    temperature: 0.8,
    messages: [
      {
        role: "system",
        content: "Simplify the following email draft for clarity: \n" + initialDraft,
      },
    ],
  });
  
  const simplification = simplificationResponse.choices[0]!.message!.content;
  
  await runner.submit();
 
  return [simplification, runner];
};

python
import gentrace
PIPELINE_SLUG = "compose"
pipeline = gentrace.Pipeline(
    PIPELINE_SLUG,
    openai_config={
        "api_key": process.env.OPENAI_KEY,
    },
)
def compose(sender, receiver, query):
    # Runner automatically captures and meters invocations to OpenAI
    runner = pipeline.start()
    # Near type-match of the official OpenAI Node.JS package handle.
    openai = runner.get_openai()
    # FIRST STAGE (INITIAL DRAFT).
    # Since we're using the OpenAI handle provided by our runner, we capture inputs
    # and outputs automatically as a distinct step.
    initial_draft_response = openai.chat.completions.create(
      messages=[
        {
          "role": "system",
          "content": f"Write an email on behalf of {sender} to {receiver}: {query}"
        },
      ],
      model="gpt-3.5-turbo"
    )
    
    initial_draft = initial_draft_response.choices[0].message.content
    
     # SECOND STAGE (SIMPLIFICATION)
     # We also automatically capture inputs and outputs as a step here too.
    simplification_response = openai.chat.completions.create(
      messages=[
        {
          "role": "system",
          "content": "Simplify the following email draft for clarity: \n" + initial_draft
        },
      ],
      model="gpt-3.5-turbo"
    )
    
    simplification = simplification_response.choices[0].message.content
    
    runner.submit()
    
    return [simplification, runner]

Manually with our core tracer functions

typescript
python

The Gentrace pipeline runner tracks OpenAI and Pinecone out-of-the-box with the runner.openai and runner.pinecone invocations.

The Gentrace pipeline runner tracks OpenAI and Pinecone out-of-the-box with the runner.get_openai() and runner.get_pinecone() invocations.

However, many AI pipelines are more complex than single-step LLM/vector store queries. They might involve multiple network requests to databases/external APIs to construct more complex prompts.

If you want to track network calls to databases or APIs, you can wrap your invocations with runner.measure(). Inputs and outputs are automatically captured as they are passed into measure().

Measure

SDK reference for the measure() function

Checkpoint

SDK reference for the checkpoint() function

TypeScript
Python

typescript
export const composeEmailForMembers = async (
  sender: string,
  organizationId: string,
  organizationName: string,
) => {
  // Runner captures and meters invocations to OpenAI
  const runner = pipeline.start();
  
  const usernames = await runner.measure(
    (organizationId) => {
      // Sends a database call to retrieve the names of users in the organization
      return getOrganizationUsernames(organizationId);
    },
    [organizationId],
  );
  
  const openai = runner.openai;
  
  const initialDraftResponse = await openai.chat.completions.create({
    model: "gpt-3.5-turbo",
    temperature: 0.8,
    messages: [
      {
        role: "system",
        content: `
Write an email on behalf of ${sender} to the members of organization ${organizationName}.
These are the names of the members: ${usernames.join(", ")}
`
      },
    ],
  });
 
  const initialDraft = initialDraftResponse.choices[0]!.message!.content;
 
  await runner.submit();  
  
  return [initialDraft, runner];
}

python
def compose_email_for_members(sender, organization_id, organization_name):
    runner = pipeline.start()
    usernames = runner.measure(
        # TODO: replace with your own function that you want to measure
        get_organization_usernames,
        org_id=organization_id,
    )
    openai = runner.get_openai()
    initial_draft_response = openai.chat.completions.create(
        messages=[
            {
                "role": "system",
                "content": f"Write an email on behalf of {sender} to the members of organization {organization_name}. These are the names of the members: {usernames.join(", ")}"
            },
        ],
        model="gpt-3.5-turbo"
    )
    initial_draft = initial_draft_response.choices[0].message.content
    runner.submit()
    return [initial_draft, runner]

Submission

In production

Traces are submitted asynchronously using the runner's submit function. This function should be called after all steps have been completed.

TypeScript
Python

typescript
export const compose = async (...) => {
  // Runner will capture the trace
  const runner = pipeline.start();
  
  // ...
  
  runner.submit(); // can be awaited if you want to wait for the submission to complete
};

python
def compose(...):
    # Runner automatically captures and meters invocations to OpenAI
    runner = pipeline.start()
    
    # ...
    
    runner.submit() # always asynchronous

During experimentation / test

In test, you must make changes to the test script and the generative AI pipeline in order to attach traces.

First, modify the pipeline to return a [output(s), runner] pair.

TypeScript
Python

typescript
export const compose = async (...) => {
  const runner = pipeline.start();
  
  const output = // ...
  
  runner.submit(); // you can leave this in production, but more work needs to happen for test to succeed
  
  // this will be returned to the test script
  return [output, runner]
};

Then, modify the script to run tests using a single runTest invocation (as shown below).

Note that runTest will execute the callback function once per test case associated with the pipeline.

python
def compose(...):
    # Runner automatically captures and meters invocations to OpenAI
    runner = pipeline.start()
    
    # ...
    
    runner.submit() # you can leave this in production, but more work needs to happen for test to succeed
    
    # this will be returned to the test script 
    return [output, runner]

Then, modify the script to run tests using a single run_test() invocation (as shown below).

Note that a single invocation of run_test() creates a single test result report.

TypeScript
Python

typescript
import { init, runTest, Configuration } from "@gentrace/core";
import { compose } from "../src/compose"; // TODO: REPLACE WITH YOUR PIPELINE
 
init({ apiKey: process.env.GENTRACE_API_KEY });
 
const PIPELINE_SLUG = "your-pipeline-id";
 
async function main() {
  // The callback passed to the second parameter is invoked once
  // for every test case associated with the pipeline.
  //
  // This assumes you have previously created test cases for the specified
  // pipeline that have three parameters: sender, receiver, and query.
  await runTest(PIPELINE_SLUG, async (testCase) => {
    return compose(
      testCase.inputs.sender,
      testCase.inputs.receiver,
      testCase.inputs.query
    );
  });
}
 
main();

python
import gentrace
import pipelines
gentrace.init(
    api_key=process.env.GENTRACE_API_KEY,
)
def test_case_callback(test_case):
    return pipelines.compose(
        test_case.get("inputs").get("sender"), 
        test_case.get("inputs").get("receiver"), 
        test_case.get("inputs").get("query")
    )
def main():
  # The callback passed to the second parameter is invoked once for every test 
  # case associated with the pipeline.
  #
  # This assumes you have previously created test cases for the specified
  # pipeline that have three parameters: sender, receiver, and query.
  gentrace.run_test(pipelines.PIPELINE_SLUG, test_case_callback)
main()

Setup​

Collection​

Automatically with our plugins​

Manually with our core tracer functions​

Submission​

In production​

During experimentation / test​