Version: 4.7.24

Monitoring (OpenAI) quickstart

This guide covers how to get started with Gentrace for monitoring OpenAI. Gentrace monitors your logged OpenAI requests and tracks speed, cost, and aggregate statistics.

Installation

TypeScript
Python

🛑Server-side only

Please only use this library on the server-side. Using it on the client-side will reveal your API key.

First, install our core package.

bash
# Execute only one, depending on your package manager
npm i @gentrace/core
yarn add @gentrace/core
pnpm i @gentrace/core

bash
# Execute only one, depending on your package manager
pip install gentrace-py
poetry add gentrace-py

typescript
python

If you want to use our provider SDK handlers, you must install our associated plugin SDKs. These SDKs have a direct dependency on the officially supported SDK for their respective providers. We type match the official SDK whenever possible.

Requires @gentrace/openai@v4

This section requires Gentrace's official OpenAI plugin. The plugin version matches the major version of the official OpenAI Node.JS SDK.

shell
npm install @gentrace/openai@v4

These NPM packages will only work with Node.JS versions >= 16.16.0.

This PyPI package will only work with Python versions >= 3.7.1.

Usage

typescript
python

TypeScript
Python

We designed our SDKs to mostly preserve the original interface to OpenAI's client library. You can simply insert the following lines of code before your OpenAI invocations.

typescript
import { init } from "@gentrace/core";
import { OpenAI } from "@gentrace/openai";
 
// This function globally initializes Gentrace with the appropriate 
// credentials. Constructors like OpenAI() will transparently use
// these credentials to authenticate with Gentrace.
init({
  apiKey: process.env.GENTRACE_API_KEY
});
 
const openai = new OpenAI({
  apiKey: process.env.OPENAI_KEY,
});

The OpenAI class is virtually identical to the equivalents in the official SDK.

You can then execute your OpenAI functions against the openai handle directly.

typescript
async function createEmbedding() {
  const embeddingResponse = await openai.embeddings.create({
    model: "text-embedding-ada-002",
    input: "Example input",
    // IMPORTANT: Supply a Gentrace Pipeline slug to track this invocation
    pipelineSlug: "create-test-embedding"
  });
  console.log("Pipeline run ID: ", embeddingResponse.pipelineRunId);
}
createEmbedding();

We designed our SDKs to mostly preserve the original interface to OpenAI's client library. You can simply insert the following two lines of code before your OpenAI invocations.

python
import os
import gentrace
gentrace.init(
  api_key=os.getenv("GENTRACE_API_KEY")
)
openai = gentrace.OpenAI(api_key=os.getenv("OPENAI_KEY"))

The gentrace.OpenAI() constructors automatically tracks invocations to OpenAI and asynchronously forwards information to our service. Our SDK will not increase request latency to OpenAI.

Here's an example usage for creating embeddings.

python
import os
import gentrace
import openai
gentrace.init(
  api_key=os.getenv("GENTRACE_API_KEY")
)
openai = gentrace.OpenAI(api_key=os.getenv("OPENAI_KEY"))
# Our SDK transparently sends this information async to our servers.
result = openai.embeddings.create(
    input="sample text",
    model="text-embedding-3-small",
  	# IMPORTANT: Supply a Gentrace Pipeline slug to track this invocation
    pipeline_slug="create-sample-embedding",
)
# We modify the OpenAI SDK return value to include a PipelineRun ID. This ID is 
# important to tie feedback to a particular AI generation.
print("Pipeline run ID: ", result.pipelineRunId)
# Since we send information asynchronously, we provide a function that allows you to wait 
# until all requests are sent. This should be used only in development.
gentrace.flush()

caution

You should provide a Pipeline slug as a request parameter to any method that you want to instrument. This ID associates OpenAI invocations to that identifier on our service. If you omit the slug, we will not track telemetry for that invocation.

The PipelineRun ID provided by the OpenAI create() return value is from the Gentrace SDK. Our SDK provides this for you to uniquely associate feedback with AI generated content. If you do not provide a Pipeline slug, the create() functions will not return a PipelineRun ID.

typescript
python

Asynchronous commands

We support instrumenting the asynchronous methods of OpenAI's SDK.

python
import asyncio
import os
import gentrace
async def main():
    gentrace.init(
      api_key=os.getenv("GENTRACE_API_KEY")
    )
    
    openai = gentrace.AsyncOpenAI()
    openai.api_key = os.getenv("OPENAI_KEY")
    result = await openai.embeddings.create(
        input="sample text",
        model="text-similarity-davinci-001",
        pipeline_slug="testing-value",
    )
    gentrace.flush()
    
    # We still modify the OpenAI SDK return value to include a PipelineRun ID
    print("Pipeline run ID: ", result.pipelineRunId)
asyncio.run(main())

Keep these notes in mind when using async functions.

The SDK still returns a PipelineRun ID that you can use to uniquely associate feedback to a generation.
If you await the asynchronous invocation, the SDK does not wait for Gentrace's telemetry request to complete.

typescript
python

Content templates for the `openai.chat.completions.create()` interface

The other difference between the Gentrace-instrumented SDK and the official SDK is how prompts are specified for openAi.chat.completion.create() requests.

In the official version of the SDK, you specify your chat completion input as an object array with role and content key-pairs defined.

typescript
// ❌ Official OpenAI SDK invocation
const chatCompletionResponse = await openai.chat.completions.create({
  messages: [
    {
      role: "user",
      content: "Hello Vivek!"
    },
  ],
  model: "gpt-3.5-turbo"
});

In our SDK, if part of the content is dynamically generated, you should instead create contentTemplate and contentInputs key-pairs to separate the static and dynamic information, respectively. This is helpful to better display the generation in our UI and internally track version changes.

We use Mustache templating with the Mustache.js library to render the final content that is sent to OpenAI.

typescript
// ✅ Gentrace-instrumented OpenAI SDK
const chatCompletionResponse = await openai.chat.completions.create({
  messages: [
    {
      role: "user",
      contentTemplate: "Hello {{ name }}!",
      contentInputs: { name: "Vivek" },
    },
  ],
  model: "gpt-3.5-turbo",
  pipelineSlug: "testing-pipeline-id",
});

Note: We still allow you to specify the original content key-value pair in the dictionary if you want to incrementally migrate your invocations.

Consult OpenAI's Node.JS SDK documentation for more details to learn more about the original SDK.

Content templates for the `openai.chat.completions` interface

The other difference between the Gentrace-instrumented SDK and the official SDK is how prompts are specified for openai.chat.completions.create() requests.

In the official version of the SDK, you specify your chat completion input as an array of dictionaries with role and content key-pairs defined.

python
import openai
openai = OpenAI(api_key=os.getenv("OPENAI_KEY"))
# ❌ Official OpenAI SDK invocation
result = openai.chat.completions.create(
    messages=[
      {
        "role": "user", 
        "content": "Hello Vivek!"
      }
    ],
    model="gpt-3.5-turbo",
)

We use Mustache templating with the Pystache library to render the final content that is sent to OpenAI.

python
# ✅ Gentrace-instrumented OpenAI SDK
import gentrace
gentrace.init(
  api_key=os.getenv("GENTRACE_API_KEY")
)
openai = gentrace.OpenAI(api_key=os.getenv("OPENAI_KEY"))
openai.chat.completions.create(
  messages=[
    {
      "role": "user",
      "contentTemplate": "Hello {{ name }}!",
      "contentInputs": {"name": "Vivek"},
    }
  ],
  model="gpt-3.5-turbo",
  pipeline_slug="test-hello-world-templatized",
)

Note: We still allow you to specify the original content key-value pair in the dictionary if you want to incrementally migrate your invocations.

Consult OpenAI's Python SDK documentation for more details to learn more about the original SDK.

Streaming

typescript
python

We transparently wrap OpenAI's Node streaming functionality.

typescript
// Imports and initialization truncated
async function main() {
  const streamChat = await openai.chat.completions.create({
    model: 'gpt-4',
    messages: [{ role: 'user', content: 'Say this is a test' }],
    stream: true,
  });
  
  // This PipelineRun ID actually hasn't been created on the server yet.
  // It's created asynchronously after the final stream event is processed.
  console.log("Pipeline run ID: ", streamChat.pipelineRunId);
  
  for await (const part of streamChat) {
    console.log(part.choices[0]?.delta?.content || '');
  }
  
  // Stream data is coalesced and sent to our server.
}
main();

We support data streaming of OpenAI's API for both synchronous and asynchronous SDK methods. Our SDK instruments the start time at the moment prior to the invocation and the end time at the moment immediately after the final stream event is received for that invocation.

Synchronous

python
import os
import gentrace
gentrace.init(
  api_key=os.getenv("GENTRACE_API_KEY")
)
openai = gentrace.OpenAI(api_key=os.getenv("OPENAI_KEY"))
result = openai.chat.completions.create(
    pipeline_slug="testing-chat-completion-value",
    messages=[{"role": "user", "content": "Hello!"}],
    model="gpt-3.5-turbo",
    stream=True,
)
pipeline_run_id = None
for value in result:
    if hasattr(value, "pipelineRunId"):
        pipeline_run_id = value.pipelineRunId
print("Result: ", pipeline_run_id)
gentrace.flush()

Asynchronous

python
import asyncio
import os
import gentrace
gentrace.init(
  api_key=os.getenv("GENTRACE_API_KEY")
)
openai = gentrace.AsyncOpenAI(api_key=os.getenv("OPENAI_KEY"))
async def main():
    result = await openai.chat.completions.create(
        pipeline_slug="testing-chat-completion-value",
        messages=[{"role": "user", "content": "Hello!"}],
        model="gpt-3.5-turbo",
        stream=True,
    )
    pipeline_run_id = None
    
    # 👀 The async iteration is key here!
    async for value in result:
        if hasattr(value, "pipelineRunId"):
            pipeline_run_id = value.pipelineRunId
    gentrace.flush()
    print("Result: ", pipeline_run_id)
asyncio.run(main())

Keep in mind that the PipelineRun ID is included in the payload for every event that's returned from the server.

caution

We measure the total time a generation takes from the first byte received from iterating on the stream to the last event yielded from the stream.

Before sending the streamed events to Gentrace, we coalesce the streamed payload as a single payload to improve readability.

Continue

Full OpenAI docs

Installation​

Usage​

Asynchronous commands​

Content templates for the openai.chat.completions.create() interface​

Content templates for the openai.chat.completions interface​

Streaming​

Synchronous​

Asynchronous​

Continue​