Skip to main content
Version: 4.7.8

Redaction

End users may not want their generative AI data shared until they consent. The Gentrace SDK provides a redaction feature to remove sensitive data from the data before it is shared.

Node.JS only

The redaction feature is only available in the Node.JS SDK. We will add support for our Python SDK shortly.

Serialization

Our pipeline runner class provides a toJson() method that serializes the data to a JSON string.

typescript
import { init, Pipeline, PipelineRun } from "@gentrace/core";
import { initPlugin } from "@gentrace/openai";
 
init({
apiKey: process.env.GENTRACE_API_KEY ?? "",
});
 
const plugin = await initPlugin({
apiKey: process.env.OPENAI_KEY ?? "",
});
 
const pipeline = new Pipeline({
slug: "chat-message",
plugins: {
openai: plugin,
},
});
 
const runner = pipeline.start();
 
const completion = await runner.openai.chat.completions.create({
model: "gpt-4-turbo-preview",
messages: [
{
role: "user",
content: "Convert this sentence to JSON: John is 10 years old.",
},
],
});
 
const json = runner.toJson();
 
// Store to disk (e.g. localStorage, indexedDB, etc.)

Once your end user consents, you can submit the data to Gentrace.

typescript
import { init, Pipeline, PipelineRun } from "@gentrace/core";
 
init({
apiKey: process.env.GENTRACE_API_KEY ?? "",
});
 
// Load the JSON object from disk (e.g. localStorage, indexedDB, etc.)
const json = localStorage.getItem("user-data") as string;
 
const response = await PipelineRun.submitFromJson(json);

Fine-grained redaction

If you want to submit a subset of the data, you can use the selectFields parameter upon submission to specify a field whitelist.

Our SDK expects that users will provide Lodash object selectors to specify their whitelisted fields.

typescript
import { init, Pipeline, PipelineRun } from "@gentrace/core";
 
init({
apiKey: process.env.GENTRACE_API_KEY ?? "",
});
 
// Load the JSON object from disk (e.g. localStorage, indexedDB, etc.)
const json = localStorage.getItem("user-data") as string;
 
const redactedObject = await PipelineRun.submitFromJson(json, {
selectFields: {
inputs: [["messages", "0", "role"]],
outputs: ["messages[0].content"],
modelParams: true,
},
});

Lodash selectors can take two forms.

  • The first form is a string that represents the path to the field messages[0].content
  • The second form is an array of strings string[] that represents the path to the field ["messages", "0", "role"]

Multi-step redaction

If your run has multiple steps, you can provide a function to specify the redaction for each step.

typescript
import { init, Pipeline, PipelineRun } from "@gentrace/core";
 
init({
apiKey: process.env.GENTRACE_API_KEY ?? "",
});
 
// Load the JSON object from disk (e.g. localStorage, indexedDB, etc.)
const json = localStorage.getItem("user-data") as string;
 
const redactedObject = await PipelineRun.submitFromJson(json, {
selectFields: (steps) => {
return steps.map((step, index) => ({
// Only show the inputs for the first step
inputs: index === 0 ? ["messages[0].role", "messages[0].content"] : false,
outputs: false,
modelParams: true,
}));
},
});
Redaction array must be equal

The redaction array length must be equal to the step length in the pipeline. If you provide fewer redaction objects, we will copy the last element in the array to the remaining steps.

Dynamic selector length

Lodash selectors do not support dynamic lengths. If you want to redact a dynamic number of fields, you can construct the selector array dynamically.

typescript
import { init, Pipeline, PipelineRun } from "@gentrace/core";
 
init({
apiKey: process.env.GENTRACE_API_KEY ?? "",
});
 
// Load the JSON object from disk (e.g. localStorage, indexedDB, etc.)
const json = localStorage.getItem("user-data") as string;
 
const redactedObject = await PipelineRun.submitFromJson(json, {
selectFields: (steps) => {
return steps.map((step, index) => {
return {
inputs: step.inputs.messages.map((message: Record<string, any>, index: number) => {
return {
role: ["messages", index, "role"],
content: ["messages", index, "content"],
};
}),
outputs: false,
modelParams: true,
}
});
},
});

Submit without serialization

If you want to submit data without serializing it, you can pass the selectFields() parameter into the submit method.

typescript
const runner = pipeline.start();
 
const completion = await runner.openai.chat.completions.create({
model: "gpt-4-turbo-preview",
messages: [
{
role: "user",
content: "Convert this sentence to JSON: John is 10 years old.",
},
],
});
 
await runner.submit({
selectFields: {
inputs: [["messages", "0", "role"]],
outputs: ["messages[0].content"],
modelParams: true,
},
})

Resubmission with more data

If you want to submit limited data (e.g. request metadata) before the user consents and then submit the full payload after consent, you can invoke submit() or submitFromJson() multiple times for the serialized JSON.

typescript
import { init, Pipeline, PipelineRun } from "@gentrace/core";
 
init({
apiKey: process.env.GENTRACE_API_KEY ?? "",
});
 
// Load the JSON object from disk (e.g. localStorage, indexedDB, etc.)
const json = localStorage.getItem("user-data") as string;
 
const redactedObject = await PipelineRun.submitFromJson(json, {
selectFields: {
inputs: false,
outputs: false,
modelParams: true,
}
});
 
// ⌛ time passes + ✅ user consents. The json variable must have the same `id` field.
 
const fullRedactedObject = await PipelineRun.submitFromJson(json, {
selectFields: {
inputs: true,
outputs: true,
modelParams: true,
}
});
Resubmission deletes

When you resubmit data, the prior record is deleted from Gentrace.