Redaction
End users may not want their generative AI data shared until they consent. The Gentrace SDK provides a redaction feature to remove sensitive data from the data before it is shared.
The redaction feature is only available in the Node.JS SDK. We will add support for our Python SDK shortly.
Serialization
Our pipeline runner class provides a toJson()
method that serializes the data to a JSON string.
typescript
import {init ,Pipeline ,PipelineRun } from "@gentrace/core";import {initPlugin } from "@gentrace/openai";init ({apiKey :process .env .GENTRACE_API_KEY ?? "",});constplugin = awaitinitPlugin ({apiKey :process .env .OPENAI_KEY ?? "",});constpipeline = newPipeline ({slug : "chat-message",plugins : {openai :plugin ,},});construnner =pipeline .start ();constcompletion = awaitrunner .openai .chat .completions .create ({model : "gpt-4-turbo-preview",messages : [{role : "user",content : "Convert this sentence to JSON: John is 10 years old.",},],});constjson =runner .toJson ();// Store to disk (e.g. localStorage, indexedDB, etc.)
Once your end user consents, you can submit the data to Gentrace.
typescript
import {init ,Pipeline ,PipelineRun } from "@gentrace/core";init ({apiKey :process .env .GENTRACE_API_KEY ?? "",});// Load the JSON object from disk (e.g. localStorage, indexedDB, etc.)constjson =localStorage .getItem ("user-data") as string;constresponse = awaitPipelineRun .submitFromJson (json );
Fine-grained redaction
If you want to submit a subset of the data, you can use the selectFields
parameter upon submission to specify
a field whitelist.
Our SDK expects that users will provide Lodash object selectors to specify their whitelisted fields.
typescript
import {init ,Pipeline ,PipelineRun } from "@gentrace/core";init ({apiKey :process .env .GENTRACE_API_KEY ?? "",});// Load the JSON object from disk (e.g. localStorage, indexedDB, etc.)constjson =localStorage .getItem ("user-data") as string;constredactedObject = awaitPipelineRun .submitFromJson (json , {selectFields : {inputs : [["messages", "0", "role"]],outputs : ["messages[0].content"],modelParams : true,},});
Lodash selectors can take two forms.
- The first form is a
string
that represents the path to the fieldmessages[0].content
- The second form is an array of strings
string[]
that represents the path to the field["messages", "0", "role"]
Multi-step redaction
If your run has multiple steps, you can provide a function to specify the redaction for each step.
typescript
import {init ,Pipeline ,PipelineRun } from "@gentrace/core";init ({apiKey :process .env .GENTRACE_API_KEY ?? "",});// Load the JSON object from disk (e.g. localStorage, indexedDB, etc.)constjson =localStorage .getItem ("user-data") as string;constredactedObject = awaitPipelineRun .submitFromJson (json , {selectFields : (steps ) => {returnsteps .map ((step ,index ) => ({// Only show the inputs for the first stepinputs :index === 0 ? ["messages[0].role", "messages[0].content"] : false,outputs : false,modelParams : true,}));},});
The redaction array length must be equal to the step length in the pipeline. If you provide fewer redaction objects, we will copy the last element in the array to the remaining steps.
Dynamic selector length
Lodash selectors do not support dynamic lengths. If you want to redact a dynamic number of fields, you can construct the selector array dynamically.
typescript
import {init ,Pipeline ,PipelineRun } from "@gentrace/core";init ({apiKey :process .env .GENTRACE_API_KEY ?? "",});// Load the JSON object from disk (e.g. localStorage, indexedDB, etc.)constjson =localStorage .getItem ("user-data") as string;constredactedObject = awaitPipelineRun .submitFromJson (json , {selectFields : (steps ) => {returnsteps .map ((step ,index ) => {return {inputs :step .inputs .messages .map ((message :Record <string, any>,index : number) => {return {role : ["messages",index , "role"],content : ["messages",index , "content"],};}),outputs : false,modelParams : true,}});},});
Submit without serialization
If you want to submit data without serializing it, you can pass the selectFields()
parameter into the submit
method.
typescript
construnner =pipeline .start ();constcompletion = awaitrunner .openai .chat .completions .create ({model : "gpt-4-turbo-preview",messages : [{role : "user",content : "Convert this sentence to JSON: John is 10 years old.",},],});awaitrunner .submit ({selectFields : {inputs : [["messages", "0", "role"]],outputs : ["messages[0].content"],modelParams : true,},})
Resubmission with more data
If you want to submit limited data (e.g. request metadata) before the user consents and then submit the full payload
after consent, you can invoke submit()
or submitFromJson()
multiple times for the serialized JSON.
typescript
import {init ,Pipeline ,PipelineRun } from "@gentrace/core";init ({apiKey :process .env .GENTRACE_API_KEY ?? "",});// Load the JSON object from disk (e.g. localStorage, indexedDB, etc.)constjson =localStorage .getItem ("user-data") as string;constredactedObject = awaitPipelineRun .submitFromJson (json , {selectFields : {inputs : false,outputs : false,modelParams : true,}});// ⌛ time passes + ✅ user consents. The json variable must have the same `id` field.constfullRedactedObject = awaitPipelineRun .submitFromJson (json , {selectFields : {inputs : true,outputs : true,modelParams : true,}});
When you resubmit data, the prior record is deleted from Gentrace.