Skip to main content
Trajectory SDK supports pluggable PII redaction via transforms. Transforms run after trajectory building — your raw data never touches disk unredacted.

Using a Transform

Pass transforms to tj.init():
from trajectory_sdk.transforms.pii_transform import RegexPiiTransform
from trajectory_sdk.primitives.primitives import PiiPolicy

redactor = RegexPiiTransform(PiiPolicy(name="default", rules=["EMAIL", "PHONE"]))

tj.init(
    provider="langsmith",
    project_id="your-project-id",
    transforms=[redactor],
)

# All imports will have PII redacted automatically
trajectories = tj.import_conversations(bulk=True)
This works with both standard and bulk imports.

Writing a Custom Transform

Custom PII transforms extend BasePiiTransform and implement two methods:
  • transform(trajectory) — return a new trajectory with PII removed
  • preview(trajectories) — dry-run that reports what would be redacted
import re
from dataclasses import replace
from trajectory_sdk.transforms.pii_transform import BasePiiTransform

class EmailTransform(BasePiiTransform):
    pattern = re.compile(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b')

    def transform(self, trajectory):
        new_steps = []
        for step in trajectory.steps:
            new_msgs = [
                replace(msg, content=self.pattern.sub("[REDACTED]", msg.content or ""))
                for msg in step.messages
            ]
            new_steps.append(replace(step, messages=new_msgs))
        return replace(trajectory, steps=new_steps)

    def preview(self, trajectories):
        count = sum(
            len(self.pattern.findall(msg.content or ""))
            for t in trajectories for s in t.steps for msg in s.messages
        )
        return RedactionPreview(
            total_rule_counts={"EMAIL": count},
            samples=[],
        )

Preview Before Redacting

Use preview() to check what a transform would catch without modifying any data:
redactor = EmailTransform()
report = redactor.preview(trajectories)
print(report.total_rule_counts)  # {"EMAIL": 42}

Built-in Rules

RegexPiiTransform supports the following PiiRule values:
RuleWhat it matches
"EMAIL"Email addresses
"PHONE"Phone numbers
"SSN"Social Security Numbers
"CREDIT_CARD"Credit card numbers