Skip to main content
Trajectory SDK supports pluggable PII redaction. Redactors run after trajectory building but before staging or saving — your raw data never touches disk unredacted.

Using a Redactor

Pass a redactor to import_conversations():
trajectories = tj.import_conversations(
    conversations,
    redactor=EmailRedactor(),
)
This works with both standard and bulk imports:
trajectories = tj.import_conversations(
    bulk=True,
    source="./export.parquet",
    redactor=EmailRedactor(),
)

Writing a Custom Redactor

Redactors extend BasePiiRedactor and implement two methods:
  • redact(trajectory) — return a new trajectory with PII removed
  • preview(trajectories) — dry-run that reports what would be redacted
import re
from dataclasses import replace
from trajectory_sdk.storage.pii import BasePiiRedactor

class EmailRedactor(BasePiiRedactor):
    pattern = re.compile(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b')

    def redact(self, trajectory):
        new_steps = []
        for step in trajectory.steps:
            new_msgs = [
                replace(msg, content=self.pattern.sub("[REDACTED]", msg.content or ""))
                for msg in step.messages
            ]
            new_steps.append(replace(step, messages=new_msgs))
        return replace(trajectory, steps=new_steps)

    def preview(self, trajectories):
        count = sum(
            len(self.pattern.findall(msg.content or ""))
            for t in trajectories for s in t.steps for msg in s.messages
        )
        return {"emails_found": count}

Preview Before Redacting

Use preview() to check what a redactor would catch without modifying any data:
redactor = EmailRedactor()
report = redactor.preview(trajectories)
print(report)  # {"emails_found": 42}