PII & secret redaction for AI apps

Stop leaking secrets into your AI traces.

A small, dependency-light Python library that redacts PII and API keys from prompts, agent traces and tool-call arguments — deterministically, in-process, before they reach Langfuse, Datadog or your own logs.

$ pip install traceredact

View source

pypiv0.2.1 python3.11 – 3.13 cipassing licenseApache-2.0 deps3

agent_trace.json

1{
2  "user": { "email": "[email protected]"[REDACTED:pii], "plan": "pro" },
3  "tool_call": {
4    "name": "charge_card",
5    "args": { "card": "4111 1111 1111 1111"[REDACTED:pii] }
6  },
7  "config": { "openai_key": "sk-aB3kZ9qRsuVwXy012345"[REDACTED:secret] },
8  "note": "shipped to your logger ⟶ redacted first"
9}

Why traceredact

Built for the way AI apps actually leak.

Field-name denylists miss the real cases. traceredact reads the values.

🔍

Content-based, not key-name

Catches a live sk-… key or a card number even under an innocuous JSON key, or buried in free-form prompt text — not just fields called password.

🧩

Tool-call arguments by path

Walks nested dicts, lists, pydantic models and dataclasses, redacting string leaves wherever they sit — and reports the exact json_path.

⛓️

CI-gateable & drop-in

traceredact scan exits non-zero on findings. Wrap your OpenAI / Anthropic client or LangChain handler to redact in-flight. Streaming supported.

🔒

Deterministic & safe

No data retained, in-process, ReDoS-safe, fail-closed. Three dependencies, fully typed. A redaction pass should never be your next incident.

Who it's for

For teams shipping AI to production.

If you log prompts, agent traces or tool calls, you're already exposed.

🚀

AI-first startups

10–200 people logging to Langfuse, Helicone, Datadog or your own DB. Add a redaction layer in under five minutes.

📋

Hit by a privacy review

A security questionnaire or a near-miss leak just made “what's in our traces?” an urgent question. Gate CI on it today.

⚖️

GDPR / EU AI Act

Transparency duties make PII-in-traces a budgeted legal liability. “We didn't know it was logged” is not a defense.

Five minutes

One call. Redacted copy + findings.

python

from traceredact import redact

result = redact({
    "args": {
        "email": "[email protected]",
        "key":   "sk-1234567890abcdefABCD",
    }
})

result.value

{
  "args": {
    "email": "[REDACTED:pii]",
    "key":   "[REDACTED:secret]",
  }
}
# result.findings → paths: args.email, args.key

Coverage

Secrets & PII, out of the box.

Data-driven rules with checksum validation (Luhn, IBAN mod-97) and an entropy fallback.

OpenAIAnthropicAWS GitHubSlackStripe GoogleSendGridTwilio HuggingFacenpmPyPI AzureJWTPEM / PGP keys Bearer tokenswebhookshigh-entropy emailphonecredit card IBANIPUS SSN

How it compares

Specialized where the others are generic.

Most masking is tied to one platform or only matches field names. traceredact reads values, covers secrets and PII, and works with any backend.

Capability	traceredact	Langfuse masking	Helicone omit-logs	Presidio / DLP
Matches the value, not just key names	✓	regex	—	✓ PII
Secrets — API keys, tokens, private keys	✓	—	—	partial
Tool-call arguments by JSON path	✓	—	—	—
Works with any logger / backend	✓	Langfuse only	Helicone only	varies
Granularity	per value	per field	drops whole log	per entity
Footprint	3 deps, no models	platform feature	platform feature	heavy / ML

Your traces shouldn't be your next leak.

Install it, gate CI on it, wrap your SDK client. Free and open-source.

$ pip install traceredact

Get started →