PII & secret redaction for AI apps

Stop leaking secrets into your AI traces.

A small, dependency-light Python library that redacts PII and API keys from prompts, agent traces and tool-call arguments — deterministically, in-process, before they reach Langfuse, Datadog or your own logs.

$ pip install traceredact
View source
pypiv0.2.1 python3.11 – 3.13 cipassing licenseApache-2.0 deps3
agent_trace.json
1{
2  "user": { "email": "[email protected]"[REDACTED:pii], "plan": "pro" },
3  "tool_call": {
4    "name": "charge_card",
5    "args": { "card": "4111 1111 1111 1111"[REDACTED:pii] }
6  },
7  "config": { "openai_key": "sk-aB3kZ9qRsuVwXy012345"[REDACTED:secret] },
8  "note": "shipped to your logger ⟶ redacted first"
9}
Why traceredact

Built for the way AI apps actually leak.

Field-name denylists miss the real cases. traceredact reads the values.

🔍

Content-based, not key-name

Catches a live sk-… key or a card number even under an innocuous JSON key, or buried in free-form prompt text — not just fields called password.

🧩

Tool-call arguments by path

Walks nested dicts, lists, pydantic models and dataclasses, redacting string leaves wherever they sit — and reports the exact json_path.

⛓️

CI-gateable & drop-in

traceredact scan exits non-zero on findings. Wrap your OpenAI / Anthropic client or LangChain handler to redact in-flight. Streaming supported.

🔒

Deterministic & safe

No data retained, in-process, ReDoS-safe, fail-closed. Three dependencies, fully typed. A redaction pass should never be your next incident.

Who it's for

For teams shipping AI to production.

If you log prompts, agent traces or tool calls, you're already exposed.

🚀

AI-first startups

10–200 people logging to Langfuse, Helicone, Datadog or your own DB. Add a redaction layer in under five minutes.

📋

Hit by a privacy review

A security questionnaire or a near-miss leak just made “what's in our traces?” an urgent question. Gate CI on it today.

⚖️

GDPR / EU AI Act

Transparency duties make PII-in-traces a budgeted legal liability. “We didn't know it was logged” is not a defense.

Five minutes

One call. Redacted copy + findings.

python
from traceredact import redact

result = redact({
    "args": {
        "email": "[email protected]",
        "key":   "sk-1234567890abcdefABCD",
    }
})
result.value
{
  "args": {
    "email": "[REDACTED:pii]",
    "key":   "[REDACTED:secret]",
  }
}
# result.findings → paths: args.email, args.key
Coverage

Secrets & PII, out of the box.

Data-driven rules with checksum validation (Luhn, IBAN mod-97) and an entropy fallback.

OpenAIAnthropicAWS GitHubSlackStripe GoogleSendGridTwilio HuggingFacenpmPyPI AzureJWTPEM / PGP keys Bearer tokenswebhookshigh-entropy emailphonecredit card IBANIPUS SSN
How it compares

Specialized where the others are generic.

Most masking is tied to one platform or only matches field names. traceredact reads values, covers secrets and PII, and works with any backend.

Capability traceredact Langfuse masking Helicone omit-logs Presidio / DLP
Matches the value, not just key names regex PII
Secrets — API keys, tokens, private keys partial
Tool-call arguments by JSON path
Works with any logger / backend Langfuse only Helicone only varies
Granularity per value per field drops whole log per entity
Footprint 3 deps, no models platform feature platform feature heavy / ML

Your traces shouldn't be your next leak.

Install it, gate CI on it, wrap your SDK client. Free and open-source.

$ pip install traceredact
Get started →