Skip to main content

PII Masking

AgentMark PII masking strips sensitive data from span attributes before traces are exported. Masking runs client-side in your application process, so no unmasked data ever leaves your environment. You can use a custom mask function, the built-in PII masker, environment variable suppression, or any combination of these approaches.

How it works

PII masking is implemented as an OpenTelemetry SpanProcessor that wraps the export pipeline. When a span finishes, the masking processor intercepts it before the exporter sends data over the network.
1

Span finishes

Your application completes an LLM call or tool invocation. The span is ready for export.
2

MaskingSpanProcessor intercepts

If masking is configured, the processor runs env var suppression first, then your mask function on each sensitive attribute.
3

Redacted span exported (or dropped)

If masking succeeds, the redacted span is forwarded to the exporter. If the mask function throws, the span is dropped entirely (fail-closed) and a warning is logged.
This means:
  • No unmasked data ever leaves your application process. Masking runs in-memory before any network call.
  • Zero overhead when masking is disabled. The processor is only added to the pipeline when you configure a mask function or set env vars.
  • Standard OTel pattern. The MaskingSpanProcessor wraps your existing BatchSpanProcessor or SimpleSpanProcessor — no forking or patching required.

Before and after

With createPiiMasker() enabled, PII tokens like [EMAIL], [SSN], and [PHONE] replace sensitive data in the trace viewer:
Trace with PII masking enabled — sensitive data replaced with tokens like [EMAIL], [SSN], [PHONE]
With AGENTMARK_HIDE_INPUTS=true, all input attributes show [REDACTED] while outputs remain visible:
Trace with input suppression — all inputs show [REDACTED]
Here’s what the raw span attributes look like with each approach: Without masking:
{
  "gen_ai.request.input": "My SSN is 123-45-6789 and email is user@example.com",
  "gen_ai.response.output": "I found your account linked to user@example.com",
  "gen_ai.request.model": "gpt-4o",
  "gen_ai.usage.total_tokens": 150
}
With createPiiMasker(email=True, ssn=True):
{
  "gen_ai.request.input": "My SSN is [SSN] and email is [EMAIL]",
  "gen_ai.response.output": "I found your account linked to [EMAIL]",
  "gen_ai.request.model": "gpt-4o",
  "gen_ai.usage.total_tokens": 150
}
With AGENTMARK_HIDE_INPUTS=true:
{
  "gen_ai.request.input": "[REDACTED]",
  "gen_ai.response.output": "I found your account linked to user@example.com",
  "gen_ai.request.model": "gpt-4o",
  "gen_ai.usage.total_tokens": 150
}
Notice that gen_ai.request.model and gen_ai.usage.total_tokens are never masked — these operational attributes contain no user data.

Basic usage

Pass a mask function to AgentMarkSDK. The function receives each string attribute value and returns the redacted version.
import { AgentMarkSDK } from '@agentmark-ai/sdk';

const sdk = new AgentMarkSDK({
  apiKey: 'am_...',
  appId: 'app-123',
  mask: (data) => data.replace(/\b\d{3}-\d{2}-\d{4}\b/g, '[SSN]'),
});
sdk.initTracing();
The mask function is called on every maskable span attribute before the span is handed to the exporter. The function must be synchronous. You have full control over the replacement logic.

Built-in PII masker

AgentMark ships a built-in PII masker that covers common patterns out of the box. Enable the patterns you need:
import { AgentMarkSDK, createPiiMasker } from '@agentmark-ai/sdk';

const sdk = new AgentMarkSDK({
  apiKey: 'am_...',
  appId: 'app-123',
  mask: createPiiMasker({
    email: true,
    phone: true,
    ssn: true,
    creditCard: true,
    ipAddress: true,
  }),
});
sdk.initTracing();

Built-in patterns

email — Matches email addresses like user@example.com. Replaced with [EMAIL]. phone — Matches phone numbers like (555) 123-4567. Replaced with [PHONE]. ssn — Matches Social Security numbers like 123-45-6789. Replaced with [SSN]. creditCard — Matches credit card numbers like 4111 1111 1111 1111. Replaced with [CREDIT_CARD]. ipAddress — Matches IP addresses like 192.168.1.100. Replaced with [IP_ADDRESS]. All patterns default to false. Only patterns you explicitly enable are applied.

Custom patterns

You can add custom patterns alongside the built-in ones. Each entry needs a regex pattern and a replacement string.
import { AgentMarkSDK, createPiiMasker } from '@agentmark-ai/sdk';

const sdk = new AgentMarkSDK({
  apiKey: 'am_...',
  appId: 'app-123',
  mask: createPiiMasker({
    email: true,
    custom: [
      { pattern: /MRN-\d+/g, replacement: '[MEDICAL_RECORD]' },
      { pattern: /ACCT-[A-Z0-9]+/g, replacement: '[ACCOUNT_ID]' },
    ],
  }),
});
sdk.initTracing();
Custom patterns run after built-in patterns. Custom patterns can be used on their own without enabling any built-in patterns.

Environment variable suppression

For a zero-code option, set environment variables to suppress all inputs, all outputs, or both:
AGENTMARK_HIDE_INPUTS=true
AGENTMARK_HIDE_OUTPUTS=true
When enabled, these replace ALL input or output attribute values with [REDACTED]. No code changes are needed.
If both environment variable suppression and a mask function are configured, suppression runs first. The mask function then receives the already-suppressed values.

Masked attributes reference

AgentMark masks specific span attributes depending on their category. Input attributes (suppressed by AGENTMARK_HIDE_INPUTS):
  • gen_ai.request.input — The prompt or messages sent to the model
  • gen_ai.request.tool_calls — Tool call arguments included in the request
Output attributes (suppressed by AGENTMARK_HIDE_OUTPUTS):
  • gen_ai.response.output — The model’s text response
  • gen_ai.response.output_object — Structured output from the model
Metadata attributes (mask function only, not affected by env vars):
  • agentmark.metadata.* — Custom metadata attached to spans
Operational attributes such as trace IDs, model names, and token counts are never masked. These contain no user data and are required for observability to function.

Error handling

PII masking uses fail-closed behavior. If your mask function throws an error, the span is dropped entirely and never exported. This ensures that unmasked data is never sent to the trace backend. Tracing continues normally for subsequent spans after a mask failure. The dropped span does not affect the rest of the trace pipeline.
Test your mask function thoroughly before deploying to production. A mask function that throws on unexpected input will cause spans to be silently dropped.

Recipes

Microsoft Presidio (Python)

Microsoft Presidio uses NLP to detect unstructured PII like person names, addresses, and passport numbers that regex patterns miss. Since Presidio is a Python library, this recipe applies to the Python SDK.
pip install presidio-analyzer presidio-anonymizer
python -m spacy download en_core_web_lg
Python
from presidio_analyzer import AnalyzerEngine
from presidio_anonymizer import AnonymizerEngine
from agentmark_sdk import AgentMarkSDK

analyzer = AnalyzerEngine()
anonymizer = AnonymizerEngine()

def presidio_mask(data: str) -> str:
    results = analyzer.analyze(text=data, language="en")
    anonymized = anonymizer.anonymize(text=data, analyzer_results=results)
    return anonymized.text

sdk = AgentMarkSDK(
    api_key="am_...",
    app_id="app-123",
    mask=presidio_mask,
)
sdk.init_tracing()
Presidio detects 15+ entity types including PERSON, LOCATION, US_PASSPORT, IBAN_CODE, and CRYPTO. See the Presidio supported entities for the full list.
For TypeScript applications, use create_pii_masker() with custom regex patterns for common PII types. Presidio requires a Python runtime and NLP models (~500MB) which makes it better suited for Python services.

Healthcare (HIPAA)

Combine built-in patterns with custom patterns for Protected Health Information:
Python
import re
from agentmark_sdk import AgentMarkSDK, create_pii_masker, PiiMaskerConfig, CustomPattern

sdk = AgentMarkSDK(
    api_key="am_...",
    app_id="app-123",
    mask=create_pii_masker(PiiMaskerConfig(
        email=True,
        phone=True,
        ssn=True,
        ip_address=True,
        custom=[
            CustomPattern(pattern=re.compile(r"MRN[-\s]?\d{6,10}"), replacement="[MRN]"),
            CustomPattern(pattern=re.compile(r"\b\d{3}-\d{3}-\d{4}\b"), replacement="[DEA_NUMBER]"),
            CustomPattern(pattern=re.compile(r"\b(?:DOB|dob)[:\s]+\d{1,2}/\d{1,2}/\d{2,4}"), replacement="[DOB]"),
        ],
    )),
)
sdk.init_tracing()

Financial services

For PCI-DSS compliance, enable credit card masking and add patterns for financial identifiers:
TypeScript
import { AgentMarkSDK, createPiiMasker } from '@agentmark-ai/sdk';

const sdk = new AgentMarkSDK({
  apiKey: 'am_...',
  appId: 'app-123',
  mask: createPiiMasker({
    creditCard: true,
    ssn: true,
    custom: [
      { pattern: /\b\d{9}\b/g, replacement: '[ROUTING_NUMBER]' },
      { pattern: /\b[A-Z]{2}\d{2}[A-Z0-9]{4}\d{7}([A-Z0-9]?){0,16}\b/g, replacement: '[IBAN]' },
    ],
  }),
});
sdk.initTracing();
For maximum compliance assurance, combine a mask function with AGENTMARK_HIDE_INPUTS=true as a defense-in-depth strategy. The env var acts as a safety net in case a new input attribute is added that the mask function doesn’t cover.

Have Questions?

We’re here to help! Choose the best way to reach us: