Noise & Secret Management in API Tests
Noise is any field in a captured API request or response that changes between runs for reasons unrelated to correctness — timestamps, UUIDs, session tokens, pagination cursors. Keploy automatically detects these patterns at capture time and normalizes them during replay so deterministic assertions do not produce flaky failures. Secrets and PII are masked at capture so the values never persist to disk, with built-in defaults for GDPR, HIPAA, and PCI DSS compliance.
This page explains how Keploy's deterministic replay engine handles the two hardest problems in traffic-based testing: non-deterministic data that causes false failures, and secrets that should never appear in committed test fixtures.
What is noise in captured API traffic?
Every recorded HTTP interaction contains two kinds of fields: fields that are determined by your application logic (a user's name, a product's price, an order's line items) and fields that change every time a request is processed regardless of what your code is doing. The second category is noise.
Common sources of noise in HTTP APIs:
- Timestamps. Every response with a createdAt, updatedAt, or Date header contains a new value on every replay. Comparing these literally guarantees a false failure on the second run.
- UUIDs and auto-incrementing IDs. Database-generated IDs are not stable across runs unless the database is reset to a known state between captures — and even then, any feature that assigns an ID from an external service (Stripe customer ID, auth provider user ID) breaks the contract.
- Session tokens and CSRF tokens. These are scoped to the current user session by design. A captured login response contains a token that will never be valid again.
- Pagination cursors. Cursors encode server-side position and often include a timestamp or hash, so they change on every run.
- Request-scoped trace IDs. Distributed tracing frameworks (OpenTelemetry, Jaeger) attach a unique trace ID to each request. That ID is logged and sometimes echoed back in response headers.
- Rate-limit metadata.
X-RateLimit-RemainingandRetry-Afterheaders change as the service processes traffic.
A replay-based testing tool that treats all these fields as part of the assertion surface will produce tests that fail on every run. Flakiness is the default state — which is why traffic-based testing did not take off until the noise problem was solved.
How Keploy's deterministic replay engine handles noise
Keploy auto-detects the common noise patterns at capture time so most applications work out of the box with no configuration. The detection runs in two passes.
Pass 1: Pattern-based detection
Keploy scans every string field in the captured request and response bodies and headers for well-known formats:
- ISO-8601 timestamps (
2026-04-14T10:00:00Z) - Unix epoch seconds and milliseconds
- UUID v1 through v5 (
550e8400-e29b-41d4-a716-446655440000) - MongoDB ObjectId (24 hex characters)
- Auto-incrementing positive integers used as IDs
- JWT tokens (three dot-separated base64 segments)
- Bearer tokens in
Authorizationheaders
Any field matching one of these patterns is marked as noise in the generated test YAML. During replay, the field is compared with a structural matcher: a timestamp field is valid as long as the replayed response also contains a valid timestamp in the same format; a UUID field is valid as long as the replayed response contains a valid UUID of the same version.
Pass 2: User configuration
Application-specific noise (internal order IDs, request correlation tokens, domain-specific checksums) is not auto-detected. Those fields are marked via keploy.yaml:
# keploy.yaml
noise:
global:
body:
- "$.data.orderId"
- "$.data.correlationId"
- "$.metadata.checksum"
header:
- "X-Request-Id"
- "X-Trace-Id"
- "X-RateLimit-Remaining"Rules can also be scoped to a single test case by adding a noise block inside the generated YAML file. This is the right choice when a field is noise only in one specific flow (for example, an admin operation might return a request timestamp that matters for auditing and should be asserted, while the same field in a customer-facing flow is irrelevant).
Secret and PII masking
Noise detection determines what Keploy ignores during comparison. Masking determines what Keploy writes to disk in the first place. The two are different concerns — a timestamp is noise but does not need to be masked; a Stripe secret key is both noise and must never persist to disk.
Keploy masks values matching known secret patterns at capture time, before any write to the keploy/ directory:
- Bearer tokens in Authorization headers
- Cookie values (stored as opaque placeholders scoped by cookie name)
- Stripe API keys (
sk_live_...,sk_test_...) - AWS access and secret keys (patterns from the IAM guidance)
- GitHub PATs (
ghp_...) and OAuth tokens - JWTs (full token redacted; header can optionally be preserved)
- PCI DSS: credit card numbers (Luhn-checked) and CVVs
For application-specific PII, add fields to the noise.global.body section with a redact: true flag:
# keploy.yaml
noise:
global:
body:
- path: "$.user.email"
redact: true
- path: "$.user.phoneNumber"
redact: true
- path: "$.patient.ssn"
redact: true
- path: "$.patient.diagnosis"
redact: trueMasked fields appear in the committed test YAML as [REDACTED] placeholders. During replay, Keploy treats them as structural matchers (any value of the same type passes). The original value is never touched after the initial in-memory redaction and never persists to disk, committed history, or CI logs.
Compliance posture: GDPR, HIPAA, PCI DSS
Keploy does not make a regulatory claim — compliance is your responsibility based on how you configure and operate the tool. What Keploy provides is the set of primitives that make compliant replay-based testing feasible.
GDPR
Email addresses, user names, IP addresses, and any field configured in redact are masked at capture time. Test fixtures committed to git contain no personal data from production users.
HIPAA
Protected health information fields (patient ID, diagnosis, prescription, insurance number) can be added to the redaction list and are masked before any disk write. Keploy can run inside a BAA-governed environment for capture.
PCI DSS
Card numbers matched by Luhn checking and CVVs are auto-redacted. For custom payment fields (bank account numbers, routing numbers, international IBAN), add to the user configuration.
How Keploy compares to WireMock and Hoverfly
WireMock and Hoverfly are the two most widely used recording-proxy tools for API testing. Both let you capture HTTP interactions and replay them as mocks, and both support request matching via regex patterns. Neither auto-detects noise.
| Capability | Keploy | WireMock | Hoverfly |
|---|---|---|---|
| Auto-detect timestamps | Yes (ISO-8601, epoch) | Manual regex | Manual templates |
| Auto-detect UUIDs | Yes (v1-v5) | Manual regex | Manual templates |
| Auto-mask Bearer tokens | Yes | No | No |
| Auto-mask Stripe/AWS/GitHub keys | Yes | No | No |
| Luhn-checked card redaction | Yes | No | No |
| Capture at kernel level (no SDK) | Yes (eBPF) | Proxy or Java SDK | Proxy |
For simple recorded mocks where you can tolerate writing regex matchers by hand, WireMock and Hoverfly are sufficient. For full regression suites where every non-deterministic field would otherwise need manual handling, and for applications that must meet compliance requirements on committed test fixtures, Keploy eliminates the bulk of that work through auto-detection.