Record and Replay Testing
What Is Record and Replay Testing?
Record and replay testing captures real application traffic — API requests, responses, and all dependency interactions — during normal operation and converts recordings into automated, replayable test cases with auto-generated mocks.
Instead of imagining what to test and writing scripts, you observe what your application actually does and capture it. Every real API call becomes a test case. Every database query becomes a mock. Every downstream response becomes a stub.
The advent of eBPF has removed previous constraints (HTTP-only capture, intrusive proxies), enabling kernel-level capture of all network traffic without any application changes. Keploy is the tool that brought this approach to production readiness.
How eBPF Enables Non-Intrusive Traffic Capture
Kernel-level capture means complete visibility into every byte of network traffic, regardless of protocol or language.
Kernel-Level Hooks
eBPF probes attach to sendto(), recvfrom(), write(), and read() system calls on socket file descriptors. Sees all network traffic entering and leaving your application.
Multi-Protocol
HTTP, PostgreSQL wire protocol, MongoDB BSON, Redis commands, gRPC frames, Kafka protocol — all visible and decoded into human-readable test cases and mocks.
Zero Overhead
No latency from routing through external processes. No SDK to import. No proxy to configure. Approximately 1-3% overhead during recording — negligible for staging.
Fully Transparent
Your application does not know it is being recorded. No code changes. No instrumentation. Language and framework agnostic — works identically with Go, Java, Python, Node.js, Rust.
For a comprehensive technical deep dive, see the eBPF for testing guide.
Record-Replay vs Traditional Approaches
Why record-replay testing is gaining adoption compared to manual authoring and spec-based generation.
| Dimension | Manual Authoring | Spec-Based Generation | Record-Replay (Keploy) |
|---|---|---|---|
| Test creation | Engineer writes each test by hand | Generated from OpenAPI spec | Captured from real traffic automatically |
| Mock generation | Engineer writes mocks manually | None (tests against spec only) | Auto-generated from real dependency responses |
| Edge case coverage | Limited to engineer's imagination | Schema-level only | Reflects real production edge cases |
| Time to first test | Hours to days per endpoint | Minutes (if spec exists) | Minutes (capture one session) |
| Maintenance effort | High (update on every change) | Medium (update when spec changes) | Low (re-record to refresh) |
Key takeaway: Record-replay excels at regression and integration testing where you need comprehensive coverage quickly. Manual authoring remains valuable for negative testing, edge case exploration, and business logic validation.
Record-Replay Testing Use Cases
Where traffic-based testing delivers the most value.
Regression Testing
Record traffic, replay after every code change. Detect unintended behavior differences with precise diffs. Catches the regressions that cause the most production incidents.
Migration & Rewrite Validation
Record traffic from the existing service, replay against the new implementation. The new service must produce identical responses. Far more reliable than manual comparison.
Legacy System Testing
No test framework, no dependency injection, no mocks? No problem. Keploy operates at the kernel level. The service does not need to be modified or even aware it is being tested.
Load Testing with Real Traffic
Replay recorded traffic at higher concurrency. Matches actual distribution of endpoints, payload sizes, and request sequences. More realistic than synthetic load generators.
Developer Onboarding
Recorded test cases serve as living documentation. New team members read YAML files to understand what each endpoint does, what inputs it accepts, and what responses it produces.
How Keploy Implements Record-Replay
Purpose-built for record-replay testing. Here is what makes it different from HTTP-only recording tools.
Full Dependency Graph Capture
Not just HTTP. PostgreSQL queries, MongoDB operations, Redis commands, Kafka produce calls, gRPC requests, and third-party API calls. Every interaction becomes an auto-replayed mock.
AI-Powered Noise Detection
Statistical analysis across multiple recordings identifies varying fields. Timestamps, UUIDs, and session tokens are automatically excluded from strict assertions. Adapts as schemas evolve.
Time-Freezing
System clock set to original recording timestamp during replay. Token expiration, cache TTLs, date-based partitioning all produce identical results. Deterministic by design.
YAML-Based Test Storage
Human-readable YAML files in your project directory. Version-controlled alongside code, reviewed in PRs, diffed with standard tools. No proprietary format or database.
Record
eBPF captures all traffic
Stores as YAML test cases
Keploy
Noise detection + Time-freezing
Deterministic tests
Replay
Mocked dependencies
Runs in seconds in CI
Implementing Record-Replay Testing
A step-by-step approach to adopting record-replay testing with Keploy.
Identify Your Recording Environment
Start with staging for realistic traffic without data sensitivity concerns. For mature data governance, use production shadow capture for the most realistic test cases.

Record a Traffic Session
Start Keploy in record mode alongside your application. Execute workflows: API calls, database operations, background jobs. A 10-minute session can produce dozens to hundreds of test cases.
Review and Curate Recordings
Remove health checks and monitoring noise. Filter sensitive data. Review the automatic noise detection exclusions to ensure important fields are not accidentally skipped.

Replay in CI/CD
On every PR, Keploy starts your app, replays recorded requests, intercepts outbound calls, and compares responses. Failures produce detailed diffs. Tests execute in seconds.
Maintain and Refresh
Re-record traffic periodically to capture new endpoints. Set a refresh cadence (monthly or per major release). Track flakiness rate to detect recordings that need updating.

Record-Replay Best Practices
Maximize the value of traffic-based testing with these operational guidelines.
Curate recordings carefully
Remove health checks, monitoring endpoints, and noise. Ensure PII and credentials are filtered before committing test files.
Refresh on a cadence
Re-record monthly or per major release to capture new endpoints and updated behavior. Stale recordings lead to false failures.
Version control test files
Treat YAML test files like production code. Review in PRs, track changes over time, and maintain alongside the application.
Start with staging, not production
Staging carries realistic traffic without data sensitivity concerns. Graduate to production shadow capture only with mature data governance.
Run replays on every PR
With mocked dependencies, replay tests execute in seconds. No reason to defer to nightly runs. Gate PRs on replay results.
Track flakiness rate
Monitor which recordings produce non-deterministic results. Quarantine and re-record tests above 2% flake rate rather than ignoring failures.
See Record-Replay in Action
Watch how Keploy captures traffic, generates mocks, and eliminates flaky tests from recorded sessions.