What is Integration Testing?
Integration testing verifies that a group of components work together correctly — a service with its database, an API endpoint with its downstream calls, a microservice with its message queue. Integration tests exercise real interactions and catch the class of bug that unit tests cannot: contract mismatches at component boundaries, which is where the majority of production bugs actually live.
Integration tests vs unit tests
Unit tests isolate a single function or class and mock every dependency. They run in milliseconds and catch logic bugs in the code you wrote. Integration tests exercise real dependencies — a real Postgres instance, a real Redis, a real downstream HTTP service — and catch bugs in the contract between your code and those dependencies. Both are necessary, for different reasons.
| Dimension | Unit test | Integration test |
|---|---|---|
| Scope | One function or class | Multiple components |
| Dependencies | Mocked | Real (or recorded) |
| Speed | Milliseconds | Seconds |
| Catches | Logic bugs | Contract and boundary bugs |
| Count in a typical service | Hundreds to thousands | Dozens to hundreds |
Integration testing tools
Three generations of tools cover different trade-offs:
- Framework-native. pytest fixtures, JUnit, Mocha with test containers — integration tests live alongside unit tests in the same codebase. Easy to start, manual assertion writing for every test.
- Containerization. Testcontainers spins up real Postgres, Redis, Kafka per test run so tests hit the real thing without sharing state. Realistic but slow and resource-intensive.
- Traffic capture. Keploy, WireMock record-playback, Hoverfly record real API interactions and replay them as deterministic tests with auto-generated mocks for downstream dependencies. Fastest to set up and fastest to run at steady state.
The non-determinism problem
The hardest problem with integration testing is non-determinism. Tests that hit real databases, real timestamps, and real distributed tracing contexts produce different results on every run — failing assertions when nothing is actually broken. Teams that hit this problem either (a) adopt aggressive mocking, which defeats the point of integration testing, or (b) quarantine flaky tests until the whole suite loses credibility.
Keploy solves this with a deterministic replay engine that normalizes non-deterministic fields (timestamps, UUIDs, session tokens, pagination cursors) at comparison time. A test passes as long as the replayed response has the same structural shape and the same deterministic fields as the original capture. See Noise & Secret Management for the full mechanism.