Contract-first API testing: How your OpenAPI spec becomes the test suite

I built a Symfony API experiment to validate one idea: What if the OpenAPI spec is not just documentation, but the source of truth for your entire test suite?
The result is a testing concept where adding a new endpoint to the OpenAPI file is enough to get schema validation, auth checks, error format verification, and guardrail tests. No hand-written test boilerplate needed.
This post explains the testing architecture, the metadata system that drives it, and how everything fits together in CI.
The full experiment is open source: symfony-use-case-driven-api on GitHub
The problem: OpenAPI specs that lieLink to this section
Most teams write OpenAPI specs after the code is done. Or they generate them from annotations. Both approaches share the same weakness: the spec drifts from reality over time.
A field gets renamed in the code but not in the spec. A new error response is added but never documented. A required parameter becomes optional, but the spec still says required.
Nobody notices until a client breaks in production. Or until a frontend team builds against the spec and discovers it does not match the actual API.
The root cause is simple. The spec and the tests are separate artifacts. They evolve independently. There is no mechanism to keep them in sync.
The idea: make the spec executableLink to this section
The experiment flips this around. The OpenAPI spec is written first, by hand. Then the test infrastructure reads the spec and generates test scenarios from it.
This means:
- Every documented response is validated against its JSON schema
- Every secured endpoint is tested for missing and wrong-audience auth
- Every documented error case is exercised
- Every documented example is verified against its schema
- Every route in the Symfony router must exist in the OpenAPI spec (and vice versa)
If the spec says it, the tests prove it. If the code does not match, the tests fail.
flowchart LR
subgraph spec [Public Spec]
paths[paths + schemas]
end
subgraph sidecar [Sidecar Metadata]
meta[x-test-fixture]
neg[x-test-negative]
ex[x-test-examples]
end
spec --> build[build-openapi-test-spec]
sidecar --> build
build --> testjson[openapi.test.json]
subgraph infra [Test Infrastructure]
reader[OpenApiDocument]
factory[Request Factory]
resolver[Token Resolver]
validator[Schema Validator]
end
testjson --> reader
reader --> factory
reader --> validator
factory --> resolver
resolver -->|ContractRequest| tests[PHPUnit Tests]
tests -->|response payload| validator
validator --> result{Schema match?}
result -->|Yes| pass[CI green]
result -->|No| fail[CI red]The x-test metadata systemLink to this section
The key innovation is keeping test instructions next to the OpenAPI operations they describe, but separate from the public contract. Three custom extensions carry everything the test infrastructure needs to build executable requests. They live in sidecar files under openapi/src/test-metadata/paths/ , one per path file.
A build script merges them with the public spec into openapi.test.json at build time. The test harness reads this internal artifact. The published spec that consumers and documentation tools see stays clean.
openapi/src/
paths/
api_store_carts_{cart_id}_items.json # public contract
test-metadata/paths/
api_store_carts_{cart_id}_items.json # sidecar with x-test-*
openapi/build/
openapi.json # public bundle (no x-test-*)
openapi.test.json # merged bundle (public + x-test-*)x-test-fixtureLink to this section
Tells the test factory how to build a valid request for the happy path. Each value is a token that gets resolved at runtime. The sidecar file mirrors the operation key from the public path file.
{
"post": {
"x-test-fixture": {
"path": {
"cart_id": "existing_cart"
},
"headers": {
"Idempotency-Key": "generated_idempotency_key"
},
"body": {
"product_id": "existing_product",
"quantity": "valid_quantity"
}
}
}
}The token existing_cart does not contain a literal cart ID. It tells the test infrastructure: “Create a cart first, then use its ID here.” The token existing_product means: “Look up a real product from the fixtures.” The token generated_idempotency_key means: “Generate a random hex string.”
These tokens are baked into PHP provider classes. A registry of TestValueProvider implementations resolves them at runtime. Each provider knows a fixed set of tokens and how to produce real values for them.
flowchart LR sidecar["sidecar file\ncart_id: existing_cart"] --> build["build-openapi-test-spec.mjs"] build --> testjson["openapi.test.json"] testjson --> registry[TestValueResolverRegistry] registry --> cart[CartTestValueProvider] cart -->|POST /api/store/carts| real_cart["cart_abc123"] real_cart --> request[ContractRequest]
| Provider | Tokens |
|---|---|
CatalogTestValueProvider | existing_product, category_with_products, unique_product_name |
CartTestValueProvider | existing_cart, unknown_cart |
CollectionPaginationTestValueProvider | products_sort_name_second_page_cursor |
PrimitiveTestValueProvider | valid_quantity, below_minimum, generated_idempotency_key |
Adding a new token means writing PHP code. The sidecar file references the token by name, but the resolution logic lives in the provider. A possible next step would be to move the resolution instructions into the metadata itself, for example by describing which endpoint to call and which response field to extract. That would make the system fully self-describing without any token-specific PHP code. The current experiment does not go that far.
x-test-negativeLink to this section
Defines how to build requests that should fail. Each scenario maps to a specific error case.
{
"x-test-negative": {
"not_found": {
"cart_id": "unknown_cart"
},
"invalid_body": {
"quantity": "below_minimum"
},
"invalid_query": {
"sort": "invalid_enum"
}
}
}The test infrastructure takes the happy-path request and swaps individual values. not_found replaces the cart ID with a non-existent one. invalid_body sets the quantity below the documented minimum. invalid_query provides an enum value that is not in the allowed list.
x-test-examplesLink to this section
Maps documented response examples to the request parameters that produce them. This is how the test infrastructure can execute each example through HTTP and verify the response still matches the schema.
{
"x-test-examples": {
"200": {
"default": {
"query": {
"sort": "sort_name",
"limit": "limit_one"
}
},
"secondPage": {
"query": {
"sort": "sort_name",
"limit": "limit_one",
"cursor": "products_sort_name_second_page_cursor"
}
}
}
}
}All three extensions live in sidecar files next to the operations they describe. The public spec stays consumer-facing. The test harness reads the merged internal artifact.
Five layers of automated testsLink to this section
The test suite is organized into distinct layers. Each layer tests a different aspect of the contract. All of them read from the same OpenApiDocument class.
flowchart TD
subgraph contract [Contract Tests - fully automated]
schema["Schema Validation\n401 auth + 2xx success + response shape"]
guardrail["Guardrail Tests\n400 headers + 404 not found + enum + body + idempotency"]
docs["Documentation Governance\nrouter sync + operationId + examples + versioned specs"]
examples["Example Verification\nrequest examples + response examples + HTTP execution"]
end
subgraph behavior [Behavior Tests]
auto["Automated Behavior\nsort order + filters + search + pagination + limits"]
manual["Manual Behavior\nbusiness semantics like cart merge"]
end
schema --> guardrail --> docs --> examples --> auto --> manualLayer 1: Schema ValidationLink to this section
OpenApiSchemaValidationTest is the core. It iterates over every operation in the spec and runs three checks.
Every secured operation must reject missing auth:
public function testEverySecuredOperationRejectsMissingAuthenticationWithDocumentedProblemSchema(): void
{
$openApi = $this->document();
foreach ($openApi->operations() as $definition) {
$path = $definition['path'];
$method = $definition['method'];
if (!$openApi->requiresSecurity($path, $method)) {
continue;
}
$client = self::createClient();
$request = $this->requestFactory()->withoutAuthentication($client, $path, $method);
$this->requestFactory()->send($client, $request);
self::assertResponseStatusCodeSame(401);
$this->schemaValidator()->assertResponseMatchesSchema(
$path, $method, '401',
$this->decode($client->getResponse()->getContent()),
'application/problem+json',
);
}
}Every secured operation must reject wrong-audience auth. A store token must not work on admin endpoints. An admin token must not work on store endpoints.
Every operation must have an executable success case that matches its schema:
public function testEveryDocumentedOperationHasExecutableSuccessCaseThatMatchesSchema(): void
{
$openApi = $this->document();
foreach ($openApi->operations() as $definition) {
$path = $definition['path'];
$method = $definition['method'];
$successStatus = $openApi->successStatus($path, $method);
$client = self::createClient();
$request = $this->requestFactory()->success($client, $path, $method);
$this->requestFactory()->send($client, $request);
self::assertResponseStatusCodeSame((int) $successStatus);
$this->schemaValidator()->assertResponseMatchesSchema(
$path, $method, $successStatus,
$this->decode($client->getResponse()->getContent()),
);
}
}This one test method validates every endpoint in the entire API. Add a new route to the OpenAPI spec with x-test-fixture metadata and this test covers it.
Layer 2: Guardrail TestsLink to this section
OpenApiGuardrailTest tests edge cases that every well-behaved API must handle.
| Test | What it does |
|---|---|
| Required header enforcement | Drops each required header one by one, expects 400 |
| Not-found responses | Sends unknown resource IDs, expects 404 |
| Enum validation | Sends invalid enum values, expects 400 |
| Request body validation | Sends invalid payloads, expects 400 |
| Idempotency replay | Sends the same request twice, expects identical responses |
All error responses are validated against the application/problem+json schema. This does not prove full RFC 9457 compliance on its own, but it guarantees that every error response follows a consistent Problem Details payload shape across the entire API.
Layer 3: Documentation GovernanceLink to this section
OpenApiDocumentationTest ensures the spec itself is complete and consistent.
public function testEveryContractApiRouteIsDocumentedInOpenApi(): void
{
$router = self::getContainer()->get(RouterInterface::class);
$documented = [];
foreach ($this->openApi()->operations() as $definition) {
$documented[] = strtoupper($definition['method']) . ' ' . $definition['path'];
}
$actual = [];
foreach ($router->getRouteCollection()->all() as $route) {
$path = $route->getPath();
if (!str_starts_with($path, '/api/store')
&& !str_starts_with($path, '/api/admin')) {
continue;
}
foreach ($route->getMethods() as $method) {
$actual[] = strtoupper($method) . ' ' . $path;
}
}
sort($documented);
sort($actual);
self::assertSame($documented, $actual);
}Both arrays are sorted before comparison so that ordering differences do not cause false failures. The test only checks membership: if a route exists in code but not in the spec, it fails. If the spec documents a route that does not exist, it fails. They must be identical.
Additional governance checks:
- Every operation must have a
summaryandoperationId - Every operation must have a success response schema
- Every secured operation must document a 401 response
- Every parameter must have a schema
- Every request body must have at least one example
- Frozen versioned specs must exist on disk
Layer 4: Example VerificationLink to this section
OpenApiExamplesTest validates that documented examples stay schema-valid.
- Every request example must match its request schema
- Every response example must match its response schema
- Every GET success example must execute through HTTP and return a schema-valid response
This catches examples that fall out of sync with the schema. It does not compare the live response to the documented example value. A stale example can still pass if it has the right shape. But it guarantees that no example in your docs violates its own schema.
Layer 5: Automated Behavior TestsLink to this section
AutomatedApiBehaviorTest tests collection behavior by reading query parameter definitions from the spec.
- Sortable collections must return data in the requested order
- Category filters must return only matching items
- Search queries must find relevant products
- Paginated collections must return consistent cursor metadata
- Invalid limits must return 400
These tests discover their scenarios from the OpenAPI spec. If a collection endpoint documents a sort enum parameter, the test automatically exercises every sort option.
How schema validation works under the hoodLink to this section
The OpenApiSchemaValidator class is intentionally simple. It uses opis/json-schema for the heavy lifting.
final readonly class OpenApiSchemaValidator
{
private Validator $validator;
public function __construct(private OpenApiDocument $document)
{
$this->validator = new Validator();
}
public function assertResponseMatchesSchema(
string $path, string $method, string $status,
array $payload, string $contentType = 'application/json',
): void {
$schema = $this->document->responseSchema($path, $method, $status, $contentType);
$result = $this->validator->validate(
$this->toJsonValue($payload),
$this->toJsonValue($schema),
);
if (!$result->isValid()) {
throw new \RuntimeException(sprintf(
'%s %s response %s does not match the OpenAPI schema.',
strtoupper($method), $path, $status,
));
}
}
}The OpenApiDocument class handles all the complexity: reading the bundled JSON, resolving $ref references recursively, and extracting schemas, parameters, and metadata. The validator just compares payloads against schemas.
The request factoryLink to this section
The OpenApiOperationRequestFactory is the most complex piece. It builds executable HTTP requests from the OpenAPI spec and metadata.
For a success case, it:
- Reads the audience adapter for the endpoint (store or admin)
- Sets auth headers based on the audience
- Resolves path parameters using
x-test-fixturetokens - Resolves required query parameters
- Builds the request body from the request schema and fixture tokens
- Returns a
ContractRequestready to send
public function success(KernelBrowser $client, string $path, string $method): ContractRequest
{
$context = $this->context($client, $path, $method);
$server = $context->audience->baseServer($context);
foreach ($this->document->parameters($path, $method) as $parameter) {
if ('header' !== ($parameter['in'] ?? null)
|| !((bool) ($parameter['required'] ?? false))) {
continue;
}
$name = (string) $parameter['name'];
$resolver = $this->fixtureResolver($path, $method, 'headers', $name);
$server[$this->headerServerKey($name)] = null !== $resolver
? (string) $this->values->resolve($resolver, $context)
: $context->audience->defaultHeaderValue($name, $context);
}
$uri = $this->resolveUri($context, unknownIdentifiers: false);
$json = $this->buildRequestBody($path, $method);
return new ContractRequest(strtoupper($method), $uri, $server, $json);
}For negative cases, it starts with the success request and mutates it. Drop auth headers for 401 tests. Swap path IDs for 404 tests. Break the body for 400 tests.
The CI pipelineLink to this section
Everything runs in a single GitHub Actions workflow. The pipeline validates the spec, the code, and the contract in sequence.
steps:
# OpenAPI validation
- name: Lint OpenAPI
run: npm run openapi:lint
- name: Check OpenAPI structure
run: npm run openapi:check:structure
- name: Bundle OpenAPI JSON
run: npm run openapi:bundle:json
- name: Detect breaking OpenAPI changes
run: npm run openapi:diff
- name: Prove machine-consumability with api-gen
run: npm run openapi:proof:api-gen
# PHP quality
- name: Run PHPStan
run: composer phpstan
- name: Run PHPUnit
run: php bin/phpunitThe key steps before PHPUnit even runs:
- openapi:lint validates the spec against OpenAPI rules using Redocly CLI
- openapi:check:structure prevents duplicate schemas and inline complex schemas
- openapi:diff detects breaking changes against the committed baseline using OASDiff
- openapi:proof:api-gen proves the spec is machine-consumable by generating TypeScript types with @shopware/api-gen
The OASDiff step also runs as a PR annotation on pull requests, so reviewers see breaking changes directly in the diff.
What you get for freeLink to this section
When you add a new endpoint to this system, the only thing you write is:
- The OpenAPI path file with
x-test-fixtureandx-test-negativemetadata - The controller and use-case handler
- The response DTO
You do not write:
- Auth tests (the schema validation layer handles them)
- Schema compliance tests (automatic for every operation)
- Error format tests (the guardrail layer validates all error responses)
- Documentation completeness tests (the governance layer catches missing docs)
- Example validation tests (automatic for all documented examples)
Manual behavior tests are only needed for business semantics that cannot be inferred from the spec. For example: “Adding the same product to a cart twice should merge the quantity instead of creating a second line item.” That is business logic. The spec cannot express it.
The experiment also enforces architecture rules through PHPat in the PHPStan run. Controllers never access Doctrine repositories directly. Use-case handlers do not depend on HTTP layer classes. DTOs do not reference entities. Contract tests protect the external surface. Architecture tests protect the internal structure.
TakeawaysLink to this section
Treat OpenAPI as a real contract. Not as documentation you update when you remember. Write the spec first. Test against it automatically.
Embed test metadata in the spec. The x-test-* extensions keep test instructions close to the operation they describe. No separate mapping files. No guessing.
Automate the boring parts. Auth checks, schema validation, error format compliance, documentation completeness. None of these need hand-written tests per endpoint.
Keep manual tests for business logic. The contract tells you the shape. The behavior tests tell you the semantics. Both are needed. But only the semantics require hand-written tests.
Run it all in CI. Lint the spec, check for breaking changes, validate schemas, run the tests. Every push. Every pull request. No exceptions.
The API surface is intentionally kept small. The depth around those endpoints is not. The goal is to prove that contract-first, use-case-driven API development produces real confidence with minimal test boilerplate.
The bigger pictureLink to this section
This post focused on the testing concept. But the experiment covers more ground than that. The repository also includes:
- Date-based request versioning with frozen per-version specs and backward compatibility tests
- RFC 9457 Problem Details for all error responses with a full error catalog
- Cursor pagination inspired by RFC 9865 (SCIM cursor pagination) instead of offset/limit
- Idempotency on write endpoints with replay detection
- Deprecation signaling via
Deprecation,Sunset, andLinkheaders - k6 load tests for all major flows and phpbench benchmarks for hot paths
- Machine-consumability proofs generating TypeScript types with
@shopware/api-gen - Architecture Decision Records documenting every major design choice
- A Shopware transfer playbook with class-by-class mapping and a phase-by-phase adoption path
The goal was never to build a small demo. The goal was to prove that contract-first, use-case-driven API development works with the same discipline across testing, versioning, documentation, performance, and architecture. At the time of writing, the 120+ source files, 40+ documentation pages, and 28 test files are the actual experiment.
The testing concept described in this post is stable. The next steps are about breadth: more use cases to test if the module structure holds up when the surface grows, write-heavy flows to stress-test idempotency, a real authentication layer, and more audiences beyond store and admin.
The repository is open source. If the testing concept or any other part of the experiment is useful for your own API work, take what you need: symfony-use-case-driven-api on GitHub
ReferencesLink to this section
Standards and RFCs: OpenAPI Specification | RFC 9457 - Problem Details | RFC 9865 - Cursor Pagination | RFC 7234 - HTTP Caching
API design inspiration: Stripe API | Stripe API Versioning | Zalando API Guidelines | Microsoft API Guidelines
Tools: opis/json-schema | Redocly CLI | OASDiff | PHPat | PHPStan