AI Quality Engineering

When AI writes the code, what's left for quality to do?

Fifteen years of test architecture in insurance, finance and government — now applied to the AI layer on top: agent orchestration, review gates for LLM output, and the audit trail your regulator still expects.

See how I work → Send a note →

release-reviewer

v0.4.2

✓ No secrets in commit diff 12 files scanned

✓ Spec-drift: 0 uncovered ACs 23 / 23 traceable

✓ No destructive migrations schema diff clean

! Auth code touched flagged for review

APPROVED FOR PROD 2.3s

Example: a review gate I run at a client. Four checks, one verdict, in CI.

15+

Years

QA & test automation

Clients

Enterprise & government

Agents

In production, licensed

Open source

Templates on GitHub

★

Ambassador

Cypress.io

The approach

Four pillars beneath a software development process where quality stays in hand.

Every engagement touches all four, in different mixtures. My approach is always effective and pragmatic: the best solution that fits your organisation for the long term.

Test architecture that holds up when the team turns over.

Software testing patterns, AC/TS traceability, per-feature coverage matrix, Language-First. The foundation for any quality architecture, with Cypress or Playwright. Built so AI coding agents and the surrounding quality gates keep safeguarding quality — including after you leave.

AI coding agents in your release pipeline.

Claude Code, Cursor, Copilot. In-repo subagents, AGENTS.md, slash commands, review gates. The agents run in your repo and your CI; your source never ends up anywhere else.

Audit trail by design.

DORA, GDPR, NIS2, ISO/IEC 25010, TMMi. Every change traces back to an acceptance criterion, every gate is documented. What you do, you can defend — to an inspector or to internal audit.

Quality as a financial story.

One report your CFO and your auditor read the same way. What does a production bug cost? What does a green pipeline save? I make the hidden costs in your development process visible — and steerable. Example from a recent engagement¹: 40h regression research per release → 6h. First green pipeline on the agentic flow: week 3. ¹ Anonymized; figures TBD — placeholder. If you are interested in a deep-dive in the financial spectrum, see QualityProfit →

Earlier work

Where the work has been applied.

A selection. The same discipline, in different contexts — from a worldwide government rollout to a solo SaaS I build myself.

QualityProfit · Solo SaaS2025 — present

Solo SaaS, four agents.

Founder · Full-stack with Claude Code

A dashboard the customer deploys themselves, turning Jira, Azure DevOps, GitHub and GitLab signals into financial ROI for QA. Four in-repo subagents: release-reviewer, deploy-monitor, onboarding-smoke-tester, requirements-guard.

Python · FastAPI · Pydantic · React · Cypress · Docker · Caddy · Stripe · Claude Code

New Orange Digital Agency2025 — present

Their AI test stack, productized.

Architecture · Framework · Claude Code skill

Built an AI-augmented Playwright architecture, framework and reusable Claude Code skill for a Dutch digital agency. Designed to plug-and-play into any current or future client engagement, not for a single project. Codifies project structure, Page Object pattern, AC/TS traceability and per-feature coverage matrix. The agency now ships AI-augmented test suites to clients through a single Claude Code skill — productized AI-testing assets at agency scale.

Playwright · TypeScript · Next.js 16 · Turborepo · Tailwind v4 · Claude Code · Cursor · Copilot

Evides · National Utility2024 — present

Quality Framework rollout.

Quality Assurance Manager

TMMi-aligned Quality Framework on top of ISO/IEC 25010, embedded in delivery pipelines for a national utility. Quality maturity expressed in financial impact, with the traceability a regulator expects.

TMMi · ISO/IEC 25010 · Quality Framework

RvO · NL Government2024 — 2025

Language-First in gov.

Cypress + Playwright architecture

Test architecture across multiple government departments where different testing tools, specifications, scenarios and tests share one continuous human-readable layer. Presented at CypressConf 2024 — "Beyond the Battle: Empowering Test Automation with a Language-First Approach." The same Language-First approach I now extend into AI-augmented delivery.

Cypress · Playwright · TypeScript · Lerna · Artillery · Gherkin · Blueriq · GitLab · SonarQube

VGZ · Insurance2022 — 2024

Architect for the long run.

Test Automation Architect

Cypress + Lit Elements test architecture with Cucumber traceability, integrated into Azure DevOps. Page Object discipline and spec-to-test traceability that let the team keep the suite maintainable after I left. Every change anchored to a spec, every spec traceable to an acceptance criterion. Built to keep working without me; handed back to the team.

Cypress · Lit Elements · Cucumber · Azure DevOps

Ministry of Foreign Affairs · Government2021 / 2022 — 2024

Global rollout, audited.

QA Architect & Test Manager

Test management for a worldwide rollout under ISO 25010 / TMap discipline. Every change traceable, every gate documented, every decision defensible to an inspector. Earlier engagement covered Cypress, Angular and Docker on Azure DevOps.

ISO 25010 · TMap · Cypress · Angular · Docker · Azure DevOps

Open source

What I share openly.

Most of what I do at clients can't be shared. This can: two orchestration templates for Playwright and Cypress, the pieces I publish on Medium, and a Playbook landing later this year. Fork what's useful; let me know if you improve something.

GitHub · Template

orchestration-playwright-agents

Drop-in Claude Code orchestration template for Playwright E2E: master prompt as a skill, 8 specialised sub-agents, slash commands, starter e2e/ folder. Adapt it to your repo in a day.

View on GitHub →

GitHub · Template

orchestration-cypress-agents

Sister template for Cypress: master prompt as a skill, 8 sub-agents, slash commands, starter cypress/ folder. Same pattern, framework-native.

View on GitHub →

Medium · Field notes

Field notes from AI-augmented engineering

Long-form on Language-First test design, in-repo subagents and the messy reality of shipping with LLMs. Picking up cadence in Q3 2026 — see the writing roadmap.

Read on Medium →

Playbook · Coming Q3 2026

The AI-Augmented E2E Playbook

15-page PDF: Language-First architecture, AGENTS.md scaffolds, Page Object pattern with AC/TS traceability, per-feature coverage matrix. Bundles three Medium pieces into one printable artifact.

Get notified →

Four agents in production

What turns out to work in CI pipelines.

Four review gates I distilled out of client work over the last two years. They run in my own codebase and at a handful of teams. Anyone who wants to try one knows where to find me.

In production · v0.4.2

release-reviewer

Reviews every push for risk patterns: secrets in the diff, coverage thresholds, destructive migrations, touched auth code. Posts a verdict on the PR with the failing rule IDs. Runs on every commit in my own codebase.

Email me about it →

In production

deploy-monitor

Verifies the container digest in production matches the released artifact, exactly. Catches the silent drift between "CI was green" and "what's actually running."

Email me about it →

In production

onboarding-smoke-tester

Walks the full onboarding flow end-to-end through the real API on every release. Catches the "registration is broken in prod" class of regression before a customer does. Runs independently; opens an issue on failure.

Email me about it →

In production

requirements-guard

Reconciles the written spec against the live code on every PR. Flags drift between what was promised and what was built — before it reaches an auditor or a customer. Plays nicely with spec tools like openSpec. The discipline the other three agents lean on.

Email me about it →

An agent isn't plug-and-play. First a short working session to see if it fits your repo — and if it clicks, a focused two-to-four-week integration. Every rule is tunable per repo; a gate never blocks without an audited override path. Wondering if one of these would suit your team? Just drop me a note; we'll look at it together.

The context

Three sectors, one recurring conversation.

The domains I've worked in for fifteen years: insurance, financial services and government. The common question — from auditor, regulator, internal audit — is how AI-augmented delivery stays explainable to someone who doesn't read code.

Insurance & financial services

DORA is here. So is DNB.

Insurers, banks, payment platforms, asset managers. DORA, GDPR, NIS2, internal audit and third-party ICT risk — plus the regulators behind them. For Dutch insurers: DNB and AFM oversight, Wft implications, Solvency II reporting systems, IFRS 17 reconciliation pipelines. The regulator isn't asking whether you use AI any more — they're about to ask how you control it.

Government & public sector

Auditable at delivery, by default.

Ministries, public-service implementers, government IT bodies. Algoritmeregister, AVG, BIO, NPR 5326, EU AI Act. AI-assisted delivery that stands up under both an inspector and a change of administration — privacy and data residency answered by architecture, not paperwork. The discipline I built at RvO and the Ministry of Foreign Affairs.

Engineering & QA leadership

Speed in, traceability out.

CTOs, VPs of Engineering, Heads of QA in regulated organisations. Claude Code, Cursor and Copilot deliver speed; your audit committee wants the evidence behind it. That bridge — from velocity to a defensible release pipeline — is what I build.

Where it doesn't fit

Honest here beats discovered in week six.

A generic AI vendor with no story toward regulation, a one-off Cypress audit with no architecture beneath it, or a pure consumer context where "move fast, break things" still leads — that's a different trade. Better to be honest here than to discover it in week six.

Trust · Continuity · Data residency

The three questions your CISO, DPO and auditor ask first.

Honest answers, named risks. The Trust & Data pack — sub-processor list, DPA, regional data-flow diagram, continuity arrangements, security questionnaire — goes to your procurement, DPO and internal audit before the first POC.

Where does your code go?

Inside your repo. Inside your CI.

The agents run inside your repository and your CI runners — no proprietary cloud holds your source. LLM access goes through your existing Claude Code, Cursor or Copilot enterprise tenant: your region, your DPA, your training opt-out. Sub-processor list, regional flow diagram and DPA highlights ship with the engagement pack.

Continuity

One architect, fully covered.

Runbooks, named backup contractor, source-code escrow — scoped per engagement and signed before kick-off. Not a warning on a label; it's just covered before the conversation starts. The Trust & Data pack carries the specifics for your contract shape.

For procurement, DPO & audit

DORA / GDPR / Wft — on one page.

KvK-registered company, standard DPA, sub-processor list, security questionnaire (CAIQ-lite), and — available separately — a Dutch-language one-pager covering DORA, GDPR and Wft implications. For procurement, your DPO and internal audit.

Request Trust & Data pack → Vraag NL-bijsluiter aan (DORA / AVG / Wft) →

Tech stack

What I bring into your repo.

Pragmatic, opinionated, chosen for AI extension — not novelty.

AI / Agents

Claude Code · Custom subagents · Hooks · Prompt engineering · AGENTS.md / SKILL.md · Cursor · GitHub Copilot

Testing

Cypress.io · Playwright · Jest · Cucumber / Gherkin · Postman · Artillery · axe-core

Frontend

TypeScript · React · Next.js · Vue · Angular · Lit · Tailwind · Turborepo

Backend

Python · FastAPI · Pydantic · Java · Node · REST · GraphQL

DevOps

Docker · GitHub Actions · GitLab CI · Azure DevOps · SonarQube

Quality

TMMi · ISO/IEC 25010 · NPR 5326 · TMap · OTAP · CI/CD · Page Object pattern

Integrations

Jira · GitHub · GitLab · Azure DevOps · Blueriq · Sitecore · Stripe

Also experienced with

Selenium · TestNG · JMeter · Jenkins · TeamCity · Lerna · Hibernate / JPA · Caddy · AWS Cognito

Career timeline

15+ years across enterprise & government.

A selection — earlier roles span ING, SBB, Ministry of Foreign Affairs, ZLM, KPN and lecturing at The Hague University of Applied Sciences.

2024 — 2025

RvO (NL Government)

Quality Assurance Manager

2024 — present

Evides

Quality Assurance Manager

2025 — present

QualityProfit

Founder · Solo SaaS

2025 — present

New Orange Digital Agency

AI test stack productized

2022 — 2024

VGZ

Test Automation Architect

2022 — 2024

Ministry of Foreign Affairs

Test Manager

2022 — 2023

Aon

Quality Automation Architect

2021

Ministry of Foreign Affairs

QA Architect

2021

Test Automation Specialist

2020

Harlem Next · Nederlandse Transplantatie Stichting

Test Automation Specialist

2019 — 2020

Aon

Quality Assurance Manager

2018 — 2019

ING

Test Automation Specialist

Writing

Pragmatic guides on Cypress, testing & automation.

On Medium since 2020, with 10+ deep-dives on Cypress patterns, ROI for testing, and test strategy. New AI-augmented engineering pieces landing on this site through 2026.

Jul 12, 2023 · Cypress

How to test multiple tabs in Cypress

Pro-tip: stub the window object. A practical walkthrough for the multi-tab problem Cypress users hit constantly.

Feb 16, 2023 · Cypress

How to test multiple domains in Cypress

Cross-origin testing is finally there. What changed, what to watch for, and how to migrate your auth flows.

Jan 19, 2023 · Strategy

Unit vs. Component Testing — what's the difference?

And why you should care. Where the lines are, why teams confuse them, and how to pick the right tool for the assertion.

Read everything on Medium →

Speaking & community

On stage, on a podcast, in your team's Slack.

Cypress.io Ambassador, conference speaker, certified didactical trainer.

★

Cypress Ambassador

Active community work

◆

Conference Speaker

CypressConf 2024 + 2025 workshops

▲

Certified Trainer

Software testing & QA · Post-HBO didactical

●

Tech Writer

medium.com/@pdewitt

Talks on YouTube

Recorded talks and workshops.

2025 · Workshop · with Frits van der Sloot

Cypress: The Bad Practices Workshop

Hands-on tour of the Cypress anti-patterns we keep meeting in real codebases — and how to refactor out of them. Co-presented with Frits van der Sloot.

Watch on YouTube →

CypressConf 2024 · Conference talk

Beyond the Battle: Empowering Test Automation with a Language-First Approach

How specs, scenarios and tests can share one continuous human-readable layer — and why that shape makes AI extension tractable.

Watch on YouTube →

2024 · Conference talk

Effective Test Automation Design

The architecture decisions that make a test suite outlive the team that wrote it — Page Objects, traceability, and the discipline behind a pyramid that holds.

Watch on YouTube →

Also built · Solo SaaS

Invisible costs, now visible.

QualityProfit is my solo SaaS that hands the financial language back to QA. Same story as pillar 04, in product form — Jira, Azure DevOps, GitHub and GitLab signals turned into numbers your CFO and your auditor can both work with. The four agents above run inside it today.

One report your CFO and your auditor read.
Correlation engine: bugs traced to their release fingerprint.
Customer-deployed: your data, your infra.

See qualityprofit.io →

Working session

An hour, your release pipeline, and honest questions.

No sales call. Write me a few lines about your team — that's enough — and I'll send a short agenda back. If it fits, we go further. If it doesn't, I'll say so honestly.

Half-day review

One pipeline, one verdict. Built to see fast whether the work lands.

Fixed track

Two to four weeks. Scope set up front, delivery includes runbooks.

Retainer

Fixed monthly capacity for teams evolving the architecture across quarters.

Send a note → paul@happytesting.io