How Much Do Flaky Tests Actually Cost?
Spoiler: it’s not just CI minutes. The real number will make your engineering manager wince.
When teams talk about the cost of flaky tests, they usually start with CI minutes. That’s the visible part — the line item on your GitHub bill. But CI compute is maybe 10% of the real cost. The other 90% is human time, delayed shipping, and the slow erosion of engineering culture.
Let’s break it down with real numbers.
Layer 1: CI compute
This is the easy math. Every time a flaky test causes a re-run, you’re paying for the same CI job twice.
| Metric | Example team |
|---|---|
| Average CI run duration | 12 minutes |
| Flaky-caused re-runs per week | 40 |
| Wasted CI minutes per week | 480 minutes |
| GitHub Actions cost per minute | $0.008 |
| Monthly CI waste | ~$60/month |
$60 a month? That’s nothing, right? That’s the trap. CI compute is cheap enough that nobody escalates it. But it’s the tip of the iceberg.
Layer 2: Developer time
This is where the real money goes. Every flaky failure triggers a human response:
- Developer sees red CI badge on their PR
- Opens CI logs, scrolls through output
- Tries to figure out if the failure is real or flaky
- Decides to re-run (or asks a teammate)
- Waits for the re-run to finish
- Resumes their previous work — but the context switch already happened
Research on context switching shows it takes an average of 23 minutes to regain deep focus after an interruption. Even if the investigation itself takes only 5 minutes, the true cost per interruption is closer to 30 minutes of productive time.
| Metric | Example team |
|---|---|
| Flaky interruptions per week | 40 |
| Context-switch cost per interruption | 30 min |
| Total developer hours lost per week | 20 hours |
| Average fully-loaded eng cost | $85/hour |
| Monthly developer time waste | ~$6,800/month |
That’s over 100x the CI compute cost. And this is for a modest team with a moderate flaky test problem. A team of 30 engineers with a bad flaky test culture can easily burn $20,000+/month in lost productivity.
Layer 3: Shipping velocity
Flaky tests don’t just waste time — they slow down how fast you ship.
- ●PRs stay open longer. A PR that gets a flaky red build sits in review limbo. The author re-runs, waits, and the reviewer has moved on to something else. Round-trip time expands from hours to days.
- ●Merge conflicts compound. Longer PR lifetimes mean more merge conflicts. Each conflict is another context switch, another re-run, another delay.
- ●Deploys batch up. When teams can’t merge quickly, changes pile up into larger, riskier deploys. The opposite of continuous delivery.
This is the hardest cost to quantify but often the most painful. Your competitors ship daily while your team spends a quarter of their time fighting CI noise.
Layer 4: Trust erosion
This is the most dangerous cost because it’s invisible until it’s catastrophic.
When tests are unreliable, developers develop a reflex: “It’s probably just flaky.” This is rational behavior given unreliable signals. But it means real failures get ignored too.
The progression looks like this:
The total picture
| Cost layer | Monthly cost | Visibility |
|---|---|---|
| CI compute | $60 | On your bill |
| Developer time | $6,800 | Hidden |
| Shipping velocity | $??? | Invisible |
| Trust erosion | $??? | Invisible until incident |
| Total | $7,000 – $25,000+/month | |
The irony: the cost that shows up on your bill (CI minutes) is the smallest component. The costs that don’t show up anywhere — developer time, delayed shipping, trust — are 100x larger.
What can you actually do about it?
Step one is visibility. You can’t fix what you can’t see. Most teams have no idea how many flaky tests they have, which ones are the worst, or what they cost.
That’s the gap Kleore fills. It connects to your GitHub repos, analyzes your CI run history, and gives you a ranked list of every flaky test — with dollar costs attached. No configuration, no test framework changes, no new CLI tools. Just the data you need to start making decisions.
See your real CI waste in two minutes.
Install the Kleore GitHub App and get a dollar-cost breakdown of every flaky test in your repos. Free to start. No credit card required.
Scan my repos — freeFurther reading
- What Are Flaky Tests? — A primer on what causes test flakiness.
- How to Fix Flaky Tests in GitHub Actions — Practical fixes for the most common root causes.
- Flaky Test Cost Calculator — Plug in your numbers and see the real cost.