What permissions does the Kleore GitHub App need?

Kleore starts with the bare minimum: read-only access to Actions workflow runs. It never sees your source code. When you enable optional features like PR comments, it asks for just the additional permission needed — you approve each one.

Do I need to change my CI workflows?

Not for the initial report. The zero-config scan works from your existing GitHub Actions data. To see individual flaky tests, you add one step to upload JUnit XML results — a 2-line YAML change.

How is the CI cost number calculated?

Each flaky rerun costs approximately 30 minutes (20 min rerun wait + 10 min context switch) at $75/hr fully-loaded engineer rate. These are conservative defaults you can customize for your team.

Can I use Kleore on a private GitHub repo?

Yes. The free tier works on any repo you install the app on, public or private. Your data stays private and is never shared.

What makes Kleore different from other CI tools?

Most CI tools show you pass/fail. Kleore shows you cost. It translates flaky tests into dollar amounts so you can prioritize fixes and get budget approval. The shareable report makes the problem impossible to ignore.

← All articles

How Much Do Flaky Tests Actually Cost?

Spoiler: it’s not just CI minutes. The real number will make your engineering manager wince.

March 21, 2026·10 min read

When teams talk about the cost of flaky tests, they usually start with CI minutes. That’s the visible part — the line item on your GitHub bill. But CI compute is maybe 10% of the real cost. The other 90% is human time, delayed shipping, and the slow erosion of engineering culture.

Let’s break it down with real numbers.

Layer 1: CI compute

This is the easy math. Every time a flaky test causes a re-run, you’re paying for the same CI job twice.

Metric	Example team
Average CI run duration	12 minutes
Flaky-caused re-runs per week	40
Wasted CI minutes per week	480 minutes
GitHub Actions cost per minute	$0.008
Monthly CI waste	~$60/month

$60 a month? That’s nothing, right? That’s the trap. CI compute is cheap enough that nobody escalates it. But it’s the tip of the iceberg.

Layer 2: Developer time

This is where the real money goes. Every flaky failure triggers a human response:

Developer sees red CI badge on their PR
Opens CI logs, scrolls through output
Tries to figure out if the failure is real or flaky
Decides to re-run (or asks a teammate)
Waits for the re-run to finish
Resumes their previous work — but the context switch already happened

Research on context switching shows it takes an average of 23 minutes to regain deep focus after an interruption. Even if the investigation itself takes only 5 minutes, the true cost per interruption is closer to 30 minutes of productive time.

Metric	Example team
Flaky interruptions per week	40
Context-switch cost per interruption	30 min
Total developer hours lost per week	20 hours
Average fully-loaded eng cost	$85/hour
Monthly developer time waste	~$6,800/month

That’s over 100x the CI compute cost. And this is for a modest team with a moderate flaky test problem. A team of 30 engineers with a bad flaky test culture can easily burn $20,000+/month in lost productivity.

Layer 3: Shipping velocity

Flaky tests don’t just waste time — they slow down how fast you ship.

●PRs stay open longer. A PR that gets a flaky red build sits in review limbo. The author re-runs, waits, and the reviewer has moved on to something else. Round-trip time expands from hours to days.
●Merge conflicts compound. Longer PR lifetimes mean more merge conflicts. Each conflict is another context switch, another re-run, another delay.
●Deploys batch up. When teams can’t merge quickly, changes pile up into larger, riskier deploys. The opposite of continuous delivery.

This is the hardest cost to quantify but often the most painful. Your competitors ship daily while your team spends a quarter of their time fighting CI noise.

Layer 4: Trust erosion

This is the most dangerous cost because it’s invisible until it’s catastrophic.

When tests are unreliable, developers develop a reflex: “It’s probably just flaky.” This is rational behavior given unreliable signals. But it means real failures get ignored too.

The progression looks like this:

Phase 1Team re-runs flaky tests and reports them in Slack

Phase 2Team re-runs without reporting — it's just background noise

Phase 3Team merges with red CI, assuming flakiness

Phase 4A real bug slips through. "We thought it was flaky."

Phase 5Production incident. Post-mortem identifies eroded CI trust as root cause.

The total picture

Cost layer	Monthly cost	Visibility
CI compute	$60	On your bill
Developer time	$6,800	Hidden
Shipping velocity	$???	Invisible
Trust erosion	$???	Invisible until incident
Total	$7,000 – $25,000+/month

The irony: the cost that shows up on your bill (CI minutes) is the smallest component. The costs that don’t show up anywhere — developer time, delayed shipping, trust — are 100x larger.

What can you actually do about it?

Step one is visibility. You can’t fix what you can’t see. Most teams have no idea how many flaky tests they have, which ones are the worst, or what they cost.

That’s the gap Kleore fills. It connects to your GitHub repos, analyzes your CI run history, and gives you a ranked list of every flaky test — with dollar costs attached. No configuration, no test framework changes, no new CLI tools. Just the data you need to start making decisions.

See your real CI waste in two minutes.

Install the Kleore GitHub App and get a dollar-cost breakdown of every flaky test in your repos. Free to start. No credit card required.

Scan my repos — free

How Much Do Flaky Tests Actually Cost?

Layer 1: CI compute

Layer 2: Developer time

Layer 3: Shipping velocity

Layer 4: Trust erosion

The total picture

What can you actually do about it?

See your real CI waste in two minutes.

Further reading

Stop guessing.
Start measuring.

Layer 1: CI compute

Layer 2: Developer time

Layer 3: Shipping velocity

Layer 4: Trust erosion

The total picture

What can you actually do about it?

See your real CI waste in two minutes.

Further reading

Stop guessing.Start measuring.

Stop guessing.
Start measuring.