QualityPilot
Free for open source

Failed tests get a proposed fix. As a PR.

QualityPilot watches your CI runs, infers the production code under test, and opens a GitHub PR with the fix + reasoning + confidence score. You merge — or you don't. No auto-merge, no fluff.

→ public repos only · no signup needed · ~10 seconds

Try:
qlens.dev/dashboard
B78/100
Pass Rate96%
Stability72%
Speed88%

Watch it fix a test in 30 seconds

From red CI to merged PR. No code, no signup, no demo call.

  1. 1Test fails
    $ npm test
    RUNS auth.test.ts
    FAIL auth.test.ts
    ✗ rejects expired tokens
    Tests: 1 failed, 24 passed
  2. 2CI sends to QualityPilot
    GitHub Actions
    QualityPilot
  3. 3AI analyzes
    Thinking…
    • • Reading test src/lib/auth.test.ts
    • • Reading production src/lib/auth.ts
    • • Drafting minimal patch
    • • Opening pull request
  4. 4PR opens
    fix(auth): reject expired tokens
    confidence 92%
    auth.ts
    + if (token.exp < Date.now()) {
    + return null;
    + }
  5. 5You merge
    Mergedvia squash
    CI passing — 25 of 25 tests
    You saved ~30 minutes of debugging.

Everything you need to fix your test suite

Scan any public repo for free. Connect a CI reporter for the AI Bug Detective + trends.

killer feature

AI Bug Detective

Failed test → LLM proposes a fix → opens a GitHub PR with reasoning + confidence score. Within an hour. You review and merge — or close. Never auto-merges.

Health Score A-F

Weighted scoring across pass rate, stability, speed, and coverage. Instant grade for your test suite.

Flaky Detection

11 risk patterns scored 1-10: timing bombs, external dependencies, animation waits, concurrency issues.

Failure Triage

Auto-categorize failures by root cause: timeouts, assertions, connections, permissions, resources.

CI Ingest Reporters

Drop-in npm/PyPI packages for Jest, Playwright, pytest. Three lines of CI config and your test runs flow into the dashboard automatically.

Slack Notifications

Per-repo incoming webhook. Every new auto-fix PR pings your channel with confidence + reasoning + Review button. Optional notification on merge.

Scan History + Trends

Every scan and CI run is saved. Track test health over time, compare scores, measure improvement.

Live demo · no sign-up

See it in action

Pick a failing test below and watch the AI propose a fix — exactly what QualityPilot does on every red CI run.

Subtraction operator used where addition was intended.

Failing testjavascript
1test("adds two numbers", () => {2  expect(add(2, 3)).toBe(5);3});
Source under testjavascript
1function add(a, b) {2  return a - b;3}
Proposed fixjavascript
1function add(a, b) {2  return a + b;3}

Reasoning

The test asserts add(2, 3) === 5, but the function subtracts. The fix swaps the - operator for +. No callers in the repo rely on the broken behavior.

github.com / pullqualitypilot

fix: off-by-one in add()

Auto-generated by QualityPilot · review before merging

🟢 high · 96%+1 −11 file changed
Checks: 1 failing test will pass after this change.

This demo uses pre-computed fixes for instant playback. Real fixes are generated by GPT-4o-mini against your actual source — usually within an hour of the failed run.

Try it on YOUR code →

Free, no sign-up · same model, your snippet

How it works

Three steps to understand your test suite health

Step 1

Connect GitHub

Sign in with your GitHub account. We only request read access to your repositories.

Step 2

Select Repository

Pick any repository from your account. We search for test files (*.test.ts, *.spec.js, etc.).

Step 3

Get Health Score

Receive an A-F grade with factors breakdown, flaky test detection, and improvement recommendations.

Why QualityPilot exists

Hi — I'm Ihor. I'm a senior QA automation engineer (C# + Playwright + NUnit daily). I built QualityPilot because every team I've worked on hit the same wall: tests fail, the CI report is unhelpful, finding the actual bug eats a developer's afternoon, and the fix is usually a 2-line change.

The AI Bug Detective is what I wished existed: it watches your CI runs, infers the production code under test, and opens a PR with a proposed fix + reasoning + a confidence score. You spend 30 seconds reviewing instead of 30 minutes investigating.

No auto-merge. No fluff. You merge it — or you don't. That's the deal.

— Ihor

Read more about how we handle your code →