Skip to main content
A flaky test is a test that sometimes passes and sometimes fails with no change in the product or the test code. Retries and quarantine are short-term bandaids, the real fix is understanding why.

Step 1: Look at the video

Open the failing run in the dashboard and watch the video. 80% of flakes have an obvious visual cause: a modal that wasn’t dismissed, a toast that covered a button, an animation that hadn’t settled.

Step 2: Compare to a passing run

Open a passing run of the same test and diff the step timings. Big spikes usually point at:
  • Network timing: a request is faster/slower than usual
  • Animation timing: a transition is now blocking interaction
  • Ordering: a race condition between two state updates

Step 3: Check auto-heal history

If auto-heal fired in recent runs, something in the UI is drifting. Review the before/after locators to see what changed and whether the product change was intentional.

Step 4: Watch the trace

Each failing run has a full trace: DOM, network, console. Common signals:
  • Console errors from the product around the failed step
  • Pending network requests when the agent acted
  • Unhandled promise rejections that broke the page after a step

Step 5: Isolate the step

Pull the problematic step into a minimal test and run it 20 times locally:
for i in {1..20}; do npx momentic run my-test || break; done
If it always fails in isolation, the step itself is wrong. If it only fails when run with others, you have test pollution, shared state from a previous test.

Common fixes

  • Add an explicit wait for a specific element or URL before acting
  • Use an AI check to confirm intermediate state
  • Reset state between tests via a beforeAll module
  • Mock flaky upstream services: see Request mocking
  • Split the test if it’s doing too much

When to quarantine

Quarantine if:
  • You know the fix but can’t ship it today
  • The test is blocking a critical merge
  • You’re investigating and don’t want to distract the team
See Quarantine and Failure recovery.