Writing maintainable tests

Momentic tests are YAML. Like any code, they degrade without discipline.

Prompt like you’re giving directions

Aim for prompts a new teammate could follow without running the app.

Good: Click the Save button in the header
Worse: Click save

Include the location, the element role, and the visible label when the label alone is ambiguous.

Assert after every meaningful action

A step that “worked” is not the same as the right thing happening. Pair destructive or state-changing steps with an explicit assertion:

checkout.test.yaml

steps:
  - click: Submit order button
  - assert: A confirmation banner appears with the new order ID

Don’t assert on time-sensitive UI

Tooltips, toasts, loading spinners, and other transient UI can appear and disappear between the moment Momentic captures page state and the moment the agent reasons about it. Asserting on them leads to race conditions and false failures. Assert on the durable side effect instead:

Bad: A tooltip appears saying 'Saved'
Better: The row shows a green checkmark next to the saved item
Bad: A loading spinner is visible
Better: The results table is populated

If the transient state is genuinely what you need to verify, record a video and review the artifact rather than asserting on it.

Keep tests short

If a test has 40+ steps, it’s doing too much. Split by user intent, one test per outcome you want to guarantee. Short tests fail faster, retry cleaner, and point at a single problem.

Extract reusable flows into modules

Anything you do more than twice (login, seeding data, teardown) belongs in a module. Module changes apply to every test that uses them.

Use variables for environment differences

Never hard-code URLs, emails, or credentials. Use variables for anything that differs between local, staging, and production.

Don’t chase flakes with retries

Retries mask instability. If a test is flaky, dig into why, missing waits, network timing, or a truly flaky product flow. Quarantine it while you investigate, rather than bumping retry counts. See Debugging flaky tests.

Review test diffs like you review code

Treat test changes as part of the same PR as the product change. This catches:

Accidental locator changes that should have been coupled with code
New tests that don’t assert anything meaningful
Tests that no longer catch a product regression

​Prompt like you’re giving directions

​Assert after every meaningful action

​Don’t assert on time-sensitive UI

​Keep tests short

​Extract reusable flows into modules

​Use variables for environment differences

​Don’t chase flakes with retries

​Review test diffs like you review code