Prompt like you’re giving directions
Aim for prompts a new teammate could follow without running the app.- Good:
Click the Save button in the header - Worse:
Click save
Assert after every meaningful action
A step that “worked” is not the same as the right thing happening. Pair destructive or state-changing steps with an explicit assertion:checkout.test.yaml
Don’t assert on time-sensitive UI
Tooltips, toasts, loading spinners, and other transient UI can appear and disappear between the moment Momentic captures page state and the moment the agent reasons about it. Asserting on them leads to race conditions and false failures. Assert on the durable side effect instead:- Bad:
A tooltip appears saying 'Saved' - Better:
The row shows a green checkmark next to the saved item - Bad:
A loading spinner is visible - Better:
The results table is populated
Keep tests short
If a test has 40+ steps, it’s doing too much. Split by user intent, one test per outcome you want to guarantee. Short tests fail faster, retry cleaner, and point at a single problem.Extract reusable flows into modules
Anything you do more than twice (login, seeding data, teardown) belongs in a module. Module changes apply to every test that uses them.Use variables for environment drift
Never hard-code URLs, emails, or credentials. Use variables for anything that differs between local, staging, and production.Don’t chase flakes with retries
Retries mask instability. If a test is flaky, dig into why, missing waits, network timing, or a truly flaky product flow. Quarantine it while you investigate, don’t just bump retry counts. See Debugging flaky tests.Keep the step cache fresh
Commit your.momentic/cache so deterministic replays are the norm. When tests
change, delete the relevant cache entries and let auto-heal regenerate them.
Review test diffs like you review code
Treat test changes as part of the same PR as the product change. This catches:- Accidental locator changes that should have been coupled with code
- New tests that don’t assert anything meaningful
- Tests that quietly drift past a product regression