Playwright

Momentic is a managed testing platform for the web. Tests are YAML, executed on a managed runner. A multi-modal step cache stores locator metadata per step and auto-heals in place when the UI changes. AI primitives cover action, assertion, visual diff, and typed extraction. AI providers route with cross-provider failover behind a single managed surface. A dashboard captures run videos, traces, network, heal events, and AI reasoning. Playwright is Microsoft’s open-source browser automation framework. Tests are written in TypeScript / JavaScript / Python / Java / .NET and run across Chromium, Firefox, and WebKit. It has built-in auto-waiting, web-first auto-retrying assertions (expect), locators (getByRole, getByTestId, CSS, XPath), a parallel test runner, and a trace viewer. It’s well-suited to teams that want OSS, full programmatic control, and have the bandwidth to maintain a locator-based codebase.

This page compares hand-written Playwright. For the coding-agent authoring workflow that drives a browser and generates Playwright code, see Playwright MCP.

Speed and caching

	Momentic	Playwright
What’s cached	Multi-modal locator data per step (docs).	Nothing. Locators re-resolve against the live DOM on every run.
Waiting	Built-in: navigation, `load`, screenshots, DOM mutations, same-origin requests. 3s default, configurable.	Built-in auto-waiting on actionability; web-first `expect` auto-retries. No network-quiescence signal.
Heal on miss	Re-resolves and updates the entry in place. Heal event on the run.	Not supported. A broken locator is a test failure.
Cost of a UI change	Auto-heal absorbs renamed IDs, restructured DOM, and reworded copy.	One code edit per broken locator or assertion.
Storage	Managed, git-aware.	N/A.

A cached step stores more than one way to find the target: where it sits on screen, what it looks like, what text it contains, and the structural and accessibility attributes around it. Which of those signals matters for a given step is inferred from the natural-language description. “The red Cancel button below the Order Summary header” leans on visual and positional signals; “the Submit button in the form” leans on structure and role. When a step replays, the runner checks the stored signals against the live page and runs the action without invoking the LLM when there’s a match. On a miss, the locator agent (auto-heal) re-resolves the original description against the live page, updates the cache entry in place, and the run continues. A heal event is recorded against the run.

What happens on replay

Take a passing Playwright spec, then the next day the team ships two changes: the welcome banner copy changes from Welcome, Ada to Hi Ada, welcome back, and the submit button’s data-testid is renamed from submit to submit-btn. Playwright replay:

page.getByTestId("submit").click() auto-waits for actionability, then times out because the test id no longer exists. The action throws.
expect(page.getByText("Welcome, Ada")).toBeVisible() auto-retries until the timeout, then fails. The copy no longer matches.
The CI job fails. A maintainer edits the locator and the assertion, opens a PR, gets it reviewed, and re-runs CI.

Momentic replay:

click steps hit the cache. On the renamed button the cached locator misses, so the locator agent re-resolves the original description Sign in against the live page, binds, and updates the entry in place. A heal event is recorded.
assert: The dashboard chart is visible and not cut off is evaluated by the assertion agent against the current page state. The agent reasons over the intent of the assertion, not a literal string match, so the reworded banner doesn’t trip it.
The test passes. No code review needed.

Technical details

Momentic smart waitingThe default smart wait is 3000ms and configurable per test. The runner waits on a combination of navigation, load, screenshots, DOM mutations, and same-origin requests until the page is quiet or the timeout elapses.Playwright waiting, for contrast

Auto-waiting checks actionability (visible, stable, enabled, receives events) before each action, so timing alone rarely causes flakes.
expect assertions auto-retry until they pass or time out.
There is no built-in network-quiescence wait; the docs discourage waitForLoadState("networkidle"). Hard waitForTimeout is discouraged but common in quick or generated code.

Locators and AI primitives

	Momentic	Playwright
Locator model	Natural-language descriptions resolved by an AI agent against DOM + a11y tree + screenshot. Cached, healed.	`getByRole`, `getByText`, `getByLabel`, `getByTestId`, CSS, XPath. Resolved literally each run.
Visual cues	Color, icon, relative size, position part of the locator.	Not available; locators are structural or text-based.
Agentic step	`act` accepts a multi-step goal; the agent plans and executes.	Not supported.
AI assert	`assert` is a first-class step type, fails by default.	`expect` matchers are literal (text, attribute, count). No semantic assertion.
Visual diff	`assertVisually`, agent-scored against a golden.	`toHaveScreenshot` pixel diff against a golden; no semantic scoring.
AI provider	Managed; cross-provider failover handled by the platform.	None.

Playwright locator trade-offs

getByRole / getByLabel are stable when ARIA roles and labels exist, but dynamic content often lacks them.
getByTestId requires developers to add and maintain data-testid attributes; they get renamed or removed under refactors.
CSS selectors break on restyle or restructure; XPath breaks on any structural change.
No locator carries semantic intent. A failing locator has no description to recover from.

Recovery, quarantine, and CI

	Momentic	Playwright
Failure recovery	Post-run heal agent (`momentic ai heal`) rewrites failing tests and opens a PR or patch in CI; `momentic ai classify` triages the failure.	Retries (`retries: n`) re-run the same code. Flaky-test detection flags, it does not fix.
Quarantine	First-class: tests run, results report, exit code unaffected unless `--only-quarantined`.	Not built-in. `test.fixme` / `test.skip` remove the test from the run.
Sharding	`--shard-index <i>` / `--shard-count <n>`, 1-indexed. Deterministic alphabetical partition.	`--shard=i/n`, plus parallel workers.
Reporters	`junit`, `allure`, `playwright-json`, `buildkite-json`.	Built-in `list`, `line`, `dot`, `html`, `json`, `junit`; third-party for others.
Execution	Managed runner, multi-region.	Bring-your-own: CI runners or a grid (e.g. Selenium Grid, vendor cloud).
Dashboard	Run videos, traces, network, heal events, AI reasoning.	Trace viewer + HTML report as local artifacts; no hosted dashboard.

Authoring side-by-side

import { expect, test } from "@playwright/test";

test("sign in and verify", async ({ page }) => {
  await page.goto("https://app.example.com");
  await page.getByLabel("Email").fill("ada@example.com");
  await page.getByLabel("Password").fill("secret");
  await page.getByRole("button", { name: "Sign in" }).click();
  await expect(page.getByText("Welcome, Ada")).toBeVisible(); // literal text
  // "Chart visible and not cut off" needs a custom pixel or bounding-box check.
});

Agentic simplified format:

fileType: momentic/test/v2
id: sign-in-and-verify
url: https://app.example.com
steps:
  - act: Sign in with ada@example.com / secret
  - assert: The dashboard chart is visible and not cut off

Explicit simplified format (same flow, step-by-step):

fileType: momentic/test/v2
id: sign-in-and-verify
url: https://app.example.com
steps:
  - type:
      text: ada@example.com
      into: Email
  - type:
      text: secret
      into: Password
  - click: Sign in
  - assert: The dashboard chart is visible and not cut off

No literal selectors materialized at authoring time and no string-match assertions to maintain.

A more realistic test

The hello-world above doesn’t show the full simplified format surface. A representative checkout regression with module reuse, parameter inputs, typed extraction, and a conditional looks like this:

checkout.test.yaml

fileType: momentic/test/v2
id: checkout-with-promo
url: https://shop.example.com
steps:
  - module:
      path: ../modules/sign-in.module.yaml
      inputs:
        EMAIL: env.QA_EMAIL
        PASSWORD: env.QA_PASSWORD
  - act: Add the Tetris Eye Sweatshirt (size M) to the cart
  - navigate: https://shop.example.com/checkout
  - type:
      text: "{{ env.PROMO_CODE }}"
      into: Promo code field
  - click: Apply
  - if:
      assert: A success banner saying the promo was applied is visible
      then:
        - extract:
            goal: The discounted subtotal in the order summary
            schema:
              type: object
              properties:
                amount:
                  type: number
              required: [amount]
  - if:
      assert: An invalid-promo error is visible
      then:
        - assert: The subtotal is unchanged
  - assertVisually: The order summary section is fully visible and not cut off

The matching module:

../modules/sign-in.module.yaml

fileType: momentic/module/v2
id: sign-in
name: Sign in
parameters:
  - name: EMAIL
  - name: PASSWORD
steps:
  - type:
      text: "{{ env.EMAIL }}"
      into: Email
  - type:
      text: "{{ env.PASSWORD }}"
      into: Password
  - click: Sign in
  - assert: The dashboard chart is visible and not cut off

When to pick which

Playwright is the right call if you have an existing Playwright test suite the team wants to keep, you need cross-browser coverage including WebKit and Firefox, you have a hard OSS requirement with no SaaS, or you want full programmatic control in a general-purpose language. Momentic is the right call if selector and assertion maintenance is a real recurring cost, you want locators that re-resolve and heal on UI changes instead of failing, you want AI assertions and visual checks that fail the test by default, you’d rather author in YAML than maintain a locator codebase, and you expect healing, recovery, quarantine, and a managed dashboard with run videos built in. For the build-it-yourself version of this decision, see Build vs. buy.

Overview

AI actions

Comparisons

Use cases

Authentication

Page interactions

Test data & variants

Visual testing

Reporting

Auto-heal

Explore

Speed and caching

What happens on replay

Locators and AI primitives

Recovery, quarantine, and CI

Authoring side-by-side

A more realistic test

When to pick which

​Speed and caching

​How the multi-modal cache works

​What happens on replay

​Locators and AI primitives

​Recovery, quarantine, and CI

​Authoring side-by-side

​A more realistic test

​When to pick which

Speed and caching

How the multi-modal cache works

What happens on replay

Locators and AI primitives

Recovery, quarantine, and CI

Authoring side-by-side

A more realistic test

When to pick which