> ## Documentation Index
> Fetch the complete documentation index at: https://momentic.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Selenium

> How Momentic's multi-modal step cache, AI primitives, and managed runner compare against Selenium WebDriver.

Momentic is a managed testing platform for the web. Tests are YAML, executed on
a managed runner. A multi-modal step cache stores locator metadata per step and
auto-heals in place when the UI changes. AI primitives cover action, assertion,
visual diff, and typed extraction. AI providers route with cross-provider
failover behind a single managed surface. A dashboard captures run videos,
traces, network, heal events, and AI reasoning.

[Selenium](https://www.selenium.dev) is the original open-source browser
automation project. It drives browsers through the W3C WebDriver protocol with
bindings in Java, Python, C#, Ruby, and JavaScript, manages drivers with
Selenium Manager, and distributes runs across Selenium Grid. It's well-suited to
teams that want the broadest language and browser support, a hard OSS
WebDriver-protocol requirement, or have a large existing Selenium codebase to
keep.

## Speed and caching

|                     | Momentic                                                                                                  | Selenium                                                                                        |
| ------------------- | --------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------- |
| What's cached       | Multi-modal locator data per step ([docs](/reliability/step-cache)).                                      | Nothing. Every `findElement` re-queries the DOM.                                                |
| Waiting             | Built-in: navigation, `load`, screenshots, DOM mutations, same-origin requests. 3s default, configurable. | No auto-waiting. `implicitlyWait` or explicit `WebDriverWait` + `ExpectedConditions` per query. |
| Heal on miss        | Re-resolves and **updates the entry in place**. Heal event on the run.                                    | Not supported. A miss is a test failure after the wait timeout.                                 |
| Cost of a UI change | Auto-heal absorbs renamed IDs, restructured DOM, and reworded copy.                                       | One code edit per broken selector. XPath edits often require full re-derivation.                |
| Storage             | Managed, git-aware.                                                                                       | N/A.                                                                                            |

## How the multi-modal cache works

A cached step stores more than one way to find the target: where it sits on
screen, what it looks like, what text it contains, and the structural and
accessibility attributes around it. Which of those signals matters for a given
step is inferred from the natural-language description. "The red Cancel button
below the Order Summary header" leans on visual and positional signals; "the
Submit button in the form" leans on structure and role. When a step replays, the
runner checks the stored signals against the live page and runs the action
without invoking the LLM when there's a match.

On a miss, the locator agent ([auto-heal](/reliability/auto-heal)) re-resolves
the original description against the live page, updates the cache entry in
place, and the run continues. A heal event is recorded against the run.

## What happens on replay

Take a passing Selenium test suite, then the next day the team ships two
changes: the welcome banner copy changes from `Welcome, Ada` to
`Hi Ada, welcome back`, and the submit button's `id` is renamed from `submit` to
`submit-btn`.

**Selenium replay:**

1. `driver.find_element(By.ID, "submit")` issues a lookup. With an implicit
   wait, the driver polls until the timeout, finds nothing because the id was
   renamed, and raises `NoSuchElementException`.
2. The explicit wait for the old welcome copy
   (`WebDriverWait(...).until(EC.text_to_be_present_in_element(...))`) elapses
   and raises `TimeoutException`.
3. The CI job fails. A maintainer edits the selectors and the waits, opens a PR,
   gets it reviewed, and re-runs CI. If the broken locator was an XPath, the
   edit can cascade across steps.

**Momentic replay:**

1. `click` steps hit the cache. On the renamed button the cached locator misses,
   so the locator agent re-resolves the original description `Sign in` against
   the live page, binds, and updates the entry in place. A heal event is
   recorded.
2. `assert: The dashboard chart is visible and not cut off` is evaluated by the
   assertion agent against the current page state. The agent reasons over the
   intent of the assertion, not a literal string match, so the reworded banner
   doesn't trip it.
3. The test passes. No code review needed.

<Accordion title="Technical details">
  **Momentic smart waiting**

  The default smart wait is 3000ms and configurable per test. The runner waits on
  a combination of navigation, `load`, screenshots, DOM mutations, and same-origin
  requests until the page is quiet or the timeout elapses.

  **Selenium waiting, for contrast**

  * `implicitlyWait` is global per driver. Too low causes flakes; too high pads
    every run.
  * Explicit waits (`WebDriverWait` +
    `ExpectedConditions.visibility_of_element_located`, `presence_of`,
    `element_to_be_clickable`) are per query. It is common to layer several
    explicit waits per logical step.
  * There is no notion of network quiescence; teams poll the DOM or instrument
    their own request hooks.
</Accordion>

## Locators and AI primitives

|               | Momentic                                                                                                    | Selenium                                                                                       |
| ------------- | ----------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------- |
| Locator model | Natural-language descriptions resolved by an AI agent against DOM + a11y tree + screenshot. Cached, healed. | `By.id`, `By.name`, `By.className`, `By.cssSelector`, `By.xpath`, `By.linkText`, `By.tagName`. |
| Visual cues   | Color, icon, relative size, position part of the locator.                                                   | Not available; locators are structural or text-based.                                          |
| Agentic step  | `act` accepts a multi-step goal; the agent plans and executes.                                              | Not supported.                                                                                 |
| AI assert     | `assert` is a first-class step type, fails by default.                                                      | Not built-in. Teams build with `get_text` / `get_attribute` + host-framework assertions.       |
| Visual diff   | `assertVisually`, agent-scored against a golden.                                                            | Not built-in. Third-party plugins (e.g. Applitools) bolt on.                                   |
| AI provider   | Managed; cross-provider failover handled by the platform.                                                   | None.                                                                                          |

<Accordion title="Selenium locator trade-offs">
  * `By.id` / `By.name` are stable when present, but break under refactors and A/B
    testing, and dynamic content frequently lacks them.
  * `By.cssSelector` breaks on restyle or restructure.
  * `By.xpath` is the catch-all but is slow on large DOMs and breaks on any
    structural change.
  * No locator carries semantic intent. A failing step has no description to
    recover from.
</Accordion>

## Recovery, quarantine, and CI

|                  | Momentic                                                                                                                                   | Selenium                                                              |
| ---------------- | ------------------------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------- |
| Failure recovery | Post-run heal agent (`momentic ai heal`) rewrites failing tests and opens a PR or patch in CI; `momentic ai classify` triages the failure. | Not supported. Retries are whatever the host test framework provides. |
| Quarantine       | First-class: tests run, results report, exit code unaffected unless `--only-quarantined`.                                                  | Not built-in. Skips remove the test from the run.                     |
| Sharding         | `--shard-index <i>` / `--shard-count <n>`, 1-indexed. Deterministic alphabetical partition.                                                | Owned by the host runner (pytest, JUnit, Mocha) and Selenium Grid.    |
| Reporters        | `junit`, `allure`, `playwright-json`, `buildkite-json`.                                                                                    | Whatever the host runner emits (Allure, ExtentReports, JUnit XML).    |
| Execution        | Managed runner, multi-region.                                                                                                              | Bring-your-own: local drivers, a Selenium Grid, or a vendor cloud.    |
| Dashboard        | Run videos, traces, network, heal events, AI reasoning.                                                                                    | Third-party (Allure, vendor dashboards).                              |

## Authoring side-by-side

```py theme={null}
# Selenium with Python
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome()
driver.get("https://app.example.com")

wait = WebDriverWait(driver, 10)
wait.until(EC.visibility_of_element_located((By.ID, "email"))).send_keys(
    "ada@example.com"
)
driver.find_element(By.ID, "password").send_keys("secret")
driver.find_element(By.ID, "submit").click()
wait.until(
    EC.text_to_be_present_in_element((By.CSS_SELECTOR, ".banner"), "Welcome, Ada")
)
# "Chart visible and not cut off" requires custom logic.
driver.quit()
```

Agentic simplified format:

```yaml theme={null}
fileType: momentic/test/v2
id: sign-in-and-verify
url: https://app.example.com
steps:
  - act: Sign in with ada@example.com / secret
  - assert: The dashboard chart is visible and not cut off
```

Explicit simplified format (same flow, step-by-step):

```yaml theme={null}
fileType: momentic/test/v2
id: sign-in-and-verify
url: https://app.example.com
steps:
  - type:
      text: ada@example.com
      into: Email
  - type:
      text: secret
      into: Password
  - click: Sign in
  - assert: The dashboard chart is visible and not cut off
```

No explicit waits to tune, no literal selectors, and no string-match assertions
to maintain.

## A more realistic test

The hello-world above doesn't show the full simplified format surface. A
representative checkout regression with module reuse, parameter inputs, typed
extraction, and a conditional looks like this:

```yaml checkout.test.yaml theme={null}
fileType: momentic/test/v2
id: checkout-with-promo
url: https://shop.example.com
steps:
  - module:
      path: ../modules/sign-in.module.yaml
      inputs:
        EMAIL: env.QA_EMAIL
        PASSWORD: env.QA_PASSWORD
  - act: Add the Tetris Eye Sweatshirt (size M) to the cart
  - navigate: https://shop.example.com/checkout
  - type:
      text: "{{ env.PROMO_CODE }}"
      into: Promo code field
  - click: Apply
  - if:
      assert: A success banner saying the promo was applied is visible
      then:
        - extract:
            goal: The discounted subtotal in the order summary
            schema:
              type: object
              properties:
                amount:
                  type: number
              required: [amount]
  - if:
      assert: An invalid-promo error is visible
      then:
        - assert: The subtotal is unchanged
  - assertVisually: The order summary section is fully visible and not cut off
```

The matching module:

```yaml ../modules/sign-in.module.yaml theme={null}
fileType: momentic/module/v2
id: sign-in
name: Sign in
parameters:
  - name: EMAIL
  - name: PASSWORD
steps:
  - type:
      text: "{{ env.EMAIL }}"
      into: Email
  - type:
      text: "{{ env.PASSWORD }}"
      into: Password
  - click: Sign in
  - assert: The dashboard chart is visible and not cut off
```

There is no equivalent first-class surface in Selenium. Reuse is by extracting
host-language helpers; extraction is whatever the client codes; conditionals are
`if` in the host language; assertions and visual checks need the host framework
or third-party plugins.

## When to pick which

**Selenium is the right call if** you have a large existing Selenium test suite
the team wants to keep, you need the broadest language and browser support
including legacy targets, or you have a hard requirement for an OSS
WebDriver-protocol layer.

**Momentic is the right call if** selector and wait maintenance is a real
recurring cost, you want locators that re-resolve and heal on UI changes instead
of failing, you want AI assertions and visual checks that fail the test by
default, you'd rather author in YAML than maintain a WebDriver codebase, and you
expect healing, recovery, quarantine, and a managed dashboard with run videos
built in.

For the build-it-yourself version of this decision, see
[Build vs. buy](/comparisons/build-vs-buy).