Skip to main content

Documentation Index

Fetch the complete documentation index at: https://momentic.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Momentic is a managed testing platform for iOS and Android. Tests are YAML, executed on managed remote emulators and simulators. A multi-modal step cache stores locator metadata per step and auto-heals in place when the UI drifts. AI primitives cover action, assertion, visual diff, and typed extraction. AI providers route with cross-provider failover. A dashboard captures run videos, view hierarchies, heal events, and AI reasoning. Appium is an open-source mobile automation framework that exposes the WebDriver / W3C protocol across iOS, Android, and other platforms via swappable drivers (UiAutomator2, XCUITest, Espresso, Flutter, Mac, Windows). Tests are written in TypeScript / Python / Java / Ruby / .NET against the Appium client of choice. It’s well-suited to teams that want OSS, multi-language flexibility, physical-hardware support, and deep customization at the driver level, plus the bandwidth to maintain a verbose, locator-heavy codebase.

Speed and caching

MomenticAppium
What’s cachedMulti-modal locator data per step (docs).Nothing. Every locator strategy re-queries the device.
Heal on missRe-resolves and updates the entry in place. Heal event on the run.Not supported. A miss is a test failure after the WebDriverWait timeout.
WaitingBuilt-in: navigation, load, screenshots, DOM / view-hierarchy mutations, same-origin requests. 3s default, configurable.implicitlyWait (per-driver) or explicit WebDriverWait + ExpectedConditions per query. Boilerplate per step.
StorageManaged, git-aware.N/A.
Cost of a UI changeAuto-heal absorbs renamed IDs, localized strings, reordered hierarchies.One client-code edit per broken selector. XPath edits often require full re-derivation.

How the multi-modal cache works

A cached step stores more than one way to find the target: where it sits on screen, what it looks like, what text it contains, and the accessibility and structural attributes around it. Which of those signals matters for a given step is inferred from the natural-language description. “The red Cancel button below the Order Summary header” leans on visual and positional signals; “the Sign in button” leans on accessibility and text. When a step replays, the runner checks the stored signals against the live UI and runs the action without invoking the LLM when there’s a match.

What happens on a UI change

A practical sequence that exercises the headline difference. Take a sign-in screen whose Email field has accessibilityIdentifier = "email_input", after a passing baseline run where the cache is warm. Refactor: the app team renames email_input to email_field. The XPath position of the field shifts because a container was added above it. Appium replay:
  1. driver.$("~email_input") issues a findElement request against the device. The WebDriver waits up to implicitlyWait (or the configured WebDriverWait timeout) for the element to appear.
  2. The timeout elapses with no match. The client throws NoSuchElementError.
  3. The test stops; the CI job is red. Someone edits the client code to use the new accessibility ID (or rewrites the XPath against the new hierarchy), opens a PR, gets it merged, and re-runs CI. If the XPath path was deep, the edit can cascade across multiple steps.
Momentic replay:
  1. The cached locator for the Email step misses on the live device.
  2. The locator agent re-resolves the original natural-language description Email.
  3. The new locator binds, the step runs, the test passes.
  4. The cache entry is updated in place. A heal event is attached to the run for review. Subsequent runs hit the cache normally.
Across a suite this is the difference between a renamed-ID incident and a no-op.
Smart waitingMomentic’s default smart wait is 3000ms and configurable per test. The runner waits on a combination of navigation, load, screenshots, DOM / view-hierarchy mutations, and same-origin requests until the UI is quiet or the timeout elapses.Appium waiting, for contrast
  • implicitlyWait is per-driver. Too low -> flakes; too high -> padded runs.
  • Explicit waits (WebDriverWait + ExpectedConditions.visibilityOfElementLocated, presenceOf, elementToBeClickable) are per query. Common to layer 3-5 explicit waits per logical step.
  • No notion of network quiescence — teams instrument their own request interceptors or poll the UI.

Locators and AI primitives

MomenticAppium
Locator modelNatural-language descriptions resolved by an AI agent against a11y tree + view hierarchy + screenshot. Cached, auto-healed.accessibility id, id, xpath, class name, -android uiautomator (UiSelector), -ios predicate string, -ios class chain, -image (template match).
Visual cuesColor, icon, relative size, position part of the locator.-image strategy is template matching only (no semantic context).
Agentic stepact accepts a multi-step goal; the agent plans and executes.Not supported.
AI assertassert is a first-class step type, fails by default.Not built-in. Teams build with getText + manual checks.
Visual diffassertVisual, agent-scored against a golden.Not built-in. Third-party plugins (e.g. Applitools) bolt on.
AI providerManaged; cross-provider failover handled by the platform.None. Teams integrate LLMs themselves.
Momentic mobile step types
  • Action: act, tap, doubleTap, longPress, type, swipe, scroll, back, dismissKeyboard, launchApp, terminateApp
  • Assert: assert, assertVisual, checkElement<...>
  • Extract: extract (typed via JSON schema)
  • Control flow: if/then/else, modules, parameter inputs
Appium locator trade-offs, for contrast
  • accessibility id is the most stable strategy but only exists when developers explicitly set contentDescription (Android) / accessibilityIdentifier (iOS). Production apps frequently miss them on dynamic content.
  • id (resource-id on Android) breaks under refactors and A/B testing.
  • xpath is the catch-all but is slow on large hierarchies and breaks on any structural change.
  • -image does template matching; works for static images, fails on themed UIs.
  • No locator strategy carries semantic intent. A failing step has no description to recover from.

Recovery, quarantine, and CI

MomenticAppium
Failure recoveryLLM agent proposes test edits in the dashboard.Not supported.
QuarantineFirst-class: tests run, results report, exit code unaffected unless --only-quarantined.Not supported.
Sharding--shard-index <i> / --shard-count <n>, 1-indexed. Deterministic alphabetical partition.Owned by the host runner (pytest, Mocha, JUnit).
Reportersjunit, allure, playwright-json, buildkite-json.Whatever the host runner emits (Allure, ExtentReports).
Device fleetRemote Android 14/15 emulators and iOS 26 simulators with sub-1s provisioning, multi-region. Local AVDs / simulators supported.Bring-your-own: local devices, Appium server cluster, or device cloud (BrowserStack, Sauce, LambdaTest, AWS Device Farm).
DashboardRun videos, traces, heal events, AI reasoning, screenshots, network.Third-party (Allure, vendor dashboards).
Sharding: --shard-index <i> / --shard-count <n>. Deterministic, contiguous partition of the suite.Device provisioning: each test gets its own device session, so parallel runs don’t share device state. No per-test execution cap.Appium grid, for contrast: Appium servers run locally or in a Selenium Grid. Device farms layer their own provisioning on top. Common pain points: stale UI hierarchies between sessions, dangling driver processes, capability drift across drivers and OS versions.

Authoring side-by-side

// Appium with WebdriverIO
import { remote } from "webdriverio";

const driver = await remote({
  hostname: "localhost",
  port: 4723,
  capabilities: {
    platformName: "Android",
    "appium:deviceName": "emulator-5554",
    "appium:automationName": "UiAutomator2",
    "appium:app": "/path/to/app.apk",
  },
});

const email = await driver.$("~email_input"); // accessibility id
await email.waitForDisplayed({ timeout: 10_000 });
await email.setValue("[email protected]");
const password = await driver.$("~password_input");
await password.setValue("secret");
const signIn = await driver.$('//*[@text="Sign in"]'); // XPath
await signIn.click();
await driver.waitUntil(
  async () => (await driver.$('//*[contains(@text, "Welcome")]')).isDisplayed(),
  { timeout: 10_000 },
);
// "Chart visible and not cut off" requires custom logic.
await driver.deleteSession();
Agentic v2:
fileType: momentic/test/v2
id: sign-in-and-verify
steps:
  - act: Sign in with [email protected] / secret
  - assert: The dashboard chart is visible and not cut off
Explicit v2 (same flow, step-by-step):
fileType: momentic/test/v2
id: sign-in-and-verify
steps:
  - type:
      text: [email protected]
      into: Email
  - type:
      text: secret
      into: Password
  - tap: Sign in
  - assert: The dashboard chart is visible and not cut off

A more realistic test

The hello-world above doesn’t show the v2 surface. A representative onboarding regression with module reuse, parameter inputs, typed extraction, and a conditional looks like this:
onboarding.test.yaml
fileType: momentic/test/v2
id: onboarding-with-promo
steps:
  - launchApp
  - module:
      path: ../modules/sign-in.module.yaml
      inputs:
        email: "{{ env.QA_EMAIL }}"
        password: "{{ env.QA_PASSWORD }}"
  - act: Skip the onboarding tour and land on Home
  - tap: Account
  - type:
      text: "{{ env.PROMO_CODE }}"
      into: Promo code field
  - tap: Apply
  - if:
      condition:
        assert: A success banner saying the promo was applied is visible
      then:
        - extract:
            goal: The discounted monthly total shown on the plan card
            schema:
              type: object
              properties:
                amount:
                  type: number
              required: [amount]
      else:
        - assert: An invalid-promo error is visible
  - assertVisual:
      that: The plan card is fully visible and not cut off
The matching module:
../modules/sign-in.module.yaml
fileType: momentic/module/v2
moduleId: sign-in
name: Sign in
steps:
  - type:
      text: "{{ inputs.email }}"
      into: Email
  - type:
      text: "{{ inputs.password }}"
      into: Password
  - tap: Sign in
  - assert: The Home tab is visible
There is no equivalent first-class surface in Appium. Reuse is by extracting host-language helpers; extraction is whatever the client codes; conditionals are if in the host language; visual assertions need a third-party plugin.

When to pick which

Appium is the right call if you have an existing Appium suite the team wants to keep, you have a hard requirement for an OSS WebDriver-protocol layer, you need multi-language clients, or you do deep customization at the driver level (custom plugins, native command extensions). Momentic is the right call if wall-clock run time matters at scale, selector maintenance is a real recurring cost, you want AI assertions that fail the test by default, you’d rather author in YAML than maintain a multi-language WebDriver codebase, and you expect healing, recovery, quarantine, sub-second emulator boots, and run videos out of the box.