Configure
Set the version of each agent underai.agentConfig in
momentic.config.yaml:
momentic.config.yaml
v3 is recommended for new projects. Omit an agent to use the organization
default.
Agents
| Agent | Powers |
|---|---|
locator | Click, Type, and Element check steps |
assertion | AI check steps |
visual-assertion | Visual assertions against a screenshot |
text-extraction | AI extract steps |
Locator
Locates elements on the page from a natural language description.- v1: 2024 models. Lenient matching, picks candidates that fulfill the same function as the description.
- v2: early-2025 models. Strict about relative positioning (“the link in the header”) and single-quoted text (“the ‘Submit’ button”).
- v3 (recommended): latest 2025 models. Smarter caches that store referenced attributes and relative elements, so meaningful changes invalidate the cache. Writes clearer reasoning for chosen elements.
Assertion
Evaluates natural language statements against a snapshot of the page.- v1: 2024 models. Lenient, evaluates statements as “overall true/false”.
- v2: early-2025 models. Stronger at sorted order, visual positioning, and relative criteria. Strict about single-quoted text.
- v3 (recommended): latest 2025 models.
Visual assertion
Evaluates natural language statements purely from a viewport screenshot.- v1: 2024 models. Faster (~3s avg). Lenient on color assertions.
- v2: early-2025 models. Slower but sharper on small visual details, logical reasoning, negations, and container-scoped assertions. Strict about “exactly” and single-quoted text.
- v3 (recommended): latest 2025 models.
Text extraction
Extracts structured data from the page given a JSON schema.- v1: 2024 models.
- v2: early-2025 models. Tighter schema adherence; throws if extracted data
doesn’t conform. Understands the
patternJSON schema option. - v3 (recommended): latest 2025 models. Handles nested objects and arrays. Better at following transformation instructions before returning data.