Finding elements
Guide to targeting elements reliably with Momentic’s AI
Modalities
Momentic’s AI is trained to understand HTML, the accessibility tree, and screenshots. Use a combination of these three modalities to describe the elements you want to interact with.
Accessibility attributes
Leveraging accessibility attributes such as aria-roles
, aria-labels
, and
alt text is an existing best practice
in testing. If your app is not accessible, we recommend adding these
attributes to improve UX and comply with ADA
standards.
The accessibility tree is how screen readers and other assistive technologies navigate web pages. As such, the best descriptions reference elements using their accessibility attributes.
We recommend mentioning the desired element’s accessibility role in the description, such as “the submit button”, “the homepage link”, “the password input”, “the ‘People’ tab”, or “the combobox for selecting a month”.
If you are unsure what attributes to use, you can use Chrome Dev Tools to find out.
HTML content and attributes
If the desired element contains text content, you can also include a subset or summary of the text in your description. For example, the “the ‘Welcome’ header” or “the text area containing ‘Lorem Ipsum’”.
Momentic’s AI also looks at certain high-signal HTML attributes. Values that are flagged as auto-generated, sensitive, or lengthy are ignored.
Exact matches
Momentic’s AI is trained to treat single quotation marks as an exact match. As such, the description “the ‘Submit’ button” will never match a button with content ‘Approve’.
If quotation marks are not provided, Momentic’s AI will try to use the context available on the page to infer the correct element.
Visual information
Momentic’s AI can access HTML and a11y tree for the entire page, not just the current visible viewport. However, it can only access visual information from the current visible viewport as it relies on a screenshot of the page.
Momentic’s AI is trained to understand visual information like colors, shapes, sizes, and positions. For example, you can say “the red ‘Cancel’ button”, “the thumbs up icon”, or “the hamburger menu beside the logo”.
Due to image quality constraints, we recommend against relying on visual information alone to identify elements smaller than 12x12 pixels. Vision-based descriptions are also less reliable when the element is embedded among many visually similar elements (e.g. a toolbar full of small icons).
Positional cues
Wen there are multiple instances of the same element (e.g. a data table or list), you will need to disambiguate which instance you want to interact with.
Absolute locations
You can provide absolute cues such as “…at the top of the page” or “…in the bottom right corner”.
Relative locations
You can provide relative cues such as “…below the ‘Introduction’ header” or “…beside the ‘Back’ button”.
Use the inside keyword to specify the parent element of another element. For example, “the ‘Submit’ button inside the user dialog”.
Use the closest keyword to specify the closest element to another element. For example, “the closest ‘Submit’ button to the ‘Username’ input”.
Ordering
Since the order of elements on the page can change, make sure to turn off caching. This will ensure Momentic’s AI will always target the correct element
Use the nth keyword to specify the position of an element in a list. For example, “the second ‘Submit’ button”.
Dynamic elements
Since the target element will change every time, make sure to turn off caching. This will ensure Momentic’s AI will always target the correct element
You can target random and dynamic elements with natural language descriptions. For example, “a random card” or “today’s date in the calendar”
Examples
Putting it all together, here are some examples of how to describe the labelled elements in the screenshot above:
- “the avatar picture on the top left corner”
- “the home link in the sidebar”
- “the ‘Amount’ filter menu”
- “the notification bell icon to the right of the green button”
- “the third outgoing transaction card in the list”
Troubleshooting
- Dynamic elements: If you use nth or dynamic elements in a description, make sure to turn off caching. This will ensure Momentic’s AI will always target the correct element.
- Hidden elements: Momentic’s AI will ignore elements that are explicitly marked as
aria-hidden
(or are part of sub-trees that are hidden), not attached to the DOM, or have no bounding box.