When to reach for agentic testing
- Dynamic flows where the UI changes based on feature flags, A/B tests, or user state
- High-level acceptance checks, “confirm a new user can sign up and reach the welcome screen”, without prescribing each click
- Exploratory coverage: let the agent probe areas of your app that aren’t worth a dedicated deterministic test
- End-to-end smoke tests after deploys, “complete an order”, with the agent handling whatever the current UX looks like
AI action
AI action is the primitive behind agentic testing. It accepts a natural language goal and lets the agent drive the browser or app until the goal is complete.signup.test.yaml
Version: V2 vs V3
AI action ships in two versions, selectable per step from the Version dropdown in the editor.- V3 (beta, recommended): a planner-style agent that drafts the full flow up front, caches the resolved steps after the first successful run, and self-heals when a cached step misses. Reruns are faster and more deterministic than V2. V3 also supports optional Pre-condition and Post-condition checks that run as protected guards around the generated flow and cannot be modified by the agent.
- V2: the previous generation. A fully dynamic agent that decides each step at runtime and does not cache the generated trajectory. Kept available as a fallback for flows where V3 does not yet work well.
Pairing with assertions
Wrap agentic steps with explicit assertions (AI check, Page check, Element check) so you verify the outcome, not only that the agent “finished”. Agentic steps are flexible but non-deterministic; assertions check the result.Reliability tips
- Keep goals short and specific. “Sign up a new user with a fresh email” is better than “Test the onboarding flow thoroughly.”
- Provide context the agent can’t infer. If there’s an invite code, pass it in via variables.
- Add a fallback assertion right after the agentic step so failures surface with a meaningful message.
- Combine with Auto-heal and Step cache, the agent’s successful traces are cached and replayed deterministically on subsequent runs.