Natural Language Actions

The page.act() method allows you to control the browser using natural language instructions. The AI interprets your instruction, determines what actions to take, and executes them automatically.

Signature

page.act(instruction: string): Promise<ActResult>

Parameters:

instruction - Natural language description of what you want to do

Returns: Promise<ActResult>

{
  success: boolean
  actions: Array<{
    type: string        // 'click', 'input', 'goto', etc.
    params: unknown[]
    result?: unknown
    error?: string
  }>
  reasoning?: string
}

How It Works

When you call page.act():

AI captures the current page state (DOM, visible elements, form fields, etc.)
Your instruction is sent to the AI provider with page context
AI determines which actions to perform (click, input, scroll, etc.)
Actions are executed sequentially on the page
Result is returned with details about what was done

Basic Examples

Simple Click Actions

await page.goto('https://example.com')

// Click a button
await page.act('Click the sign up button')

// Click a link
await page.act('Click the privacy policy link')

// Click by description
await page.act('Click the blue submit button at the bottom')

Form Interactions

// Fill a single field
await page.act('Fill in the email field with [email protected]')

// Fill multiple fields
await page.act('Enter username "johndoe" and password "secret123"')

// Select from dropdown
await page.act('Select "United States" from the country dropdown')

// Check a checkbox
await page.act('Check the "I agree to terms" checkbox')

Intermediate Examples

Multi-Step Workflows

await page.goto('https://example.com/login')

// AI performs multiple actions in sequence
await page.act('Fill in the login form with username "john" and password "pass123", then click login')

Conditional Actions

await page.goto('https://example.com/products')

// AI finds and interacts with dynamic elements
await page.act('If there is a cookie consent banner, click accept')

await page.act('Find the product with the lowest price and click on it')

// Scroll to find elements
await page.act('Scroll down to the footer and click the contact link')

// Navigate through pages
await page.act('Click the next page button')

Advanced Examples

Complex Interactions

await page.goto('https://example.com/checkout')

// Multi-step form completion
const result = await page.act(`
  Fill out the shipping form with:
  - Name: John Doe
  - Address: 123 Main St
  - City: San Francisco
  - ZIP: 94102
  Then click continue
`)

if (result.success) {
  console.log('Form completed successfully')
  console.log('Actions taken:', result.actions)
}

Dynamic Content Handling

// Wait and interact with loaded content
await page.act('Wait for the search results to load, then click the first result')

// Handle popups
await page.act('If a signup popup appears, close it')

// Interact with carousels
await page.act('Click the right arrow on the image carousel 3 times')

Search and Filter

await page.goto('https://example.com/products')

await page.act('Enter "laptop" in the search box and press enter')
await page.act('Filter results by price: $500-$1000')
await page.act('Sort by highest rating')

Best Practices

Write Clear Instructions

Be specific and descriptive. Instead of “click the button”, say “click the blue submit button at the bottom of the form”

Good:

await page.act('Click the red "Add to Cart" button next to the product image')
await page.act('Fill in the email field in the login form with [email protected]')

Bad:

await page.act('Click it') // Too vague
await page.act('Do the thing') // Not specific

Break Down Complex Tasks

For very complex workflows, break them into steps:

// Instead of one giant instruction
await page.act('Fill in the form')
await page.act('Select shipping method')
await page.act('Click continue to payment')

Handle Errors

Check the result and handle failures:

const result = await page.act('Click the checkout button')

if (!result.success) {
  console.error('Action failed:', result.actions)
  // Try alternative approach or manual automation
}

Combine with Manual Automation

Mix AI actions with traditional automation for best results:

// Use AI for dynamic interactions
await page.act('Accept cookie consent if it appears')

// Use manual automation for precise control
await page.click('#precise-selector')
await page.input('#email', '[email protected]')

// Use AI for complex extraction
await page.act('Find and click the product with best rating')

Common Use Cases

Authentication Flows

await page.goto('https://example.com/login')
await page.act('Click the "Sign in with Google" button')
// Handle OAuth popup...
await page.act('Click the authorize button')

Form Filling

await page.act(`
  Complete the contact form with:
  Name: John Doe
  Email: [email protected]
  Message: I would like more information
  Then submit the form
`)

Shopping and E-commerce

await page.goto('https://shop.example.com')
await page.act('Search for "wireless headphones"')
await page.act('Filter by price under $100')
await page.act('Click on the product with the highest rating')
await page.act('Select black color and add to cart')

await page.goto('https://news.example.com')
await page.act('Find and click the article about technology')
await page.act('Scroll down and click "Load more comments"')

Error Handling

The ActResult includes details about what happened:

const result = await page.act('Click the submit button')

if (result.success) {
  console.log('✓ Action completed successfully')

  // See what actions were performed
  result.actions.forEach(action => {
    console.log(`${action.type}(${action.params.join(', ')})`)
  })

  // View AI reasoning (if available)
  if (result.reasoning) {
    console.log('AI reasoning:', result.reasoning)
  }
} else {
  console.error('✗ Action failed')

  // Check which step failed
  const failedAction = result.actions.find(a => a.error)
  if (failedAction) {
    console.error('Failed at:', failedAction.type, failedAction.error)
  }
}

Limitations

AI actions are powerful but have limitations:

Require API calls (adds latency and cost)
May not work on heavily obfuscated UIs
Success rate varies by page complexity
Not suitable for precise pixel-level interactions

When AI actions fail, fall back to traditional selector-based automation.

Performance Tips

Use specific instructions - Reduces AI thinking time
Combine actions - One instruction with multiple steps is faster than multiple calls
Cache page state - If performing multiple actions on same page
Use Haiku model - Faster for simple actions

Data Extraction

Extract structured data with page.extract()

AI Setup

Configure AI agents and providers

Best Practices

Learn effective AI automation patterns

Page Interaction

Manual interaction methods (click, input)

JavaScript SDK

Page API

AI Automation

Natural Language Actions

Signature

How It Works

Basic Examples

Simple Click Actions

Form Interactions

Intermediate Examples

Multi-Step Workflows

Conditional Actions

Navigation and Scrolling

Advanced Examples

Complex Interactions

Dynamic Content Handling

Search and Filter

Best Practices

Write Clear Instructions

Break Down Complex Tasks

Handle Errors

Combine with Manual Automation

Common Use Cases

Authentication Flows

Form Filling

Shopping and E-commerce

Content Navigation

Error Handling

Limitations

Performance Tips

Data Extraction

AI Setup

Best Practices

Page Interaction

JavaScript SDK

Page API

AI Automation

​Signature

​How It Works

​Basic Examples

​Simple Click Actions

​Form Interactions

​Intermediate Examples

​Multi-Step Workflows

​Conditional Actions

​Navigation and Scrolling

​Advanced Examples

​Complex Interactions

​Dynamic Content Handling

​Search and Filter

​Best Practices

​Write Clear Instructions

​Break Down Complex Tasks

​Handle Errors

​Combine with Manual Automation

​Common Use Cases

​Authentication Flows

​Form Filling

​Shopping and E-commerce

​Content Navigation

​Error Handling

​Limitations

​Performance Tips

​Related

Data Extraction

AI Setup

Best Practices

Page Interaction

Signature

How It Works

Basic Examples

Simple Click Actions

Form Interactions

Intermediate Examples

Multi-Step Workflows

Conditional Actions

Navigation and Scrolling

Advanced Examples

Complex Interactions

Dynamic Content Handling

Search and Filter

Best Practices

Write Clear Instructions

Break Down Complex Tasks

Handle Errors

Combine with Manual Automation

Common Use Cases

Authentication Flows

Form Filling

Shopping and E-commerce

Content Navigation

Error Handling

Limitations

Performance Tips

Related