The page.act() method allows you to control the browser using natural language instructions. The AI interprets your instruction, determines what actions to take, and executes them automatically.
Signature
page.act(instruction: string): Promise<ActResult>
Parameters:
instruction - Natural language description of what you want to do
Returns: Promise<ActResult>
{
success: boolean
actions: Array<{
type: string // 'click', 'input', 'goto', etc.
params: unknown[]
result?: unknown
error?: string
}>
reasoning?: string
}
How It Works
When you call page.act():
- AI captures the current page state (DOM, visible elements, form fields, etc.)
- Your instruction is sent to the AI provider with page context
- AI determines which actions to perform (click, input, scroll, etc.)
- Actions are executed sequentially on the page
- Result is returned with details about what was done
Basic Examples
Simple Click Actions
await page.goto('https://example.com')
// Click a button
await page.act('Click the sign up button')
// Click a link
await page.act('Click the privacy policy link')
// Click by description
await page.act('Click the blue submit button at the bottom')
// Fill a single field
await page.act('Fill in the email field with [email protected]')
// Fill multiple fields
await page.act('Enter username "johndoe" and password "secret123"')
// Select from dropdown
await page.act('Select "United States" from the country dropdown')
// Check a checkbox
await page.act('Check the "I agree to terms" checkbox')
Multi-Step Workflows
await page.goto('https://example.com/login')
// AI performs multiple actions in sequence
await page.act('Fill in the login form with username "john" and password "pass123", then click login')
Conditional Actions
await page.goto('https://example.com/products')
// AI finds and interacts with dynamic elements
await page.act('If there is a cookie consent banner, click accept')
await page.act('Find the product with the lowest price and click on it')
// Scroll to find elements
await page.act('Scroll down to the footer and click the contact link')
// Navigate through pages
await page.act('Click the next page button')
Advanced Examples
Complex Interactions
await page.goto('https://example.com/checkout')
// Multi-step form completion
const result = await page.act(`
Fill out the shipping form with:
- Name: John Doe
- Address: 123 Main St
- City: San Francisco
- ZIP: 94102
Then click continue
`)
if (result.success) {
console.log('Form completed successfully')
console.log('Actions taken:', result.actions)
}
Dynamic Content Handling
// Wait and interact with loaded content
await page.act('Wait for the search results to load, then click the first result')
// Handle popups
await page.act('If a signup popup appears, close it')
// Interact with carousels
await page.act('Click the right arrow on the image carousel 3 times')
Search and Filter
await page.goto('https://example.com/products')
await page.act('Enter "laptop" in the search box and press enter')
await page.act('Filter results by price: $500-$1000')
await page.act('Sort by highest rating')
Best Practices
Write Clear Instructions
Be specific and descriptive. Instead of “click the button”, say “click the blue submit button at the bottom of the form”
Good:
await page.act('Click the red "Add to Cart" button next to the product image')
await page.act('Fill in the email field in the login form with [email protected]')
Bad:
await page.act('Click it') // Too vague
await page.act('Do the thing') // Not specific
Break Down Complex Tasks
For very complex workflows, break them into steps:
// Instead of one giant instruction
await page.act('Fill in the form')
await page.act('Select shipping method')
await page.act('Click continue to payment')
Handle Errors
Check the result and handle failures:
const result = await page.act('Click the checkout button')
if (!result.success) {
console.error('Action failed:', result.actions)
// Try alternative approach or manual automation
}
Combine with Manual Automation
Mix AI actions with traditional automation for best results:
// Use AI for dynamic interactions
await page.act('Accept cookie consent if it appears')
// Use manual automation for precise control
await page.click('#precise-selector')
await page.input('#email', '[email protected]')
// Use AI for complex extraction
await page.act('Find and click the product with best rating')
Common Use Cases
Authentication Flows
await page.goto('https://example.com/login')
await page.act('Click the "Sign in with Google" button')
// Handle OAuth popup...
await page.act('Click the authorize button')
await page.act(`
Complete the contact form with:
Name: John Doe
Email: [email protected]
Message: I would like more information
Then submit the form
`)
await page.goto('https://shop.example.com')
await page.act('Search for "wireless headphones"')
await page.act('Filter by price under $100')
await page.act('Click on the product with the highest rating')
await page.act('Select black color and add to cart')
Content Navigation
await page.goto('https://news.example.com')
await page.act('Find and click the article about technology')
await page.act('Scroll down and click "Load more comments"')
Error Handling
The ActResult includes details about what happened:
const result = await page.act('Click the submit button')
if (result.success) {
console.log('✓ Action completed successfully')
// See what actions were performed
result.actions.forEach(action => {
console.log(`${action.type}(${action.params.join(', ')})`)
})
// View AI reasoning (if available)
if (result.reasoning) {
console.log('AI reasoning:', result.reasoning)
}
} else {
console.error('✗ Action failed')
// Check which step failed
const failedAction = result.actions.find(a => a.error)
if (failedAction) {
console.error('Failed at:', failedAction.type, failedAction.error)
}
}
Limitations
AI actions are powerful but have limitations:
- Require API calls (adds latency and cost)
- May not work on heavily obfuscated UIs
- Success rate varies by page complexity
- Not suitable for precise pixel-level interactions
When AI actions fail, fall back to traditional selector-based automation.
- Use specific instructions - Reduces AI thinking time
- Combine actions - One instruction with multiple steps is faster than multiple calls
- Cache page state - If performing multiple actions on same page
- Use Haiku model - Faster for simple actions