Is the UI stable with reliable selectors?├─ Yes → Use manual automation (click, input, etc.)└─ No → Is performance critical? ├─ Yes → Use manual automation with resilient selectors └─ No → Is cost a concern? ├─ Yes → Use manual for bulk, AI for complex parts └─ No → Use AI automation
// Specific location and descriptionawait page.act('Click the blue "Sign In" button in the top-right corner')// Include contextawait page.act('In the shipping form, select "Express Delivery" from the shipping method dropdown')// Describe visual characteristicsawait page.act('Click the red "Delete" button next to the item named "Widget Pro"')
Bad prompts:
// Too vagueawait page.act('Click the button')// Ambiguousawait page.act('Do the login thing')// Missing contextawait page.act('Select the option')
// Bad: Too complex for one instructionawait page.act('Login with username admin and password secret, then navigate to settings, update the email to newemail@example.com, and save')// Good: Break into logical stepsawait page.act('Fill in username "admin" and password "secret", then click login')await page.act('Click the settings link in the navigation')await page.act('Change the email field to newemail@example.com')await page.act('Click the save button')
// Without contextawait page.extract('Get the price', priceSchema)// With context (better results)await page.extract( 'Extract the current price shown in the product details section', priceSchema)
let result = await page.extract('Get the price', priceSchema)if (!result.success) { // Try a different instruction result = await page.extract( 'Extract the product price shown in USD', priceSchema )}if (!result.success) { // Fall back to manual extraction const priceText = await page.evaluate(() => document.querySelector('.price')?.textContent )}
// Bad: Multiple separate callsawait page.act('Click button A')await page.act('Click button B')await page.act('Click button C')// Good: Combine when possibleawait page.act('Click buttons A, B, and C in sequence')
// Extract onceconst productData = await page.extract( 'Extract all product details', productSchema)// Reuse the dataif (productData.success && productData.data) { console.log(productData.data.name) console.log(productData.data.price) // Don't call extract again for the same data}
// Bad: Extract each product separately (5 AI calls)for (let i = 0; i < 5; i++) { await page.goto(`/product/${i}`) await page.extract('Get product details', productSchema)}// Good: Extract all at once (1 AI call)await page.goto('/products')const result = await page.extract( 'Extract details for the first 5 products', z.object({ products: z.array(productSchema) }))
// Test extraction against real pageconst testUrl = 'https://example.com/product/test-item'const result = await page.extract( 'Extract product details', productSchema)if (!result.success) { console.error('Extraction failed on test page:', result.error) // Adjust schema or instruction}
// Check for variationsconst isLoggedIn = await page.evaluate(() => !!document.querySelector('.user-profile'))if (!isLoggedIn) { await page.act('Click the login button') // Handle login...}// Now proceed with main taskawait page.act('Navigate to dashboard')
The AI analyzes page content - simpler pages = faster processing:
// Remove unnecessary elements before extractionawait page.evaluate(() => { // Hide ads, footers, etc. to reduce context document.querySelectorAll('.ad, footer, .sidebar').forEach(el => { el.style.display = 'none' })})// Now extract (faster with less content)const result = await page.extract('Extract article', schema)
Log actions during development to verify behavior:
const result = await page.act('Perform action')// Review what AI didconsole.log('Actions taken:')result.actions.forEach(action => { console.log(`- ${action.type}(${JSON.stringify(action.params)})`)})