Skip to main content
The Uplink AI package enables you to control browsers using natural language instructions powered by large language models (LLMs). Instead of writing selectors and detailed automation scripts, you can describe what you want to do in plain English.

Installation

The AI package is optional and must be installed separately:
npm install @uplink-code/ai
You’ll also need an API key from a supported AI provider (Anthropic or OpenAI).

When to Use AI Automation

AI automation is ideal when:
  • Dynamic UIs: The page structure changes frequently or unpredictably
  • Complex interactions: Multi-step workflows that are hard to script manually
  • Data extraction: Extracting structured information from unstructured content
  • Natural workflows: Actions that are easier to describe than to code
Traditional selector-based automation works better when:
  • Performance is critical: AI calls add latency and cost
  • Exact control needed: You need pixel-perfect precision
  • Stable selectors: The page structure is predictable and stable
  • Offline automation: No internet connection or API access

Quick Example

Here’s how AI automation works:
import uplink from '@uplink-code/uplink'
import ai from '@uplink-code/ai'

// Create an AI agent
const agent = ai.createAgent({
  provider: 'anthropic',
  options: {
    apiKey: process.env.ANTHROPIC_API_KEY
  }
})

// Connect with the agent
const session = await uplink.session('<organization-api-key>', {
  projectId: '<project-id>',
  include: { ecdsa: true, ecdh: true }
})
const client = await uplink.client.fromSession(session, { agent })
const browser = await client.launch()
const page = await browser.newPage()

await page.goto('https://example.com/products')

// Use natural language to perform actions
await page.act('Find the cheapest laptop and add it to cart')

// Extract structured data with type safety
import { z } from 'zod'

const productSchema = z.object({
  name: z.string(),
  price: z.number(),
  inStock: z.boolean()
})

const result = await page.extract(
  'Get the product details from this page',
  productSchema
)

console.log(result.data)
// { name: "ThinkPad X1", price: 1299.99, inStock: true }

Key Features

Natural Language Actions

The page.act() method interprets your instruction and performs the necessary browser actions automatically:
await page.act('Click the sign in button')
await page.act('Fill in the email field with [email protected]')
await page.act('Select United States from the country dropdown')

Type-Safe Data Extraction

The page.extract() method extracts structured data from pages using AI, with optional Zod schema validation for type safety:
const userSchema = z.object({
  name: z.string(),
  email: z.string().email(),
  posts: z.array(z.object({
    title: z.string(),
    date: z.string()
  }))
})

const result = await page.extract(
  'Extract user profile information including their recent posts',
  userSchema
)
// result.data is fully typed!

Supported Providers

  • Anthropic - Claude models (fully supported)
  • OpenAI - GPT models (coming soon)

How It Works

When you call page.act() or page.extract():
  1. The AI agent captures a snapshot of the current page state
  2. Your instruction is sent to the AI provider along with page context
  3. The AI analyzes the page and determines what actions to take
  4. For act(): Actions are executed automatically on the page
  5. For extract(): Data is extracted and validated against your schema

Next Steps