Docs / Skills Reference / Computer Use

Computer Use

What it does

Controls your Mac directly — observes the screen via accessibility APIs and screenshots, clicks, types, scrolls, drags, opens apps, and runs AppleScript. Your assistant's hands and eyes on your desktop.

Setup required

None (built into the macOS app). Requires macOS system permissions.

Permissions

  • Accessibility (mouse/keyboard control)
  • Screen Recording (seeing screen content)
  • Each action is prompted individually for approval

Common prompts

You say...What happens
“Open Safari and go to my bank's website”Opens app and navigates
“Click the Submit button”Clicks a specific UI element
“Fill out this form with my info”Types into form fields
“Take a screenshot of what's on screen”Captures current screen state
“Switch to Slack and check my DMs”Navigates between apps
“Scroll down to the pricing section”Scrolls within an app

Configuration

  • No configuration needed
  • Step limit of 50 actions per session
  • Each action requires approval unless you create “Always Allow” rules

Tips & gotchas

  • Accessibility tree + screenshots. The assistant reads the accessibility tree (same API screen readers use) AND takes screenshots for a complete picture.
  • Element-based clicking. It prefers clicking by element name rather than coordinates for reliability.
  • Session caps. Sessions are capped at 50 steps with loop detection.
  • macOS only. Computer use is macOS only — not available on iOS or other channels.
  • Screen visibility. Be mindful of what's visible on screen — screenshots are sent to the AI model.