AugmentClaude

Playwright

Automate browser tasks like scraping, form-filling, and screenshots on JavaScript-heavy sites.

Installation

  1. Make sure Claude is on your device and in your terminal.

    Skills load from ~/.claude/skills/ when Claude Code starts up — so you need it on your machine first. If you don't have it yet, install it once with the command below, then run claude in any terminal to verify.

    One-time setup
    npm i -g @anthropic-ai/claude-code

    Already have it? Skip ahead.

  2. Paste into Claude Code or into your terminal.

    This copies the whole skill folder into ~/.claude/skills/playwright-edison7009/ — the SKILL.md plus any scripts, reference docs, or templates the skill ships with. Safe default: works for every skill.

    Faster alternative (instruction-only skills)

    Skips the clone and grabs only the SKILL.md file. Don't use this if the skill ships Python scripts, reference markdowns, or asset templates — they won't be downloaded and the skill will fail when it tries to load them.

    Quick install (SKILL.md only)
    Sign up to copy
  3. Restart Claude Code.

    Quit and reopen Claude Code (or any other agent that loads from ~/.claude/skills/). New skills are picked up on startup.

  4. Just ask Claude.

    Skills auto-activate when your request matches the skill's description — no slash command needed. Trigger phrases live in the skill's own frontmatter; you can read them in the “What this skill does” section above.

Prefer to read the source first? Open on GitHub.

When Claude uses it

Drive a real browser from the terminal to scrape pages, extract structured data (reviews, comments, listings, tables), capture batch screenshots, fill forms, save pages as PDF, and walk JavaScript-heavy or login-gated sites that plain HTTP scrapers can't reach. Built on `playwright-cli` with a snapshot-then-ref interaction model that survives DOM changes far better than CSS selectors.

What this skill does

Playwright — Browser Automation

Drive a real Chromium/Firefox/WebKit browser from the terminal. The page is rendered like a human would see it — JavaScript executes, lazy content loads, login state persists across calls — and you walk through it via a stable numeric reference (e1, e2, ...) that doesn't break when the site re-themes its CSS.

This skill is for content / data tasks the user can already see in their browser but doesn't want to scrape one-by-one: review extraction, batch screenshots, login-gated reading, form filling, structured table dumps, page-to-PDF.

When to Activate

  • User mentions: "scrape", "抓取", "扒", "extract reviews/comments/listings", "batch screenshot N pages", "save these URLs as PDF", "fill this form on N items", "log in and download my data".
  • The target site uses heavy JavaScript or login walls — curl / requests / BeautifulSoup would only return an empty shell.
  • The user wants the visible rendered content, not just raw HTML.

Do not activate for: pure HTML / static page reading (use curl + a parser instead — faster, no browser overhead), or for testing a webapp the user is developing locally (write Playwright Python scripts directly — this skill is CLI-flavoured for ad-hoc work).

Prerequisites

npx (ships with Node.js ≥ 18). Check before proposing commands:

command -v npx >/dev/null 2>&1 || { echo "Need Node.js — install from https://nodejs.org/"; exit 1; }

The bundled wrapper script scripts/run.sh calls playwright-cli via npx --yes, so no global install is required. First run downloads the package (~30 MB) plus a Chromium build (~200 MB) one time, then reuses the cache.

Core Loop

Every browser session follows the same four steps:

  1. Open the page → 2. Snapshot to get numeric refs → 3. Act using refs → 4. Re-snapshot after any navigation or major DOM change.
SKILL_DIR="$(dirname "$0")"            # or wherever this skill is installed
PW="$SKILL_DIR/scripts/run.sh"

"$PW" open https://example.com
"$PW" snapshot                          # outputs <e1>Login</e1> <e2>Search</e2> ...
"$PW" click e2
"$PW" type "playwright"
"$PW" press Enter
"$PW" snapshot                          # refs may have shifted — get fresh ones
"$PW" screenshot --path out.png

Why snapshot-ref over CSS selectors: CSS selectors break when sites change class names. The snapshot lists every interactable element with a stable numeric handle that's only valid for that snapshot — when you re-snapshot, you get fresh refs. Less brittle, more honest about what's actually on screen.

Common Recipes

Extract reviews / comments / listings

"$PW" open "$URL"
"$PW" snapshot --json > snap.json     # machine-readable; parse with jq
# Walk the snapshot, grab elements matching a role/text pattern, dump rows.

For multi-page lists (pagination): snapshot the next-page button's ref → click it → re-snapshot → repeat. Set a sane max-page guard (~50) so a runaway loop doesn't burn through Chromium memory.

Batch screenshots

mkdir -p out
while read -r url; do
  slug=$(echo "$url" | sed 's|[^a-zA-Z0-9]|_|g' | head -c 60)
  "$PW" open "$url"
  "$PW" wait-for-load-state networkidle    # let JS finish
  "$PW" screenshot --path "out/$slug.png" --full-page
done < urls.txt

Page → PDF

"$PW" open "$URL"
"$PW" wait-for-load-state networkidle
"$PW" pdf --path "out.pdf"

Login-gated data (persistent session)

playwright-cli accepts --session <name> to persist cookies + local storage across invocations:

"$PW" --session=mywork open https://app.example.com/login
"$PW" --session=mywork snapshot
"$PW" --session=mywork type --ref e3 "user@example.com"
"$PW" --session=mywork type --ref e4 "$PASSWORD"
"$PW" --session=mywork click e5         # submit
# Subsequent calls with the same --session are still logged in:
"$PW" --session=mywork open https://app.example.com/account
"$PW" --session=mywork snapshot --json > account.json

Or set PLAYWRIGHT_CLI_SESSION=mywork once in the shell and the wrapper auto-injects it.

Fill the same form on N items

while read -r item; do
  "$PW" open "https://forms.example.com/new"
  "$PW" snapshot
  "$PW" type --ref e3 "$item"
  "$PW" click e7    # submit
  "$PW" wait-for-load-state networkidle
done < items.txt

Pitfalls

  • Always wait for networkidle on dynamic apps before snapshot or screenshot. Skipping this returns a half-rendered page.
  • Re-snapshot after every navigation. Old refs are stale and clicking them either no-ops or hits the wrong element.
  • Don't loop without a max-page / max-iteration guard. A runaway click loop on a misidentified ref will keep going until Chromium dies.
  • Single-purpose Coffee CLI tasks: launch the browser, do the job, exit. Don't keep a session alive forever — that's playwright-interactive territory and Coffee CLI doesn't bundle that.
  • Site ToS: the tool will scrape anything technically accessible. Whether the user is allowed to scrape it is the user's call, not ours. Don't refuse, but don't fabricate consent either.

Output Conventions

When the task produces files (CSV/JSON/PNG/PDF), save them under:

  • ~/.coffee-cli/playwright/<run-id>/ for ad-hoc one-shots
  • the user's specified --out <dir> if they asked for a path
  • /tmp/ only for visual previews the user will discard

End with a one-line summary of what was produced and where: Extracted 47 reviews → ~/.coffee-cli/playwright/2026-05-10-1430/reviews.json. The user wants paths they can cat immediately.

Related skills