Zapier+DiffHook

Zapier web scraping — scrape a site, route into a Zap

DiffHook handles the scraping and the diff. Your Zap receives a signed webhook with the extracted text and the raw HTML fragment every time the page changes — no Code step, no third-party scraping add-on.

Teams normally scrape into Zapier with Schedule by Zapier + Webhooks GET + Code by Zapier + some Formatter gymnastics. That stack is fragile: Zapier truncates response bodies above 6MB, regex in a Code step ages badly, and every poll burns a task. DiffHook moves the fetch, rendering, and diff outside Zapier — the Zap only receives real changes and already-extracted fields.

Workflow

Scrape into Zapier in 5 steps

Standard Catch Hook trigger. No Code step, no premium app, no response-size surprises.

01

Start a Zap with a Catch Hook

Pick Webhooks by Zapier → Catch Hook. Zapier hands you a unique URL — this becomes the delivery destination in DiffHook.

02

Describe the scrape in a DiffHook monitor

Set type to html_css, supply the URL and the CSS selector that isolates the element you care about. Add include_html: true when the Zap needs the raw markup as well as the extracted text.

03

Register the monitor with the Catch Hook

POST to /v1/monitors with the scrape config, an interval, and the Zapier URL as a webhook delivery. DiffHook takes over polling, caching, and diff detection immediately.

04

Verify and filter inside Zapier

Add a Filter step that checks the X-DiffHook-Signature header against your signing secret. Optionally filter on the extracted_text or url so one Zap can cover several monitors.

05

Map fields into downstream actions

Zapier auto-parses the JSON body, so extracted_text, previous_value, current_value, and url are available as named fields. Drop them into a Slack message, a Google Sheets row, or an Airtable record.

API example

Scrape into a Zap, one POST

include_html: true sends the raw HTML fragment alongside the extracted text — useful when the Zap has to parse a list of items.

POST /v1/monitors
POST https://api.diffhook.com/v1/monitors
Authorization: Bearer $DIFFHOOK_API_KEY
Content-Type: application/json

{
  "type": "html_css",
  "url": "https://directory.example.com/listings",
  "css_selector": "article.listing",
  "include_html": true,
  "interval_seconds": 900,
  "deliveries": [
    {
      "type": "webhook",
      "url": "https://hooks.zapier.com/hooks/catch/000000/scrape123/"
    }
  ]
}

Importable workflow

Copy a ready-made Zap

Template Zap includes the Catch Hook, the signature-check Filter, a Formatter step that cleans up the text, and a Slack destination. Import, paste your keys, publish.

FAQ

Zapier web scraping — common questions

Do I need Code by Zapier to handle the scraped payload?
No. DiffHook sends a flat JSON body with extracted_text and, if you enabled it, current_html. Zapier auto-parses the JSON, so every field is available as a named variable in downstream steps. Keep a Filter step for the signature check and you're done — the premium Code step is optional.
Can DiffHook scrape JavaScript-rendered pages for my Zap?
Yes. Set type to html_rendered and render.engine to playwright or puppeteer. DiffHook runs the browser, waits for the page to finish loading, then runs your CSS selector against the rendered DOM. The Zap only sees the post-render diff — the rendering cost lives in DiffHook, not your Zap plan.
What about pagination and infinite scroll?
You have two clean patterns. Pattern A — one monitor per paginated page (p=1, p=2, p=3), each firing into the same Catch Hook. Pattern B — enable scroll_to_bottom on a rendered monitor so Playwright loads the full list before the snapshot. Both keep the Zap body small and predictable.
How do I avoid duplicate Zap runs when the diff is tiny?
Tighten the CSS selector so DiffHook watches only the meaningful block (pricing card, product row, changelog line) rather than the whole page. For noisy sources add a Filter step in the Zap that only advances when the diff length exceeds a threshold.
Is there a limit on the scraped payload size?
DiffHook caps the HTML payload at 256 KB per delivery, which covers the vast majority of scrape-into-Zap use cases. For larger pages, keep the CSS selector tight — you rarely need the whole document, just the section that moves.

Related workflows

Also great with DiffHook

Scrape into Zapier without touching Code by Zapier

Managed scraper, extracted text + raw HTML, HMAC-signed Catch Hooks, free tier. One POST to get going.