Create job

cURL

curl --request POST \
  --url https://api.usescraper.com/crawler/jobs \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "urls": [
    "<string>"
  ],
  "exclude_globs": [],
  "exclude_elements": "nav, header, footer, script, style, noscript, svg, [role=\"alert\"], [role=\"banner\"], [role=\"dialog\"], [role=\"alertdialog\"], [role=\"region\"][aria-label*=\"skip\" i], [aria-modal=\"true\"]",
  "output_format": "text",
  "output_expiry": 604800,
  "min_length": 50,
  "webhook_url": "<string>",
  "page_limit": 10000,
  "force_crawling_mode": "sitemap",
  "block_resources": true,
  "include_linked_files": false
}'

{
    "id": "7YEGS3M8Q2JD6TNMEJB8B6EKVS",
    "urls": [
        "https://example.com"
    ],
    "createdAt": 1699964378397,
    "status": "starting",
    "sitemapPageCount": 0,
    "progress": {
        "scraped": 0,
        "discarded": 0,
        "failed": 0
    },
    "costCents": 0,
    "webhookFails": []
}

POST

crawler

jobs

cURL

curl --request POST \
  --url https://api.usescraper.com/crawler/jobs \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "urls": [
    "<string>"
  ],
  "exclude_globs": [],
  "exclude_elements": "nav, header, footer, script, style, noscript, svg, [role=\"alert\"], [role=\"banner\"], [role=\"dialog\"], [role=\"alertdialog\"], [role=\"region\"][aria-label*=\"skip\" i], [aria-modal=\"true\"]",
  "output_format": "text",
  "output_expiry": 604800,
  "min_length": 50,
  "webhook_url": "<string>",
  "page_limit": 10000,
  "force_crawling_mode": "sitemap",
  "block_resources": true,
  "include_linked_files": false
}'

{
    "id": "7YEGS3M8Q2JD6TNMEJB8B6EKVS",
    "urls": [
        "https://example.com"
    ],
    "createdAt": 1699964378397,
    "status": "starting",
    "sitemapPageCount": 0,
    "progress": {
        "scraped": 0,
        "discarded": 0,
        "failed": 0
    },
    "costCents": 0,
    "webhookFails": []
}

Crawler jobs may take several minutes to complete. Use the Get job endpoint to check the status of a job, and fetch the results from the Get job data endpoint when the job is complete.

{
    "id": "7YEGS3M8Q2JD6TNMEJB8B6EKVS",
    "urls": [
        "https://example.com"
    ],
    "createdAt": 1699964378397,
    "status": "starting",
    "sitemapPageCount": 0,
    "progress": {
        "scraped": 0,
        "discarded": 0,
        "failed": 0
    },
    "costCents": 0,
    "webhookFails": []
}

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

Crawler parameters

The body is of type object.

Response

201

The created job object. Job URL will be provided in Location header.

Quickstart Get job

Get Started

API Documentation

Scraper

Crawler

Authorizations

Body

Response