GET
/
crawler
/
jobs
/
{id}
curl --request GET \
  --url https://api.usescraper.com/crawler/jobs/{id} \
  --header 'Authorization: Bearer <token>'
{
    "id": "7YEGS3M8Q2JD6TNMEJB8B6EKVS",
    "urls": [
        "https://example.com"
    ],
    "createdAt": 1699964378397,
    "status": "running",
    "sitemapPageCount": 12,
    "progress": {
        "scraped": 6,
        "discarded": 0,
        "failed": 0
    },
    "costCents": 0,
    "webhookFails": []
}

Crawler jobs may take several minutes to complete. Use this endpoint to check the status of a job, and fetch the results from the Get job data endpoint when the job is complete.

{
    "id": "7YEGS3M8Q2JD6TNMEJB8B6EKVS",
    "urls": [
        "https://example.com"
    ],
    "createdAt": 1699964378397,
    "status": "running",
    "sitemapPageCount": 12,
    "progress": {
        "scraped": 6,
        "discarded": 0,
        "failed": 0
    },
    "costCents": 0,
    "webhookFails": []
}

Authorizations

Authorization
string
headerrequired

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

id
string
required

Job ID

Response

200 - application/json
id
string
required
org
string
required
urls
string[]
required
exclude_globs
string[]
exclude_elements
string
output_format
enum<string>
required
Available options:
text,
html,
markdown
output_expiry
integer
required
min_length
number
required
webhook_url
string
use_browser
boolean
required
link_crawling
boolean
required
page_limit
number
force_crawling_mode
enum<string>
Available options:
link,
sitemap
block_resources
boolean
default: true
include_linked_files
boolean
default: false
createdAt
number
required

UNIX timestamp

finishedAt
number

UNIX timestamp

costMillicents
number
costCents
number
status
enum<string>
required
Available options:
starting,
running,
succeeded,
failed,
cancelled
notices
object[]
required
sitemapPageCount
number
required
progress
object
webhookFails
any