Stop Maintaining ATS Scrapers: One API for Greenhouse, Lever, Workday & Ashby
Building and maintaining scrapers for Greenhouse, Lever, Workday, and Ashby is painful and expensive. Here's why a job data API is the better path — and how to migrate.
The ATS Scraping Trap
It starts simple enough. You need job data from a few companies. You notice that Greenhouse has a public API at /boards/{company}/jobs, Lever exposes listings at /v0/postings/{company}, and Ashby has a GraphQL endpoint. You write a few fetch functions and it works. You ship.
Six months later, you're maintaining a scraper for 12 different ATS platforms. Workday changes its URL structure for the third time. Greenhouse starts rate-limiting from certain IP ranges. A company migrates from Lever to Rippling and your pipeline silently drops all their jobs. Your weekend gets eaten fixing a data issue nobody noticed for three weeks.
This is the ATS scraping trap — and it's where many job board builders, sales intelligence tools, and HR tech products get stuck. This article is about how to get out.
The ATS Landscape in 2026
There are over 40 major Applicant Tracking Systems in active use. The top ones by market share:
- Greenhouse: Dominant in Series B–D tech companies. Has a public jobs board API, but no salary data, inconsistent skill tags, and documentation that lags behind actual behavior.
- Lever: Popular with growth-stage startups. Has a public postings endpoint, but it returns raw description HTML with no structured enrichment.
- Workday: Enterprise standard. The jobs API is notoriously difficult — different subdomain per customer, non-standard URL patterns, heavy JavaScript rendering on many implementations.
- Ashby: Fast-growing among technical teams. Has a GraphQL endpoint, but the schema changes without warning.
- Rippling, Jobvite, iCIMS, SmartRecruiters, BambooHR, Taleo: Each with its own API quirks, authentication schemes, and data shapes.
Each ATS is a separate integration. Each integration breaks independently. Each break requires diagnosis, a fix, a deploy, and validation that you didn't break anything else.
What Building Your Own Scraper Actually Costs
The direct cost of building and maintaining ATS scrapers is often underestimated. A realistic breakdown:
- Initial build: 2–4 weeks per ATS for a production-quality scraper with error handling, rate limiting, and schema validation
- Ongoing maintenance: 4–8 hours per month per ATS for schema changes, rate limit adjustments, and new company URL patterns
- Silent failures: The hidden cost — jobs dropped because a company changed their Workday subdomain, discovered only when a user reports missing data
- Normalization: Each ATS has different field names, value formats, and conventions. Building a unified data model on top requires significant work across all platforms simultaneously
For 10 ATS platforms, you're looking at a part-time engineer's entire workload just to keep the data pipeline running. That's before you've built any product on top of it.
The Public ATS APIs: What They Actually Give You
Let's be concrete about what each major ATS's public API returns — and what it doesn't.
Greenhouse
// Greenhouse public job board API
GET https://boards-api.greenhouse.io/v1/boards/{company}/jobs?content=true
// What you get:
{
"jobs": [{
"id": 1234567,
"title": "Senior Software Engineer",
"location": { "name": "San Francisco, CA" },
"content": "<div>...raw HTML description...</div>",
"absolute_url": "https://boards.greenhouse.io/company/jobs/1234567"
// No salary. No skills array. No seniority field.
}]
}
Lever
// Lever public postings API
GET https://api.lever.co/v0/postings/{company}?mode=json
// What you get:
{
"id": "uuid-here",
"text": "Senior Software Engineer",
"categories": { "location": "San Francisco, CA", "team": "Engineering" },
"descriptionBody": "<div>...more raw HTML...</div>"
// No salary. No skills array. No seniority.
}
These APIs return raw, unstructured data. To get salary, skills, and seniority — the fields that make job data actually useful — you'd have to parse the HTML description for every listing. That's an NLP problem you now own, multiplied across every ATS.
What a Job Data API Handles For You
A purpose-built job data API like JobDataLake maintains integrations with 40+ ATS platforms and handles all of the normalization and enrichment. For each listing, you get:
// JobDataLake — one API, all ATSs
GET https://api.jobdatalake.com/v1/jobs?title=senior+software+engineer&location=San+Francisco
{
"jobs": [{
"id": "jdl_abc123",
"title": "Senior Software Engineer",
"company_name": "Acme Corp",
"locations": ["San Francisco, CA"],
"salary_min_usd": 160, // in thousands — $160k
"salary_max_usd": 220, // in thousands — $220k
"required_skills": ["Go", "Kubernetes", "PostgreSQL"],
"seniority": ["senior"],
"remote_type": "hybrid",
"employment_type": "full_time",
"posted_at": 1748380800000, // Unix milliseconds
"url": "https://boards.greenhouse.io/acme/jobs/1234567"
}]
}
The same query returns jobs from Greenhouse, Lever, Workday, and every other ATS in the index — normalized into a single schema, with salary and skills already extracted.
Handling the Long Tail: 40+ ATSs
The major ATSs get the most attention, but there's a significant long tail. Rippling, Jobvite, iCIMS, SmartRecruiters, BambooHR, Taleo, Oracle HCM, SAP SuccessFactors, Workable, JazzHR — each used by thousands of companies, each with different data shapes.
If you're targeting enterprise companies, you'll encounter Workday. If you're targeting mid-market, you'll hit iCIMS and Jobvite. Each requires its own scraper, its own normalization, its own maintenance window when the schema changes.
The economics of maintaining 40+ integrations only make sense at very large scale. For everyone else, the build-vs-buy calculation strongly favors a job data API.
Migration: From Custom Scrapers to the API
If you have existing ATS scraper code, migrating to a job data API typically takes a day or two. The key change is replacing per-ATS fetches with a single parameterized API call:
// Before: per-ATS scrapers
async function fetchGreenhouse(company: string) {
const res = await fetch(`https://boards-api.greenhouse.io/v1/boards/${company}/jobs?content=true`);
const data = await res.json();
return data.jobs.map(normalizeGreenhouseJob); // Your normalization code
}
async function fetchLever(company: string) {
const res = await fetch(`https://api.lever.co/v0/postings/${company}?mode=json`);
const data = await res.json();
return data.map(normalizeLeverJob); // Different normalization
}
// After: one call, all ATSs, pre-normalized
async function fetchJobs(params: { title?: string; location?: string; company?: string }) {
const query = new URLSearchParams(params as Record<string, string>);
const res = await fetch(`https://api.jobdatalake.com/v1/jobs?${query}`, {
headers: { 'X-API-Key': process.env.JDL_API_KEY! },
});
const { jobs } = await res.json();
return jobs; // Already normalized, already enriched
}
You delete thousands of lines of normalization code. You stop getting paged for scraper failures. You start getting data from ATSs you hadn't even considered building scrapers for.
When Building Your Own Makes Sense
There are legitimate reasons to maintain custom ATS scrapers:
- You're targeting a very specific, niche ATS not covered by any API provider
- You need company-internal data not exposed via the public jobs board (proprietary ATS fields, internal only postings)
- You're operating at a scale where the economics of a data API are genuinely worse than in-house engineering
For most products — especially those early in development — none of these apply. The job data problem is solved infrastructure. Use the API and spend your engineering time on the product.
Frequently Asked Questions
Does Greenhouse have a public job listing API?
Yes. Greenhouse exposes a public job board API at boards-api.greenhouse.io/v1/boards/{company}/jobs. However, it returns raw HTML descriptions with no salary, skills, or seniority fields. A job data API like JobDataLake aggregates Greenhouse plus 39+ other ATS platforms into a single normalized endpoint.
How do I get job data from Workday?
Workday has no standard public API. Each company has a custom subdomain and URL structure. JobDataLake crawls Workday job boards across thousands of companies and normalizes the data into a consistent REST API — no Workday integration required.
Is it legal to scrape Greenhouse or Lever job boards?
Public job boards are generally legal to crawl, but terms of service vary. Using a job data API that has established relationships with ATS providers is the lower-risk path, and eliminates all the infrastructure maintenance overhead.
Try JobDataLake
1M+ enriched job listings from 20,000+ companies. Free API key with 1,000 credits — no credit card required.