One credit = one API request. Searching jobs, fetching a job by handle, or querying facets each cost 1 credit. Your free account starts with 1,000 credits.

No. Purchased credits never expire. Use them whenever you need.

What data is included in the API?

Each job includes title, company, location, salary (USD), remote type, seniority, required skills, employment type, and more. We enrich raw postings with AI to extract structured fields.

How often is the data updated?

Jobs are scraped directly from 20,000+ company career pages and ATS platforms and updated hourly. New jobs are typically indexed within 3 hours of being posted.

Do you support semantic search?

Yes. Use the semantic_query parameter to search in plain English. Currently supported for remote and tech jobs.

Is there a rate limit?

Free API keys are limited to 10 requests per second. Paid plans get 100 requests per second. Rate limits are per API key. Contact us if you need higher throughput.

Can I use the API to build a job board?

Absolutely. Many customers power niche job boards, career tools, resume builders, and AI career assistants with our data.

What is the MCP server?

JobDataLake has a free MCP server that lets you search 1M+ jobs directly from Claude, Cursor, or any MCP-compatible AI tool. 500 calls/day, no signup required. Connect at https://mcp.jobdatalake.com

How do you extract tech stack from job postings?

Use a multi-strategy approach: first check structured skills arrays from enriched job data APIs, then parse descriptions for technology keywords using a curated dictionary with alias handling (e.g., k8s = Kubernetes).

What is technographic data?

Technographic data describes what technologies a company uses. Job postings are one of the best sources — when a company lists AWS, Terraform, Datadog in job requirements, those are confirmed technology investments.

How accurate is tech stack data from job postings?

Highly accurate for current adoption since companies list tools they actively use. The main challenge is false positives from generic terms. Use enriched APIs that pre-extract skills to avoid parsing errors.

Tech Stack Intelligence from Job Postings

Job Descriptions as Tech Stack Confessions

Every time a company writes a job description, they inadvertently publish their technology roadmap. "Experience with Snowflake and dbt required" tells you their data stack. "Must have Kubernetes and Terraform experience" reveals their infrastructure approach. "3+ years with React and GraphQL" signals their frontend and API patterns.

Aggregated across thousands of postings from a single company, this data becomes a surprisingly detailed picture of their engineering environment. Aggregated across thousands of companies, it becomes a market map: who's adopting which technologies, how fast, and in which industries.

This article covers how to extract, normalize, and operationalize tech stack intelligence from job postings.

The Extraction Challenge

Technology names appear in job descriptions in several forms:

Explicit skill requirements: "Required skills: Python, PostgreSQL, Redis" — easiest to parse, especially when an API provides a structured skills array
Inline mentions: "You'll work primarily in our Go microservices environment, with some Python scripting" — requires NLP or pattern matching
Implied by context: "We use the AWS ML stack" doesn't name specific services but implies SageMaker, S3, Lambda involvement
Version-qualified mentions: "Python 3.10+", "React 18", "PostgreSQL 15" — need to extract both the technology and version signal

A job data API with pre-extracted skills (like the skills array in JobDataLake) gets you most of the way there for standard technologies. For deeper extraction, you still need to parse the description text.

Building a Technology Extraction Pipeline

Step 1: Define Your Tech Dictionary

Start with a comprehensive dictionary of technologies to detect, organized by category:

const TECH_DICTIONARY: Record<string, TechCategory> = {
  // Languages
  'python': 'language', 'javascript': 'language', 'typescript': 'language',
  'go': 'language', 'golang': 'language', 'rust': 'language',
  'java': 'language', 'kotlin': 'language', 'scala': 'language',
  'ruby': 'language', 'php': 'language', 'c#': 'language', '.net': 'language',

  // Databases
  'postgresql': 'database', 'postgres': 'database', 'mysql': 'database',
  'mongodb': 'database', 'redis': 'database', 'elasticsearch': 'database',
  'cassandra': 'database', 'dynamodb': 'database', 'snowflake': 'database',
  'bigquery': 'database', 'redshift': 'database', 'clickhouse': 'database',

  // Cloud
  'aws': 'cloud', 'amazon web services': 'cloud', 'gcp': 'cloud',
  'google cloud': 'cloud', 'azure': 'cloud',

  // Infrastructure
  'kubernetes': 'infrastructure', 'k8s': 'infrastructure', 'docker': 'infrastructure',
  'terraform': 'infrastructure', 'ansible': 'infrastructure',

  // Frameworks
  'react': 'frontend', 'vue': 'frontend', 'angular': 'frontend', 'nextjs': 'frontend',
  'django': 'backend', 'fastapi': 'backend', 'rails': 'backend', 'express': 'backend',

  // Data
  'spark': 'data', 'kafka': 'data', 'airflow': 'data', 'dbt': 'data',
  'flink': 'data', 'beam': 'data', 'databricks': 'data',
};

Step 2: Normalize Aliases

Technologies have many alternate names. Normalize before counting:

const TECH_ALIASES: Record<string, string> = {
  'golang': 'go',
  'postgres': 'postgresql',
  'k8s': 'kubernetes',
  'gcp': 'google-cloud',
  'amazon web services': 'aws',
  'node': 'node.js',
  'node.js': 'node.js',
  'nodejs': 'node.js',
  'react.js': 'react',
  'reactjs': 'react',
  'vue.js': 'vue',
  'vuejs': 'vue',
  'next.js': 'nextjs',
  'angular.js': 'angular',
  'angularjs': 'angular',
};

function normalizeTechName(raw: string): string {
  const lower = raw.toLowerCase().trim();
  return TECH_ALIASES[lower] ?? lower;
}

Step 3: Multi-Strategy Extraction

Use both the structured skills array and description parsing for maximum coverage:

function extractTechStack(job: Job): ExtractedTech[] {
  const found = new Map<string, TechMention>();

  // Strategy 1: Structured skills array (highest confidence)
  for (const skill of (job.skills ?? [])) {
    const normalized = normalizeTechName(skill);
    if (TECH_DICTIONARY[normalized]) {
      found.set(normalized, {
        name: normalized,
        category: TECH_DICTIONARY[normalized],
        confidence: 'high',
        source: 'skills_field',
      });
    }
  }

  // Strategy 2: Description text parsing (lower confidence, but catches unlisted techs)
  const words = tokenize(job.description);
  for (const word of words) {
    const normalized = normalizeTechName(word);
    if (TECH_DICTIONARY[normalized] && !found.has(normalized)) {
      found.set(normalized, {
        name: normalized,
        category: TECH_DICTIONARY[normalized],
        confidence: 'medium',
        source: 'description_parse',
      });
    }
  }

  return Array.from(found.values());
}

// Tokenize respecting common tech punctuation
function tokenize(text: string): string[] {
  return text
    .toLowerCase()
    .split(/[s,;()[]{}|]+/)
    .map(t => t.replace(/[.!?]+$/, ''))
    .filter(t => t.length >= 2);
}

Company-Level Tech Profile Aggregation

Individual job postings are noisy — one posting might mention a technology as a "nice to have" rather than a core requirement. The signal becomes more reliable when aggregated across multiple postings from the same company:

async function buildCompanyTechProfile(companyName: string): Promise<CompanyTechProfile> {
  // Fetch last 6 months of job postings for this company
  const res = await fetch(
    `https://api.jobdatalake.com/v1/jobs?company=${encodeURIComponent(companyName)}&limit=200`,
    { headers: { 'X-API-Key': process.env.JDL_API_KEY! } }
  );
  const { jobs } = await res.json();

  if (jobs.length === 0) return { company: companyName, techs: [], confidence: 'low' };

  // Extract and aggregate tech mentions
  const techCounts = new Map<string, number>();
  const techCategories = new Map<string, string>();

  for (const job of jobs) {
    const techs = extractTechStack(job);
    for (const tech of techs) {
      techCounts.set(tech.name, (techCounts.get(tech.name) ?? 0) + 1);
      techCategories.set(tech.name, tech.category);
    }
  }

  // Convert to ranked list with frequency percentage
  const ranked = Array.from(techCounts.entries())
    .map(([name, count]) => ({
      name,
      category: techCategories.get(name)!,
      mentionCount: count,
      frequency: Math.round((count / jobs.length) * 100),
    }))
    .sort((a, b) => b.mentionCount - a.mentionCount);

  return {
    company: companyName,
    jobsSampled: jobs.length,
    techs: ranked,
    // Core stack: mentioned in >30% of postings
    coreStack: ranked.filter(t => t.frequency >= 30).map(t => t.name),
    confidence: jobs.length >= 10 ? 'high' : jobs.length >= 5 ? 'medium' : 'low',
  };
}

Handling False Positives

Technology extraction has a false positive problem — "Java" appears in "JavaScript", "Go" appears in many English words, "Ruby" might refer to a person named Ruby. A few mitigation strategies:

Word boundary matching: Use regex with word boundaries (go) rather than substring matching
Context validation: "Go programming" or "Go microservices" is more confident than standalone "Go"
Minimum frequency threshold: If a "technology" appears in only 1 of 100 postings, treat it with skepticism
Structured fields first: The skills array from a job data API has already been extracted by a system specifically designed for this — trust it over your own description parsing

// Use word boundaries to avoid substring false positives
function extractWithBoundaries(text: string, tech: string): boolean {
  const escapedTech = tech.replace(/[.*+?^${}()|[]\]/g, '\$&');
  const pattern = new RegExp(`\\b${escapedTech}\\b`, 'i');
  return pattern.test(text);
}

Trend Analysis: Tracking Technology Adoption Over Time

One of the most valuable outputs of a tech stack intelligence system is trend data: which technologies are growing, which are declining, and where the market is heading.

async function analyzeTechTrend(tech: string, months = 12) {
  const results = [];

  for (let i = 0; i < months; i++) {
    const monthStart = startOfMonth(subMonths(new Date(), i));
    const monthEnd = endOfMonth(monthStart);

    const res = await fetch(
      `https://api.jobdatalake.com/v1/jobs?skills=${encodeURIComponent(tech)}&posted_after=${monthStart.toISOString()}&posted_before=${monthEnd.toISOString()}&limit=1`,
      { headers: { 'X-API-Key': process.env.JDL_API_KEY! } }
    );
    const { total } = await res.json();

    results.push({ month: monthStart, count: total });
  }

  return results.reverse(); // Chronological order
}

This kind of trend data is genuinely valuable: developer tools companies use it to size markets, VCs use it to track category growth, and hiring managers use it to anticipate salary pressure as a technology becomes more competitive.

Building a Firmographic Database

Combine tech stack profiles across a universe of companies and you have a powerful firmographic database — one that can answer questions like:

Which companies in the Fortune 1000 are using Kafka?
What percentage of Series B startups have adopted Rust?
Which companies recently switched from MySQL to PostgreSQL (based on shifting mentions in job postings)?

This is the data that powers the best B2B prospecting tools, developer advocacy programs, and competitive intelligence products. And unlike proprietary surveys or web scraping, job posting data is continuously refreshed — companies update their stack requirements as their technology evolves.

Practical Applications

Who uses tech stack intelligence extracted from job postings?

Developer tools companies: Identifying potential customers using complementary or competitor tools
Technical recruiters: Understanding what technologies a company actually uses before pitching candidates
VC analysts: Tracking technology adoption trends to identify emerging categories early
Market research firms: Building credible technology adoption reports
Product managers: Understanding competitive landscape through technology co-occurrence patterns

The raw material is the same in each case — job postings flowing through an API — but the application layer transforms it into intelligence worth paying for.

Extracting Tech Stack Intelligence from Job Postings