Skip to article
Guide

How to scrape LinkedIn safely in 2026

The honest guide to scraping LinkedIn profiles and company pages without burning your session or violating ToS boundaries that matter.

Stekpad Team7 min read
On this page

LinkedIn is the hardest surface on the public web to scrape. The anti-bot stack is aggressive, the public HTML is nearly empty, and the logged-out experience redirects most of what you care about behind a wall. Every "LinkedIn scraper" you see advertised on the open market is doing one of two things: running on a pool of throwaway accounts that get banned weekly, or fetching pages through your own session while pretending that is a safe default.

This post explains the only approach we trust. You scrape LinkedIn from your own logged-in browser, through Stekpad's cookie bridge, on the lists you already own. Your session cookies never touch a Stekpad server. The fetches happen in your Chrome, one page at a time, at a rate you control. The only people you scrape are people you were already going to visit manually.

That last sentence is the one that keeps you out of trouble. If your workflow starts with "I have a list of 200 leads my sales team already added to Sales Navigator and I want to enrich their profiles", this is the right tool. If it starts with "I want to scrape 50,000 random profiles", stop reading. That is neither safe nor legal in most jurisdictions, and no tool will make it safe.

Why every other LinkedIn scraper is a trap

Three common patterns, three honest problems.

Pattern 1: server-side scraping with shared accounts. Some providers run a pool of real LinkedIn accounts on their servers and rent them out. Your request hits their backend, their backend logs into LinkedIn from a datacenter IP, scrapes the page, and returns HTML. Problem: LinkedIn bans those accounts in waves. Your data goes stale the moment the account dies, and LinkedIn's detection improves every quarter. You are paying to watch accounts burn.

Pattern 2: server-side scraping with your cookies. Some providers ask you to paste your li_at cookie into a form, then use it server-side. Problem: your session is now on a third-party server. If that server gets breached, a stranger has full write access to your LinkedIn. No password prompt required. No 2FA. This is the worst pattern on the market and it is surprisingly common.

Pattern 3: Chrome extension that reads the DOM. Some tools run as an extension that reads the profile in your open tab. This is fine for one profile at a time, but it forces a human into the loop for each fetch, and most extensions sell your scraped data back to their customer base as a "people database". Check the privacy policy before you install anything.

The Stekpad cookie bridge is a fourth pattern and it is the one we build around: fetches run in your browser, scheduled by your backend code, with nothing stored on our servers.

You install the Stekpad Chrome extension once. It signs in with your API key. It opens a WebSocket to our backend. When you call /v1/scrape with use_session: "linkedin.com", the backend pushes a fetch job to your extension. Your browser opens the target URL in a background fetch, attaches your existing LinkedIn cookies (the browser does this for you, just like when you click a link), runs any actions you asked for (click, scroll, wait), and returns the rendered HTML over the WebSocket. The backend converts it to markdown or JSON and stores the result in a dataset.

Your cookies never leave your browser. We do not store them, log them, proxy them, or look at them. The extension's fetch journal records every URL we asked for, so you can audit exactly what happened. Read the cookie bridge architecture doc for the full sequence diagram.

Your session cookies never leave your browser. That is an architectural rule, not a feature. Every request logged, every fetch auditable, nothing persisted backend-side.

Step 1: install the extension and enable linkedin.com

The Chrome Web Store listing is at the Stekpad page. Install, click the Stekpad icon, sign in with your workspace API key. In the Sessions panel, click Add domain, type linkedin.com, confirm. The panel shows a green pill when the bridge is connected and the extension badge turns on.

You can revoke linkedin.com from the extension at any time. You can also revoke it from the Stekpad web app — the kill switch pushes to the extension instantly and every pending fetch is cancelled. That two-sided revocation is one of the reasons we built this as an in-house extension instead of recommending a generic proxy tool.

Step 2: scrape a single profile

Here is a curl for one profile. Note use_session: "linkedin.com" — that is the flag that routes the fetch through your browser instead of through our backend's public fetch path.

bash
curl -X POST https://api.stekpad.com/v1/scrape \
-H "Authorization: Bearer stkpd_live_..." \
-H "Content-Type: application/json" \
-d '{
"url": "https://www.linkedin.com/in/williamhgates/",
"formats": ["markdown", "json"],
"use_session": "linkedin.com",
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"headline": {"type": "string"},
"location": {"type": "string"},
"current_company": {"type": "string"},
"current_role": {"type": "string"},
"about": {"type": "string"}
}
},
"persist": true,
"dataset": { "type": "table", "name": "Profiles to enrich" }
}'

The response looks like any other `scrape` response: a run_id, the markdown, the JSON object that matches your schema, and the row_id in the target dataset. The only difference is that session_used in the metadata reads linkedin.com, and the fetch took a little longer because it round-tripped through your Chrome.

If the extension is not connected when you make the call, you get a structured error back:

json
{
"error": {
"code": "session_unavailable",
"domain": "linkedin.com",
"guidance": "Open Chrome with the Stekpad extension active, or remove use_session from this request."
}
}

That is the contract. The backend never silently falls back to a public fetch when you asked for a session. If the bridge is down, you get told.

Step 3: scrape a company page

Company pages are public enough that you can sometimes scrape them without a session. You will get a thin result. With a session you get the full about section, employee count, industry, headquarters, specialties, and the activity feed. Same call, different URL.

bash
curl -X POST https://api.stekpad.com/v1/scrape \
-H "Authorization: Bearer stkpd_live_..." \
-H "Content-Type: application/json" \
-d '{
"url": "https://www.linkedin.com/company/microsoft/",
"formats": ["json"],
"use_session": "linkedin.com",
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"tagline": {"type": "string"},
"industry": {"type": "string"},
"employee_count_range": {"type": "string"},
"headquarters": {"type": "string"},
"website": {"type": "string", "format": "uri"},
"specialties": {"type": "array", "items": {"type": "string"}}
}
}
}'

5 credits per URL on the extract path. The row lands in your dataset, keyed by canonical URL. Re-running the same URL next quarter updates the row and bumps _scraped_version so you can track changes over time.

Step 4: rate limits and the only throttle that matters

LinkedIn's unwritten rate limit is your own usage pattern. A human opens maybe 30 profiles an hour on a busy day. If your scraper opens 300 profiles in 60 seconds through your session, LinkedIn notices, and your account gets a soft ban — no feed for a day, sometimes longer. After a few of those, you get a hard suspension.

The Stekpad cookie bridge ships with three defaults that keep you human-paced:

  • One concurrent fetch per domain, per workspace. Two scrape calls on linkedin.com queue, they do not run in parallel through your session.
  • A minimum spacing of 1.5 seconds between fetches on any session-backed domain. Configurable upward, not downward.
  • A cap of 200 session-backed fetches per domain per hour on the default bridge policy. Raisable only by explicit confirmation in the extension popup, per-session.

These are not the limits that protect Stekpad. They are the limits that protect your LinkedIn account. You can turn them off, but the extension makes you click through a warning that explains what happens if you do.

Step 5: the list you already own

The rule we will repeat because it matters: scrape profiles you already have a reason to visit. That means:

  • Leads already in your Sales Navigator lists
  • People who accepted your connection request
  • Authors of posts you commented on
  • Employees of companies in your CRM
  • Attendees of an event you registered for

Stekpad does not publish a "LinkedIn people database". We do not want one. The Terms of Service on LinkedIn are explicit that data is accessible to logged-in members for personal use, and resale or mass-scraping of member data is not allowed. Court rulings are still in flux (the hiQ v. LinkedIn saga is worth reading), but "scraping data you were already going to look at as part of your job" is both the safest legal posture and the only one we can build a product around without embarrassing ourselves.

How the alternatives compare, honestly

PhantomBuster runs on their servers, uses your cookie (or a rented account), costs 69 € to 900 € per month, and charges per "slot" of concurrent execution. The trade-off: you get scheduling and a catalog of pre-wired flows. The cost: your session cookie sits on their infra. See the full PhantomBuster comparison for the side-by-side.

Clay integrates LinkedIn enrichment as part of its waterfall, backed by third-party providers whose data freshness varies. It is integrated, it is powerful in the right hands, and it is expensive (around 350 € a month for a usable plan). If you live in Clay already and you want LinkedIn as one data source among many, it works. If you want direct control of your session and your dataset, it is the wrong shape.

Proxycurl is an API that sells LinkedIn person and company data without any session on your side. The data comes from their own scraping pipeline. It is useful when you do not have a session at all. We treat it as a premium enricher in the `linkedin_enrich` option on Cloud plans, but it is explicitly opt-in and labeled — your data does leave our stack in that one call, and we say so.

Homebrew scripts. You can absolutely write a Puppeteer script that drives your own Chrome and scrapes profiles. For 20 profiles a month it is fine. By profile 200 you will have built a worse version of the cookie bridge, minus the audit log and the kill switch.

Next steps

Stekpad Team
We build Stekpad. We scrape the web, store it, and enrich it — from an API, from an app, or from Claude.

Try the API. Free to start.

3 free runs a day on the playground. No credit card. Install MCP for Claude in 60 seconds.

How to scrape LinkedIn safely in 2026 — Stekpad — Stekpad