Does puppeteer-extra-plugin-stealth make Puppeteer completely undetectable?

No. The stealth plugin patches structural detection signals like navigator.webdriver, navigator.plugins, and CDP artifacts, but it doesn't randomize canvas or WebGL fingerprints. Sites that correlate fingerprint hashes across sessions or use advanced behavioral analysis can still detect automation. You need custom evaluators and residential proxies for full coverage.

How do I rotate proxies per page in Puppeteer?

The --proxy-server flag applies at the browser level, not per page. For per-page rotation, either launch a separate browser instance per proxy (full isolation) or use page.authenticate() with different credentials — though the latter only works for HTTP auth, not SOCKS5. For production, use a browser pool where each worker has its own browser and dedicated proxy.

What's the difference between residential and datacenter proxies for Puppeteer scraping?

Residential proxies use IPs from real ISPs, making them much harder for anti-bot systems to flag. Datacenter proxies come from hosting providers and are easily identified by their ASN. For stealth scraping, residential proxies are strongly preferred — they match the expected profile of a real user and are far less likely to appear on blocklists.

How many concurrent Puppeteer instances can I run on one server?

Each headless Chromium instance uses roughly 150–300 MB of RAM and 0.5–1.0 CPU cores under load. On a 16-core, 64 GB server, you can realistically run 40–80 concurrent browsers. For larger fleets, use containerized workers distributed across multiple machines, coordinated via a job queue like Redis.

Is using puppeteer-extra stealth plugin legal for web scraping?

The stealth plugin itself is a legal tool. Web scraping legality depends on what you scrape and how. Collecting publicly available data at reasonable rates is generally legal in most jurisdictions, but bypassing authentication, violating terms of service, or collecting personal data without consent can have legal consequences. Always respect robots.txt, rate limits, and privacy regulations like GDPR and CCPA.

Puppeteer-Extra Stealth Proxy Guide | ProxyHat

Why Raw Puppeteer Gets Caught Every Time

If you've ever fired up Puppeteer against a protected site and watched your request get blocked within seconds, you've hit the same wall every scraping engineer meets: automated browsers leak detectable signals. It doesn't matter how clever your selectors are if the site knows you're a bot before the page even loads.

The core problem is that Chromium — the engine Puppeteer drives — was designed for testing, not for blending in. When you launch it via the DevTools Protocol, it leaves a trail of artifacts that anti-bot systems are specifically trained to find.

The Big Three Detection Signals

Here are the most reliable tells that expose a raw Puppeteer session:

navigator.webdriver — Set to true in any Chromium instance launched via WebDriver or CDP. Cloudflare, DataDome, and Akamai all check this property first.
Inconsistent plugins and mimeTypes arrays — Headless Chromium reports an empty navigator.plugins array, while a real Chrome browser lists PDF Viewer, Chrome PDF Viewer, and others. This mismatch is trivially detectable.
iframe chromedriver artifacts — The CDP runtime injects __nightmare, cdc_-prefixed variables, and internal iframe references that have no equivalent in a human-driven browser.

But those three are just the start. Modern anti-bot systems also check WebGL renderer strings, canvas fingerprint consistency, navigator.languages ordering, window.chrome object presence, User-Agent vs. navigator.platform mismatches, and timing-based behavioral signals. Raw Puppeteer fails on most of these out of the box.

Puppeteer-Extra with Stealth Plugin: What It Actually Patches

The puppeteer-extra-plugin-stealth is a collection of evasion modules — each one targeting a specific detection vector. It's not magic; it's a stack of carefully ordered interceptors that run before any page script executes.

Here's what the stealth plugin covers and what it doesn't:

Detection Signal	Stealth Patch	Notes
`navigator.webdriver`	Yes — set to `undefined`	Most critical patch
`navigator.plugins`	Yes — populated with realistic entries	MimeTypes also aligned
`window.chrome` object	Yes — added with expected properties	Missing in headless by default
WebGL vendor/renderer	Partial — spoofed to common values	May need custom override for niche sites
Canvas fingerprint	No — not randomized by default	Requires custom evaluator (see below)
CDP artifacts / `cdc_` vars	Yes — removed from iframe contentWindow	Also strips `__nightmare`
Permissions API	Yes — overrides `navigator.permissions.query`	Prevents headless detection via permissions
Iframe `contentWindow` consistency	Yes — patches cross-origin discrepancies	Prevents iframe-based detection
User-Agent consistency	Partial — depends on your UA string	You must set a realistic UA yourself

The stealth plugin handles the structural signals well, but it doesn't touch fingerprint entropy. Two stealth-enabled browsers on the same machine will produce identical canvas and WebGL fingerprints — a dead giveaway if a site correlates sessions. That's why you need custom evaluators and proxy rotation to build a truly robust stack.

Setting Up Puppeteer-Extra Stealth with Proxies

Let's build the foundation: a stealth-enabled browser that routes traffic through ProxyHat residential proxies with geo-targeting.

const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');

puppeteer.use(StealthPlugin());

async function createStealthBrowser(proxyCountry = 'US') {
  const proxyAuth = `user-country-${proxyCountry}:YOUR_PASSWORD`;
  const proxyUrl = `http://${proxyAuth}@gate.proxyhat.com:8080`;

  const browser = await puppeteer.launch({
    headless: 'new',
    args: [
      `--proxy-server=${proxyUrl}`,
      '--disable-blink-features=AutomationControlled',
      '--no-sandbox',
      '--disable-setuid-sandbox',
      '--disable-dev-shm-usage',
      '--disable-gpu',
    ],
  });

  return browser;
}

(async () => {
  const browser = await createStealthBrowser('DE');
  const page = await browser.newPage();
  await page.setUserAgent(
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 ' +
    '(KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36'
  );
  await page.goto('https://bot.sannysoft.com/');
  await page.screenshot({ path: 'stealth-check.png' });
  await browser.close();
})();

Key points in this setup:

Proxy goes in --proxy-server launch arg, not in page-level settings. This ensures all connections — including subresource requests — go through the proxy.
--disable-blink-features=AutomationControlled disables the navigator.webdriver flag at the Chromium level as a first line of defense, before the stealth plugin applies its runtime patches.
Geo-targeting in the username — ProxyHat uses the user-country-XX format to route your traffic through a residential IP in the specified country. This is critical for sites that serve different content by region.

Combining Stealth with Residential Proxies: The Anti-Detection Stack

Stealth patches browser signals. Residential proxies patch network signals. You need both because anti-bot systems check both layers:

Network layer — Is the IP in a datacenter ASN? Does it match the claimed geo? Has it been flagged for bot activity? Residential proxies from ProxyHat's location pool solve this by providing IPs from real ISPs.
Browser layer — Does the browser look automated? Are fingerprints consistent with a real user? The stealth plugin handles the structural signals; custom evaluators handle the entropy.

The combination is powerful because each layer covers the other's blind spots. A residential IP with a detectable browser still gets blocked. A stealth browser on a datacenter IP still gets flagged by IP reputation checks. Together, they present a consistent profile: a real residential user with a normal browser.

Sticky Sessions for Stateful Scraping

Some sites require login or multi-page flows. You need the same IP across multiple requests. ProxyHat supports sticky sessions via the username format:

// Sticky session: same IP for the session duration
const proxyAuth = `user-country-US-session-orderFlow42:YOUR_PASSWORD`;
const proxyUrl = `http://${proxyAuth}@gate.proxyhat.com:8080`;

// Use this proxy URL for all requests in the order flow
// The session ID keeps the IP stable

Without sticky sessions, each request may exit through a different residential IP — fine for SERP scraping, catastrophic for e-commerce checkout flows.

Custom Evaluators: Canvas and WebGL Fingerprint Randomization

This is where most Puppeteer anti-detection guides stop — and where production crawlers fail. The stealth plugin doesn't randomize canvas or WebGL fingerprints. If you launch 50 browser instances on the same machine, they all produce the same hash. Sophisticated anti-bot systems detect this correlation.

The solution: inject per-session noise into canvas rendering and WebGL parameters before any page script runs.

Canvas Fingerprint Randomization

Canvas fingerprinting works by drawing hidden text and shapes, then reading the pixel data via toDataURL(). Tiny differences in rendering — caused by GPU drivers, font rasterizers, and OS-level anti-aliasing — produce a unique hash. We simulate those differences by injecting deterministic noise per session.

function generateCanvasNoise(seed) {
  // Simple seeded PRNG for deterministic per-session noise
  let s = seed;
  const rand = () => {
    s = (s * 16807 + 0) % 2147483647;
    return (s - 1) / 2147483646;
  };

  // Generate a small offset table for RGBA channels
  const offsets = [];
  for (let i = 0; i < 16; i++) {
    offsets.push(Math.floor(rand() * 3) - 1); // -1, 0, or +1
  }
  return offsets;
}

async function injectCanvasRandomization(page, sessionId) {
  const seed = hashCode(sessionId); // Convert session ID to numeric seed
  const offsets = generateCanvasNoise(seed);

  await page.evaluateOnNewDocument((noise) => {
    const origToDataURL = HTMLCanvasElement.prototype.toDataURL;
    HTMLCanvasElement.prototype.toDataURL = function (...args) {
      const ctx = this.getContext('2d');
      if (ctx) {
        const imgData = ctx.getImageData(0, 0, this.width, this.height);
        for (let i = 0; i < imgData.data.length && i < noise.length * 4; i += 4) {
          imgData.data[i]     = Math.max(0, Math.min(255, imgData.data[i]     + noise[i/4 % noise.length]));
          imgData.data[i + 1] = Math.max(0, Math.min(255, imgData.data[i + 1] + noise[(i/4+1) % noise.length]));
          imgData.data[i + 2] = Math.max(0, Math.min(255, imgData.data[i + 2] + noise[(i/4+2) % noise.length]));
        }
        ctx.putImageData(imgData, 0, 0);
      }
      return origToDataURL.apply(this, args);
    };
  }, offsets);
}

function hashCode(str) {
  let hash = 0;
  for (let i = 0; i < str.length; i++) {
    hash = ((hash << 5) - hash) + str.charCodeAt(i);
    hash |= 0;
  }
  return Math.abs(hash);
}

WebGL Fingerprint Randomization

WebGL fingerprinting reads the vendor and renderer strings from the GPU. We override these to match common consumer hardware profiles:

async function injectWebGLRandomization(page, profile) {
  const profiles = {
    nvidia: { vendor: 'Google Inc. (NVIDIA)', renderer: 'ANGLE (NVIDIA, NVIDIA GeForce GTX 1060, OpenGL 4.5)' },
    amd:    { vendor: 'Google Inc. (AMD)',     renderer: 'ANGLE (AMD, AMD Radeon RX 580, OpenGL 4.5)' },
    intel:  { vendor: 'Google Inc. (Intel)',   renderer: 'ANGLE (Intel, Intel(R) UHD Graphics 630, OpenGL 4.5)' },
  };
  const p = profiles[profile] || profiles.nvidia;

  await page.evaluateOnNewDocument((webglProfile) => {
    const getParameter = WebGLRenderingContext.prototype.getParameter;
    WebGLRenderingContext.prototype.getParameter = function (param) {
      if (param === 37445) return webglProfile.vendor;   // UNMASKED_VENDOR_WEBGL
      if (param === 37446) return webglProfile.renderer; // UNMASKED_RENDERER_WEBGL
      return getParameter.call(this, param);
    };
    // Same for WebGL2
    if (typeof WebGL2RenderingContext !== 'undefined') {
      const getParam2 = WebGL2RenderingContext.prototype.getParameter;
      WebGL2RenderingContext.prototype.getParameter = function (param) {
        if (param === 37445) return webglProfile.vendor;
        if (param === 37446) return webglProfile.renderer;
        return getParam2.call(this, param);
      };
    }
  }, p);
}

Assign a random profile per session to avoid correlation. Pair each profile with a matching User-Agent and viewport — an Intel GPU profile should come with a laptop-like viewport (1366×768 or 1920×1080), not a 4K ultrawide resolution.

Per-Browser-Context Proxy Rotation

Puppeteer's browser.newPage() creates pages that share the browser's proxy settings. But for true per-session isolation — different IPs, different fingerprints, different cookies — you need per-context proxy assignment.

Chromium supports per-context proxies via the --proxy-server flag combined with CDP's Fetch.enable with per-request auth, but the cleanest approach is to launch a separate browser instance per proxy when you need full isolation, or use browser.createIncognitoBrowserContext() with page-level proxy auth for lighter-weight rotation.

For production crawlers, the recommended pattern is a browser pool where each worker gets its own browser with a dedicated proxy:

class StealthBrowserPool {
  constructor({ size, proxyConfig, fingerprintProfiles }) {
    this.size = size;
    this.proxyConfig = proxyConfig;
    this.profiles = fingerprintProfiles;
    this.pool = [];
    this.available = [];
  }

  async init() {
    for (let i = 0; i < this.size; i++) {
      const worker = await this._createWorker(i);
      this.pool.push(worker);
      this.available.push(i);
    }
  }

  async _createWorker(index) {
    const country = this.proxyConfig.countries[index % this.proxyConfig.countries.length];
    const profile = this.profiles[index % this.profiles.length];
    const sessionId = `worker-${index}-${Date.now()}`;

    const proxyAuth =
      `user-country-${country}-session-${sessionId}:${this.proxyConfig.password}`;
    const proxyUrl = `http://${proxyAuth}@gate.proxyhat.com:8080`;

    const browser = await puppeteer.launch({
      headless: 'new',
      args: [
        `--proxy-server=${proxyUrl}`,
        '--disable-blink-features=AutomationControlled',
        '--no-sandbox',
        '--disable-setuid-sandbox',
        '--disable-dev-shm-usage',
      ],
    });

    const page = await browser.newPage();
    await page.setUserAgent(profile.userAgent);
    await page.setViewport(profile.viewport);
    await injectCanvasRandomization(page, sessionId);
    await injectWebGLRandomization(page, profile.gpuProfile);

    return { browser, page, sessionId, country, proxyUrl };
  }

  async acquire() {
    if (this.available.length === 0) {
      throw new Error('Pool exhausted — increase size or implement queuing');
    }
    const index = this.available.shift();
    return { index, ...this.pool[index] };
  }

  async release(index) {
    // Reset the page state for reuse
    const worker = this.pool[index];
    try {
      const client = await worker.page.target().createCDPSession();
      await client.send('Network.clearBrowserCache');
      await client.send('Network.clearBrowserCookies');
    } catch (e) {
      // Context may have been closed; recreate
      this.pool[index] = await this._createWorker(index);
    }
    this.available.push(index);
  }

  async close() {
    await Promise.all(this.pool.map(w => w.browser.close()));
  }
}

// Usage
const pool = new StealthBrowserPool({
  size: 10,
  proxyConfig: {
    countries: ['US', 'DE', 'GB', 'FR'],
    password: 'YOUR_PASSWORD',
  },
  fingerprintProfiles: [
    { userAgent: '...', viewport: { width: 1920, height: 1080 }, gpuProfile: 'nvidia' },
    { userAgent: '...', viewport: { width: 1366, height: 768 },  gpuProfile: 'intel'  },
    { userAgent: '...', viewport: { width: 1536, height: 864 },  gpuProfile: 'amd'    },
  ],
});

await pool.init();
const worker = await pool.acquire();
try {
  await worker.page.goto('https://example.com');
  // ... scrape logic ...
} finally {
  await pool.release(worker.index);
}

This pool pattern gives you full isolation: each worker has its own IP, its own fingerprint, and its own cookie jar. When a worker is released, cookies and cache are cleared so the next task starts clean.

Scaling: Containerized Fleets and Resource Management

Running 10 browsers on one machine is manageable. Running 500 requires infrastructure. Here's how to think about scaling a Puppeteer stealth fleet.

Resource Budgeting

Each Chromium instance consumes roughly:

150–300 MB RAM per page (more for heavy SPAs)
0.5–1.0 CPU cores under active load
Network bandwidth depends on page weight — budget 2–5 MB per page load

On a 16-core, 64 GB machine with headless Chromium, you can realistically run 40–80 concurrent browsers. Beyond that, you need horizontal scaling.

Container Architecture

Use a worker container pattern where each container runs a small pool of browsers and exposes a job API:

# docker-compose.yml — scaled worker fleet
version: '3.8'
services:
  worker:
    build: ./worker
    deploy:
      replicas: 10
      resources:
        limits:
          cpus: '2.0'
          memory: 4G
        reservations:
          cpus: '1.0'
          memory: 2G
    environment:
      - PROXYHAT_USER=user-country-US
      - PROXYHAT_PASS=YOUR_PASSWORD
      - POOL_SIZE=8
      - REDIS_URL=redis://queue:6379
    depends_on:
      - queue

  queue:
    image: redis:7-alpine
    ports:
      - '6379:6379'

  orchestrator:
    build: ./orchestrator
    environment:
      - REDIS_URL=redis://queue:6379
      - WORKER_COUNT=10
    depends_on:
      - queue

The orchestrator pushes URLs to a Redis queue. Each worker pulls jobs, acquires a browser from its local pool, executes the crawl, and pushes results to an output queue. This architecture is stateless at the worker level — if a container crashes, the orchestrator simply re-queues the job.

Browser Lifecycle Management

Browsers are not long-lived. Memory leaks in Chromium accumulate, and anti-bot systems may start flagging an IP after too many requests. Implement a rotation policy:

Max requests per browser: 50–100 before restarting
Max lifetime: 10–15 minutes before restarting
On detection: immediately rotate the proxy (new session ID) and restart the browser

Build a health check into your pool that monitors memory usage and restarts workers proactively:

async function healthCheck(pool) {
  for (const worker of pool.pool) {
    try {
      const metrics = await worker.page.metrics();
      const mem = process.memoryUsage();

      if (mem.heapUsed > 500 * 1024 * 1024 || worker.requestCount > 80) {
        console.log(`Recycling worker ${worker.sessionId}`);
        await worker.browser.close();
        // Recreate with fresh proxy session
        const idx = pool.pool.indexOf(worker);
        pool.pool[idx] = await pool._createWorker(idx);
      }
    } catch (e) {
      // Browser already dead — recreate
      const idx = pool.pool.indexOf(worker);
      pool.pool[idx] = await pool._createWorker(idx);
    }
  }
}

setInterval(() => healthCheck(pool), 60_000);

Concurrency vs. Politeness

More browsers ≠ more data. Aggressive concurrency triggers rate limits and CAPTCHAs. A practical rule of thumb: 1 request per second per domain per IP. If you have 50 residential IPs targeting one domain, you can sustain ~50 requests/second. Push beyond that and you'll hit behavioral detection regardless of your stealth setup.

For SERP tracking at scale, stagger your requests across the proxy pool and add jitter:

async function staggeredCrawl(urls, pool) {
  const results = [];
  const concurrency = pool.size;

  for (let i = 0; i < urls.length; i += concurrency) {
    const batch = urls.slice(i, i + concurrency);
    const promises = batch.map((url, j) => (async () => {
      // Add random jitter: 0–2000ms
      await new Promise(r => setTimeout(r, Math.random() * 2000));
      const worker = await pool.acquire();
      try {
        await worker.page.goto(url, { waitUntil: 'networkidle2', timeout: 30000 });
        return await worker.page.evaluate(() => document.title);
      } finally {
        await pool.release(worker.index);
      }
    })());

    const batchResults = await Promise.allSettled(promises);
    results.push(...batchResults);

    // Cooldown between batches
    await new Promise(r => setTimeout(r, 1000));
  }
  return results;
}

When Stealth Isn't Enough: CAPTCHAs and Behavioral Checks

Even with the full stack — stealth plugin, residential proxies, fingerprint randomization — some sites will still challenge you. This typically happens when:

The site uses advanced behavioral analysis (mouse movement patterns, scroll depth, typing cadence)
You're hitting the same endpoint at superhuman speed
Your proxy IP has been burned by other users

Mitigation strategies:

Use residential proxies with city-level targeting — user-country-US-city-newyork — for local-service sites that validate IP geolocation against the claimed location
Add realistic interaction delays — don't just goto() and scrape; scroll, hover, wait for images to load
Rotate proxy sessions proactively — don't wait for a block; change your session ID every 50 requests
Use mobile proxies for mobile-optimized sites — mobile user agents on residential mobile IPs are less scrutinized than desktop datacenter traffic. ProxyHat's mobile proxies are available via socks5://USERNAME:PASSWORD@gate.proxyhat.com:1080

Ethical Boundaries: Stealth for Legitimate Scraping

Stealth technology is a tool, not a license to bypass every gate. There are clear lines between legitimate scraping and abuse:

Legitimate use: collecting publicly available data at reasonable rates, monitoring your own brand's SERP positions, aggregating pricing data from e-commerce sites for comparison tools, academic research on public web content.

Abuse: bypassing authentication to access private data, circumventing rate limits to DDoS a service, creating fake accounts at scale, committing ad fraud or credential stuffing.

Practical guidelines for ethical stealth scraping:

Respect robots.txt — if a page is disallowed, don't scrape it
Honor rate limits — if a site returns 429, back off instead of rotating IPs to hammer harder
Comply with GDPR and CCPA — don't collect personal data without a legal basis
Check terms of service — some sites explicitly prohibit scraping; violating ToS can have legal consequences
Be transparent when possible — if you can identify your bot in the User-Agent without getting blocked, do so

Stealth is a defensive measure against overly aggressive bot detection that blocks legitimate automated access. It's not a tool to access things you shouldn't.

Key Takeaways

Raw Puppeteer is trivially detectable — navigator.webdriver, empty plugins, and CDP artifacts expose automation immediately.
puppeteer-extra-plugin-stealth patches structural signals but doesn't randomize fingerprints — you need custom evaluators for canvas and WebGL.
Residential proxies + stealth is the strongest stack — network-layer and browser-layer detection are independent problems that require independent solutions.
Per-session fingerprint isolation prevents correlation — each browser instance should have a unique canvas noise seed, WebGL profile, viewport, and User-Agent.
Browser pools with dedicated proxies enable safe concurrency — don't share IPs or cookies across sessions.
Scale with containers and queues — stateless workers pulling from Redis, with health checks that recycle browsers before they leak or get flagged.
Stealth is for legitimate scraping — respect robots.txt, rate limits, and privacy regulations.

Ready to build your anti-detection stack? Check out ProxyHat's residential proxy plans for geo-targeted IPs that pair perfectly with puppeteer-extra stealth, or explore our web scraping use case for more implementation patterns.

The Complete Guide to Puppeteer-Extra Stealth with Proxies

Why Raw Puppeteer Gets Caught Every Time

The Big Three Detection Signals

Puppeteer-Extra with Stealth Plugin: What It Actually Patches

Setting Up Puppeteer-Extra Stealth with Proxies

Combining Stealth with Residential Proxies: The Anti-Detection Stack

Sticky Sessions for Stateful Scraping

Custom Evaluators: Canvas and WebGL Fingerprint Randomization

Canvas Fingerprint Randomization

WebGL Fingerprint Randomization

Per-Browser-Context Proxy Rotation

Scaling: Containerized Fleets and Resource Management

Resource Budgeting

Container Architecture

Browser Lifecycle Management

Concurrency vs. Politeness

When Stealth Isn't Enough: CAPTCHAs and Behavioral Checks

Ethical Boundaries: Stealth for Legitimate Scraping

Key Takeaways

Ready to get started?

Why Raw Puppeteer Gets Caught Every Time

The Big Three Detection Signals

Puppeteer-Extra with Stealth Plugin: What It Actually Patches

Setting Up Puppeteer-Extra Stealth with Proxies

Combining Stealth with Residential Proxies: The Anti-Detection Stack

Sticky Sessions for Stateful Scraping

Custom Evaluators: Canvas and WebGL Fingerprint Randomization

Canvas Fingerprint Randomization

WebGL Fingerprint Randomization

Per-Browser-Context Proxy Rotation

Scaling: Containerized Fleets and Resource Management

Resource Budgeting

Container Architecture

Browser Lifecycle Management

Concurrency vs. Politeness

When Stealth Isn't Enough: CAPTCHAs and Behavioral Checks

Ethical Boundaries: Stealth for Legitimate Scraping

Key Takeaways

Ready to get started?

You might also be interested in

Node.js Scraping with Cheerio & Proxies: A Code-First Guide

Selenium Proxy Auth & Stealth: A Code-First Guide for Scraping Engineers

Scrapy Proxy Middleware: A Code-First Guide to Residential Proxy Rotation

Using HTTP Proxies in Rust: reqwest, hyper, and Rotating Proxy Pools