Why Headless Browsers Need Proxies
Headless browsers — browser instances running without a visible GUI — are essential for scraping JavaScript-heavy websites. However, running multiple headless browser sessions from a single IP address is an obvious automation signal. Combining headless browsers with proxy rotation solves this by distributing requests across thousands of residential IPs.
This guide covers Puppeteer and Playwright proxy configuration, stealth plugins, and rotation strategies. For background on how detection works, see our guide to anti-bot detection systems.
Puppeteer Proxy Setup
Basic Configuration
// Puppeteer: Basic proxy configuration
const puppeteer = require('puppeteer');
const browser = await puppeteer.launch({
headless: 'new',
args: [
'--proxy-server=http://gate.proxyhat.com:8080',
'--no-sandbox',
'--disable-setuid-sandbox'
]
});
const page = await browser.newPage();
// Authenticate with the proxy
await page.authenticate({
username: 'USERNAME',
password: 'PASSWORD'
});
await page.goto('https://example.com', {
waitUntil: 'networkidle2',
timeout: 30000
});
const content = await page.content();
console.log(content.substring(0, 200));
await browser.close();
SOCKS5 with Puppeteer
// Puppeteer: SOCKS5 proxy configuration
const browser = await puppeteer.launch({
headless: 'new',
args: [
'--proxy-server=socks5://gate.proxyhat.com:1080'
]
});
const page = await browser.newPage();
await page.authenticate({
username: 'USERNAME',
password: 'PASSWORD'
});
Puppeteer Stealth Plugin
The default Puppeteer configuration exposes dozens of automation markers that anti-bot systems detect. The puppeteer-extra-plugin-stealth plugin patches these markers automatically.
// Install: npm install puppeteer-extra puppeteer-extra-plugin-stealth
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
// Apply stealth plugin
puppeteer.use(StealthPlugin());
const browser = await puppeteer.launch({
headless: 'new',
args: [
'--proxy-server=http://gate.proxyhat.com:8080',
'--disable-blink-features=AutomationControlled',
'--window-size=1920,1080',
'--disable-dev-shm-usage'
]
});
const page = await browser.newPage();
await page.authenticate({
username: 'USERNAME',
password: 'PASSWORD'
});
// Set realistic viewport
await page.setViewport({ width: 1920, height: 1080 });
// Set extra headers for consistency
await page.setExtraHTTPHeaders({
'Accept-Language': 'en-US,en;q=0.9'
});
await page.goto('https://example.com');
The stealth plugin patches:
navigator.webdriver— set to undefined instead of truechrome.runtime— adds missing Chrome-specific objects- WebGL vendor/renderer — realistic GPU strings
- Plugin and permission arrays — match real Chrome
- Language and platform consistency
For deeper understanding of what these patches address, see our article on browser fingerprinting.
Playwright Proxy Setup
Playwright provides more elegant proxy configuration than Puppeteer, supporting per-context proxies and built-in device emulation.
Basic Configuration
// Playwright: Basic proxy setup
const { chromium } = require('playwright');
const browser = await chromium.launch({
proxy: {
server: 'http://gate.proxyhat.com:8080',
username: 'USERNAME',
password: 'PASSWORD'
}
});
const context = await browser.newContext();
const page = await context.newPage();
await page.goto('https://example.com');
const title = await page.title();
console.log(`Page title: ${title}`);
await browser.close();
Per-Context Proxy (Different IPs per Tab)
// Playwright: Different proxy per context
const { chromium } = require('playwright');
const browser = await chromium.launch();
// Context 1: US proxy session
const ctx1 = await browser.newContext({
proxy: {
server: 'http://gate.proxyhat.com:8080',
username: 'USERNAME-country-us-session-abc1',
password: 'PASSWORD'
},
locale: 'en-US',
timezoneId: 'America/New_York'
});
// Context 2: UK proxy session
const ctx2 = await browser.newContext({
proxy: {
server: 'http://gate.proxyhat.com:8080',
username: 'USERNAME-country-gb-session-abc2',
password: 'PASSWORD'
},
locale: 'en-GB',
timezoneId: 'Europe/London'
});
const page1 = await ctx1.newPage();
const page2 = await ctx2.newPage();
// Each page uses a different IP and locale
await page1.goto('https://example.com');
await page2.goto('https://example.com');
Device Emulation with Proxy
// Playwright: Realistic device emulation + proxy
const { chromium, devices } = require('playwright');
const browser = await chromium.launch({
proxy: {
server: 'http://gate.proxyhat.com:8080',
username: 'USERNAME',
password: 'PASSWORD'
}
});
// Emulate a specific device with matching settings
const context = await browser.newContext({
...devices['Desktop Chrome'],
locale: 'en-US',
timezoneId: 'America/Chicago',
geolocation: { latitude: 41.8781, longitude: -87.6298 },
permissions: ['geolocation'],
colorScheme: 'light'
});
const page = await context.newPage();
await page.goto('https://example.com');
Proxy Rotation with Headless Browsers
Strategy 1: New Context per Request
Create a new browser context for each URL. This gives a fresh IP and clean cookies per request.
// Playwright: Rotate proxy per request via new contexts
async function scrapeWithRotation(urls) {
const browser = await chromium.launch();
const results = [];
for (const url of urls) {
const sessionId = `sess-${Date.now()}-${Math.random().toString(36).slice(2, 8)}`;
const context = await browser.newContext({
proxy: {
server: 'http://gate.proxyhat.com:8080',
username: `USERNAME-session-${sessionId}`,
password: 'PASSWORD'
}
});
const page = await context.newPage();
try {
await page.goto(url, { waitUntil: 'domcontentloaded', timeout: 30000 });
const data = await page.evaluate(() => document.title);
results.push({ url, data });
} catch (error) {
console.error(`Failed: ${url} — ${error.message}`);
} finally {
await context.close();
}
// Natural delay between requests
await new Promise(r => setTimeout(r, 1000 + Math.random() * 2000));
}
await browser.close();
return results;
}
Strategy 2: Sticky Sessions for Multi-Page Flows
Use sticky sessions when you need to maintain the same IP across multiple pages (pagination, login flows).
// Playwright: Sticky session for multi-page scraping
async function scrapeWithStickySession(baseUrl, pageCount) {
const sessionId = `sticky-${Date.now()}`;
const browser = await chromium.launch({
proxy: {
server: 'http://gate.proxyhat.com:8080',
username: `USERNAME-session-${sessionId}`,
password: 'PASSWORD'
}
});
const context = await browser.newContext();
const page = await context.newPage();
const results = [];
for (let i = 1; i <= pageCount; i++) {
await page.goto(`${baseUrl}?page=${i}`, { waitUntil: 'networkidle' });
const items = await page.$$eval('.item', els =>
els.map(el => el.textContent.trim())
);
results.push(...items);
// Natural delay between pages
await new Promise(r => setTimeout(r, 1500 + Math.random() * 1500));
}
await browser.close();
return results;
}
Strategy 3: Concurrent Scraping with Pool
// Playwright: Concurrent scraping with proxy pool
async function concurrentScrape(urls, concurrency = 5) {
const browser = await chromium.launch();
const results = [];
// Process URLs in batches
for (let i = 0; i < urls.length; i += concurrency) {
const batch = urls.slice(i, i + concurrency);
const promises = batch.map(async (url) => {
const sessionId = `conc-${Date.now()}-${Math.random().toString(36).slice(2, 8)}`;
const context = await browser.newContext({
proxy: {
server: 'http://gate.proxyhat.com:8080',
username: `USERNAME-session-${sessionId}`,
password: 'PASSWORD'
}
});
const page = await context.newPage();
try {
await page.goto(url, { timeout: 30000, waitUntil: 'domcontentloaded' });
return { url, title: await page.title(), status: 'ok' };
} catch (e) {
return { url, error: e.message, status: 'error' };
} finally {
await context.close();
}
});
const batchResults = await Promise.all(promises);
results.push(...batchResults);
// Delay between batches
await new Promise(r => setTimeout(r, 2000));
}
await browser.close();
return results;
}
Puppeteer with Python (Pyppeteer Alternative)
For Python developers, playwright for Python provides the same capabilities with cleaner syntax.
# Python: Playwright with proxy and stealth settings
# pip install playwright
# playwright install chromium
from playwright.async_api import async_playwright
import asyncio
async def scrape_with_proxy():
async with async_playwright() as p:
browser = await p.chromium.launch(
proxy={
"server": "http://gate.proxyhat.com:8080",
"username": "USERNAME",
"password": "PASSWORD"
}
)
context = await browser.new_context(
viewport={"width": 1920, "height": 1080},
locale="en-US",
timezone_id="America/New_York",
user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
)
page = await context.new_page()
await page.goto("https://example.com")
title = await page.title()
print(f"Title: {title}")
await browser.close()
asyncio.run(scrape_with_proxy())
Resource Optimization
Headless browsers consume significant memory and CPU. Optimize for production workloads:
Block Unnecessary Resources
// Playwright: Block images, fonts, and CSS to save bandwidth
const context = await browser.newContext({
proxy: {
server: 'http://gate.proxyhat.com:8080',
username: 'USERNAME',
password: 'PASSWORD'
}
});
const page = await context.newPage();
// Block non-essential resources
await page.route('**/*.{png,jpg,jpeg,gif,svg,webp,woff,woff2,ttf,css}', route =>
route.abort()
);
// Block tracking and analytics
await page.route('**/{google-analytics,gtag,facebook}**', route =>
route.abort()
);
await page.goto('https://example.com');
Memory Management
- Close contexts after use: Each open context consumes 50-150MB. Always close contexts when done.
- Limit concurrent contexts: Keep 3-10 contexts per browser instance based on available RAM.
- Restart browser periodically: After 100-200 context cycles, restart the browser to prevent memory leaks.
- Use headless: 'new' (Puppeteer): The new headless mode uses less memory than the old one.
Handling Common Issues
Proxy Authentication Failures
// Handle proxy auth errors gracefully
try {
const response = await page.goto(url, { timeout: 30000 });
if (response.status() === 407) {
console.error('Proxy authentication failed — check credentials');
}
} catch (error) {
if (error.message.includes('net::ERR_PROXY_CONNECTION_FAILED')) {
console.error('Proxy connection failed — check proxy server availability');
}
}
Timeout Handling
// Retry with exponential backoff
async function gotoWithRetry(page, url, maxRetries = 3) {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
return await page.goto(url, {
waitUntil: 'domcontentloaded',
timeout: 30000
});
} catch (error) {
if (attempt === maxRetries) throw error;
const delay = 1000 * Math.pow(2, attempt) + Math.random() * 1000;
console.log(`Retry ${attempt}/${maxRetries} after ${delay}ms`);
await new Promise(r => setTimeout(r, delay));
}
}
}
Best Practices Checklist
| Practice | Puppeteer | Playwright |
|---|---|---|
| Per-context proxy | Via launch args only | Per-context proxy support |
| Stealth patches | puppeteer-extra-plugin-stealth | Built-in device emulation |
| Resource blocking | page.setRequestInterception | page.route |
| Multi-browser | Chromium only | Chromium, Firefox, WebKit |
| Session isolation | New browser per session | New context per session |
For most scraping tasks, Playwright is the recommended choice over Puppeteer due to its superior proxy support (per-context), built-in device emulation, and multi-browser support. Combine it with ProxyHat's residential proxies for the best results.
For language-specific proxy setup without headless browsers, see our guides for Python, Node.js, and Go. For comprehensive anti-detection strategies, read our detection reduction guide. Always follow ethical scraping practices and respect website access policies.






