Playwright에서 프록시를 어떻게 설정합니까?

browser.launch()나 browser.newContext()에서 proxy 옵션을 전달합니다. 서버 URL, 사용자 이름, 비밀번호를 지정할 수 있습니다. 컨텍스트별로 다른 프록시를 설정할 수도 있습니다.

Playwright에서 프록시 로테이션은 어떻게 합니까?

각 컨텍스트나 페이지마다 다른 프록시를 할당합니다. ProxyHat의 게이트웨이를 사용하면 자동으로 로테이션됩니다. 또는 세션 ID를 변경하여 IP를 전환할 수 있습니다.

Playwright 로테이팅 프록시 가이드

Playwright와 프록시

Playwright는 Chromium, Firefox, WebKit을 제어하는 브라우저 자동화 프레임워크입니다. JavaScript 렌더링이 필요한 사이트의 스크래핑에 필수적이며 프록시와 조합하면 강력한 스크래핑 도구가 됩니다.

기본 프록시 설정

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch(
        proxy={
            "server": "http://gate.proxyhat.com:8080",
            "username": "your_username",
            "password": "your_password",
        }
    )
    page = browser.new_page()
    page.goto("https://example.com")
    content = page.content()
    browser.close()

컨텍스트별 프록시

browser = p.chromium.launch()

# 각 컨텍스트에 다른 프록시 할당
context1 = browser.new_context(proxy={
    "server": "http://gate.proxyhat.com:8080",
    "username": "user_session1",
    "password": "pass",
})

context2 = browser.new_context(proxy={
    "server": "http://gate.proxyhat.com:8080",
    "username": "user_session2",
    "password": "pass",
})

동시 스크래핑

import asyncio
from playwright.async_api import async_playwright

async def scrape_page(browser, url, proxy_config):
    context = await browser.new_context(proxy=proxy_config)
    page = await context.new_page()
    await page.goto(url)
    content = await page.content()
    await context.close()
    return content

async def main():
    async with async_playwright() as p:
        browser = await p.chromium.launch()
        urls = [f"https://example.com/product/{i}" for i in range(1, 11)]

        tasks = [scrape_page(browser, url, {
            "server": "http://gate.proxyhat.com:8080",
            "username": "user", "password": "pass"
        }) for url in urls]

        results = await asyncio.gather(*tasks)
        await browser.close()

asyncio.run(main())

스텔스 설정

현실적인 viewport 크기 설정
locale과 timezone 설정
webdriver 속성 제거
실제 브라우저 User-Agent 사용

핵심 요약

Playwright는 JS 렌더링이 필요한 사이트 스크래핑에 최적입니다.

컨텍스트별 프록시로 각 세션에 다른 IP를 할당할 수 있습니다.

비동기 API로 동시 스크래핑이 가능합니다.

ProxyHat 게이트웨이(gate.proxyhat.com:8080)와 Playwright를 조합하십시오.

Playwright로 로테이팅 프록시 사용하기: 완벽 개발자 가이드

Playwright와 프록시

기본 프록시 설정

컨텍스트별 프록시

동시 스크래핑

스텔스 설정

핵심 요약

시작할 준비가 되셨나요?

Playwright와 프록시

기본 프록시 설정

컨텍스트별 프록시

동시 스크래핑

스텔스 설정

핵심 요약

시작할 준비가 되셨나요?

이런 글도 관심 있으실 수 있어요

안정적인 스크래핑 아키텍처 설계

프록시 성능 모니터링: 지연 시간, 성공률, 알림

스크래핑 인프라 확장 방법

동시성 제어로 프록시 요청 확장하기