Guida Completa ai Proxy HTTP in PHP: cURL, Guzzle, Symfony e Laravel

Scopri come configurare proxy HTTP in PHP utilizzando cURL nativo, Guzzle, Symfony HTTP Client e Laravel. Include esempi di codice per rotazione IP, richieste concorrenti e gestione TLS/SSL.

Guida Completa ai Proxy HTTP in PHP: cURL, Guzzle, Symfony e Laravel

Se stai sviluppando scraper, integrando API di terze parti o automatizzando processi in PHP, prima o poi ti scontrerai con limiti di rate, blocchi IP o necessità di geo-targeting. I proxy HTTP risolvono questi problemi permettendoti di instradare le richieste attraverso IP diversi. In questa guida esploreremo come implementare proxy HTTP in PHP utilizzando cURL nativo, Guzzle, Symfony HTTP Client e Laravel, con esempi di codice pronti per la produzione.

Perché Usare Proxy HTTP in PHP

Quando effettui richieste HTTP ripetute verso lo stesso endpoint, il server destinatario può bloccare il tuo IP per eccesso di richieste. I proxy HTTP risolvono questo problema instradando il traffico attraverso IP intermedi. Per gli sviluppatori PHP, questo è particolarmente rilevante per:

  • Web scraping: evitare blocchi IP durante l'estrazione di dati
  • Integrazioni API: rispettare rate limit distribuiti su più IP
  • Testing geografico: verificare contenuti localizzati da diversi paesi
  • Automazione: eseguire job in parallelo senza conflitti IP

La scelta del tipo di proxy — residential, datacenter o mobile — dipende dal caso d'uso. I proxy residential offrono la massima affidabilità per lo scraping perché utilizzano IP di dispositivi reali. I proxy datacenter sono più veloci ed economici, ma più facili da rilevare. I proxy mobile sono ideali per piattaforme con protezioni anti-bot avanzate.

cURL Nativo: Configurazione Base con Proxy

cURL è il modo più diretto per effettuare richieste HTTP con proxy in PHP. Le opzioni chiave sono CURLOPT_PROXY per l'hostname del proxy e CURLOPT_PROXYUSERPWD per le credenziali di autenticazione.

<?php

class CurlProxyClient
{
    private string $proxyHost = 'gate.proxyhat.com';
    private int $proxyPort = 8080;
    private string $username;
    private string $password;

    public function __construct(string $username, string $password)
    {
        $this->username = $username;
        $this->password = $password;
    }

    public function get(string $url, array $options = []): array
    {
        $ch = curl_init();

        // Configurazione base
        curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
        curl_setopt($ch, CURLOPT_MAXREDIRS, 5);
        curl_setopt($ch, CURLOPT_TIMEOUT, $options['timeout'] ?? 30);
        curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $options['connect_timeout'] ?? 10);

        // Configurazione proxy HTTP
        curl_setopt($ch, CURLOPT_PROXY, $this->proxyHost);
        curl_setopt($ch, CURLOPT_PROXYPORT, $this->proxyPort);
        curl_setopt($ch, CURLOPT_PROXYTYPE, CURLPROXY_HTTP);

        // Autenticazione proxy
        $proxyAuth = "{$this->username}:{$this->password}";
        curl_setopt($ch, CURLOPT_PROXYUSERPWD, $proxyAuth);

        // TLS/SSL - Verifica certificato
        curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, true);
        curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2);
        
        // CA bundle personalizzato (opzionale)
        if (isset($options['ca_bundle'])) {
            curl_setopt($ch, CURLOPT_CAINFO, $options['ca_bundle']);
        }

        // Headers personalizzati
        if (!empty($options['headers'])) {
            curl_setopt($ch, CURLOPT_HTTPHEADER, $options['headers']);
        }

        // Esecuzione richiesta
        $response = curl_exec($ch);
        $httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
        $error = curl_error($ch);
        $errno = curl_errno($ch);
        
        curl_close($ch);

        return [
            'status' => $httpCode,
            'body' => $response,
            'error' => $error,
            'errno' => $errno,
            'success' => $errno === 0 && $httpCode >= 200 && $httpCode < 300
        ];
    }

    public function post(string $url, array $data, array $options = []): array
    {
        $options['headers'] = $options['headers'] ?? [];
        $options['headers'][] = 'Content-Type: application/json';
        $options['body'] = json_encode($data);
        
        return $this->request('POST', $url, $options);
    }

    private function request(string $method, string $url, array $options): array
    {
        $ch = curl_init();

        curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
        curl_setopt($ch, CURLOPT_CUSTOMREQUEST, $method);

        if (!empty($options['body'])) {
            curl_setopt($ch, CURLOPT_POSTFIELDS, $options['body']);
        }

        curl_setopt($ch, CURLOPT_PROXY, $this->proxyHost);
        curl_setopt($ch, CURLOPT_PROXYPORT, $this->proxyPort);
        curl_setopt($ch, CURLOPT_PROXYUSERPWD, "{$this->username}:{$this->password}");

        if (!empty($options['headers'])) {
            curl_setopt($ch, CURLOPT_HTTPHEADER, $options['headers']);
        }

        $response = curl_exec($ch);
        $httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
        $error = curl_error($ch);
        
        curl_close($ch);

        return [
            'status' => $httpCode,
            'body' => $response,
            'error' => $error
        ];
    }
}

// Esempio di utilizzo con geo-targeting
$client = new CurlProxyClient('user-country-US-session-abc123', 'password');

$response = $client->get('https://httpbin.org/ip', [
    'timeout' => 20,
    'headers' => [
        'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
        'Accept: application/json'
    ]
]);

if ($response['success']) {
    echo "IP utilizzato: " . $response['body'];
} else {
    echo "Errore: " . $response['error'];
}

Questo esempio mostra una classe wrapper completa che gestisce autenticazione, timeout, headers personalizzati e geo-targeting. Il formato del username user-country-US-session-abc123 permette di specificare paese e sessione per mantenere lo stesso IP attraverso richieste multiple.

Rotazione IP Per-Richiesta con cURL

Per evitare blocchi durante lo scraping intensivo, è fondamentale ruotare gli IP tra una richiesta e l'altra. Con i proxy residential che supportano sessioni sticky, puoi implementare una rotazione controllata.

<?php

class RotatingProxyClient
{
    private string $proxyHost = 'gate.proxyhat.com';
    private int $proxyPort = 8080;
    private string $baseUsername;
    private string $password;
    private int $requestCount = 0;

    public function __construct(string $baseUsername, string $password)
    {
        $this->baseUsername = $baseUsername;
        $this->password = $password;
    }

    private function generateSessionId(): string
    {
        // Genera un ID sessione unico per ogni rotazione
        return 'sess_' . bin2hex(random_bytes(8));
    }

    public function requestWithRotation(
        string $url,
        int $rotateEvery = 1,
        ?string $country = null
    ): array {
        $this->requestCount++;
        
        // Ruota l'IP ogni N richieste
        $sessionId = $this->generateSessionId();
        
        // Costruisci il username con geo-targeting e sessione
        $username = $this->baseUsername;
        
        if ($country !== null) {
            $username .= "-country-{$country}";
        }
        
        $username .= "-session-{$sessionId}";

        $ch = curl_init();
        curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
        curl_setopt($ch, CURLOPT_PROXY, $this->proxyHost);
        curl_setopt($ch, CURLOPT_PROXYPORT, $this->proxyPort);
        curl_setopt($ch, CURLOPT_PROXYUSERPWD, "{$username}:{$this->password}");
        curl_setopt($ch, CURLOPT_TIMEOUT, 30);
        curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, true);

        $response = curl_exec($ch);
        $info = curl_getinfo($ch);
        $error = curl_error($ch);
        curl_close($ch);

        return [
            'status' => $info['http_code'],
            'body' => $response,
            'total_time' => $info['total_time'],
            'connect_time' => $info['connect_time'],
            'session_id' => $sessionId,
            'error' => $error
        ];
    }

    public function scrapeMultiple(array $urls, ?string $country = null): array
    {
        $results = [];
        
        foreach ($urls as $index => $url) {
            $results[] = $this->requestWithRotation($url, 1, $country);
            
            // Pausa tra richieste per evitare rate limiting
            if ($index < count($urls) - 1) {
                usleep(500000); // 500ms
            }
        }
        
        return $results;
    }
}

// Utilizzo con rotazione IP automatica
$client = new RotatingProxyClient('user', 'your_password');

$urls = [
    'https://httpbin.org/ip',
    'https://httpbin.org/headers',
    'https://httpbin.org/user-agent'
];

$results = $client->scrapeMultiple($urls, 'DE'); // Geo-targeting Germania

foreach ($results as $result) {
    echo "Sessione: {$result['session_id']}\n";
    echo "Status: {$result['status']}\n";
    echo "Tempo: {$result['total_time']}s\n\n";
}

Guzzle HTTP Client: Configurazione Proxy Avanzata

Guzzle è il client HTTP più popolare nell'ecosistema PHP. Offre un'interfaccia orientata agli oggetti, middleware per retry automatici e gestione delle eccezioni integrata.

<?php

require 'vendor/autoload.php';

use GuzzleHttp\Client;
use GuzzleHttp\HandlerStack;
use GuzzleHttp\Middleware;
use GuzzleHttp\RetryMiddleware;
use Psr\Http\Message\RequestInterface;
use Psr\Http\Message\ResponseInterface;

class GuzzleProxyClient
{
    private Client $client;
    private string $proxyHost = 'gate.proxyhat.com';
    private int $proxyPort = 8080;

    public function __construct(
        string $username,
        string $password,
        array $config = []
    ) {
        // Configurazione proxy come URL completo
        $proxyUrl = "http://{$username}:{$password}@{$this->proxyHost}:{$this->proxyPort}";

        // Handler stack per middleware personalizzati
        $handlerStack = HandlerStack::create();

        // Middleware per retry automatico con backoff esponenziale
        $handlerStack->push(Middleware::retry(
            function (int $retries, RequestInterface $request, ?ResponseInterface $response, ?\Exception $exception) use ($config) {
                $maxRetries = $config['max_retries'] ?? 3;
                
                // Retry su errori 5xx o eccezioni di rete
                if ($retries >= $maxRetries) {
                    return false;
                }
                
                if ($exception instanceof \GuzzleHttp\Exception\ConnectException) {
                    return true;
                }
                
                if ($response && $response->getStatusCode() >= 500) {
                    return true;
                }
                
                return false;
            },
            function (int $retries) {
                // Backoff esponenziale: 1s, 2s, 4s...
                return 1000 * (2 ** $retries);
            }
        ));

        // Middleware per logging
        $handlerStack->push(Middleware::log(
            new \Monolog\Logger('proxy_client'),
            new \Monolog\Formatter\LineFormatter('%message% %context%')
        ));

        $this->client = new Client([
            'proxy' => [
                'http' => $proxyUrl,
                'https' => $proxyUrl,
            ],
            'timeout' => $config['timeout'] ?? 30,
            'connect_timeout' => $config['connect_timeout'] ?? 10,
            'handler' => $handlerStack,
            'verify' => $config['verify_ssl'] ?? true,
            'headers' => [
                'User-Agent' => $config['user_agent'] ?? 'ProxyHat-PHP-Client/1.0',
                'Accept' => 'application/json',
            ]
        ]);
    }

    public function get(string $url, array $options = []): array
    {
        try {
            $response = $this->client->get($url, $options);
            
            return [
                'success' => true,
                'status' => $response->getStatusCode(),
                'body' => $response->getBody()->getContents(),
                'headers' => $response->getHeaders()
            ];
        } catch (\GuzzleHttp\Exception\RequestException $e) {
            $response = $e->getResponse();
            
            return [
                'success' => false,
                'status' => $response ? $response->getStatusCode() : 0,
                'error' => $e->getMessage(),
                'body' => $response ? $response->getBody()->getContents() : ''
            ];
        }
    }

    public function post(string $url, array $data, array $options = []): array
    {
        try {
            $options['json'] = $data;
            $response = $this->client->post($url, $options);
            
            return [
                'success' => true,
                'status' => $response->getStatusCode(),
                'body' => $response->getBody()->getContents()
            ];
        } catch (\GuzzleHttp\Exception\RequestException $e) {
            return [
                'success' => false,
                'error' => $e->getMessage()
            ];
        }
    }

    // Override proxy per singola richiesta (rotazione manuale)
    public function requestWithProxy(
        string $method,
        string $url,
        string $proxyUsername,
        string $proxyPassword,
        array $options = []
    ): array {
        $proxyUrl = "http://{$proxyUsername}:{$proxyPassword}@{$this->proxyHost}:{$this->proxyPort}";
        
        $options['proxy'] = [
            'http' => $proxyUrl,
            'https' => $proxyUrl
        ];

        try {
            $response = $this->client->request($method, $url, $options);
            
            return [
                'success' => true,
                'status' => $response->getStatusCode(),
                'body' => $response->getBody()->getContents()
            ];
        } catch (\Exception $e) {
            return [
                'success' => false,
                'error' => $e->getMessage()
            ];
        }
    }
}

// Esempio: Client con geo-targeting per gli USA
$client = new GuzzleProxyClient(
    'user-country-US',
    'your_password',
    [
        'timeout' => 45,
        'max_retries' => 5,
        'user_agent' => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
    ]
);

// Richiesta GET con proxy
$result = $client->get('https://api.ipify.org?format=json');

if ($result['success']) {
    $data = json_decode($result['body'], true);
    echo "IP in uscita: " . $data['ip'];
}

// Rotazione IP per-richiesta
for ($i = 0; $i < 5; $i++) {
    $sessionId = 'rot_' . bin2hex(random_bytes(4));
    $result = $client->requestWithProxy(
        'GET',
        'https://httpbin.org/ip',
        "user-country-DE-session-{$sessionId}",
        'your_password'
    );
    
    echo "Richiesta {$i}: " . $result['body'] . "\n";
}

Pool di Proxy con Guzzle

Per applicazioni che necessitano di bilanciare il carico su più pool di proxy, puoi implementare un selettore round-robin o basato su pesi.

<?php

class ProxyPool
{
    private array $proxies = [];
    private int $currentIndex = 0;

    public function __construct(array $proxyConfigs)
    {
        foreach ($proxyConfigs as $config) {
            $this->proxies[] = [
                'host' => $config['host'] ?? 'gate.proxyhat.com',
                'port' => $config['port'] ?? 8080,
                'username' => $config['username'],
                'password' => $config['password'],
                'weight' => $config['weight'] ?? 1,
                'failures' => 0,
                'last_used' => 0
            ];
        }
    }

    public function getNext(): array
    {
        // Algoritmo weighted round-robin con failover
        $totalWeight = array_sum(array_column($this->proxies, 'weight'));
        $random = mt_rand(1, $totalWeight);
        
        $current = 0;
        foreach ($this->proxies as &$proxy) {
            $current += $proxy['weight'];
            if ($random <= $current && $proxy['failures'] < 3) {
                $proxy['last_used'] = time();
                return $proxy;
            }
        }
        
        // Fallback al primo proxy disponibile
        return reset($this->proxies);
    }

    public function markFailure(string $username): void
    {
        foreach ($this->proxies as &$proxy) {
            if ($proxy['username'] === $username) {
                $proxy['failures']++;
                break;
            }
        }
    }

    public function markSuccess(string $username): void
    {
        foreach ($this->proxies as &$proxy) {
            if ($proxy['username'] === $username) {
                $proxy['failures'] = max(0, $proxy['failures'] - 1);
                break;
            }
        }
    }

    public function getProxyUrl(array $proxy): string
    {
        return "http://{$proxy['username']}:{$proxy['password']}@{$proxy['host']}:{$proxy['port']}";
    }
}

class PooledGuzzleClient
{
    private ProxyPool $pool;
    private Client $client;

    public function __construct(ProxyPool $pool)
    {
        $this->pool = $pool;
        $this->client = new Client([
            'timeout' => 30,
            'connect_timeout' => 10,
            'verify' => true
        ]);
    }

    public function request(string $method, string $url, array $options = []): array
    {
        $maxAttempts = 3;
        $lastError = null;

        for ($attempt = 0; $attempt < $maxAttempts; $attempt++) {
            $proxy = $this->pool->getNext();
            $proxyUrl = $this->pool->getProxyUrl($proxy);

            $options['proxy'] = [
                'http' => $proxyUrl,
                'https' => $proxyUrl
            ];

            try {
                $response = $this->client->request($method, $url, $options);
                $this->pool->markSuccess($proxy['username']);
                
                return [
                    'success' => true,
                    'status' => $response->getStatusCode(),
                    'body' => $response->getBody()->getContents()
                ];
            } catch (\Exception $e) {
                $this->pool->markFailure($proxy['username']);
                $lastError = $e->getMessage();
                
                // Pausa prima del retry
                usleep(1000000 * ($attempt + 1));
            }
        }

        return [
            'success' => false,
            'error' => $lastError
        ];
    }
}

// Configurazione pool con diverse località
$pool = new ProxyPool([
    ['username' => 'user-country-US', 'password' => 'pass', 'weight' => 3],
    ['username' => 'user-country-DE', 'password' => 'pass', 'weight' => 2],
    ['username' => 'user-country-GB', 'password' => 'pass', 'weight' => 1],
]);

$client = new PooledGuzzleClient($pool);

// Le richieste vengono distribuite secondo i pesi
for ($i = 0; $i < 10; $i++) {
    $result = $client->request('GET', 'https://httpbin.org/ip');
    echo "Richiesta {$i}: " . ($result['success'] ? 'OK' : 'FAIL') . "\n";
}

Symfony HTTP Client: Richieste Asincrone e Proxy

Symfony HTTP Client offre un'API moderna con supporto nativo per richieste asincrone, multiplexing e gestione avanzata delle risposte. È particolarmente adatto per scraping ad alte prestazioni.

<?php

require 'vendor/autoload.php';

use Symfony\Component\HttpClient\HttpClient;
use Symfony\Component\HttpClient\HttpClientInterface;
use Symfony\Contracts\HttpClient\ResponseInterface;

class SymfonyProxyClient
{
    private HttpClientInterface $client;
    private string $proxyHost = 'gate.proxyhat.com';
    private int $proxyPort = 8080;

    public function __construct(
        string $username,
        string $password,
        array $options = []
    ) {
        $proxyUrl = "http://{$username}:{$password}@{$this->proxyHost}:{$this->proxyPort}";

        $this->client = HttpClient::create([
            'proxy' => $proxyUrl,
            'timeout' => $options['timeout'] ?? 30,
            'max_duration' => $options['max_duration'] ?? 60,
            'verify_peer' => $options['verify_ssl'] ?? true,
            'verify_host' => true,
            'headers' => [
                'User-Agent' => $options['user_agent'] ?? 'Symfony-Proxy-Client/1.0',
            ],
            'max_redirects' => 5,
        ]);
    }

    // Richiesta sincrona
    public function get(string $url, array $options = []): array
    {
        $response = $this->client->request('GET', $url, $options);
        
        try {
            $statusCode = $response->getStatusCode();
            $content = $response->getContent();
            
            return [
                'success' => $statusCode >= 200 && $statusCode < 300,
                'status' => $statusCode,
                'body' => $content,
                'headers' => $response->getHeaders()
            ];
        } catch (\Symfony\Contracts\HttpClient\Exception\TransportExceptionInterface $e) {
            return [
                'success' => false,
                'error' => $e->getMessage()
            ];
        } catch (\Symfony\Contracts\HttpClient\Exception\HttpExceptionInterface $e) {
            return [
                'success' => false,
                'status' => $e->getResponse()->getStatusCode(),
                'error' => $e->getMessage()
            ];
        }
    }

    // Richieste asincrone concorrenti
    public function fetchConcurrent(array $urls, ?callable $onProgress = null): array
    {
        $responses = [];
        
        // Avvia tutte le richieste in parallelo
        foreach ($urls as $key => $url) {
            $responses[$key] = $this->client->request('GET', $url);
        }

        $results = [];
        
        // Itera sulle risposte man mano che arrivano
        foreach ($this->client->stream($responses) as $response => $chunk) {
            if ($chunk->isTimeout()) {
                // Gestisci timeout
                $key = array_search($response, $responses, true);
                $results[$key] = [
                    'success' => false,
                    'error' => 'Timeout'
                ];
                continue;
            }

            if ($chunk->isFirst()) {
                // Headers ricevuti
                if ($onProgress) {
                    $onProgress('headers', $response->getHeaders());
                }
            }

            if ($chunk->isLast()) {
                // Contenuto completo ricevuto
                try {
                    $key = array_search($response, $responses, true);
                    $results[$key] = [
                        'success' => true,
                        'status' => $response->getStatusCode(),
                        'body' => $response->getContent(),
                        'headers' => $response->getHeaders()
                    ];
                } catch (\Exception $e) {
                    $key = array_search($response, $responses, true);
                    $results[$key] = [
                        'success' => false,
                        'error' => $e->getMessage()
                    ];
                }
            }
        }

        return $results;
    }

    // Streaming per grandi risposte
    public function streamContent(string $url, callable $onChunk): void
    {
        $response = $this->client->request('GET', $url);

        foreach ($this->client->stream($response) as $chunk) {
            if ($chunk->isTimeout()) {
                continue;
            }
            
            $onChunk($chunk->getContent());
        }
    }
}

// Esempio: Scraping concorrente con Symfony
$client = new SymfonyProxyClient('user-country-US', 'your_password');

$urls = [
    'page1' => 'https://httpbin.org/delay/1',
    'page2' => 'https://httpbin.org/delay/2',
    'page3' => 'https://httpbin.org/delay/1',
];

$start = microtime(true);

$results = $client->fetchConcurrent($urls, function ($event, $data) {
    echo "Evento: {$event}\n";
});

$elapsed = microtime(true) - $start;

echo "Tempo totale: {$elapsed}s (vs " . count($urls) . " richieste seriali)\n";

foreach ($results as $key => $result) {
    echo "{$key}: " . ($result['success'] ? 'OK' : 'FAIL') . "\n";
}

Integrazione Laravel: Service Class per Proxy Pool

In un'applicazione Laravel, è buona pratica incapsulare la logica dei proxy in un service class dedicato, registrato come singleton nel container. Questo permette di utilizzarlo facilmente da job, controller e altri service.

<?php

// app/Services/ResidentialProxyService.php

namespace App\Services;

use Illuminate\Support\Facades\Log;
use Illuminate\Support\Facades\Cache;
use GuzzleHttp\Client;
use GuzzleHttp\Exception\RequestException;

class ResidentialProxyService
{
    private Client $client;
    private string $proxyHost = 'gate.proxyhat.com';
    private int $proxyPort = 8080;
    private string $username;
    private string $password;
    private array $config;

    public function __construct(array $config = [])
    {
        $this->config = array_merge([
            'timeout' => 30,
            'connect_timeout' => 10,
            'max_retries' => 3,
            'retry_delay' => 1000,
            'verify_ssl' => true,
            'default_country' => null,
        ], $config);

        $this->username = config('proxy.username');
        $this->password = config('proxy.password');

        $this->initializeClient();
    }

    private function initializeClient(): void
    {
        $this->client = new Client([
            'timeout' => $this->config['timeout'],
            'connect_timeout' => $this->config['connect_timeout'],
            'verify' => $this->config['verify_ssl'],
        ]);
    }

    private function buildProxyUrl(?string $country = null, ?string $sessionId = null): string
    {
        $username = $this->username;

        if ($country) {
            $username .= "-country-{$country}";
        } elseif ($this->config['default_country']) {
            $username .= "-country-{$this->config['default_country']}";
        }

        if ($sessionId) {
            $username .= "-session-{$sessionId}";
        }

        return "http://{$username}:{$this->password}@{$this->proxyHost}:{$this->proxyPort}";
    }

    public function request(
        string $method,
        string $url,
        array $options = [],
        ?string $country = null,
        ?string $sessionId = null
    ): array {
        $proxyUrl = $this->buildProxyUrl($country, $sessionId);
        
        $options['proxy'] = $proxyUrl;
        
        $attempt = 0;
        $lastException = null;

        while ($attempt < $this->config['max_retries']) {
            $attempt++;

            try {
                $response = $this->client->request($method, $url, $options);
                
                $this->logSuccess($url, $attempt);
                
                return [
                    'success' => true,
                    'status' => $response->getStatusCode(),
                    'body' => $response->getBody()->getContents(),
                    'headers' => $this->formatHeaders($response->getHeaders()),
                ];
            } catch (RequestException $e) {
                $lastException = $e;
                $this->logFailure($url, $e, $attempt);
                
                if ($this->shouldRetry($e)) {
                    usleep($this->config['retry_delay'] * 1000 * $attempt);
                    continue;
                }
                
                break;
            }
        }

        return [
            'success' => false,
            'error' => $lastException?->getMessage(),
            'status' => $lastException?->getResponse()?->getStatusCode(),
        ];
    }

    public function get(string $url, array $options = [], ?string $country = null): array
    {
        return $this->request('GET', $url, $options, $country);
    }

    public function post(string $url, array $data, array $options = [], ?string $country = null): array
    {
        $options['json'] = $data;
        return $this->request('POST', $url, $options, $country);
    }

    // Sticky session: mantiene lo stesso IP per più richieste
    public function createSession(?string $country = null): ProxySession
    {
        $sessionId = 'laravel_' . bin2hex(random_bytes(8));
        
        Cache::put("proxy_session:{$sessionId}", [
            'country' => $country ?? $this->config['default_country'],
            'created_at' => now(),
            'request_count' => 0,
        ], now()->addMinutes(30));

        return new ProxySession($sessionId, $this, $country);
    }

    public function getSession(string $sessionId): ?array
    {
        return Cache::get("proxy_session:{$sessionId}");
    }

    public function incrementSessionCount(string $sessionId): void
    {
        $session = $this->getSession($sessionId);
        if ($session) {
            $session['request_count']++;
            Cache::put("proxy_session:{$sessionId}", $session, now()->addMinutes(30));
        }
    }

    private function shouldRetry(RequestException $e): bool
    {
        $statusCode = $e->getResponse()?->getStatusCode();
        
        // Retry su errori server o rate limiting
        return $statusCode === null || $statusCode >= 500 || $statusCode === 429;
    }

    private function logSuccess(string $url, int $attempts): void
    {
        if ($attempts > 1) {
            Log::info("Proxy request succeeded after {$attempts} attempts", ['url' => $url]);
        }
    }

    private function logFailure(string $url, \Exception $e, int $attempt): void
    {
        Log::warning("Proxy request failed (attempt {$attempt})", [
            'url' => $url,
            'error' => $e->getMessage(),
            'status' => $e->getResponse()?->getStatusCode(),
        ]);
    }

    private function formatHeaders(array $headers): array
    {
        $formatted = [];
        foreach ($headers as $name => $values) {
            $formatted[$name] = implode(', ', $values);
        }
        return $formatted;
    }
}

// app/Services/ProxySession.php

class ProxySession
{
    private string $sessionId;
    private ResidentialProxyService $service;
    private ?string $country;

    public function __construct(
        string $sessionId,
        ResidentialProxyService $service,
        ?string $country
    ) {
        $this->sessionId = $sessionId;
        $this->service = $service;
        $this->country = $country;
    }

    public function request(string $method, string $url, array $options = []): array
    {
        $this->service->incrementSessionCount($this->sessionId);
        return $this->service->request($method, $url, $options, $this->country, $this->sessionId);
    }

    public function get(string $url, array $options = []): array
    {
        return $this->request('GET', $url, $options);
    }

    public function getId(): string
    {
        return $this->sessionId;
    }
}

// app/Providers/AppServiceProvider.php

namespace App\Providers;

use App\Services\ResidentialProxyService;
use Illuminate\Support\ServiceProvider;

class AppServiceProvider extends ServiceProvider
{
    public function register(): void
    {
        $this->app->singleton(ResidentialProxyService::class, function ($app) {
            return new ResidentialProxyService([
                'timeout' => config('proxy.timeout', 30),
                'max_retries' => config('proxy.max_retries', 3),
                'default_country' => config('proxy.default_country'),
            ]);
        });
    }
}

// config/proxy.php

return [
    'username' => env('PROXY_USERNAME', 'user'),
    'password' => env('PROXY_PASSWORD', 'password'),
    'timeout' => env('PROXY_TIMEOUT', 30),
    'max_retries' => env('PROXY_MAX_RETRIES', 3),
    'default_country' => env('PROXY_DEFAULT_COUNTRY', 'US'),
];

Utilizzo nei Laravel Jobs

<?php

// app/Jobs/ScrapeProductPrices.php

namespace App\Jobs;

use App\Services\ResidentialProxyService;
use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Queue\SerializesModels;
use Illuminate\Support\Facades\Log;

class ScrapeProductPrices implements ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    public int $tries = 3;
    public int $backoff = 60;

    private array $products;
    private ?string $country;

    public function __construct(array $products, ?string $country = null)
    {
        $this->products = $products;
        $this->country = $country;
    }

    public function handle(ResidentialProxyService $proxy): void
    {
        // Crea una sessione sticky per mantenere lo stesso IP
        $session = $proxy->createSession($this->country);
        
        Log::info("Starting scrape job", [
            'session_id' => $session->getId(),
            'products_count' => count($this->products),
        ]);

        foreach ($this->products as $product) {
            $result = $session->get($product['url'], [
                'headers' => [
                    'User-Agent' => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
                    'Accept' => 'text/html,application/xhtml+xml',
                ],
            ]);

            if ($result['success']) {
                $price = $this->extractPrice($result['body'], $product['selector']);
                $this->savePrice($product['id'], $price);
            } else {
                Log::warning("Failed to scrape product", [
                    'product_id' => $product['id'],
                    'error' => $result['error'],
                ]);
            }

            // Rate limiting tra richieste
            usleep(200000); // 200ms
        }

        Log::info("Scrape job completed", ['session_id' => $session->getId()]);
    }

    private function extractPrice(string $html, string $selector): ?float
    {
        // Parsing HTML con DOMDocument o simplehtmldom
        $dom = new \DOMDocument();
        @$dom->loadHTML($html);
        $xpath = new \DOMXPath($dom);
        
        $nodes = $xpath->query($selector);
        if ($nodes->length > 0) {
            $text = trim($nodes->item(0)->textContent);
            return (float) preg_replace('/[^0-9.]/', '', $text);
        }
        
        return null;
    }

    private function savePrice(int $productId, ?float $price): void
    {
        if ($price !== null) {
            \App\Models\ProductPrice::updateOrCreate(
                ['product_id' => $productId],
                ['price' => $price, 'scraped_at' => now()]
            );
        }
    }
}

// Dispatch del job
use App\Jobs\ScrapeProductPrices;

ScrapeProductPrices::dispatch(
    [
        ['id' => 1, 'url' => 'https://example.com/product/1', 'selector' => '//span[@class="price"]'],
        ['id' => 2, 'url' => 'https://example.com/product/2', 'selector' => '//span[@class="price"]'],
    ],
    'DE' // Geo-targeting Germania
)->onQueue('scraping');

Multi-cURL per Richieste Concorrenti

Per scraping ad alto throughput, curl_multi_* permette di eseguire decine di richieste in parallelo, riducendo drasticamente i tempi di esecuzione.

<?php

class ConcurrentCurlProxyClient
{
    private string $proxyHost = 'gate.proxyhat.com';
    private int $proxyPort = 8080;
    private string $username;
    private string $password;
    private int $maxConcurrent;

    public function __construct(
        string $username,
        string $password,
        int $maxConcurrent = 10
    ) {
        $this->username = $username;
        $this->password = $password;
        $this->maxConcurrent = $maxConcurrent;
    }

    /**
     * Esegue richieste concorrenti con rotazione IP automatica
     * 
     * @param array $urls Array di URL da processare
     * @param array $options Opzioni per ogni richiesta
     * @return array Risultati indicizzati per chiave dell'array originale
     */
    public function fetchAll(array $urls, array $options = []): array
    {
        $results = [];
        $handles = [];
        $multiHandle = curl_multi_init();
        
        // Configurazione comune
        $commonOptions = [
            CURLOPT_RETURNTRANSFER => true,
            CURLOPT_FOLLOWLOCATION => true,
            CURLOPT_MAXREDIRS => 5,
            CURLOPT_TIMEOUT => $options['timeout'] ?? 30,
            CURLOPT_CONNECTTIMEOUT => $options['connect_timeout'] ?? 10,
            CURLOPT_SSL_VERIFYPEER => true,
            CURLOPT_SSL_VERIFYHOST => 2,
            CURLOPT_PROXY => $this->proxyHost,
            CURLOPT_PROXYPORT => $this->proxyPort,
        ];

        // Inizializza tutti i handle
        foreach ($urls as $key => $url) {
            $ch = curl_init($url);
            
            // Genera sessione unica per ogni richiesta (rotazione IP)
            $sessionId = 'multi_' . bin2hex(random_bytes(6));
            $country = $options['country'] ?? null;
            
            $username = $this->username;
            if ($country) {
                $username .= "-country-{$country}";
            }
            $username .= "-session-{$sessionId}";
            
            $proxyAuth = "{$username}:{$this->password}";
            
            $handleOptions = $commonOptions + [
                CURLOPT_PROXYUSERPWD => $proxyAuth,
            ];
            
            // Headers personalizzati
            if (!empty($options['headers'])) {
                $handleOptions[CURLOPT_HTTPHEADER] = $options['headers'];
            }
            
            curl_setopt_array($ch, $handleOptions);
            
            $handles[$key] = $ch;
        }

        // Aggiungi tutti gli handle al multi handle
        $active = null;
        foreach ($handles as $ch) {
            curl_multi_add_handle($multiHandle, $ch);
        }

        // Esegui le richieste
        do {
            $status = curl_multi_exec($multiHandle, $active);
            
            if ($status === CURLM_CALL_MULTI_PERFORM) {
                continue;
            }
            
            if ($status !== CURLM_OK) {
                break;
            }
            
            // Attendi attività su almeno una connessione
            curl_multi_select($multiHandle, 1.0);
            
        } while ($active > 0);

        // Raccogli i risultati
        foreach ($handles as $key => $ch) {
            $results[$key] = [
                'status' => curl_getinfo($ch, CURLINFO_HTTP_CODE),
                'body' => curl_multi_getcontent($ch),
                'error' => curl_error($ch),
                'total_time' => curl_getinfo($ch, CURLINFO_TOTAL_TIME),
                'connect_time' => curl_getinfo($ch, CURLINFO_CONNECT_TIME),
                'size_download' => curl_getinfo($ch, CURLINFO_SIZE_DOWNLOAD),
                'success' => curl_errno($ch) === 0,
            ];
            
            curl_multi_remove_handle($multiHandle, $ch);
            curl_close($ch);
        }

        curl_multi_close($multiHandle);

        return $results;
    }

    /**
     * Processa URLs in batch con callback per ogni risultato
     */
    public function processBatch(
        array $urls,
        callable $onSuccess,
        ?callable $onError = null,
        array $options = []
    ): array {
        $batchSize = $this->maxConcurrent;
        $batches = array_chunk($urls, $batchSize, true);
        
        $stats = [
            'total' => count($urls),
            'success' => 0,
            'failed' => 0,
            'total_time' => 0,
        ];

        foreach ($batches as $batch) {
            $results = $this->fetchAll($batch, $options);
            
            foreach ($results as $key => $result) {
                if ($result['success'] && $result['status'] >= 200 && $result['status'] < 300) {
                    $onSuccess($key, $result);
                    $stats['success']++;
                } elseif ($onError) {
                    $onError($key, $result);
                    $stats['failed']++;
                }
                $stats['total_time'] += $result['total_time'];
            }
        }

        return $stats;
    }
}

// Esempio: Scraping concorrente di 50 URL
$client = new ConcurrentCurlProxyClient('user-country-US', 'your_password', 20);

$urls = [];
for ($i = 1; $i <= 50; $i++) {
    $urls["page_{$i}"] = "https://httpbin.org/delay/1?page={$i}";
}

$start = microtime(true);

$stats = $client->processBatch(
    $urls,
    function ($key, $result) {
        echo "OK: {$key} ({$result['total_time']}s)\n";
    },
    function ($key, $result) {
        echo "FAIL: {$key} - {$result['error']}\n";
    },
    ['country' => 'US', 'timeout' => 15]
);

$elapsed = microtime(true) - $start;

echo "\n=== Statistiche ===\n";
echo "Totale: {$stats['total']}\n";
echo "Successi: {$stats['success']}\n";
echo "Fallimenti: {$stats['failed']}\n";
echo "Tempo totale: {$elapsed}s\n";
echo "Tempo medio per richiesta: " . ($stats['total_time'] / $stats['total']) . "s\n";

TLS/SSL e Gestione CA Bundle

Quando si usano proxy HTTPS, è fondamentale configurare correttamente la verifica dei certificati per evitare attacchi man-in-the-middle. PHP e cURL necessitano di un CA bundle aggiornato.

<?php

class SecureProxyClient
{
    private string $proxyHost = 'gate.proxyhat.com';
    private int $proxyPort = 8080;
    private ?string $caBundlePath;
    private array $tlsOptions;

    public function __construct(
        string $username,
        string $password,
        array $tlsConfig = []
    ) {
        $this->caBundlePath = $this->resolveCaBundle($tlsConfig['ca_bundle'] ?? null);
        
        $this->tlsOptions = [
            'verify_peer' => $tlsConfig['verify_peer'] ?? true,
            'verify_peer_name' => $tlsConfig['verify_peer_name'] ?? true,
            'verify_host' => $tlsConfig['verify_host'] ?? 2,
            'ssl_version' => $tlsConfig['ssl_version'] ?? CURL_SSLVERSION_TLSv1_2,
        ];
    }

    /**
     * Risolve il percorso del CA bundle
     */
    private function resolveCaBundle(?string $customPath): ?string
    {
        if ($customPath && file_exists($customPath)) {
            return $customPath;
        }

        // Percorsi comuni del CA bundle
        $commonPaths = [
            // Composer CA bundle (consigliato)
            __DIR__ . '/vendor/composer/ca-bundle/res/cacert.pem',
            // System paths
            '/etc/ssl/certs/ca-certificates.crt', // Debian/Ubuntu
            '/etc/pki/tls/certs/ca-bundle.crt', // RHEL/CentOS
            '/usr/local/etc/openssl/cert.pem', // macOS Homebrew
            '/usr/share/curl/curl-ca-bundle.crt', // Windows cURL
        ];

        foreach ($commonPaths as $path) {
            if (file_exists($path)) {
                return $path;
            }
        }

        // Fallback: scarica il bundle Mozilla
        return $this->downloadCaBundle();
    }

    /**
     * Scarica il CA bundle di Mozilla come fallback
     */
    private function downloadCaBundle(): ?string
    {
        $cachePath = sys_get_temp_dir() . '/proxy_client_ca_bundle.pem';
        
        if (file_exists($cachePath) && filemtime($cachePath) > strtotime('-1 week')) {
            return $cachePath;
        }

        $bundle = file_get_contents('https://curl.se/ca/cacert.pem');
        if ($bundle) {
            file_put_contents($cachePath, $bundle);
            return $cachePath;
        }

        return null;
    }

    /**
     * Configura cURL con TLS sicuro
     */
    private function configureTls($ch, string $proxyUsername, string $proxyPassword): void
    {
        // Configurazione proxy
        curl_setopt($ch, CURLOPT_PROXY, $this->proxyHost);
        curl_setopt($ch, CURLOPT_PROXYPORT, $this->proxyPort);
        curl_setopt($ch, CURLOPT_PROXYUSERPWD, "{$proxyUsername}:{$proxyPassword}");
        
        // TLS/SSL verification
        curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, $this->tlsOptions['verify_peer']);
        curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, $this->tlsOptions['verify_host']);
        curl_setopt($ch, CURLOPT_SSLVERSION, $this->tlsOptions['ssl_version']);
        
        // CA bundle personalizzato
        if ($this->caBundlePath) {
            curl_setopt($ch, CURLOPT_CAINFO, $this->caBundlePath);
            curl_setopt($ch, CURLOPT_CAPATH, dirname($this->caBundlePath));
        }
        
        // Certificato client (se richiesto)
        // curl_setopt($ch, CURLOPT_SSLCERT, '/path/to/client.crt');
        // curl_setopt($ch, CURLOPT_SSLKEY, '/path/to/client.key');
        
        // Cipher suites (opzionale, per compatibilità)
        // curl_setopt($ch, CURLOPT_SSL_CIPHER_LIST, 'TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256');
    }

    /**
     * Verifica la connessione TLS
     */
    public function testTlsConnection(string $url = 'https://www.google.com'): array
    {
        $ch = curl_init($url);
        
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
        curl_setopt($ch, CURLOPT_HEADER, true);
        curl_setopt($ch, CURLOPT_NOBODY, true);
        curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, true);
        curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2);
        
        if ($this->caBundlePath) {
            curl_setopt($ch, CURLOPT_CAINFO, $this->caBundlePath);
        }

        $result = curl_exec($ch);
        $info = curl_getinfo($ch);
        $error = curl_error($ch);
        
        curl_close($ch);

        return [
            'success' => $error === '',
            'ssl_verify_result' => $info['ssl_verify_result'],
            'certinfo' => $info['certinfo'] ?? [],
            'error' => $error,
            'ca_bundle' => $this->caBundlePath,
        ];
    }

    /**
     * Richiesta HTTPS attraverso proxy con verifica completa
     */
    public function secureGet(
        string $url,
        string $proxyUsername,
        string $proxyPassword,
        array $options = []
    ): array {
        $ch = curl_init($url);
        
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
        curl_setopt($ch, CURLOPT_TIMEOUT, $options['timeout'] ?? 30);
        
        $this->configureTls($ch, $proxyUsername, $proxyPassword);
        
        if (!empty($options['headers'])) {
            curl_setopt($ch, CURLOPT_HTTPHEADER, $options['headers']);
        }

        $response = curl_exec($ch);
        $info = curl_getinfo($ch);
        $error = curl_error($ch);
        
        curl_close($ch);

        return [
            'success' => $error === '' && $info['http_code'] >= 200 && $info['http_code'] < 300,
            'status' => $info['http_code'],
            'body' => $response,
            'ssl_verify_result' => $info['ssl_verify_result'],
            'error' => $error,
        ];
    }
}

// Installazione del CA bundle via Composer
// composer require composer/ca-bundle

// Utilizzo
$client = new SecureProxyClient('user-country-US', 'your_password');

// Test connessione TLS
$test = $client->testTlsConnection();
if (!$test['success']) {
    echo "Errore TLS: " . $test['error'] . "\n";
    echo "CA Bundle: " . $test['ca_bundle'] . "\n";
}

// Richiesta sicura
$result = $client->secureGet(
    'https://api.example.com/data',
    'user-country-DE-session-abc123',
    'your_password',
    ['timeout' => 20]
);

if ($result['success']) {
    echo "Risposta ricevuta: " . strlen($result['body']) . " bytes\n";
} else {
    echo "Errore: " . $result['error'] . "\n";
}

Confronto tra Client HTTP PHP con Proxy

Caratteristica cURL Nativo Guzzle Symfony HTTP Laravel HTTP
Configurazione Proxy CURLOPT_PROXY Array 'proxy' Opzione 'proxy' Metodo withProxy()
Richieste Asincrone curl_multi_* Promise/Middleware Stream nativo Promise (async)
Retry Automatico Manuale Middleware integrato Manuale Middleware
Logging Manuale Middleware PSR-3 PSR-3 Logger Integrato Log facade
Curve di Apprendimento Bassa Media Media Bassa
Prestazioni Massime Alte Alte Alte
Integrazione Laravel Nessuna Facade disponibile Bridge disponibile Nativa

Punti Chiave da Ricordare

Configurazione Proxy: Usa sempre il formato http://username:password@gate.proxyhat.com:8080 per HTTP e socks5://username:password@gate.proxyhat.com:1080 per SOCKS5. Il geo-targeting e le sessioni si configurano nel username, non nella password.

Rotazione IP: Per scraping intensivo, implementa rotazione per-request con ID sessione univoci. I proxy residential di ProxyHat permettono di mantenere sessioni sticky quando necessario.

TLS/SSL: Non disabilitare mai verify_peer in produzione. Usa un CA bundle aggiornato tramite composer/ca-bundle o il bundle di sistema.

Concorrenza: Per scraping ad alto volume, usa curl_multi_* o il sistema di stream di Symfony HTTP Client per massimizzare il throughput.

Error Handling: Implementa sempre retry con backoff esponenziale e circuit breaker per gestire fallimenti temporanei del proxy o del server target.

Conclusione

Configurare proxy HTTP in PHP richiede attenzione ai dettagli, ma con gli strumenti giusti — cURL nativo, Guzzle o Symfony HTTP Client — puoi costruire sistemi di scraping robusti e performanti. Per progetti Laravel, un service class dedicato come ResidentialProxyService centralizza la logica e semplifica l'utilizzo da job e controller.

Per iniziare con i proxy residential di ProxyHat, configura le tue credenziali nel file .env di Laravel o nel tuo script PHP, e utilizza il gateway gate.proxyhat.com:8080 per le tue richieste. Il geo-targeting e le sessioni sticky si configurano direttamente nel username, permettendo un controllo granulare senza modificare il codice.

Per casi d'uso più avanzati come web scraping su larga scala o SERP tracking, consulta la nostra pagina dei prezzi per scegliere il piano più adatto al tuo volume di richieste.

Pronto per iniziare?

Accedi a oltre 50M di IP residenziali in oltre 148 paesi con filtraggio AI.

Vedi i prezziProxy residenziali
← Torna al Blog