Setting up a proxy in Puppeteer and Playwright: a complete guide

Puppeteer and Playwright are popular tools for browser automation and web scraping. When working with large volumes of requests or scraping protected sites, using proxies becomes critically important to avoid IP blocks. In this guide, we will explore all the ways to integrate proxies into both tools, from basic setup to advanced scenarios involving rotation and error handling.

Basics of Proxy Usage in Headless Browsers

Puppeteer and Playwright control real browsers (Chromium, Firefox, WebKit) through the DevTools Protocol. This means that proxies are set at the browser launch level, not for individual requests. Both tools support HTTP, HTTPS, and SOCKS5 proxies but have different APIs for their configuration.

Key differences from regular HTTP libraries (axios, fetch):

Proxy is set at browser launch — you cannot change the proxy on the fly within a single browser session
JavaScript support and rendering — the proxy applies to all resources on the page (images, scripts, XHR)
Automatic handling of redirects and cookies — the browser behaves like a real user
Fingerprinting — even with proxies, sites can detect automation based on browser characteristics

Important: For tasks requiring frequent IP changes (scraping thousands of pages, mass registration), it is more effective to use residential proxies with rotation — they allow changing the IP for each request without restarting the browser.

Setting Up Proxies in Puppeteer

Puppeteer is a library from Google for controlling Chrome/Chromium. Proxies are configured via browser launch arguments using the --proxy-server flag.

Basic Setup of HTTP/HTTPS Proxies

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch({
    headless: true,
    args: [
      '--proxy-server=http://proxy.example.com:8080'
    ]
  });

  const page = await browser.newPage();
  
  // Check IP address
  await page.goto('https://api.ipify.org?format=json');
  const content = await page.content();
  console.log('Current IP:', content);

  await browser.close();
})();

This code launches a browser through a proxy server. All HTTP and HTTPS requests will go through the specified proxy. The ipify service is used to check functionality, returning the external IP address.

Setting Up SOCKS5 Proxies

const browser = await puppeteer.launch({
  headless: true,
  args: [
    '--proxy-server=socks5://proxy.example.com:1080'
  ]
});

// Remaining code is similar

SOCKS5 proxies operate at a lower level and support UDP traffic, which can be useful for certain web applications. The syntax is identical to HTTP proxies, only the protocol in the URL changes.

Using Proxies for Specific Domains

const browser = await puppeteer.launch({
  args: [
    '--proxy-server=http://proxy1.example.com:8080',
    '--proxy-bypass-list=localhost;127.0.0.1;*.internal.com'
  ]
});

// Local requests and requests to *.internal.com will go directly
// All others will go through the proxy

The --proxy-bypass-list flag allows you to exclude certain domains from proxying. This is useful when you need to combine direct and proxied requests.

Setting Up Proxies in Playwright

Playwright is a more modern library from Microsoft that supports Chromium, Firefox, and WebKit. Proxies are configured through a configuration object, making the API clearer and more typed.

Basic Proxy Setup

const { chromium } = require('playwright');

(async () => {
  const browser = await chromium.launch({
    proxy: {
      server: 'http://proxy.example.com:8080'
    }
  });

  const context = await browser.newContext();
  const page = await context.newPage();
  
  await page.goto('https://api.ipify.org?format=json');
  const ip = await page.textContent('body');
  console.log('IP through proxy:', ip);

  await browser.close();
})();

Playwright uses the proxy object instead of command-line arguments. This provides better typing in TypeScript and cleaner code.

Setting Up Proxies at the Context Level

const browser = await chromium.launch();

// Context 1 - with proxy
const context1 = await browser.newContext({
  proxy: {
    server: 'http://proxy1.example.com:8080'
  }
});

// Context 2 - with another proxy
const context2 = await browser.newContext({
  proxy: {
    server: 'http://proxy2.example.com:8080'
  }
});

const page1 = await context1.newPage();
const page2 = await context2.newPage();

// Each page uses its own proxy!

One of the key advantages of Playwright is the ability to create multiple browser contexts with different proxies within a single process. This saves resources when working with multiple IP addresses.

Excluding Domains from Proxying

const browser = await chromium.launch({
  proxy: {
    server: 'http://proxy.example.com:8080',
    bypass: 'localhost,127.0.0.1,*.internal.com'
  }
});

// Requests to localhost and *.internal.com will go directly

Authentication with Username and Password

Most commercial proxies require authentication. Both tools support authorization but implement it differently.

Authentication in Puppeteer

const browser = await puppeteer.launch({
  args: ['--proxy-server=http://proxy.example.com:8080']
});

const page = await browser.newPage();

// Set credentials for the proxy
await page.authenticate({
  username: 'your_username',
  password: 'your_password'
});

await page.goto('https://httpbin.org/ip');
const content = await page.content();
console.log(content);

The page.authenticate() method sets the credentials for all subsequent requests on that page. It is important to call it BEFORE the first page.goto().

Authentication in Playwright

const browser = await chromium.launch({
  proxy: {
    server: 'http://proxy.example.com:8080',
    username: 'your_username',
    password: 'your_password'
  }
});

const page = await browser.newPage();
await page.goto('https://httpbin.org/ip');

Playwright allows you to specify credentials directly in the proxy configuration object — this is more convenient and secure as it does not require additional method calls.

Alternative Method: Credentials in URL

// Works in both frameworks
const proxyUrl = 'http://username:password@proxy.example.com:8080';

// Puppeteer
const browser = await puppeteer.launch({
  args: [`--proxy-server=${proxyUrl}`]
});

// Playwright
const browser = await chromium.launch({
  proxy: { server: proxyUrl }
});

This method works but is not recommended for production — credentials may end up in logs. Use environment variables to store sensitive data.

Tip: When scraping commercial sites, we recommend using residential proxies — they have real IPs of home users and are less frequently blocked by anti-bot systems.

Proxy Rotation and IP Pool Management

For large-scale scraping, it is necessary to regularly change IP addresses. Since proxies are set at browser launch, rotation requires restarting the browser session.

Simple Rotation with an Array of Proxies (Puppeteer)

const puppeteer = require('puppeteer');

const proxyList = [
  'http://user1:pass1@proxy1.example.com:8080',
  'http://user2:pass2@proxy2.example.com:8080',
  'http://user3:pass3@proxy3.example.com:8080'
];

async function scrapeWithRotation(urls) {
  for (let i = 0; i < urls.length; i++) {
    const proxyUrl = proxyList[i % proxyList.length];
    
    const browser = await puppeteer.launch({
      args: [`--proxy-server=${proxyUrl}`]
    });

    try {
      const page = await browser.newPage();
      await page.goto(urls[i], { waitUntil: 'networkidle0' });
      
      const data = await page.evaluate(() => {
        return {
          title: document.title,
          url: window.location.href
        };
      });
      
      console.log(`URL ${i + 1}:`, data);
    } catch (error) {
      console.error(`Error on ${urls[i]}:`, error.message);
    } finally {
      await browser.close();
    }
  }
}

const urlsToScrape = [
  'https://example.com/page1',
  'https://example.com/page2',
  'https://example.com/page3',
  'https://example.com/page4'
];

scrapeWithRotation(urlsToScrape);

This code cyclically iterates through the proxies in the list and launches a new browser for each URL. The method i % proxyList.length ensures cyclic rotation.

Rotation with a Pool of Contexts (Playwright)

const { chromium } = require('playwright');

const proxyList = [
  { server: 'http://proxy1.example.com:8080', username: 'user1', password: 'pass1' },
  { server: 'http://proxy2.example.com:8080', username: 'user2', password: 'pass2' },
  { server: 'http://proxy3.example.com:8080', username: 'user3', password: 'pass3' }
];

async function scrapeWithContextPool(urls) {
  const browser = await chromium.launch();
  
  // Create a pool of contexts with different proxies
  const contexts = await Promise.all(
    proxyList.map(proxy => browser.newContext({ proxy }))
  );

  for (let i = 0; i < urls.length; i++) {
    const context = contexts[i % contexts.length];
    const page = await context.newPage();
    
    try {
      await page.goto(urls[i], { waitUntil: 'networkidle' });
      const title = await page.title();
      console.log(`URL ${i + 1} (proxy ${i % contexts.length}):`, title);
    } catch (error) {
      console.error(`Error on ${urls[i]}:`, error.message);
    } finally {
      await page.close();
    }
  }

  await browser.close();
}

const urls = [
  'https://example.com/page1',
  'https://example.com/page2',
  'https://example.com/page3'
];

scrapeWithContextPool(urls);

Playwright allows you to create multiple contexts in one browser, each with its own proxy. This saves memory and startup time compared to fully restarting the browser.

Smart Rotation with Error Tracking

class ProxyRotator {
  constructor(proxyList) {
    this.proxyList = proxyList;
    this.currentIndex = 0;
    this.failedProxies = new Set();
  }

  getNext() {
    const availableProxies = this.proxyList.filter(
      (_, index) => !this.failedProxies.has(index)
    );

    if (availableProxies.length === 0) {
      throw new Error('All proxies are unavailable');
    }

    const proxy = this.proxyList[this.currentIndex];
    this.currentIndex = (this.currentIndex + 1) % this.proxyList.length;
    
    return { proxy, index: this.currentIndex - 1 };
  }

  markFailed(index) {
    this.failedProxies.add(index);
    console.log(`Proxy ${index} marked as unavailable`);
  }

  resetFailed() {
    this.failedProxies.clear();
  }
}

// Usage
const rotator = new ProxyRotator(proxyList);

async function scrapeWithSmartRotation(url, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const { proxy, index } = rotator.getNext();
    
    const browser = await chromium.launch({ proxy });
    const page = await browser.newPage();

    try {
      await page.goto(url, { timeout: 30000 });
      const data = await page.content();
      await browser.close();
      return data;
    } catch (error) {
      console.error(`Error with proxy ${index}:`, error.message);
      rotator.markFailed(index);
      await browser.close();
      
      if (attempt === maxRetries - 1) {
        throw new Error(`Failed to load ${url} after ${maxRetries} attempts`);
      }
    }
  }
}

This class tracks non-working proxies and excludes them from rotation. On error, it automatically switches to the next proxy in the list.

Error Handling and Functionality Checks

When working with proxies, specific errors can occur: connection timeouts, authorization failures, IP blocks by the target site. Proper error handling is critical for the stable operation of the scraper.

Common Errors and Their Handling

async function safePageLoad(page, url, options = {}) {
  const defaultOptions = {
    timeout: 30000,
    waitUntil: 'networkidle0',
    maxRetries: 3,
    retryDelay: 2000
  };

  const config = { ...defaultOptions, ...options };

  for (let attempt = 1; attempt <= config.maxRetries; attempt++) {
    try {
      await page.goto(url, {
        timeout: config.timeout,
        waitUntil: config.waitUntil
      });
      
      return { success: true, attempt };
    } catch (error) {
      console.error(`Attempt ${attempt} failed:`, error.message);

      // Analyze the type of error
      if (error.message.includes('ERR_PROXY_CONNECTION_FAILED')) {
        throw new Error('Proxy is unavailable');
      }
      
      if (error.message.includes('ERR_TUNNEL_CONNECTION_FAILED')) {
        throw new Error('Proxy tunneling error');
      }

      if (error.message.includes('407')) {
        throw new Error('Proxy authorization error (407)');
      }

      if (error.message.includes('Navigation timeout')) {
        console.log(`Loading timeout, retrying in ${config.retryDelay}ms`);
        await new Promise(resolve => setTimeout(resolve, config.retryDelay));
        continue;
      }

      // If this is the last attempt - throw the error
      if (attempt === config.maxRetries) {
        throw error;
      }
    }
  }
}

Checking Proxy Functionality

async function testProxy(proxyConfig) {
  const browser = await chromium.launch({ proxy: proxyConfig });
  const page = await browser.newPage();

  const result = {
    working: false,
    ip: null,
    responseTime: null,
    error: null
  };

  const startTime = Date.now();

  try {
    await page.goto('https://api.ipify.org?format=json', { timeout: 10000 });
    const content = await page.textContent('body');
    const data = JSON.parse(content);
    
    result.working = true;
    result.ip = data.ip;
    result.responseTime = Date.now() - startTime;
  } catch (error) {
    result.error = error.message;
  } finally {
    await browser.close();
  }

  return result;
}

// Testing the proxy list
async function validateProxyList(proxyList) {
  console.log('Checking proxies...');
  
  const results = await Promise.all(
    proxyList.map(async (proxy, index) => {
      const result = await testProxy(proxy);
      console.log(`Proxy ${index + 1}:`, result.working ? `✓ ${result.ip} (${result.responseTime}ms)` : `✗ ${result.error}`);
      return { proxy, ...result };
    })
  );

  const workingProxies = results.filter(r => r.working);
  console.log(`\nWorking proxies: ${workingProxies.length}/${proxyList.length}`);
  
  return workingProxies.map(r => r.proxy);
}

This function checks each proxy before use, measures response time, and returns only working proxies. It is recommended to run validation at the start of the application.

Monitoring and Logging

class ProxyMonitor {
  constructor() {
    this.stats = {
      totalRequests: 0,
      successfulRequests: 0,
      failedRequests: 0,
      proxyErrors: 0,
      averageResponseTime: 0,
      requestsByProxy: new Map()
    };
  }

  recordRequest(proxyIndex, success, responseTime, error = null) {
    this.stats.totalRequests++;
    
    if (success) {
      this.stats.successfulRequests++;
      this.stats.averageResponseTime = 
        (this.stats.averageResponseTime * (this.stats.successfulRequests - 1) + responseTime) / 
        this.stats.successfulRequests;
    } else {
      this.stats.failedRequests++;
      if (error && error.includes('proxy')) {
        this.stats.proxyErrors++;
      }
    }

    // Statistics for each proxy
    if (!this.stats.requestsByProxy.has(proxyIndex)) {
      this.stats.requestsByProxy.set(proxyIndex, { success: 0, failed: 0 });
    }
    
    const proxyStats = this.stats.requestsByProxy.get(proxyIndex);
    success ? proxyStats.success++ : proxyStats.failed++;
  }

  getReport() {
    const successRate = (this.stats.successfulRequests / this.stats.totalRequests * 100).toFixed(2);
    
    return {
      ...this.stats,
      successRate: `${successRate}%`,
      averageResponseTime: `${this.stats.averageResponseTime.toFixed(0)}ms`
    };
  }
}

// Usage
const monitor = new ProxyMonitor();

async function monitoredScrape(url, proxyIndex) {
  const startTime = Date.now();
  try {
    // ... scraping code ...
    const responseTime = Date.now() - startTime;
    monitor.recordRequest(proxyIndex, true, responseTime);
  } catch (error) {
    monitor.recordRequest(proxyIndex, false, 0, error.message);
    throw error;
  }
}

Advanced Scenarios: Geolocation and Fingerprinting

Modern anti-bot systems check not only the IP address but also the compliance with geolocation, time zone, browser language, and other parameters. When using proxies from another country, it is important to configure all these parameters correctly.

Setting Geolocation and Language for Proxies

const { chromium } = require('playwright');

async function createContextWithGeo(proxy, geoData) {
  const browser = await chromium.launch({ proxy });
  
  const context = await browser.newContext({
    locale: geoData.locale,           // 'en-US', 'de-DE', 'fr-FR'
    timezoneId: geoData.timezone,     // 'America/New_York', 'Europe/Berlin'
    geolocation: {
      latitude: geoData.latitude,
      longitude: geoData.longitude
    },
    permissions: ['geolocation']
  });

  return { browser, context };
}

// Example: proxy from Germany
const germanyProxy = {
  server: 'http://de-proxy.example.com:8080',
  username: 'user',
  password: 'pass'
};

const germanyGeo = {
  locale: 'de-DE',
  timezone: 'Europe/Berlin',
  latitude: 52.520008,
  longitude: 13.404954
};

const { browser, context } = await createContextWithGeo(germanyProxy, germanyGeo);
const page = await context.newPage();

await page.goto('https://www.google.com');
// Google will show the German version with results for Berlin

This code sets up the browser to appear as a real user from Germany: German interface language, Berlin time zone, and coordinates of central Berlin.

Complete Fingerprint Setup

async function createStealthContext(proxy, profile) {
  const context = await chromium.launch({ proxy }).then(b => 
    b.newContext({
      locale: profile.locale,
      timezoneId: profile.timezone,
      userAgent: profile.userAgent,
      viewport: profile.viewport,
      deviceScaleFactor: profile.deviceScaleFactor,
      isMobile: profile.isMobile,
      hasTouch: profile.hasTouch,
      colorScheme: profile.colorScheme,
      geolocation: profile.geolocation,
      permissions: ['geolocation']
    })
  );

  return context;
}

// Profile for Windows 10 + Chrome from the USA
const desktopUSProfile = {
  locale: 'en-US',
  timezone: 'America/New_York',
  userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
  viewport: { width: 1920, height: 1080 },
  deviceScaleFactor: 1,
  isMobile: false,
  hasTouch: false,
  colorScheme: 'light',
  geolocation: { latitude: 40.7128, longitude: -74.0060 }
};

// Profile for iPhone from the UK
const mobileUKProfile = {
  locale: 'en-GB',
  timezone: 'Europe/London',
  userAgent: 'Mozilla/5.0 (iPhone; CPU iPhone OS 17_0 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.0 Mobile/15E148 Safari/604.1',
  viewport: { width: 390, height: 844 },
  deviceScaleFactor: 3,
  isMobile: true,
  hasTouch: true,
  colorScheme: 'light',
  geolocation: { latitude: 51.5074, longitude: -0.1278 }
};

Comprehensive fingerprint setup reduces the likelihood of automation detection. It is important that all parameters match each other: for example, a mobile User-Agent should go along with a mobile screen resolution.

Important: When working with advertising platforms (Google Ads, Facebook Ads) or financial services, we recommend using mobile proxies — they have a trust score of real mobile operators and are almost never blocked.

Bypassing WebRTC Leak

// Puppeteer: blocking WebRTC
const browser = await puppeteer.launch({
  args: [
    '--proxy-server=http://proxy.example.com:8080',
    '--disable-webrtc',
    '--disable-webrtc-hw-encoding',
    '--disable-webrtc-hw-decoding'
  ]
});

// Playwright: overriding WebRTC API
const context = await browser.newContext({ proxy: proxyConfig });

await context.addInitScript(() => {
  // Block RTCPeerConnection
  window.RTCPeerConnection = undefined;
  window.RTCDataChannel = undefined;
  window.RTCSessionDescription = undefined;
  
  // Override getUserMedia
  navigator.mediaDevices.getUserMedia = undefined;
  navigator.getUserMedia = undefined;
});

WebRTC can reveal your real IP address even when using a proxy. This code completely disables the WebRTC API in the browser.

Comparison of Puppeteer and Playwright for Proxy Work

Criterion	Puppeteer	Playwright
Proxy Setup	Via command line args	Via configuration object
Authentication	page.authenticate() after launch	In the proxy object at creation
Multiple Proxies in One Browser	No, a separate browser is needed	Yes, through different contexts
Browser Support	Only Chromium/Chrome	Chromium, Firefox, WebKit
Performance	Fast launch of a single browser	More efficient with multiple contexts
TypeScript	Types via @types/puppeteer	Built-in TypeScript support
Documentation	Good, many examples	Excellent, more structured
Ecosystem	More plugins and extensions	Developing faster, new features

Recommendations for Choice

Choose Puppeteer if:

You are only working with Chrome/Chromium
You use one proxy at a time
You need maximum compatibility with existing tools
You already have a large codebase on Puppeteer

Choose Playwright if:

You need to work with Firefox or Safari (WebKit)
You require simultaneous use of multiple proxies
Performance is important when scaling
You are writing in TypeScript
You are starting a new project from scratch

Performance in Proxy Rotation

Test: scraping 100 pages with rotation of 10 proxies on a MacBook Pro M1:

Method	Execution Time	RAM Usage
Puppeteer (browser restart)	8 minutes 23 seconds	~1.2 GB peak
Playwright (browser restart)	7 minutes 54 seconds	~1.1 GB peak
Playwright (context pool)	4 minutes 12 seconds	~800 MB stable

Playwright with a context pool is almost twice as fast due to the absence of overhead from launching the browser. This is critical when scraping thousands of pages.

Conclusion

Integrating proxies with Puppeteer and Playwright is standard practice for web scraping, testing, and automation. Puppeteer offers simplicity and a wide ecosystem, while Playwright provides a modern API and better performance when working with multiple proxies through browser contexts.

Key points we covered:

Basic setup of HTTP, HTTPS, and SOCKS5 proxies in both frameworks
Authentication with username and password
Proxy rotation for large-scale scraping
Error handling and validation of proxies before use
Setting geolocation and fingerprinting to bypass anti-bot systems
Performance comparison of different approaches

For production solutions, we recommend combining quality proxies with proper fingerprinting setup, error handling, and monitoring. This will ensure stable operation of the scraper even on protected sites.

If you plan to scrape commercial sites, marketplaces, or work with advertising platforms, we recommend using residential proxies — they provide maximum anonymity and minimal risk of blocks due to real home user IP addresses.

Proxy Integration with Puppeteer and Playwright: Complete Guide with Code Examples