Back to Blog

How to Set Up Proxies in Python Requests for Web Scraping, API, and Automation: Complete Guide with Code Examples

A complete guide to connecting proxies in the Python requests library — from basic setup to IP rotation and bypassing blocks during scraping and automation.

📅April 3, 2026

If your Python script receives a 403 error, CAPTCHA, or IP ban — it means the target site has already noticed you. Connecting a proxy to the requests library solves this problem: you change your IP address, bypass geographical restrictions, and distribute the load among multiple addresses. This guide covers everything from basic connection to advanced rotation with real code examples.

Why use proxies in Python scripts

Most websites and APIs track the IP addresses of incoming requests. If one address makes 100+ requests per minute — it gets blocked. This is standard bot protection used by Wildberries, Ozon, Avito, Google, Instagram, and hundreds of other platforms. A proxy allows you to route requests through an intermediate server with a different IP address, making you invisible to protection systems.

Here are the main tasks where proxies in Python are critically necessary:

  • Scraping marketplaces — collecting prices from Wildberries, Ozon, Yandex.Market without IP blocks
  • Competitor monitoring — regular requests to competitors' websites every 5–15 minutes
  • Working with rate-limited APIs — distributing requests across multiple IPs to avoid exceeding the rate limit
  • Geolocation testing — checking how a website looks from different countries and regions
  • Form and registration automation — creating accounts or filling out forms from different IPs
  • SEO monitoring — tracking rankings from different regions in Russia and other countries

Without proxies, even a well-written scraper will hit a block within a few hours of operation. With properly configured IP rotation, the same script can run for weeks without interruption.

Basic proxy setup in requests

The requests library natively supports proxies — no additional packages are needed. Proxies are passed through a dictionary proxies in the request parameters.

The simplest example is an HTTP proxy for a single request:

import requests

proxies = {
    "http": "http://123.45.67.89:8080",
    "https": "http://123.45.67.89:8080",
}

response = requests.get("https://httpbin.org/ip", proxies=proxies)
print(response.json())
# {'origin': '123.45.67.89'}
  

Note: in the proxies dictionary, you need to specify both keys — http and https. If you specify only one, requests using the second protocol will go directly without a proxy. This is a common mistake for beginners, which leads to the real IP leaking anyway.

To ensure that the proxy works, use the service httpbin.org/ip — it returns the IP address from which the request came. If you see the proxy server's IP in the response instead of your own — everything is set up correctly.

HTTP, HTTPS, and SOCKS5 proxies: differences and code examples

Proxies come in different types, and each is suitable for its own tasks. In the context of Python requests, it is important to understand the difference between the three main protocols:

Type Protocol in URL Speed UDP Support Best Scenario
HTTP http:// High No Scraping HTTP sites
HTTPS https:// High No Scraping secure sites
SOCKS5 socks5:// Medium Yes Full anonymity, any protocols

To work with SOCKS5 in Python, you need to install an additional package:

pip install requests[socks]
# or separately:
pip install PySocks
  

After installation, connecting a SOCKS5 proxy looks like this:

import requests

# SOCKS5 proxy
proxies = {
    "http": "socks5://123.45.67.89:1080",
    "https": "socks5://123.45.67.89:1080",
}

response = requests.get("https://httpbin.org/ip", proxies=proxies)
print(response.json())
  

SOCKS5 is the preferred protocol for tasks where anonymity is important. Unlike HTTP proxies, SOCKS5 does not add X-Forwarded-For headers, which can reveal your real IP.

Proxies with username and password authentication

Most paid proxy services use username and password authentication. This is standard practice — without authorization, the proxy simply won't allow your request. In the requests library, the authentication data is passed directly in the proxy URL.

import requests

# Format: protocol://username:password@host:port
proxy_url = "http://myuser:[email protected]:8080"

proxies = {
    "http": proxy_url,
    "https": proxy_url,
}

response = requests.get("https://httpbin.org/ip", proxies=proxies)
print(response.status_code)
print(response.json())
  

If your password or username contains special characters (e.g., @, #, %), you need to URL-encode them. For this, use the urllib.parse module:

import requests
from urllib.parse import quote

username = "myuser"
password = "p@ss#word!"  # Special characters

# URL-encode username and password
encoded_user = quote(username, safe="")
encoded_pass = quote(password, safe="")

proxy_url = f"http://{encoded_user}:{encoded_pass}@123.45.67.89:8080"

proxies = {
    "http": proxy_url,
    "https": proxy_url,
}

response = requests.get("https://httpbin.org/ip", proxies=proxies)
print(response.json())
  

💡 Security Tip

Never hardcode your username and password directly in the script code. Use environment variables or a .env file with the python-dotenv library. This way, you can avoid accidental credential leaks when publishing code on GitHub.

Proxy rotation: automatic IP switching for scraping

One proxy is still just one IP address, which can be blocked. The real protection against bans is rotation: each request (or every N requests) goes out with a new IP. Below are several approaches to implementing rotation.

Method 1: Random selection from a list

import requests
import random

# List of proxies
proxy_list = [
    "http://user:[email protected]:8080",
    "http://user:[email protected]:8080",
    "http://user:[email protected]:8080",
    "http://user:[email protected]:8080",
]

def get_random_proxy():
    proxy = random.choice(proxy_list)
    return {"http": proxy, "https": proxy}

# Scraping 10 pages with IP rotation
urls = [f"https://example.com/page/{i}" for i in range(1, 11)]

for url in urls:
    proxies = get_random_proxy()
    try:
        response = requests.get(url, proxies=proxies, timeout=10)
        print(f"URL: {url} | IP: {proxies['http'].split('@')[1]} | Status: {response.status_code}")
    except requests.RequestException as e:
        print(f"Error: {e}")
  

Method 2: Cyclic rotation using itertools

import requests
import itertools

proxy_list = [
    "http://user:[email protected]:8080",
    "http://user:[email protected]:8080",
    "http://user:[email protected]:8080",
]

# Create an infinite cycle through the list of proxies
proxy_cycle = itertools.cycle(proxy_list)

def get_next_proxy():
    proxy = next(proxy_cycle)
    return {"http": proxy, "https": proxy}

# Each request uses the next proxy in the cycle
for i in range(9):
    proxies = get_next_proxy()
    response = requests.get("https://httpbin.org/ip", proxies=proxies, timeout=10)
    print(f"Request {i+1}: {response.json()['origin']}")
  

For industrial tasks with thousands of requests per hour, it is recommended to use residential proxies with built-in automatic rotation — the provider changes the IP for each request through a single endpoint, and you don't need to manage the list of addresses manually.

Proxies through Session: persistent connections and cookies

When you need to make several requests within one session (for example, logging in and then making requests), use the requests.Session() object. It saves cookies, headers, and proxy settings between requests — you don't need to pass the proxy in each call separately.

import requests

# Create a session with a proxy
session = requests.Session()
session.proxies = {
    "http": "http://user:[email protected]:8080",
    "https": "http://user:[email protected]:8080",
}

# Add headers to simulate a browser
session.headers.update({
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
    "Accept-Language": "en-US,en;q=0.9",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
})

# Step 1: Authorization
login_data = {"username": "myuser", "password": "mypass"}
session.post("https://example.com/login", data=login_data)

# Step 2: Requests already with cookies and through the proxy
response = session.get("https://example.com/dashboard")
print(response.status_code)

# Step 3: Close the session
session.close()
  

Using Session is also more efficient in terms of performance: the TCP connection is reused rather than opened anew for each request. When scraping 1000+ pages, this provides a noticeable speed boost.

Error handling, timeouts, and automatic retries

Proxy servers may be unavailable, respond slowly, or return connection errors. A reliable scraping script should be able to handle these situations and automatically switch to another proxy upon failure.

import requests
import random
import time

proxy_list = [
    "http://user:[email protected]:8080",
    "http://user:[email protected]:8080",
    "http://user:[email protected]:8080",
]

def fetch_with_retry(url, max_retries=3, timeout=10):
    """
    Makes a request with automatic proxy switching on error.
    Returns a Response object or None if attempts are exhausted.
    """
    available_proxies = proxy_list.copy()
    random.shuffle(available_proxies)

    for attempt, proxy_url in enumerate(available_proxies[:max_retries], 1):
        proxies = {"http": proxy_url, "https": proxy_url}
        try:
            response = requests.get(
                url,
                proxies=proxies,
                timeout=timeout,
                headers={"User-Agent": "Mozilla/5.0"}
            )
            response.raise_for_status()  # Raises an exception for 4xx/5xx
            print(f"✓ Success on attempt {attempt}")
            return response

        except requests.exceptions.ProxyError:
            print(f"✗ Attempt {attempt}: proxy unavailable — {proxy_url}")
        except requests.exceptions.Timeout:
            print(f"✗ Attempt {attempt}: timeout — {proxy_url}")
        except requests.exceptions.HTTPError as e:
            print(f"✗ Attempt {attempt}: HTTP error {e.response.status_code}")
            if e.response.status_code == 403:
                print("  → Received a ban, trying the next proxy...")
        except requests.exceptions.RequestException as e:
            print(f"✗ Attempt {attempt}: general error — {e}")

        time.sleep(1)  # Pause between attempts

    print(f"✗ All {max_retries} attempts exhausted for {url}")
    return None

# Usage
result = fetch_with_retry("https://httpbin.org/ip")
if result:
    print(result.json())
  

Note the raise_for_status() — this method automatically raises an exception for HTTP statuses 4xx and 5xx. Without it, the script will consider even a response with a 403 (ban) or 429 (rate limit exceeded) as successful.

Proxies through environment variables: safe data storage

The requests library automatically reads the environment variables HTTP_PROXY and HTTPS_PROXY. This allows you not to store credentials in the code and easily switch between proxies without changing the script.

Setting variables in the terminal (Linux/macOS):

export HTTP_PROXY="http://user:[email protected]:8080"
export HTTPS_PROXY="http://user:[email protected]:8080"
export NO_PROXY="localhost,127.0.0.1"
  

Or through a .env file with the python-dotenv library:

# .env file (add to .gitignore!)
HTTP_PROXY=http://user:[email protected]:8080
HTTPS_PROXY=http://user:[email protected]:8080
  
# Python script
from dotenv import load_dotenv
import requests
import os

load_dotenv()  # Load variables from .env

# requests automatically uses HTTP_PROXY and HTTPS_PROXY
response = requests.get("https://httpbin.org/ip")
print(response.json())

# Or explicitly from environment variables:
proxies = {
    "http": os.getenv("HTTP_PROXY"),
    "https": os.getenv("HTTPS_PROXY"),
}
response = requests.get("https://httpbin.org/ip", proxies=proxies)
print(response.json())
  

⚠️ Important: the NO_PROXY variable

The NO_PROXY variable allows you to exclude certain addresses from proxying. Be sure to add localhost and 127.0.0.1 so that local requests do not go through the proxy.

Real scenarios: scraping marketplaces, working with APIs, and automation

Let's consider three practical scenarios that developers often encounter.

Scenario 1: Scraping prices from a marketplace

When monitoring prices on Wildberries or Ozon, it is important to simulate the behavior of a real user: sending the correct browser headers, adding delays between requests, and rotating IPs. For this task, data center proxies are well-suited — they are fast and cheap when working with large volumes of data.

import requests
import time
import random

HEADERS = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
                  "(KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
    "Accept": "application/json, text/plain, */*",
    "Accept-Language": "en-US,en;q=0.9",
    "Referer": "https://www.wildberries.ru/",
}

PROXIES = [
    {"http": "http://user:[email protected]:8080",
     "https": "http://user:[email protected]:8080"},
    {"http": "http://user:[email protected]:8080",
     "https": "http://user:[email protected]:8080"},
]

def get_product_price(article_id: int) -> dict:
    """Gets the price of a product by article from Wildberries."""
    url = f"https://card.wb.ru/cards/v1/detail?appType=1&curr=rub&nm={article_id}"
    proxies = random.choice(PROXIES)

    try:
        resp = requests.get(url, headers=HEADERS, proxies=proxies, timeout=15)
        resp.raise_for_status()
        data = resp.json()
        product = data["data"]["products"][0]
        return {
            "id": product["id"],
            "name": product["name"],
            "price": product["salePriceU"] / 100,  # price in kopecks
        }
    except (requests.RequestException, KeyError, IndexError) as e:
        return {"error": str(e)}

# Scraping several articles with a delay
articles = [12345678, 87654321, 11223344]
for article in articles:
    result = get_product_price(article)
    print(result)
    time.sleep(random.uniform(1.5, 3.0))  # Random delay of 1.5-3 seconds
  

Scenario 2: Working with an API through a proxy

Some APIs limit the number of requests from one IP (rate limiting). Distributing requests across multiple proxies allows you to bypass this limitation:

import requests
import itertools
from typing import Optional

class ProxyAPIClient:
    """Client for working with APIs through proxy rotation."""

    def __init__(self, api_key: str, proxy_list: list):
        self.api_key = api_key
        self.proxy_cycle = itertools.cycle(proxy_list)
        self.session = requests.Session()
        self.session.headers.update({
            "Authorization": f"Bearer {api_key}",
            "Content-Type": "application/json",
        })

    def _get_proxy(self) -> dict:
        proxy = next(self.proxy_cycle)
        return {"http": proxy, "https": proxy}

    def get(self, url: str, **kwargs) -> Optional[dict]:
        proxies = self._get_proxy()
        try:
            resp = self.session.get(url, proxies=proxies, timeout=10, **kwargs)
            resp.raise_for_status()
            return resp.json()
        except requests.RequestException as e:
            print(f"API request failed: {e}")
            return None

# Usage
proxy_list = [
    "http://user:[email protected]:8080",
    "http://user:[email protected]:8080",
]

client = ProxyAPIClient(api_key="your_api_key", proxy_list=proxy_list)
data = client.get("https://api.example.com/products")
  

Scenario 3: Geolocation testing

Marketers and SEO specialists often check how a website looks from different regions. Using proxies from the desired locations can automate this process:

import requests

# Proxies from different regions
regional_proxies = {
    "Moscow":        "http://user:[email protected]:8080",
    "Saint Petersburg": "http://user:[email protected]:8080",
    "Novosibirsk":   "http://user:[email protected]:8080",
    "USA":           "http://user:[email protected]:8080",
}

url = "https://example.com/prices"

for region, proxy_url in regional_proxies.items():
    proxies = {"http": proxy_url, "https": proxy_url}
    try:
        resp = requests.get(url, proxies=proxies, timeout=15)
        print(f"[{region}] Status: {resp.status_code} | "
              f"Size: {len(resp.content)} bytes")
    except requests.RequestException as e:
        print(f"[{region}] Error: {e}")
  

Which type of proxy to choose for your task

The choice of proxy type directly affects the success of your project. A cheap data center proxy may work excellently for some tasks and completely fail for others. Here is a practical guide for selection:

Task Proxy Type Why
Scraping marketplaces (Wildberries, Ozon) Residential Looks like regular users, less frequently banned
Scraping open data, news Data centers Fast, cheap, sufficiently anonymous
Working with Facebook API, Instagram Mobile Social networks trust mobile IPs the most
Geolocation testing Residential with geotargeting Accurate geolocation, real IPs from the desired region
High-load scraping (10k+ requests/hour) Data centers (pool) Speed and cost for large volumes
Authorization and working with accounts Residential or mobile Fewer triggers for anti-fraud systems

For tasks where maximum reliability and minimal risk of blocking when working with secure sites are important, developers most often choose mobile proxies — they use IP addresses from real mobile operators (MTS, Beeline, Megafon), which are extremely rarely blacklisted.

Proxy checklist before use

  • ✅ Check the IP through httpbin.org/ip — is your real address visible?
  • ✅ Check the speed — response time should not exceed 2-3 seconds
  • ✅ Ensure the proxy is not on blacklists through blocklist.de or ipqualityscore.com
  • ✅ Check geolocation through ipinfo.io — does it match the expected region?
  • ✅ Test on the target site with one request before running the full script
  • ✅ Ensure that HTTPS traffic also goes through the proxy (both keys in the dictionary)

Conclusion

Setting up proxies in Python requests is not difficult, but it requires attention to detail. The main principles to remember are: always specify both keys (http and https) in the proxy dictionary, use Session for multi-step scenarios, always handle errors and timeouts, and store credentials in environment variables, not in code.

For industrial scraping with thousands of requests per day, a manual list of proxies is not enough — rotation is needed. If you are scraping protected marketplaces like Wildberries or Ozon, working with social networks, or testing geolocation, we recommend trying residential proxies — they provide a high level of trust from anti-bot systems and support automatic IP rotation through a single endpoint, which significantly simplifies your script's code.