What to do if proxies are constantly banned: complete guide to diagnosing and solving the problem
Constant proxy blocks are one of the most common problems when parsing, automating, and working with multiple accounts. In this article, we'll figure out why this happens and how to solve the problem systematically, rather than endlessly changing providers hoping for a miracle.
Why proxies are actually banned
Before looking for a solution, you need to understand the mechanics of blocks. Modern anti-fraud systems use multi-level protection, and proxy bans are just a consequence, not a cause. Understanding how these systems work allows you to build an effective bypass strategy.
IP reputation and blacklists
Each IP address has a reputation formed based on its usage history. If an address was previously used for spam, DDoS attacks, or mass parsing, it gets into databases like Spamhaus, SORBS, or proprietary lists of specific services. When you connect through such an IP, the system immediately treats you with suspicion.
Data center proxies are particularly susceptible to this problem. Entire subnets can be marked as "hosting," and any traffic from them automatically receives an elevated level of scrutiny. Amazon AWS, Google Cloud, DigitalOcean — their IP ranges are well-known and often blocked preemptively.
You can check IP reputation through services like IPQualityScore, Scamalytics, or AbuseIPDB. If your proxy shows a fraud score above 75, the problem is exactly this — change your provider or proxy type.
Request patterns
A human doesn't make 100 requests per second. A human doesn't navigate pages with perfect 2-second periodicity. A human doesn't ignore images, CSS, and JavaScript, requesting only HTML. Anti-fraud systems analyze exactly these patterns, and any deviation from "human" behavior increases the risk of blocking.
Particularly telling is the statistics on time between requests. If you have a stable interval — this is a clear sign of automation. Adding random delays (for example, from 1 to 5 seconds) significantly reduces the likelihood of detection.
Metadata mismatch
When your User-Agent says you're using Chrome on Windows, but HTTP headers reveal Python requests characteristics — that's a red flag. When an IP address is geolocated to Germany, but browser language settings indicate Russian — another red flag. When the timezone in JavaScript doesn't match the IP geography — a third flag.
The accumulation of such mismatches leads to the system classifying the connection as suspicious and applying protective measures: from CAPTCHAs to complete IP blocking.
Browser fingerprint
Modern protection systems collect dozens of browser parameters: screen resolution, installed fonts, plugins, WebGL rendering, audio context, and much more. The combination of these parameters creates a unique "fingerprint" that remains constant even when changing IPs.
If you change proxies but the fingerprint remains the same, the system understands that it's the same user. And if one fingerprint appears from hundreds of different IPs in a short time — this is a clear sign of automation.
Diagnostics: how to understand the cause of blocking
Before changing settings blindly, conduct diagnostics. This will save hours of experimentation and help you find the real cause of the problem. A systematic approach to diagnostics is the key to effective solutions.
Step 1: Check the proxy itself
Start with basic proxy functionality checks independent of your main script:
import requests
proxy = {
"http": "http://user:pass@proxy-server:port",
"https": "http://user:pass@proxy-server:port"
}
# Check basic functionality
try:
response = requests.get("https://httpbin.org/ip", proxies=proxy, timeout=10)
print(f"IP through proxy: {response.json()['origin']}")
except Exception as e:
print(f"Connection error: {e}")
# Check for real IP leaks
response = requests.get("https://browserleaks.com/ip", proxies=proxy)
# Compare with your real IP
If the proxy doesn't work even on simple requests — the problem is with the proxy itself or credentials. Check the correct connection format, account balance availability, and provider limits.
Step 2: Check IP reputation
Use multiple services for comprehensive assessment:
# Get proxy IP
proxy_ip = requests.get("https://api.ipify.org", proxies=proxy).text
# Check on these services:
# https://www.ipqualityscore.com/free-ip-lookup-proxy-vpn-test
# https://scamalytics.com/ip/{proxy_ip}
# https://www.abuseipdb.com/check/{proxy_ip}
# https://whatismyipaddress.com/ip/{proxy_ip}
print(f"Check IP {proxy_ip} on the services above")
Pay attention to the following indicators: fraud score (should be below 50), IP type (residential is better than datacenter), presence in blacklists. If the IP is marked as VPN/Proxy — many sites will be suspicious of it from the start.
Step 3: Isolate the problem
Try the same proxy on different target sites. If blocking occurs everywhere — the problem is with the proxy or your configuration. If only on a specific site — the problem is with that site's protection or your behavior on it.
Also try different proxies on one site. If all are blocked — the problem is not with the proxy, but with your script, fingerprint, or behavior pattern. This is a critically important test that many skip.
Step 4: Analyze server responses
Different types of blocks manifest differently. Learn to distinguish them:
def analyze_response(response):
status = response.status_code
if status == 403:
print("Access denied — possible IP in blacklist")
elif status == 429:
print("Too many requests — reduce frequency")
elif status == 503:
print("Service unavailable — possible DDoS protection")
elif status == 407:
print("Proxy authentication required — check credentials")
elif "captcha" in response.text.lower():
print("CAPTCHA detected — bot suspicion")
elif "blocked" in response.text.lower():
print("Explicit blocking — change IP and reconsider approach")
elif len(response.text) < 1000:
print("Suspiciously short response — possible stub")
else:
print(f"Status {status}, response length: {len(response.text)}")
Proper rotation: frequency, logic, implementation
Proxy rotation is not just "change IP more often." Improper rotation can harm more than its absence. Let's consider different strategies and when to apply them.
Strategy 1: Rotation by request count
The simplest approach — change IP after a certain number of requests. Suitable for parsing where session persistence is not needed:
import random
class ProxyRotator:
def __init__(self, proxy_list, requests_per_proxy=50):
self.proxies = proxy_list
self.requests_per_proxy = requests_per_proxy
self.current_proxy = None
self.request_count = 0
def get_proxy(self):
if self.current_proxy is None or self.request_count >= self.requests_per_proxy:
# Add randomness to the number of requests
self.requests_per_proxy = random.randint(30, 70)
self.current_proxy = random.choice(self.proxies)
self.request_count = 0
self.request_count += 1
return self.current_proxy
# Usage
rotator = ProxyRotator(proxy_list)
for url in urls_to_scrape:
proxy = rotator.get_proxy()
response = requests.get(url, proxies={"http": proxy, "https": proxy})
Note the randomness in the number of requests per proxy. A fixed number (for example, exactly 50) is a pattern that can be detected. A random range makes behavior less predictable.
Strategy 2: Time-based rotation
For tasks where session persistence is important (for example, working with accounts), it's better to tie IP to time:
import time
import random
class TimeBasedRotator:
def __init__(self, proxy_list, min_minutes=10, max_minutes=30):
self.proxies = proxy_list
self.min_seconds = min_minutes * 60
self.max_seconds = max_minutes * 60
self.current_proxy = None
self.rotation_time = 0
def get_proxy(self):
current_time = time.time()
if self.current_proxy is None or current_time >= self.rotation_time:
self.current_proxy = random.choice(self.proxies)
# Random interval until next rotation
interval = random.randint(self.min_seconds, self.max_seconds)
self.rotation_time = current_time + interval
print(f"New proxy, next rotation in {interval//60} minutes")
return self.current_proxy
Strategy 3: Sticky sessions for accounts
When working with multiple accounts, it's critical that each account uses a permanent IP. Changing IP for a logged-in account is a sure path to a ban:
class AccountProxyManager:
def __init__(self, proxy_pool):
self.proxy_pool = proxy_pool
self.account_proxies = {} # account_id -> proxy
self.used_proxies = set()
def get_proxy_for_account(self, account_id):
# If account already has a proxy — return it
if account_id in self.account_proxies:
return self.account_proxies[account_id]
# Find a free proxy
available = [p for p in self.proxy_pool if p not in self.used_proxies]
if not available:
raise Exception("No free proxies for new accounts")
proxy = random.choice(available)
self.account_proxies[account_id] = proxy
self.used_proxies.add(proxy)
return proxy
def release_account(self, account_id):
"""Releases proxy when deleting an account"""
if account_id in self.account_proxies:
proxy = self.account_proxies.pop(account_id)
self.used_proxies.discard(proxy)
# Usage
manager = AccountProxyManager(residential_proxy_list)
for account in accounts:
proxy = manager.get_proxy_for_account(account.id)
# All actions of this account go through one IP
Strategy 4: Adaptive rotation
The most advanced approach — change proxies in response to signals from the target site:
class AdaptiveRotator:
def __init__(self, proxy_list):
self.proxies = proxy_list
self.current_proxy = random.choice(proxy_list)
self.proxy_scores = {p: 100 for p in proxy_list} # Initial proxy "health"
def get_proxy(self):
return self.current_proxy
def report_result(self, success, response_code=200):
"""Called after each request"""
if success and response_code == 200:
# Successful request — slightly increase score
self.proxy_scores[self.current_proxy] = min(100,
self.proxy_scores[self.current_proxy] + 1)
elif response_code == 429:
# Rate limit — significantly decrease and rotate
self.proxy_scores[self.current_proxy] -= 30
self._rotate()
elif response_code == 403:
# Ban — zero score and rotate
self.proxy_scores[self.current_proxy] = 0
self._rotate()
elif response_code == 503:
# Possible protection — decrease and rotate
self.proxy_scores[self.current_proxy] -= 20
self._rotate()
def _rotate(self):
# Select proxy with best score
available = [(p, s) for p, s in self.proxy_scores.items() if s > 20]
if not available:
# All proxies are "dead" — reset scores
self.proxy_scores = {p: 50 for p in self.proxies}
available = list(self.proxy_scores.items())
# Weighted selection by score
self.current_proxy = max(available, key=lambda x: x[1])[0]
print(f"Rotation to proxy with score {self.proxy_scores[self.current_proxy]}")
Browser fingerprint and its role in blocks
Fingerprint is a set of browser characteristics that allows identifying a user even without cookies. If you change your IP but the fingerprint remains the same, the protection system easily links all your sessions together.
What fingerprint consists of
A modern fingerprint includes dozens of parameters. Here are the main categories:
| Category | Parameters | Weight in identification |
|---|---|---|
| User-Agent | Browser, version, OS | Medium |
| Screen | Resolution, color depth, pixel ratio | Medium |
| Fonts | List of installed fonts | High |
| WebGL | Renderer, vendor, rendering hash | Very high |
| Canvas | Hash of rendered image | Very high |
| Audio | AudioContext fingerprint | High |
| Timezone | Timezone, offset | Medium |
| Languages | navigator.languages | Medium |
| Plugins | navigator.plugins | Low (in modern browsers) |
Fingerprint and IP consistency
It's critical that the fingerprint matches the IP geography. If the proxy is in Germany, the fingerprint should look like a German user:
// Example of inconsistency (BAD):
// IP: Germany
// Timezone: America/New_York
// Languages: ["ru-RU", "ru"]
// This will raise suspicion
// Consistent fingerprint (GOOD):
// IP: Germany
// Timezone: Europe/Berlin
// Languages: ["de-DE", "de", "en-US", "en"]
Tools for fingerprint management
For serious work, use specialized tools:
Playwright with Stealth:
from playwright.sync_api import sync_playwright
from playwright_stealth import stealth_sync
with sync_playwright() as p:
browser = p.chromium.launch(
proxy={"server": "http://proxy:port", "username": "user", "password": "pass"}
)
context = browser.new_context(
viewport={"width": 1920, "height": 1080},
locale="de-DE",
timezone_id="Europe/Berlin",
geolocation={"latitude": 52.52, "longitude": 13.405},
permissions=["geolocation"]
)
page = context.new_page()
stealth_sync(page) # Apply stealth patches
page.goto("https://target-site.com")
Puppeteer with puppeteer-extra:
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());
const browser = await puppeteer.launch({
args: [`--proxy-server=http://proxy:port`]
});
const page = await browser.newPage();
// Override timezone
await page.evaluateOnNewDocument(() => {
Object.defineProperty(Intl.DateTimeFormat.prototype, 'resolvedOptions', {
value: function() {
return { timeZone: 'Europe/Berlin' };
}
});
});
Anti-detect browsers
For account work, anti-detect browsers are often used (Multilogin, GoLogin, Dolphin Anty, and others). They allow creating isolated profiles with unique fingerprints. Each profile has its own set of parameters, cookies, localStorage — a completely isolated environment.
The advantage of anti-detect browsers is that they solve the fingerprint problem "out of the box." The disadvantage is cost and automation complexity (although many have APIs).
Behavioral patterns: how not to look like a bot
Even with perfect fingerprint and clean IP, you can get banned due to non-human behavior. Modern systems analyze not only technical parameters but also patterns of interaction with the site.
Time delays
A human doesn't make requests at a constant interval. Add random delays with normal distribution:
import random
import time
import numpy as np
def human_delay(min_sec=1, max_sec=5, mean=2.5):
"""
Generates a delay that looks human-like.
Uses lognormal distribution —
most delays are short, but sometimes there are long ones.
"""
delay = np.random.lognormal(mean=np.log(mean), sigma=0.5)
delay = max(min_sec, min(max_sec, delay))
return delay
def human_typing_delay():
"""Delay between key presses when typing text"""
return random.uniform(0.05, 0.25)
# Usage
for url in urls:
response = requests.get(url, proxies=proxy)
process(response)
time.sleep(human_delay()) # Random pause between requests
Navigation imitation
A human doesn't go directly to a product page via a direct link. They go to the homepage, use search, browse categories. Imitate this path:
async def human_like_navigation(page, target_url):
"""Imitates human-like navigation to target page"""
# 1. Go to homepage
await page.goto("https://example.com")
await page.wait_for_timeout(random.randint(2000, 4000))
# 2. Sometimes scroll the homepage
if random.random() > 0.5:
await page.evaluate("window.scrollBy(0, 300)")
await page.wait_for_timeout(random.randint(1000, 2000))
# 3. Use search or navigation
if random.random() > 0.3:
search_box = await page.query_selector('input[type="search"]')
if search_box:
await search_box.type("search query", delay=100)
await page.keyboard.press("Enter")
await page.wait_for_timeout(random.randint(2000, 4000))
# 4. Go to target page
await page.goto(target_url)
# 5. Scroll page like a human
await human_scroll(page)
async def human_scroll(page):
"""Imitates human-like scrolling"""
scroll_height = await page.evaluate("document.body.scrollHeight")
current_position = 0
while current_position < scroll_height * 0.7: # Not to the end
scroll_amount = random.randint(200, 500)
await page.evaluate(f"window.scrollBy(0, {scroll_amount})")
current_position += scroll_amount
await page.wait_for_timeout(random.randint(500, 1500))
Mouse movements
Some systems track mouse movements. Straight-line movement from point A to point B is a sign of a bot. A human moves the mouse in a curve with micro-corrections:
import bezier
import numpy as np
def generate_human_mouse_path(start, end, num_points=50):
"""
Generates a mouse path that looks human-like,
using Bezier curves with slight noise.
"""
# Control points for Bezier curve
control1 = (
start[0] + (end[0] - start[0]) * random.uniform(0.2, 0.4) + random.randint(-50, 50),
start[1] + (end[1] - start[1]) * random.uniform(0.2, 0.4) + random.randint(-50, 50)
)
control2 = (
start[0] + (end[0] - start[0]) * random.uniform(0.6, 0.8) + random.randint(-50, 50),
start[1] + (end[1] - start[1]) * random.uniform(0.6, 0.8) + random.randint(-50, 50)
)
# Create Bezier curve
nodes = np.asfortranarray([
[start[0], control1[0], control2[0], end[0]],
[start[1], control1[1], control2[1], end[1]]
])
curve = bezier.Curve(nodes, degree=3)
# Generate points on the curve
points = []
for t in np.linspace(0, 1, num_points):
point = curve.evaluate(t)
# Add micro-noise
x = point[0][0] + random.uniform(-2, 2)
y = point[1][0] + random.uniform(-2, 2)
points.append((x, y))
return points
async def human_click(page, selector):
"""Clicks an element with human-like mouse movement"""
element = await page.query_selector(selector)
box = await element.bounding_box()
# Target point — not center, but random point inside element
target_x = box['x'] + random.uniform(box['width'] * 0.2, box['width'] * 0.8)
target_y = box['y'] + random.uniform(box['height'] * 0.2, box['height'] * 0.8)
# Current mouse position (or random starting position)
start_x = random.randint(0, 1920)
start_y = random.randint(0, 1080)
# Generate path
path = generate_human_mouse_path((start_x, start_y), (target_x, target_y))
# Move mouse along path
for x, y in path:
await page.mouse.move(x, y)
await page.wait_for_timeout(random.randint(5, 20))
# Small pause before click
await page.wait_for_timeout(random.randint(50, 150))
await page.mouse.click(target_x, target_y)
Resource loading
A real browser loads not only HTML but also CSS, JavaScript, images, fonts. If you use requests and request only HTML — this is suspicious. When working with headless browsers, this problem is solved automatically, but when using HTTP clients, you need to consider this.
Choosing a proxy type for your task
Different proxy types have different characteristics and are suitable for different tasks. Wrong choice is a common cause of blocks.
Data center proxies
Data center proxies are IP addresses belonging to hosting providers. They are easy to identify by their AS (autonomous system) belonging to major data centers.
Pros:
- High speed and stability
- Low cost
- Large IP pools
Cons:
- Easy to detect
- Often in blacklists
- Not suitable for sites with serious protection
Suitable for: SEO tools, availability checks, working with unprotected APIs, testing.
Residential proxies
Residential proxies are IP addresses of real users provided through partnership programs or SDKs in applications. They belong to regular internet service providers (ISPs).
Pros:
- Look like regular users
- Low fraud score
- Wide geography
- Hard to detect
Cons:
- Higher cost (traffic-based payment)
- Speed depends on end user
- IPs can "go away" (user turned off device)
Suitable for: parsing protected sites, working with social networks, e-commerce, any task where it's important not to be detected.
Mobile proxies
Mobile proxies are IP addresses from mobile operators (MTS, Beeline, Megafon and analogues in other countries). They have special status due to CGNAT technology.
Pros:
- Maximum trust from sites
- One IP is used by thousands of real users — hard to ban
- Ideal for account work
- IP change on request (network reconnection)
Cons:
- Highest cost
- Limited speed
- Fewer geography options
Suitable for: multi-accounting, working with Instagram/Facebook/TikTok, account registration, any task with high ban risk.
Comparison table
| Parameter | Data center | Residential | Mobile |
|---|---|---|---|
| Detectability | High | Low | Very low |
| Speed | High | Medium | Low-medium |
| Cost | $ | $$ | $$$ |
| For social networks | Not suitable | Suitable | Ideal |
| For parsing | Simple sites | Any sites | Excessive |
Advanced protection bypass techniques
When basic methods don't work, you need to use more complex techniques. Let's consider several advanced approaches.
Working with Cloudflare and similar protections
Cloudflare, Akamai, PerimeterX — these systems use JavaScript challenges to verify the browser. A simple HTTP request won't pass. Solution options:
1. Using a real browser:
from playwright.sync_api import sync_playwright
def bypass_cloudflare(url, proxy):
with sync_playwright() as p:
browser = p.chromium.launch(
headless=False, # Sometimes headless is detected
proxy={"server": proxy}
)
page = browser.new_page()
page.goto(url)
# Wait for challenge to pass (usually 5-10 seconds)
page.wait_for_timeout(10000)
# Check if we passed
if "challenge" not in page.url:
# Save cookies for subsequent requests
cookies = page.context.cookies()
return cookies
browser.close()
return None
2. Using ready-made solutions:
# cloudscraper — library for bypassing Cloudflare
import cloudscraper
scraper = cloudscraper.create_scraper(
browser={
'browser': 'chrome',
'platform': 'windows',
'desktop': True
}
)
scraper.proxies = {"http": proxy, "https": proxy}
response = scraper.get("https://protected-site.com")
CAPTCHA solving
If a site shows a CAPTCHA, there are several approaches:
Recognition services: 2Captcha, Anti-Captcha, CapMonster. They solve CAPTCHAs for you