How to bypass API blockages of Wildberries and Ozon when parsing.

If you are engaged in monitoring competitor prices, scraping product stock, or automatically posting ads on marketplaces, you have likely encountered blocks. The APIs of Wildberries, Ozon, Yandex.Market, and Avito are actively protected against automation: they limit the number of requests, ban IP addresses, and require CAPTCHA. In this guide, we will analyze why blocks occur and how to configure your scraper to work stably for months without bans.

Why Marketplaces Block Frequent API Requests

Marketplaces spend huge amounts of money on infrastructure support — servers, databases, CDNs. When you make thousands of requests per minute to scrape prices, you create additional load on their systems. However, the main reason for blocks is not technical but business-oriented.

Main Reasons for Blocks:

Protection of Competitive Data. Wildberries and Ozon do not want competitors to easily access information about prices, stock, and popular products. This data is a trade secret.
Reducing Server Load. One scraper can generate as many requests as 10,000 regular buyers. This increases hosting costs.
Combating Manipulation and Spam. Automated systems are used to inflate views, reviews, and mass posting of ads on Avito.
Monetization of API. Some marketplaces offer official paid APIs with limits. By blocking free scraping, they encourage the purchase of access.

For example, if you monitor prices for 5,000 products every hour — that's 120,000 requests per day. From a single IP address, this looks suspicious, and the marketplace's protection system will quickly block your access.

What Protection Methods Are Used by Wildberries, Ozon, and Avito

Modern marketplaces use multi-layered protection against scraping. Understanding these mechanisms will help you properly configure your bypass methods.

Protection Method	How It Works	How to Bypass
Rate Limiting	Request limits from one IP: 100-500 per hour	Delays between requests + IP rotation
IP Blacklist	Blocking known data center proxies	Using residential proxies
User-Agent Check	Blocking requests without a browser User-Agent	Setting realistic headers
JavaScript Checks	Requiring JS code execution to obtain data	Using headless browsers
Captcha	Forced verification during suspicious activity	Reducing request frequency, CAPTCHA solving services
TLS Fingerprinting	Identifying automation based on TLS parameters	Using libraries with the correct fingerprint
Behavioral Analysis	Analyzing patterns: click speed, mouse movements	Randomizing delays, mimicking human behavior

Wildberries employs aggressive protection: a limit of about 200-300 requests per hour from one IP, User-Agent checks, and JavaScript challenges. If you exceed the limit, you will receive HTTP 429 (Too Many Requests) or 403 (Forbidden).

Ozon is more lenient towards scraping through the API but actively bans data center IPs. They use services to determine the type of IP (DataCenter vs Residential), so regular proxies often do not work.

Avito protects its API from mass ad postings and contact scraping. Here, geographical relevance is crucial: if you post an ad in Kazan, the IP must be from Kazan; otherwise, moderation will block the publication.

Rate Limiting: How to Properly Configure Delays Between Requests

Rate limiting is an artificial restriction on the speed of requests to make your activity appear like that of a regular user. The main rule: it's better to be slow but steady than fast and banned.

Recommended Settings for Popular Marketplaces:

Wildberries:

Delay between requests: 2-5 seconds (randomized)
Maximum 150-200 requests per hour from one IP
Pause 10-15 minutes after every 100 requests
Rotate IP after 200 requests

Ozon:

Delay between requests: 1-3 seconds
Maximum 300-400 requests per hour from one IP
Using residential proxies is mandatory
Rotate IP after 300 requests

Avito:

Delay between requests: 3-7 seconds
Maximum 50-100 requests per hour (strict limits)
IP must correspond to the city of the ad
One IP = one account (do not mix)

How to Implement Randomized Delays: Do not use fixed intervals like "exactly 3 seconds" — this looks like a bot. Add randomness: from 2 to 5 seconds. Most scrapers support this through settings.

For example, in Python with the requests library, it looks like this:

import time
import random
import requests

def make_request(url, proxies):
    response = requests.get(url, proxies=proxies)
    # Random delay from 2 to 5 seconds
    delay = random.uniform(2.0, 5.0)
    time.sleep(delay)
    return response

# Example usage
proxy = {
    'http': 'http://username:[email protected]:8000',
    'https': 'http://username:[email protected]:8000'
}

for product_id in product_list:
    url = f'https://card.wb.ru/cards/detail?nm={product_id}'
    response = make_request(url, proxy)
    # Process data...

Important Note: After every 100-200 requests, take a long pause (10-20 minutes) or change the IP. This mimics the behavior of a person who browses products and then gets distracted by other tasks.

Proxy Rotation for Load Distribution

Even with the right delays, one IP cannot handle long-term load. The solution is proxy rotation: distributing requests among multiple IP addresses. This is the foundation of stable scraping on marketplaces.

Types of Proxies for Marketplace Scraping:

Proxy Type	Advantages	Disadvantages	For Which Tasks
Data Center	Fast, cheap, stable	Easily identified, often on ban lists	Yandex.Market, small marketplaces
Residential	Real IPs of home users, low risk of bans	More expensive, slower than data centers	Wildberries, Ozon, Avito
Mobile	IPs of mobile operators, maximum anonymity	Most expensive, variable speed	Bypassing strict Avito blocks

For scraping Wildberries and Ozon, we recommend using residential proxies — they have IPs of real home users, so marketplaces cannot distinguish them from regular buyers. Data center proxies perform poorly here: Ozon and Wildberries maintain blacklists of such IPs.

Proxy Rotation Strategies:

Rotation after N requests. Change IP after every 100-300 requests. This is the optimal balance between efficiency and safety.
Time-based rotation. Change IP every 30-60 minutes. Suitable for long scraping sessions.
Sticky sessions. Use one IP for all requests to one product/category, then change. This reduces suspicion.
Geographical relevance. Mandatory for Avito: scrape Moscow ads through Moscow IPs, Kazan ads through Kazan IPs.

Most residential proxy providers offer automatic rotation: you get one endpoint, and the IP changes automatically at a specified frequency or after each request. This simplifies scraper configuration.

Example of Setting Up a Proxy Pool in Python:

import requests
import random

# List of proxies (can be loaded from a file)
proxy_list = [
    'http://user:[email protected]:8000',
    'http://user:[email protected]:8000',
    'http://user:[email protected]:8000',
    # ... additional 50-100 proxies
]

def get_random_proxy():
    proxy = random.choice(proxy_list)
    return {
        'http': proxy,
        'https': proxy
    }

# Usage
for product_id in product_list:
    proxy = get_random_proxy()  # Random proxy for each request
    response = requests.get(url, proxies=proxy)
    # Process...

Configuring Headers and Fingerprint to Mimic a Browser

Marketplaces analyze not only IPs and request frequency but also HTTP headers. If your scraper sends requests with the default headers of the library (for example, python-requests/2.28.0), it is instantly identified as a bot.

Mandatory Headers to Mimic a Browser:

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
    'Accept-Language': 'ru-RU,ru;q=0.9,en-US;q=0.8,en;q=0.7',
    'Accept-Encoding': 'gzip, deflate, br',
    'DNT': '1',
    'Connection': 'keep-alive',
    'Upgrade-Insecure-Requests': '1',
    'Sec-Fetch-Dest': 'document',
    'Sec-Fetch-Mode': 'navigate',
    'Sec-Fetch-Site': 'none',
    'Cache-Control': 'max-age=0',
    'Referer': 'https://www.google.com/'
}

Important Points:

User-Agent must match a real browser. Use up-to-date versions of Chrome, Firefox, Safari. Change User-Agent every 100-200 requests.
Accept-Language must match the geography of the proxy. If using Russian IPs — set to ru-RU, for Ukrainian — uk-UA.
Referer indicates where the user came from. For the first request, use Google/Yandex, for subsequent ones — internal pages of the marketplace.
Sec-Fetch-* headers add realism. Modern browsers send them automatically.

TLS Fingerprinting: Advanced protection systems (Ozon, Wildberries) analyze the parameters of the TLS connection: the order of cipher suites, extensions, protocol version. Standard Python/Node.js libraries have a different fingerprint than browsers.

The solution is to use specialized libraries:

curl-impersonate (Python) — mimics TLS fingerprint of Chrome/Firefox
tls-client (Go, Python bindings) — customizable TLS fingerprint
Playwright / Puppeteer — headless browsers with real TLS

For most marketplace scraping tasks, correct HTTP headers and residential proxies are sufficient. TLS fingerprinting is critical only when working with the most secure APIs.

API vs Web Scraping: Which is Safer for Scraping

Marketplaces have two ways to obtain data: official API and scraping HTML pages (web scraping). Which one to choose for stable operation?

Parameter	Official API	Web Scraping
Legality	✅ Allowed, documentation available	⚠️ Gray area, may violate ToS
Stability	✅ Stable data structure	❌ Breaks with website redesign
Limits	⚠️ Strict official limits	⚠️ Unofficial, but there is protection
Data Access	⚠️ Not all data is available	✅ All public data
Speed	✅ Fast JSON responses	❌ Slower due to HTML
Cost	⚠️ Often paid	✅ Free (only proxy costs)

Recommendations for Choosing:

Use the official API if: You need small volumes of data (up to 10,000 products per day), you are willing to pay for access, and legality and stability are important.
Use web scraping if: You need large volumes of data, the official API does not provide the required information (e.g., competitor prices), and the budget is limited.

Hybrid Approach: Many professional scrapers combine both methods. For example, they obtain a list of products through the API (quickly and legally) and scrape detailed information about prices and stock from HTML pages (more data).

Internal Marketplaces APIs: Besides the official API, marketplaces use internal APIs for website operations. For instance, Wildberries loads product data through https://card.wb.ru/cards/detail. These endpoints are undocumented but work faster than HTML scraping. The downside is they can change without warning.

Setting Up Popular Scrapers and Tools

Most sellers and marketers use ready-made tools for scraping marketplaces. Let’s look at how to properly configure proxies and limits in popular solutions.

Setting Up Scrapy (Python Framework)

Scrapy is a popular framework for web scraping. To work with marketplaces, add the following to settings.py:

# Delays between requests
DOWNLOAD_DELAY = 3  # 3 seconds
RANDOMIZE_DOWNLOAD_DELAY = True  # Randomization from 0.5*DELAY to 1.5*DELAY

# Limits on concurrent requests
CONCURRENT_REQUESTS = 8
CONCURRENT_REQUESTS_PER_DOMAIN = 2

# Proxy settings (via rotating-proxies middleware)
ROTATING_PROXY_LIST = [
    'http://user:[email protected]:8000',
    'http://user:[email protected]:8000',
    # ... list of proxies
]

# User-Agent rotation
USER_AGENT_LIST = [
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) Chrome/120.0.0.0',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) Safari/537.36',
    # ... list of User-Agents
]

# Retry attempts on errors
RETRY_TIMES = 3
RETRY_HTTP_CODES = [429, 500, 502, 503, 504]

Setting Up Octoparse (Visual No-Code Parser)

Octoparse is a popular tool for scraping without programming. Proxy and limit setup:

Open Task Settings → Advanced Options
In the "Network" section, enable "Use Proxy Server"
Add the proxy list in the format IP:PORT:USER:PASS
Enable "Rotate IP for each request" for automatic rotation
In the "Speed" section, set to "Slow" or "Custom" with a delay of 3-5 seconds
Enable "Random delay" to mimic human behavior

Setting Up Selenium (Browser Automation)

Selenium controls a real browser, so it bypasses many protections. Here’s an example setup with a proxy:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time
import random

# Setting up Chrome with a proxy
chrome_options = Options()
chrome_options.add_argument('--proxy-server=http://user:[email protected]:8000')
chrome_options.add_argument('--disable-blink-features=AutomationControlled')
chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])
chrome_options.add_experimental_option('useAutomationExtension', False)

driver = webdriver.Chrome(options=chrome_options)

# Hiding WebDriver
driver.execute_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined})")

# Scraping with delays
urls = ['https://www.wildberries.ru/catalog/...', ...]

for url in urls:
    driver.get(url)
    # Random delay of 3-7 seconds
    time.sleep(random.uniform(3, 7))
    
    # Scrolling to mimic reading
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight/2);")
    time.sleep(random.uniform(1, 3))
    
    # Data scraping
    # ...

Ready-Made Marketplace Scraping Services

If you do not want to set up a scraper yourself, use specialized services:

Mpstats.io — analytics for Wildberries and Ozon, automatic price and sales monitoring
SellerFox — competitor monitoring on marketplaces, stock tracking
Moneyplace — scraping Avito, automatic ad posting
Parsehub — visual scraper for any websites, including marketplaces

These services have already configured proxies, limits, and bypass protections — you only need to specify what to scrape. The downside is a monthly subscription starting from 2000₽.

Monitoring Blocks and Automatic Response

Even with the right settings, blocks are possible: marketplaces update protection, proxies get on ban lists, limits change. It is important to track issues and respond automatically.

Signs of Blocking to Monitor:

HTTP 429 (Too Many Requests) — request limit exceeded, need a pause or IP change
HTTP 403 (Forbidden) — IP blocked, immediate proxy rotation needed
HTTP 503 (Service Unavailable) — temporary overload or DDoS protection
Captcha in response — automation detected, need to reduce activity
Empty responses or redirect to the homepage — soft block
Sharp increase in response time — possible rate limiting on the server side

Automatic Response to Blocks (Example in Python):

import requests
import time
from datetime import datetime

class SmartParser:
    def __init__(self, proxy_list):
        self.proxy_list = proxy_list
        self.current_proxy_index = 0
        self.request_count = 0
        self.blocked_proxies = set()
        
    def get_next_proxy(self):
        # Skip blocked proxies
        while self.current_proxy_index in self.blocked_proxies:
            self.current_proxy_index = (self.current_proxy_index + 1) % len(self.proxy_list)
        
        proxy = self.proxy_list[self.current_proxy_index]
        return {'http': proxy, 'https': proxy}
    
    def rotate_proxy(self):
        self.current_proxy_index = (self.current_proxy_index + 1) % len(self.proxy_list)
        self.request_count = 0
        
    def make_request(self, url):
        max_retries = 3
        
        for attempt in range(max_retries):
            try:
                proxy = self.get_next_proxy()
                response = requests.get(url, proxies=proxy, timeout=10)
                
                # Check for blocking
                if response.status_code == 429:
                    print(f"[{datetime.now()}] Rate limit! Pausing for 60 seconds...")
                    time.sleep(60)
                    self.rotate_proxy()
                    continue
                    
                elif response.status_code == 403:
                    print(f"[{datetime.now()}] IP blocked! Rotating proxy...")
                    self.blocked_proxies.add(self.current_proxy_index)
                    self.rotate_proxy()
                    continue
                    
                elif response.status_code == 503:
                    print(f"[{datetime.now()}] Server overloaded. Pausing for 120 seconds...")
                    time.sleep(120)
                    continue
                
                # Successful request
                self.request_count += 1
                
                # Rotate after 200 requests
                if self.request_count >= 200:
                    self.rotate_proxy()
                    time.sleep(10)  # Pause after rotation
                
                return response
                
            except requests.exceptions.Timeout:
                print(f"[{datetime.now()}] Timeout. Attempt {attempt + 1}/{max_retries}")
                time.sleep(5)
                
        return None  # All attempts exhausted

Logging and Alerts: Set up notifications for critical events. For example, send a message to Telegram when:

More than 30% of proxies from the pool are blocked
The percentage of successful requests drops below 80%
The parser has not received data for more than 30 minutes
Captcha detected in responses

Metrics to Monitor:

Success rate — percentage of successful requests (should be >90%)
Average response time — average response time (increase may indicate problems)
Requests per hour — number of requests per hour per proxy
Proxy health — percentage of working proxies in the pool
Block rate — frequency of blocks (should be <5%)

Use dashboards for visualizing metrics: Grafana, Datadog, or simple Google Sheets with automatic updates via API.

Conclusion

Blocks when scraping marketplaces are not an obstacle but a task that can be solved with the right tool configuration. Key points for stable operation without bans:

Use residential proxies for Wildberries, Ozon, and Avito — data center proxies do not work here
Configure randomized delays of 2-5 seconds between requests
Rotate IP after every 150-300 requests or every 30-60 minutes
Use realistic HTTP headers with up-to-date User-Agent
Monitor blocks and respond automatically
For Avito, geographical relevance of IP to the city of the ad is mandatory

A properly configured scraper with quality proxies can operate for months without a single block, collecting tens of thousands of products daily. The key is not to chase speed but to mimic the behavior of an ordinary user.

If you plan to regularly scrape Wildberries, Ozon, or Avito, we recommend using residential proxies with automatic rotation — they provide maximum stability and minimal risk of blocks. For tasks requiring mobile IPs (e.g., bypassing strict Avito blocks), mobile proxies with IPs from Russian operators will suffice.

How to Avoid Blocking When Making Frequent API Requests to Wildberries, Ozon, and Avito