If you are engaged in scraping marketplaces, monitoring competitors' prices, or automating social media tasks, you have likely encountered the 429 Too Many Requests error. The website blocks your requests, considering them suspicious, and all automation comes to a halt. In this article, we will explore why this problem occurs and how to solve it through proper proxy configuration, IP rotation, and effective load distribution.
We will provide specific solutions for various tasks: scraping Wildberries and Ozon, competitor monitoring, working with social media APIs, and mass data collection. All recommendations are based on practical experience and work in real projects.
What is the 429 Too Many Requests error and why does it occur
The 429 Too Many Requests error is an HTTP response status code that the server returns when you exceed the allowed number of requests within a certain period. This is a protective mechanism for websites against overload and automated data collection.
Typical situations when 429 occurs:
- Scraping marketplaces β you are collecting prices from Wildberries, Ozon, or Avito, making hundreds of requests per minute. The website detects abnormal activity from a single IP and blocks it.
- Competitor monitoring β automatic data collection about products, prices, availability. Frequent checks trigger the limit.
- Working with APIs β many APIs have strict limits: for example, the Instagram API allows 200 requests per hour, Twitter β 300 requests every 15 minutes.
- Mass registration or actions β creating accounts, sending messages, likes. Platforms quickly identify automation and block the IP.
It is important to understand: the 429 error is not just a technical limitation. It is a signal that the website has recognized your activity as suspicious. If you continue to attack from the same IP, you may receive a permanent ban.
Important: Some websites return 403 Forbidden instead of 429 or simply show a captcha. The essence is the same β you have exceeded the limits and have been blocked.
How websites detect suspicious activity
To effectively bypass blocks, you need to understand how exactly websites identify you. Modern protection systems analyze numerous parameters:
1. IP address and request frequency
The most obvious parameter. If 100 requests per minute come from one IP, while an ordinary user makes 5-10 β this is clear automation. Websites set limits:
- Wildberries: about 60 requests per minute from one IP
- Ozon: about 30-40 requests per minute
- Avito: strict limits, especially for search queries
- Instagram API: 200 requests per hour per application
2. User-Agent and browser headers
If you send requests through a script without the correct User-Agent, the website immediately understands that this is not a real browser. Headers are also analyzed: Accept, Accept-Language, Referer. The absence of these headers or atypical values is a red flag.
3. Behavioral patterns
A real user does not make requests with perfect periodicity every 2 seconds. They scroll, click, take pauses. If your parser works like a metronome β this is suspicious.
4. Type of IP address
Many platforms maintain blacklists of data center IPs. If you use cheap proxies from AWS or Google Cloud, the likelihood of blocking is higher. Residential IPs from real providers raise fewer suspicions.
Proxy rotation: the main way to bypass limits
The main solution to the 429 problem is IP rotation. Instead of making all requests from one IP, you distribute the load among many addresses. Each IP makes a small number of requests and does not exceed the limits.
Types of proxy rotation
| Type of rotation | How it works | When to use |
|---|---|---|
| Request-based rotation | Each request comes from a new IP. The proxy provider automatically changes the address. | Mass scraping when you need to collect a lot of data quickly |
| Timer-based rotation | IP changes every 5-30 minutes. You use one address for a series of requests. | Working with websites that require sessions (cart, authorization) |
| Pool of static proxies | You have a list of 100-1000 IPs. The script randomly selects an address for each request. | When full control over rotation and load distribution is needed |
Practical example: scraping Wildberries
Suppose you need to scrape prices for 10,000 products. Wildberries blocks after 60 requests per minute from one IP. How to solve this:
- Use request-based rotation β each request comes from a new IP. You need about 167 different IPs (10,000 requests / 60 per minute = 167 minutes with one IP, but with rotation, you can do it in 10-15 minutes).
- Set delays β even with rotation, you shouldn't make 1,000 requests per second. Optimal: 5-10 requests per second with different IPs.
- Add randomization β delays should be random: from 0.5 to 2 seconds between requests.
For such tasks, residential proxies with automatic rotation are ideal β they have pools of millions of IPs and change addresses for each request without your involvement.
Setting delays between requests
Even with proxy rotation, you cannot bombard the site with requests at maximum speed. Modern protection systems analyze the overall load on the server and may block the entire range of IPs if they detect DDoS-like activity.
Rules for setting delays
Basic rule: mimic a real user
- Minimum delay: 0.5-1 second between requests
- Recommended: 1-3 seconds with random variation
- For complex sites (marketplaces, social networks): 2-5 seconds
- Use exponential delay on errors
Exponential delay (exponential backoff)
If you still receive a 429 error, do not continue to attack the site. Use the exponential delay strategy:
- First attempt fails β wait 1 second
- Second attempt fails β wait 2 seconds
- Third attempt fails β wait 4 seconds
- Fourth attempt fails β wait 8 seconds
- And so on, up to a maximum (for example, 60 seconds)
This strategy gives the server time to "cool down" and reduces the likelihood of a permanent ban. Many APIs (Google, Twitter) recommend this approach in their documentation.
Example settings for different tasks
| Task | Delay between requests | Comment |
|---|---|---|
| Scraping Wildberries | 1-3 seconds | With proxy rotation, can speed up to 0.5-1 sec |
| Scraping Ozon | 2-4 seconds | Ozon is more sensitive to automation |
| Instagram API | 18 seconds | Limit 200 requests/hour = 1 request every 18 sec |
| Google Search scraping | 5-10 seconds | Google quickly bans, long pauses are needed |
| Avito monitoring | 3-6 seconds | Strict protection, especially for search |
User-Agent and headers: mimicking a real browser
Proxy rotation and delays solve the request frequency problem, but that's not enough. Websites analyze how you send requests. If the headers look suspicious β blocking is inevitable.
Mandatory headers to mimic a browser
The minimum set of headers that should be in every request:
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: ru-RU,ru;q=0.9,en-US;q=0.8,en;q=0.7
Accept-Encoding: gzip, deflate, br
Connection: keep-alive
Upgrade-Insecure-Requests: 1
Sec-Fetch-Dest: document
Sec-Fetch-Mode: navigate
Sec-Fetch-Site: none
Sec-Fetch-User: ?1
Cache-Control: max-age=0
User-Agent rotation
Do not use the same User-Agent for all requests. Create a list of 10-20 current browser versions and change them randomly:
- Chrome (Windows, macOS, Linux)
- Firefox (various versions)
- Safari (macOS, iOS)
- Edge (Windows)
Common mistake: Using outdated User-Agents (for example, Chrome 90 in 2024) or mobile User-Agents for desktop sites. This instantly reveals automation.
Referer and Origin
Many websites check where the request came from. If you are scraping a product page, the Referer header should link to the catalog or search. If scraping an API β the correct Origin must be present.
Example for scraping Wildberries:
Referer: https://www.wildberries.ru/catalog/0/search.aspx?search=Π½ΠΎΡΡΠ±ΡΠΊ
Origin: https://www.wildberries.ru
Which proxies to choose to bypass 429
Choosing the type of proxy is critically important. Cheap data center proxies are often already on blacklists, and you will receive 429 even with a low request frequency.
Comparison of proxy types for bypassing limits
| Proxy type | Advantages | Disadvantages | For which tasks |
|---|---|---|---|
| Data center | High speed, low price | Often banned, easily detectable | Simple websites without protection |
| Residential | Real IPs from providers, hard to detect, large pool of addresses | More expensive, sometimes slower | Marketplaces, social networks, complex sites |
| Mobile | IPs from mobile operators, maximum trust | Expensive, limited pool | Instagram, TikTok, Facebook Ads |
Recommendations for selection
For scraping marketplaces (Wildberries, Ozon, Avito): Use residential proxies with request-based rotation. The pool should be large β at least 10,000 IPs. This ensures that each IP makes few requests and does not hit the limits.
For working with social media APIs: Mobile proxies are the optimal choice. Instagram and TikTok trust IPs from mobile operators more than residential ones. One mobile IP can serve 5-10 accounts without issues.
For competitor price monitoring: Residential proxies with timer-based rotation (every 10-15 minutes). This allows for a series of requests from one IP while maintaining the session, but not exceeding limits.
For simple tasks (scraping news, blogs): Data center proxies may be suitable if the site does not have serious protection. But be prepared for periodic blocks.
Real cases: scraping marketplaces and APIs
Case 1: Price monitoring on Wildberries (10,000 products daily)
Task: A marketplace seller tracks competitors' prices on 10,000 items. Data needs to be collected twice a day.
Problem: Using one IP, I received a ban after 50-60 requests. Scraping 10,000 products took several hours with constant blocks.
Solution:
- Connected residential proxies with a pool of 50,000 IPs and request-based rotation
- Set random delays from 0.5 to 2 seconds between requests
- Added User-Agent rotation (20 variants of Chrome and Firefox)
- Configured correct Referer and Accept headers
Result: Scraping 10,000 products takes 15-20 minutes without a single block. Each IP makes a maximum of 1-2 requests, which is impossible to detect as automation.
Case 2: Instagram automation (50 client accounts)
Task: An SMM agency manages 50 client accounts on Instagram. Content needs to be published, comments answered, and statistics collected.
Problem: The Instagram API has a limit of 200 requests per hour per application. Working with 50 accounts exhausted the limits in 10 minutes.
Solution:
- Created 10 different Instagram API applications (5 accounts per application)
- Each application uses a separate mobile proxy
- Set a delay of 18 seconds between requests (200 requests/hour = 1 request every 18 sec)
- Added exponential delay upon receiving 429
Result: All 50 accounts operate stably. 429 errors occur very rarely (1-2 times a week) and are handled automatically through retries.
Case 3: Scraping Avito (ads across Russia)
Task: A real estate aggregator collects ads from Avito across all cities in Russia for its database.
Problem: Avito has one of the strictest protections among Russian websites. Blocks started after 10-15 requests even from different data center IPs.
Solution:
- Switched to residential proxies with geographical targeting (IPs from the same city as scraping)
- Increased delays to 3-5 seconds between requests
- Used a headless browser (Puppeteer) instead of simple HTTP requests
- Emulated user actions: scrolling, clicking, mouse movements
Result: Successful scraping of 50,000+ ads per day. Blocks decreased by 95%. The remaining 5% are handled through retries with a new IP.
Case 4: Monitoring competitors' APIs (e-commerce)
Task: An online store tracks product availability and prices from 20 competitors through their APIs.
Problem: Most competitors' APIs have public limits (100-500 requests per hour). Exceeding this returns a 429 error.
Solution:
- Created a queue of requests with priorities (the most important products are checked more frequently)
- Monitored limits through response headers (X-RateLimit-Remaining)
- Automatically paused upon reaching 80% of the limit
- Used multiple API keys for each competitor (where possible)
Result: The system automatically distributes requests to never exceed limits. Data is updated with the maximum possible frequency without blocks.
The overall lesson from all cases:
The 429 error is solved comprehensively: proxy rotation + correct delays + mimicking real behavior. You cannot rely on just one method. Even with a million IPs, you will be blocked if you make 1,000 requests per second with suspicious headers.
Conclusion
The 429 Too Many Requests error is a protective mechanism for websites that can be bypassed with the right approach. The main principles for solving the problem:
- IP rotation β distribute the load among many proxies so that each address makes a minimum number of requests
- Correct delays β mimic a real user with random pauses from 1 to 5 seconds
- Correct headers β use current User-Agent and a full set of browser headers
- Choosing the type of proxy β for complex sites (marketplaces, social networks), use residential or mobile proxies
- Error handling β apply exponential delay upon receiving 429, do not attack the site again
Remember: the goal is not to deceive the protection at any cost, but to make your automation look as natural as possible. Modern protection systems are becoming increasingly sophisticated, and brute force is no longer effective.
If you plan to work with scraping marketplaces, monitoring competitors, or automation in social media, we recommend trying residential proxies β they provide a large pool of IP addresses, automatic rotation, and minimal risk of blocks. For working with Instagram, TikTok, and other mobile platforms, mobile proxies with IPs from real telecom operators are a better fit.