Proxies for bypassing DataDome: which types work in 2024

```html

You have set up a scraper, launched data collection — and within minutes you receive a page with a CAPTCHA or an empty response. Most likely, the site is protected by DataDome. This is one of the most aggressive anti-bot systems on the market, and regular data center proxies won't help here. In this article, we will examine how exactly DataDome detects bots and which types of proxies yield results.

What is DataDome and where is it used

DataDome is a commercial SaaS bot protection platform used by large online stores, news portals, marketplaces, and booking services worldwide. The company was founded in 2015 and currently protects thousands of sites with a total audience of billions of requests per day.

Among DataDome's clients are platforms such as Reddit, Foot Locker, Rakuten, AngelList, and many other large resources. If you are engaged in competitor price monitoring, scraping product cards, collecting data from foreign marketplaces, or aggregating news — there is a high probability that you have already encountered this system.

Characteristic signs that a site is protected by DataDome:

A CAPTCHA page appears after several consecutive requests
The server response contains the header x-datadome-cid
Redirect to the domain geo.captcha-delivery.com
HTTP response 403 or 429 for frequent requests from a single IP
JavaScript challenge on the first visit (the "browser check" page)

DataDome operates in real-time: each incoming request is analyzed in milliseconds. The system decides whether to allow the user, show a CAPTCHA, or block them — even before the server delivers the main content of the page. This is why bypassing it is more difficult than simple IP blocks.

How DataDome identifies bots: protection mechanisms

To understand which proxies work, it is necessary to figure out what exactly DataDome analyzes. The system uses a multi-layered approach — no single factor is the sole criterion for blocking. The decision is made based on a combination of signals.

1. IP address reputation

The first thing DataDome checks is the reputation of the IP address against external and internal databases. The system instantly determines whether the IP belongs to a data center (AWS, Google Cloud, Hetzner, DigitalOcean), a VPN provider, or is a real residential/mobile address. IPs from data centers automatically receive a high "suspicion score" even before behavior analysis.

2. Behavioral analysis

DataDome tracks behavior patterns: request speed, sequence of page visits, time between clicks, mouse movement (if JavaScript is present). A real user takes breaks, follows logical paths, and sometimes goes back. A bot typically makes requests at constant intervals, to strictly defined URLs, with no "random" deviations.

3. JavaScript fingerprint

If the request is made through a browser (or a headless browser like Puppeteer/Playwright), DataDome runs a JavaScript script that collects the "fingerprint" of the environment: browser version, installed fonts, screen resolution, WebGL support, canvas fingerprint, presence of plugins. Headless browsers without additional masking are easily identified by their characteristic parameters.

4. HTTP headers

The request headers are analyzed: User-Agent, Accept-Language, Accept-Encoding, Referer, sec-ch-ua, and others. A mismatch between the declared User-Agent and the actual request parameters is a strong bot signal.

5. Real-time machine learning

All collected signals are processed by an ML model trained on a vast dataset of real users and bots. The model is constantly updated — what worked a month ago may not work today. This is why static solutions quickly become obsolete.

Why data center proxies fail against DataDome

This is the most common question from those who are just starting to work with protected sites. Data center proxies are cheap, fast, and have high uptime. It seems like the perfect choice for scraping. But against DataDome, they are practically useless.

The reason is simple: DataDome maintains and uses ASN (autonomous system) databases of all major hosting providers. When a request comes from an IP address belonging to, for example, an Amazon Web Services or OVH subnet, the system immediately assigns it a "suspicious" status. Even if your scraper perfectly mimics human behavior — an IP from a data center already puts you at risk.

⚠️ Important to understand

Data center proxies are great for tasks where protection is weak or absent: scraping open data, working with APIs without anti-bot systems, speed testing. But for sites with DataDome, they result in a block in 90%+ of cases already in the first dozens of requests.

Another problem is "burned" IPs. If thousands of users before you have used the same IP address for bot activity (and this is normal in pools of cheap data centers), DataDome already has a negative history for that address. Even the first request from such an IP may get blocked.

Residential proxies: the main tool for bypassing DataDome

Residential proxies are IP addresses that belong to real home internet users. They are issued by internet service providers (Rostelecom, Comcast, Deutsche Telekom, etc.) and from DataDome's perspective, they look like ordinary people sitting at home behind a computer.

This is why residential proxies are the primary working tool for scraping sites with DataDome. They pass the initial reputation check, giving you a "credit of trust" for further work.

What to consider when choosing residential proxies for DataDome

Parameter	What is important	Why this is critical
Rotation type	Rotation on each request or session 5-30 minutes	DataDome tracks IP history — too frequent changes are also suspicious
Geolocation	IP from the country of the target site	Requests from another country are an additional signal of suspicion
Pool size	Millions of IPs, not thousands	A small pool burns out quickly — DataDome remembers active addresses
Sticky sessions	Ability to hold one IP for 10-30 minutes	For multi-page scraping, one session must look like one user
Speed	At least 5-10 Mbps per connection	Slow proxies increase request time, affecting timing

An important point: residential proxies do not guarantee 100% bypass of DataDome by themselves. They solve the IP reputation problem, but if your scraper makes 100 requests per minute from one address or sends incorrect headers — DataDome will still block you. The IP is just one layer of protection.

Mobile proxies: when maximum trust is needed

Mobile proxies are IP addresses from mobile operators (4G/5G networks). They have a unique property: one mobile operator's IP address can be used by thousands of real users simultaneously through NAT. DataDome knows this — and therefore treats mobile IPs with maximum trust.

Blocking a mobile IP means blocking potentially thousands of real customers of the operator — no normal site would do that. This is why mobile proxies provide the highest percentage of successful requests to sites with DataDome.

When to choose mobile proxies over residential ones:

The site is very aggressively protected — residential proxies result in blocks even at low request frequencies
You are scraping the mobile version of the site — mobile IP + mobile User-Agent look organic
Need to work with applications — if scraping a mobile API, mobile IP logically corresponds to the request
Long-term sessions — mobile proxies maintain sessions well without changing IP

The downside of mobile proxies is that they are more expensive than residential ones and usually have a smaller pool of IPs. For large-scale scraping with thousands of requests per hour, this can become a limitation. In such cases, the optimal strategy is to use mobile proxies for "reconnaissance" and complex pages, and residential ones for mass data collection.

Rotation and delay strategy: how not to get caught even with good proxies

Even with residential or mobile proxies, you can get blocked if you do not build your request strategy correctly. DataDome analyzes behavior at the session level — and anomalous patterns raise suspicion regardless of the quality of the IP.

Safe scraping rules through DataDome

✅ Safe scraping checklist

Delays between requests: from 3 to 15 seconds (random, not fixed)
No more than 20-30 requests from one IP per session
Sticky session: keep one IP for one "user path"
Start with the homepage, then move to target URLs
Imitate real navigation: homepage → category → product
Use proxy geolocation that matches the site language
Change IP after each session or after a block
Do not launch parallel requests from one IP

Rotation: when to change IP

There is no universal answer here — it all depends on the specific site. But the general logic is this: DataDome remembers the activity of an IP in a sliding window (usually 10-60 minutes). If a suspiciously high number of requests come from one address during that time — the IP receives a temporary ban.

The optimal strategy is to rotate IPs not by timer, but by the number of requests. For example: 15-25 requests → change IP → pause 30-60 seconds → new session. This approach imitates the behavior of different users, each of whom visited several pages and left.

Headers and fingerprint: what else DataDome checks besides IP

Good proxies are a necessary but not sufficient condition for bypassing DataDome. The system analyzes the entire request as a whole. If the IP is residential, but the headers indicate a bot — blocking will still occur.

Critically important headers

Here’s what DataDome checks in HTTP headers and what to pay attention to:

Header	What is checked	Typical mistake
`User-Agent`	Current browser version	Outdated UA or UA from Python libraries
`Accept-Language`	Language matches the proxy geo	Proxy from the USA, but language is ru-RU
`sec-ch-ua`	Matches User-Agent	Missing header when Chrome is declared
`Referer`	Logical chain of transitions	Direct request to a deep page without Referer
`Accept-Encoding`	Standard browser set	Absence or non-standard set
`Cookie`	Saving DataDome session cookies	Ignoring Set-Cookie from DataDome

Special attention should be paid to DataDome cookies. Upon the first request, the system sets its cookie (usually called datadome). If your scraper does not save and send this cookie in subsequent requests — DataDome perceives each request as the first visit of a new user, which is itself suspicious at a high frequency.

TLS fingerprint

DataDome's advanced protection also analyzes the TLS fingerprint — characteristics of the SSL/TLS handshake. Different HTTP libraries (requests, curl, axios) have characteristic sets of cipher suites and TLS extensions that differ from those of browsers. If you use the standard Python requests library — its TLS fingerprint is easily identifiable. The solution is to use libraries that simulate browser TLS (for example, curl-impersonate or specialized solutions).

Tools for working with DataDome-protected sites

Choosing the right tool for scraping is just as important as choosing proxies. Different tasks require different approaches. Let's consider the main options in terms of compatibility with DataDome.

Browser automation (Puppeteer, Playwright)

Headless browsers should theoretically work well with DataDome, as they execute JavaScript and form a "real" fingerprint. In practice, standard Puppeteer or Playwright are easily identified by their characteristic parameters: navigator.webdriver = true, absence of plugins, non-standard WebGL values. Additional masking is needed to bypass them using plugins like puppeteer-extra-plugin-stealth.

Anti-detect browsers

For tasks that require full interaction with the site (not just scraping but also interaction), anti-detect browsers are the optimal choice. Dolphin Anty, AdsPower, GoLogin, and Multilogin create complete browser profiles with realistic fingerprints. In conjunction with residential or mobile proxies, they provide the highest level of bypassing DataDome.

The connection scheme in an anti-detect browser is standard: create a profile → specify the proxy type (HTTP/SOCKS5), host, port, username, and password from the proxy service in the settings → launch the profile. Each profile operates in an isolated environment with a unique fingerprint.

Specialized scraping services

There are ready-made services (ScrapingBee, Apify, Bright Data Scraping Browser) that handle all the work of bypassing protections — you simply provide the URL and receive HTML. They use their own pools of residential proxies and automatically solve CAPTCHAs. The downside is the high cost for large volumes and less control over the process.

Comparison of approaches

Tool	Effectiveness against DataDome	Setup complexity	Scalability
HTTP parser + residential proxies	Average	Low	High
Puppeteer/Playwright + stealth + proxies	High	Medium	Medium
Anti-detect browser + mobile proxies	Very high	Low	Low
Ready-made scraping services	High	Very low	High (expensive)
Data center proxies (any tool)	Very low	—	—

Practical scenario: price monitoring on a protected site

Suppose you are monitoring competitor prices on a foreign marketplace protected by DataDome. You need to collect data on 5000 products every 6 hours. Here’s the optimal scheme:

Tool: Playwright with the stealth plugin (automatically solves JS challenges)
Proxies: Residential with rotation, geolocation — country of the target site
Session: Sticky for 15 minutes, 20 requests per IP
Headers: Current Chrome User-Agent, correct Accept-Language
Cookies: Saving and transmitting DataDome cookies between requests of one session
Delays: Random from 4 to 12 seconds between requests
Session start: Always start from the homepage, then move to products

With this setup, the success rate of requests is 85-95%, which is quite sufficient for regular monitoring. The remaining 5-15% — repeat requests through another IP.

Conclusion and recommendations

DataDome is a serious protection system, but not insurmountable. The key to successful work with sites under its protection is a comprehensive approach: the right type of proxy, correct headers, realistic behavior, and a well-thought-out rotation strategy.

The main conclusions of the article:

Data center proxies do not work against DataDome — they are blocked at the IP reputation level
Residential proxies are the basic tool for most scraping tasks
Mobile proxies provide maximum trust and are suitable for aggressively protected sites
Good proxies are only part of the solution: headers, cookies, and behavior are equally important
Anti-detect browsers in conjunction with quality proxies yield the best results
The rotation and delay strategy is critically important — even with residential proxies, you can get banned with aggressive scraping

If you are engaged in price monitoring, scraping product cards, or collecting data from sites protected by DataDome, we recommend starting with residential proxies — they provide the optimal balance between the quality of bypassing protection and cost. For tasks requiring the highest level of trust from anti-bot systems, consider mobile proxies — especially if you are working with mobile versions of sites or mobile application APIs.