Modern websites have learned to recognize automated browsers (Selenium, Puppeteer, Playwright) and block their requests. Headless browsers leave dozens of digital traces that anti-bot systems can use to detect automation in milliseconds. In this guide, we will explore all methods for masking headless browsers with code examples in Python and JavaScript, so your scrapers can operate reliably without being blocked.
This article is intended for developers engaged in web scraping, test automation, or data collection from protected websites. We will discuss the technical details of detection and practical solutions for bypassing protection.
How Websites Detect Headless Browsers: Main Methods
Anti-bot systems use multi-layered browser checks, analyzing hundreds of parameters. Headless browsers differ from regular ones in many ways that cannot be hidden by simply changing the User-Agent. Understanding detection methods is the first step to effective masking.
JavaScript Automation Markers
The most common method is checking JavaScript properties that only appear in automated browsers:
navigator.webdriver— returnstruein Selenium and Puppeteerwindow.chrome— absent in headless Chromenavigator.plugins.length— equals 0 in headless modenavigator.languages— often an empty array or contains only "en-US"navigator.permissions— API behaves differently in headless mode
Analyzing Chrome DevTools Protocol
Puppeteer and Playwright control the browser through the Chrome DevTools Protocol (CDP). The presence of a CDP connection can be detected through specific JavaScript checks that analyze window.cdc_ objects or check for anomalies in mouse and keyboard event behavior.
Canvas and WebGL Fingerprinting
Headless browsers generate identical Canvas and WebGL fingerprints as they use software rendering instead of hardware. Anti-bot systems create an invisible Canvas element, draw text or shapes on it, and compute the image hash. If thousands of users have the same hash, it indicates bots.
Behavioral Analysis
Modern systems (DataDome, PerimeterX, Cloudflare Bot Management) analyze mouse movements, scrolling speed, and click patterns. Headless browsers perform actions instantly and without natural delays, which reveals automation. Events are also analyzed: in a regular browser, a mousemove event always occurs before a click, while bots often click without prior mouse movement.
Important: Modern anti-bot systems use machine learning to analyze hundreds of parameters simultaneously. Masking just one parameter (e.g., User-Agent) will not protect against blocking — a comprehensive approach is needed.
Removing navigator.webdriver and Other JavaScript Markers
The navigator.webdriver property is the simplest way to detect Selenium and other WebDriver tools. In a regular browser, this property returns undefined, while in an automated one, it returns true. It can be removed by executing JavaScript code before the page loads.
Selenium (Python): Removing WebDriver via CDP
For Selenium, you need to use the Chrome DevTools Protocol to execute JavaScript before loading any pages:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument('--disable-blink-features=AutomationControlled')
driver = webdriver.Chrome(options=options)
# Removing navigator.webdriver via CDP
driver.execute_cdp_cmd('Page.addScriptToEvaluateOnNewDocument', {
'source': '''
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined
});
'''
})
driver.get('https://example.com')
The option --disable-blink-features=AutomationControlled disables the flag that Chrome adds in automation mode. This is a basic protection that should be combined with other methods.
Puppeteer (Node.js): Masking via Page.evaluateOnNewDocument
In Puppeteer, the page.evaluateOnNewDocument() method is used to execute code before the page loads:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({
headless: true,
args: ['--disable-blink-features=AutomationControlled']
});
const page = await browser.newPage();
// Removing webdriver and other markers
await page.evaluateOnNewDocument(() => {
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined
});
// Adding chrome object
window.chrome = {
runtime: {}
};
// Emulating plugins
Object.defineProperty(navigator, 'plugins', {
get: () => [1, 2, 3, 4, 5]
});
});
await page.goto('https://example.com');
})();
Playwright: Built-in Masking Options
Playwright has more advanced masking out of the box, but additional configuration is still necessary:
const { chromium } = require('playwright');
(async () => {
const browser = await chromium.launch({
headless: true,
args: ['--disable-blink-features=AutomationControlled']
});
const context = await browser.newContext({
userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
});
const page = await context.newPage();
await page.addInitScript(() => {
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined
});
});
await page.goto('https://example.com');
})();
Masking Chrome DevTools Protocol
Puppeteer and Playwright leave traces of the CDP connection that can be detected through the analysis of window objects. Anti-bot systems look for variables prefixed with cdc_, $cdc_, or __webdriver_, which are created by Chrome when connected via the DevTools Protocol.
Removing CDP Variables
The following script removes all automation-related variables from the window object:
await page.evaluateOnNewDocument(() => {
// Removing all variables with the cdc_ prefix
const cdcProps = Object.keys(window).filter(prop =>
prop.includes('cdc_') ||
prop.includes('$cdc_') ||
prop.includes('__webdriver_')
);
cdcProps.forEach(prop => {
delete window[prop];
});
// Redefining document.__proto__ to hide CDP
const originalQuery = document.querySelector;
document.querySelector = function(selector) {
if (selector.includes('cdc_')) return null;
return originalQuery.call(this, selector);
};
});
Using Patched Versions of Chromium
There are modified builds of Chromium that do not leave CDP traces. For example, the puppeteer-extra library with the puppeteer-extra-plugin-stealth plugin automatically applies dozens of patches to mask CDP.
Configuring User-Agent and HTTP Headers
Headless browsers use outdated or unrealistic User-Agent strings. For example, Puppeteer by default adds the word "HeadlessChrome" to the User-Agent. Additionally, you need to configure extra headers that are present in requests from regular browsers.
Relevant User-Agent for Masking
Use fresh User-Agent strings from real browsers. Here are examples for 2024:
# Chrome 120 on Windows 10
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36
# Chrome 120 on macOS
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36
# Firefox 121 on Windows
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0
Configuring Headers in Selenium
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument('user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36')
# Additional headers via CDP
driver.execute_cdp_cmd('Network.setUserAgentOverride', {
"userAgent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
"platform": 'Win32',
"acceptLanguage": 'en-US,en;q=0.9'
})
Configuring Headers in Puppeteer
await page.setUserAgent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36');
await page.setExtraHTTPHeaders({
'Accept-Language': 'en-US,en;q=0.9',
'Accept-Encoding': 'gzip, deflate, br',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Sec-Fetch-Site': 'none',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-User': '?1',
'Sec-Fetch-Dest': 'document'
});
The Sec-Fetch-* headers are critically important — they appeared in Chrome 76+ and their absence indicates old browser versions or bots.
Emulating Canvas and WebGL Fingerprint
Canvas and WebGL fingerprinting is a powerful detection method, as headless browsers generate identical fingerprints. Anti-bot systems create an invisible Canvas, draw text on it, and compute the pixel hash. If thousands of requests have the same hash, they are bots.
Adding Noise to Canvas
The following script adds random noise to the Canvas fingerprint, making each request unique:
await page.evaluateOnNewDocument(() => {
const originalToDataURL = HTMLCanvasElement.prototype.toDataURL;
const originalToBlob = HTMLCanvasElement.prototype.toBlob;
const originalGetImageData = CanvasRenderingContext2D.prototype.getImageData;
// Noise adding function
const addNoise = (canvas, context) => {
const imageData = originalGetImageData.call(context, 0, 0, canvas.width, canvas.height);
for (let i = 0; i < imageData.data.length; i += 4) {
imageData.data[i] += Math.floor(Math.random() * 10) - 5;
imageData.data[i + 1] += Math.floor(Math.random() * 10) - 5;
imageData.data[i + 2] += Math.floor(Math.random() * 10) - 5;
}
context.putImageData(imageData, 0, 0);
};
HTMLCanvasElement.prototype.toDataURL = function() {
addNoise(this, this.getContext('2d'));
return originalToDataURL.apply(this, arguments);
};
});
Emulating WebGL Parameters
WebGL reveals information about the graphics card and drivers. In headless mode, these parameters reveal software rendering:
await page.evaluateOnNewDocument(() => {
const getParameter = WebGLRenderingContext.prototype.getParameter;
WebGLRenderingContext.prototype.getParameter = function(parameter) {
// Emulating a real graphics card
if (parameter === 37445) {
return 'Intel Inc.';
}
if (parameter === 37446) {
return 'Intel Iris OpenGL Engine';
}
return getParameter.call(this, parameter);
};
});
Parameter 37445 is UNMASKED_VENDOR_WEBGL, and 37446 is UNMASKED_RENDERER_WEBGL. In headless mode, they return "Google SwiftShader," which reveals automation.
Selenium Stealth: Ready-made Solutions for Python
The selenium-stealth library automatically applies dozens of patches to mask Selenium. This is the simplest solution for Python developers that does not require manual configuration of each parameter.
Installation and Basic Setup
pip install selenium-stealth
from selenium import webdriver
from selenium_stealth import stealth
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("--headless")
options.add_argument("--no-sandbox")
options.add_argument("--disable-dev-shm-usage")
driver = webdriver.Chrome(options=options)
# Applying stealth patches
stealth(driver,
languages=["en-US", "en"],
vendor="Google Inc.",
platform="Win32",
webgl_vendor="Intel Inc.",
renderer="Intel Iris OpenGL Engine",
fix_hairline=True,
)
driver.get("https://bot.sannysoft.com")
driver.save_screenshot("test.png")
The library automatically removes navigator.webdriver, adds window.chrome, emulates plugins, masks WebGL, and applies over 20 additional patches. This covers 80% of detection cases.
Advanced Configuration with Proxies
For complete masking, combine selenium-stealth with residential proxies — they provide real IP addresses of home users, which is critical for bypassing advanced anti-bot systems:
from selenium.webdriver.common.proxy import Proxy, ProxyType
proxy = Proxy()
proxy.proxy_type = ProxyType.MANUAL
proxy.http_proxy = "ip:port"
proxy.ssl_proxy = "ip:port"
capabilities = webdriver.DesiredCapabilities.CHROME
proxy.add_to_capabilities(capabilities)
driver = webdriver.Chrome(desired_capabilities=capabilities, options=options)
stealth(driver, languages=["en-US", "en"], vendor="Google Inc.", platform="Win32")
Puppeteer Extra Stealth Plugin for Node.js
For Puppeteer, there is the puppeteer-extra-plugin-stealth, which is the most advanced solution for masking headless browsers in the JavaScript ecosystem. It contains 23 independent modules, each masking a specific aspect of automation.
Installation and Basic Usage
npm install puppeteer-extra puppeteer-extra-plugin-stealth
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());
(async () => {
const browser = await puppeteer.launch({
headless: true,
args: ['--no-sandbox', '--disable-setuid-sandbox']
});
const page = await browser.newPage();
await page.goto('https://bot.sannysoft.com');
await page.screenshot({ path: 'test.png' });
await browser.close();
})();
What the Stealth Plugin Masks
The plugin automatically applies the following patches:
- Removing
navigator.webdriver - Adding
window.chromeobject - Emulating
navigator.permissionsAPI - Masking
navigator.pluginsandnavigator.mimeTypes - Emulating Canvas and WebGL fingerprint
- Masking User-Agent and languages
- Fixing anomalies in iframe contentWindow
- Emulating Battery API, Media Devices, WebRTC
Configuration with Proxies and Additional Parameters
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
const AdblockerPlugin = require('puppeteer-extra-plugin-adblocker');
puppeteer.use(StealthPlugin());
puppeteer.use(AdblockerPlugin({ blockTrackers: true }));
(async () => {
const browser = await puppeteer.launch({
headless: true,
args: [
'--proxy-server=http://your-proxy:port',
'--disable-web-security',
'--disable-features=IsolateOrigins,site-per-process'
]
});
const page = await browser.newPage();
// Proxy authentication
await page.authenticate({
username: 'your-username',
password: 'your-password'
});
await page.setViewport({ width: 1920, height: 1080 });
await page.goto('https://example.com', { waitUntil: 'networkidle2' });
})();
Playwright: Configuring Anti-detection
Playwright has a more sophisticated architecture compared to Puppeteer and leaves fewer traces of automation. However, additional configuration is needed to bypass advanced anti-bot systems. There is a port of the stealth plugin for Playwright — playwright-extra.
Installing playwright-extra
npm install playwright-extra playwright-extra-plugin-stealth playwright
const { chromium } = require('playwright-extra');
const stealth = require('playwright-extra-plugin-stealth');
chromium.use(stealth());
(async () => {
const browser = await chromium.launch({ headless: true });
const context = await browser.newContext({
userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
viewport: { width: 1920, height: 1080 },
locale: 'en-US',
timezoneId: 'America/New_York'
});
const page = await context.newPage();
await page.goto('https://bot.sannysoft.com');
await page.screenshot({ path: 'test.png' });
await browser.close();
})();
Configuring Browser Context for Maximum Masking
Playwright allows creating isolated browser contexts with individual settings. This is useful for multi-accounting or parallel scraping:
const context = await browser.newContext({
userAgent: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36',
viewport: { width: 1440, height: 900 },
locale: 'en-US',
timezoneId: 'America/Los_Angeles',
permissions: ['geolocation'],
geolocation: { latitude: 37.7749, longitude: -122.4194 },
colorScheme: 'light',
deviceScaleFactor: 2,
isMobile: false,
hasTouch: false,
// Proxy configuration for context
proxy: {
server: 'http://your-proxy:port',
username: 'user',
password: 'pass'
}
});
The geolocation and timezoneId parameters must match the proxy's IP address, otherwise, anti-bot systems will detect a mismatch (e.g., an IP from California but a New York timezone).
Proxy Rotation to Reduce Blocking Risk
Even a perfectly masked headless browser can be blocked if it uses one IP address for hundreds of requests. Modern anti-bot systems analyze the request frequency from a single IP and block suspicious activity. Proxy rotation is an essential element of protection when scraping.
Types of Proxies for Scraping: Comparison
| Proxy Type | Speed | Trust Score | Best for |
|---|---|---|---|
| Datacenter | Very High (50-200 ms) | Low | Simple sites, bulk scraping |
| Residential | Medium (300-1000 ms) | High | Protected sites, social networks |
| Mobile | Low (500-2000 ms) | Very High | Mobile apps, Instagram, TikTok |
For scraping protected sites (marketplaces, social networks, advertising platforms), residential proxies are recommended, as they have real home user IPs and do not get blacklisted by anti-bot systems.
Implementing Proxy Rotation in Puppeteer
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());
const proxyList = [
'http://user1:pass1@proxy1:port',
'http://user2:pass2@proxy2:port',
'http://user3:pass3@proxy3:port'
];
async function scrapeWithRotation(urls) {
for (let i = 0; i < urls.length; i++) {
const proxy = proxyList[i % proxyList.length];
const browser = await puppeteer.launch({
headless: true,
args: [`--proxy-server=${proxy}`]
});
const page = await browser.newPage();
try {
await page.goto(urls[i], { waitUntil: 'networkidle2' });
const data = await page.evaluate(() => document.body.innerText);
console.log(data);
} catch (error) {
console.error(`Error on ${urls[i]}:`, error);
} finally {
await browser.close();
}
// Delay between requests
await new Promise(resolve => setTimeout(resolve, 2000));
}
}
scrapeWithRotation([
'https://example1.com',
'https://example2.com',
'https://example3.com'
]);
Rotation via Session-based Proxies
Some proxy providers (including ProxyCove) offer session-based rotation — each request automatically receives a new IP without needing to restart the browser. This is implemented through a special proxy URL format:
// Format: username-session-RANDOM:password@gateway:port
const generateSessionProxy = () => {
const sessionId = Math.random().toString(36).substring(7);
return `http://username-session-${sessionId}:password@gateway.proxycove.com:12321`;
};
const browser = await puppeteer.launch({
args: [`--proxy-server=${generateSessionProxy()}`]
});
How to Test Masking Quality: Testing Tools
After configuring masking, it is necessary to check how well your headless browser mimics a regular user. There are several specialized websites that analyze dozens of browser parameters and show what automation traces remain.
Main Testing Tools
- bot.sannysoft.com — checks 15+ detection parameters, including webdriver, Chrome object, plugins, Canvas
- arh.antoinevastel.com/bots/areyouheadless — specializes in detecting headless Chrome
- pixelscan.net — advanced fingerprint analysis with visualization of all parameters
- abrahamjuliot.github.io/creepjs — the most detailed analysis (200+ parameters), shows browser trust level
- iphey.com — checks IP address for proxy and VPN affiliation
Automating Testing
Create a script for automatic masking checks after each configuration change:
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());
async function testStealth() {
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
// Test 1: Sannysoft
await page.goto('https://bot.sannysoft.com');
await page.screenshot({ path: 'test-sannysoft.png', fullPage: true });
// Test 2: Are you headless
await page.goto('https://arh.antoinevastel.com/bots/areyouheadless');
const headlessDetected = await page.evaluate(() => {
return document.body.innerText.includes('You are not Chrome headless');
});
console.log('Headless detected:', !headlessDetected);
// Test 3: Webdriver property
const webdriverPresent = await page.evaluate(() => navigator.webdriver);
console.log('navigator.webdriver:', webdriverPresent);
// Test 4: Chrome object
const chromePresent = await page.evaluate(() => !!window.chrome);
console.log('window.chrome present:', chromePresent);
await browser.close();
}
testStealth();
Successful Masking Checklist
Your headless browser is correctly masked if:
navigator.webdriverreturnsundefinedwindow.chromeexists and contains theruntimeobjectnavigator.plugins.lengthis greater than 0- WebGL vendor and renderer show a real graphics card, not SwiftShader
- Canvas fingerprint is unique for each session
- User-Agent matches the current version of Chrome/Firefox
- Proxy IP address is not blacklisted (check via iphey.com)
- Timezone and locale match the geolocation of the IP address
Conclusion
Masking headless browsers is a complex task that requires attention to dozens of parameters. Modern anti-bot systems use machine learning and analyze hundreds of browser characteristics simultaneously, so simply changing the User-Agent no longer works. For successful scraping, it is necessary to combine several protection methods.
Key elements of effective masking include removing JavaScript automation markers (navigator.webdriver, CDP variables), emulating Canvas and WebGL fingerprints, configuring realistic HTTP headers, and using quality proxies. Ready-made solutions (selenium-stealth for Python, puppeteer-extra-plugin-stealth for Node.js) cover 80% of cases, but additional configuration is required to bypass advanced protections.
A critically important point is the choice of proxies. Even a perfectly masked browser will be blocked if it uses IP addresses from blacklists or makes too many requests from a single IP. For scraping protected sites, we recommend using residential proxies with automatic rotation — they provide a high trust score and minimal risk of blocking, as they use real home user IPs instead of server addresses.
Regularly test the quality of masking through specialized services (bot.sannysoft.com, pixelscan.net) and adapt settings to changes in anti-bot systems. Scraping is a constant arms race between bot developers and protection creators, so a configuration that works today may require updating in a few months.