← Back to Blog

Proxies for Airline Ticket Price Scraping: How to Parse Aviasales, Skyscanner, and Google Flights Without Getting Blocked

Aviation websites instantly block automated requests β€” find out which proxies can help gather current ticket prices without bans and CAPTCHAs.

πŸ“…March 15, 2026

Flight websites are among the most aggressively protected resources on the internet. Stale prices, captchas, instant IP blocks β€” all of this makes data collection on fares a real challenge. If you are building an aggregator, monitoring prices for clients, or looking for cheap routes automatically, without properly configured proxies, you won't last an hour. In this article, we will discuss which proxies work, how to configure them, and why some types fail where others succeed.

Why flight websites block scraping so quickly

The aviation industry operates with dynamic pricing: fares change dozens of times a day depending on demand, time of day, browser history, and even the user's geolocation. That’s why major aggregators β€” Aviasales, Skyscanner, Kayak, Google Flights β€” invest huge resources in protection against automated requests.

Here’s what happens when you try to collect data without proxies or with cheap datacenter IPs:

  • Instant IP blocking β€” most flight websites maintain ASN (autonomous system) databases of datacenter IPs. A request from a hosting IP is blocked even before the page loads.
  • Captchas and Cloudflare β€” even if the first request goes through, after 5–10 requests from one address, a captcha or redirect for verification appears.
  • Fake prices β€” some sites (especially OTA aggregators) deliberately show bots inflated or outdated fares to spoil competitors' data.
  • Fingerprinting β€” in addition to IP, systems analyze HTTP headers, the order of TLS extensions, mouse behavior, and scrolling speed.
  • Rate limiting β€” limiting the number of requests from one IP in a given time. Usually, the threshold is 20–50 requests per minute, after which the connection is terminated.

The bottom line: without quality proxies with real IPs, you won't collect current data. Datacenter proxies perform poorly here β€” flight websites recognize them within seconds. You need either residential or mobile IPs.

Which types of proxies are suitable for flight tickets

Let's break down the three main types of proxies and their applicability to the task of collecting flight ticket prices:

Proxy Type IP Source Bypassing flight website protection Speed Cost
Residential Proxies Home providers (Rostelecom, Beeline, AT&T) ⭐⭐⭐⭐⭐ Excellent Average Average
Mobile Proxies Carrier networks (MTS, MegaFon, T-Mobile) ⭐⭐⭐⭐⭐ Excellent High High
Datacenter Proxies Server farms (AWS, OVH, Hetzner) ⭐⭐ Poor Very high Low

The conclusion is obvious: for flight websites, datacenter proxies are practically useless. Aviasales, Skyscanner, and Google Flights instantly identify IPs from hosting providers' ASNs and either block them or show a captcha. The real choice is between residential and mobile proxies β€” each has its niche.

Residential vs Mobile Proxies: What to Choose for Flight Tickets

Both types work, but in different scenarios, one wins over the other. Let's break it down specifically.

Residential Proxies β€” for Large-Scale Data Collection

Residential proxies use IP addresses of real home users around the world. For scraping flight tickets, this means:

  • The ability to select a specific country and even city β€” critical if you are checking prices for different markets (e.g., price from Moscow vs from London for the same flight).
  • A large pool of IPs β€” thousands of addresses for rotation, allowing hundreds of requests without repetition.
  • Good price/quality ratio for high traffic volumes.
  • Support for session and rotating modes β€” you can maintain one session to simulate a real user.

Ideal scenario: you are building an aggregator or monitoring service and need to collect prices from 10–20 websites simultaneously, making thousands of requests per hour. Residential proxies with rotation are your choice.

Mobile Proxies β€” for the Most Protected Websites

Mobile proxies operate through real SIM cards from mobile network operators. Their feature is IP addresses from mobile networks (3G/4G/5G), which flight websites almost never block. The reason is simple: behind one mobile IP can be a NAT network, where thousands of real users reside. Blocking such an address means losing thousands of live customers.

  • Maximum level of trust from anti-bot systems.
  • Practically zero risk of blocking even with aggressive scraping.
  • Ability to change IP by changing sessions (without physically changing devices).
  • Higher cost β€” justified for critically important data or complex sites.

Ideal scenario: you need to collect data from a specific complex site (e.g., the direct airline website with Cloudflare Enterprise), where residential proxies periodically trigger captchas. Mobile proxies will solve this problem.

πŸ’‘ Practical Advice

For most tasks related to monitoring flight prices, the optimal strategy is residential proxies for mass collection + mobile proxies for complex sites. This allows you to optimize your budget without compromising data quality.

Features of Protection on Aviasales, Skyscanner, Google Flights, and Kayak

Each platform has its own protection features. Understanding these differences will help you properly configure proxies and request behavior.

Aviasales

The Russian aggregator uses a combination of rate limiting and behavior analysis. The limit is about 30–40 requests per minute from one IP. When exceeded β€” a redirect to a captcha from Yandex SmartCaptcha. The site is relatively loyal to residential proxies with Russian IPs. Important: prices on Aviasales depend on geolocation, so for accurate data collection, use proxies with IPs from the country for which you need fares.

Skyscanner

One of the most protected aggregators. Uses Cloudflare with "Under Attack Mode" settings for suspicious IPs, as well as its own anti-bot system. Datacenter proxies do not work here at all. Residential proxies pass but require a slow request rate (no more than 15–20 per minute) and correct browser headers. For Skyscanner, it is recommended to simulate a real browser session through Playwright or Puppeteer with the proxy connected.

Google Flights

Google uses its own bot detection algorithms β€” reCAPTCHA v3 and behavioral pattern analysis. Direct HTML scraping does not work here, as data is loaded via JavaScript. A headless browser (Playwright/Puppeteer) with residential or mobile proxies is needed. Google is also sensitive to the match between the IP geolocation and the browser language β€” discrepancies increase the risk of blocking.

Kayak

An American aggregator with aggressive bot protection based on PerimeterX (now HUMAN Security). It recognizes not only IPs but also TLS fingerprints, the order of HTTP/2 headers, and the time between requests. For Kayak, it is mandatory to use: residential or mobile proxies, simulate a real browser, and random delays between requests (2–8 seconds).

Platform Protection System Do datacenter proxies work? Is headless needed? Recommended Proxy Type
Aviasales Rate limit + Yandex Captcha ❌ No Desirable Residential (RU)
Skyscanner Cloudflare + own system ❌ No βœ… Yes Residential / Mobile
Google Flights reCAPTCHA v3 + behavioral analysis ❌ No βœ… Mandatory Residential / Mobile
Kayak HUMAN Security (PerimeterX) ❌ No βœ… Yes Mobile

How to Set Up Proxies for Collecting Flight Price Data

The setup depends on the tool you are using. Let's consider the most common scenarios.

Option 1: Ready-made Scrapers and No-Code Tools

If you are not coding, use ready-made solutions: Octoparse, ParseHub, Apify. All of them support connecting external proxies. Here’s the order of actions:

  1. Obtain proxy data: host (IP or domain), port, username, password.
  2. Open your tool's settings β†’ "Proxy" or "Network" section.
  3. Select the protocol type: HTTPS (for most tasks) or SOCKS5 (if lower-level operation is needed).
  4. Insert the connection data. The format is usually: login:password@host:port
  5. Enable proxy rotation β€” most tools do this automatically if there is a pool of addresses.
  6. Run a test request to the target site and check that the IP has changed.

Option 2: Playwright / Puppeteer with Proxies

For complex sites (Google Flights, Skyscanner), a headless browser is needed. Here’s how to connect proxies in Playwright:

const { chromium } = require('playwright');

const browser = await chromium.launch({
  proxy: {
    server: 'http://your-proxy-host:port',
    username: 'your_login',
    password: 'your_password'
  }
});

const page = await browser.newPage();
await page.goto('https://www.skyscanner.com/...');
// Your data extraction logic goes here
await browser.close();

To rotate proxies with each new request, create a new browser context with a new proxy from your pool. This simulates the behavior of different users.

Option 3: Python + requests/httpx

For sites without JavaScript rendering (or for working with flight website APIs), Python is suitable:

import requests
import random

proxies_pool = [
    "http://login:[email protected]:port",
    "http://login:[email protected]:port",
    "http://login:[email protected]:port",
]

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
    "Accept-Language": "en-US,en;q=0.9",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
}

proxy = {"http": random.choice(proxies_pool), "https": random.choice(proxies_pool)}

response = requests.get(
    "https://www.aviasales.ru/search/...",
    proxies=proxy,
    headers=headers,
    timeout=15
)

print(response.status_code)

IP Rotation and Session Management: Key Rules

Proper IP rotation is half the success in scraping flight tickets. Simply changing IPs is not enough: it needs to be done smartly.

Rule 1: One IP β€” One Session

Do not use one IP for multiple parallel requests. Anti-bot systems see abnormally high loads from one address and block it. Each request stream should operate through a separate proxy.

Rule 2: Random Delays Between Requests

A real user does not make requests at equal intervals. Add a random delay of 2 to 8 seconds between requests. This reduces the likelihood of detection by a bot by 3–4 times compared to uniform requests.

Rule 3: Geolocation and Language Matching

If you are using proxies with German IPs, the browser headers should be in German (Accept-Language: de-DE). Mismatches are a clear signal for anti-bot systems. This is especially important for Google Flights.

Rule 4: Session Proxies for Multi-Step Requests

Some flight websites require multiple steps: search β†’ select flight β†’ view details. All these steps must be performed from one IP. Use sticky sessions β€” a mode where one IP is assigned to your stream for a certain time (usually 10–30 minutes).

Rule 5: Monitoring Proxy Quality

Regularly check which IPs from the pool are blocked. Automatically exclude addresses that return a 403, 429 code, or redirect to a captcha. Most professional scraping frameworks (Scrapy, Apify) do this automatically.

Ready-Made Tools for Scraping Flight Prices

If you do not want to write a scraper from scratch, here are tools that already support working with proxies and are suitable for monitoring flight prices:

Apify

A cloud platform for web scraping. It has ready-made actors (bots) for Skyscanner and Google Flights. Supports connecting external proxies through settings. To connect your proxies: go to actor settings β†’ "Proxy and browser configuration" tab β†’ select "Custom proxies" β†’ insert your proxy URLs in the format http://user:pass@host:port.

Octoparse

A no-code scraper with a visual interface. Suitable for those who do not write code. Supports proxy rotation: Settings β†’ Cloud Extraction β†’ Proxy Settings β†’ Add Custom Proxy. You can add a list of proxies, and Octoparse will automatically alternate them.

Scrapy + Scrapy-Rotating-Proxies

A Python framework for professional scraping. The scrapy-rotating-proxies plugin automatically rotates IPs from your list and excludes blocked addresses. Suitable for high-load tasks β€” hundreds of thousands of requests per day.

ParseHub

Another no-code tool that supports JavaScript rendering. It handles Aviasales well. Proxies are connected in the Settings β†’ Advanced β†’ Proxy section.

⚠️ Important about Geotargeting Prices

Flight websites show different prices depending on the user's country. This is not just a marketing strategy β€” it’s a technical reality. If you are monitoring prices for the Russian market, use proxies with Russian IPs. To compare prices across markets (e.g., how much the same flight costs for a user from Germany), you need proxies with IPs from the respective countries.

Checklist: How to Avoid Bans While Collecting Flight Prices

Save this list β€” it will help avoid most problems when setting up scraping:

βœ… Before Starting the Scraper

  • Residential or mobile proxies selected (not datacenter)
  • Proxy IP matches the target market (country/city)
  • Browser language matches the proxy geolocation
  • IP rotation configured (at least 1 IP per stream)
  • User-Agent headers simulate a real browser
  • A headless browser (Playwright/Puppeteer) is used for JS sites

βœ… During the Scraper's Operation

  • Delays between requests: 2–8 seconds (random)
  • No more than 20–30 requests per minute from one IP
  • Multi-step sessions use one IP (sticky session)
  • Codes 403/429 automatically exclude IPs from the pool
  • Logging all errors for analysis

βœ… Additionally for Complex Sites

  • Correct Referer and Accept headers
  • Simulating mouse movement and scrolling (for Playwright)
  • Randomly changing User-Agent from a real pool of browsers
  • Using cookie sessions to simulate repeat visits

Common Mistakes That Lead to Bans

  • Using free proxies. Their IPs have long been blacklisted by all major flight websites. You will get blocked on the first request.
  • Too high request frequency. Even with good proxies, 100 requests per minute from one IP is a sure way to get banned.
  • Same User-Agent for all requests. Real users use different browsers and versions β€” your scraper should simulate this.
  • Ignoring cookies. Many sites track sessions via cookies. If you do not save and pass cookies between requests, the behavior looks abnormal.
  • Mismatch between geolocation and request content. Requesting the Russian version of the site through an American IP is a red flag for anti-bot systems.

Conclusion

Collecting data on flight prices is one of the most technically challenging tasks in scraping. Flight websites invest significant resources in protection against bots, and bypassing it without the right tools is impossible. The main takeaways from this article:

  • Datacenter proxies do not work for flight websites β€” they are blocked instantly.
  • Residential proxies are the optimal choice for large-scale price monitoring from different markets.
  • Mobile proxies are needed for the most protected platforms (Kayak, Skyscanner) and critically important data.
  • IP rotation, random delays, and simulating a real browser are essential conditions for stable operation.
  • The geolocation of proxies must match the target market; otherwise, prices will be incorrect.

If you plan to build a flight price monitoring system or collect data for an aggregator, start with residential proxies β€” they provide the right balance between bypassing protection, geographical coverage, and cost. For the most complex sites with aggressive anti-bot protection, consider mobile proxies β€” they offer the highest level of trust from anti-bot systems and virtually eliminate blocks with proper configuration.