Proxy Market News - Proxyway https://proxyway.com/news Your Trusted Guide to All Things Proxy Fri, 27 Feb 2026 07:44:57 +0000 en-US hourly 1 https://wordpress.org/?v=6.8.3 https://proxyway.com/wp-content/uploads/2023/04/favicon-150x150.png Proxy Market News - Proxyway https://proxyway.com/news 32 32 Building a Stealth Browser: Interview with Rayobyte https://proxyway.com/news/building-a-stealth-browser-interview-with-rayobyte https://proxyway.com/news/building-a-stealth-browser-interview-with-rayobyte#respond Mon, 02 Mar 2026 09:05:24 +0000 https://proxyway.com/?post_type=news&p=40455 Neil Emeigh explains why his company built an in-house Chromium fork and how it differs from available open-source tools.

The post Building a Stealth Browser: Interview with Rayobyte appeared first on Proxyway.

]]>

News

Neil Emeigh explains why his company built an in-house Chromium fork and how it differs from available open-source tools.

Neil Emeigh

Web scraping is becoming increasingly impossible without a web browser. We have great open-source implementations like Camoufox, while AI needs have spawned a segment of cloud browsers made for agentic tasks. 

Rayobyte, however, believes that the market lacks great tools that are purpose-built for web scraping. The provider has made its own implementation of Chromium to address this gap.

We sit down with Rayobyte’s CEO Neil Emeigh to discuss how rayobrowse works, the ways it differs from other available tools, and its place in the fast-growing market of cloud web browsers. The interview also includes a browser fingerprinting benchmark done in collaboration with ScrapeOps. 

You recently announced a new tool called rayobrowse. What is it, exactly?

“rayobrowse is a self-hosted, Chromium-based stealth browser built for web scraping and automation. It runs inside Docker, requires no GPU, and works on headless Linux servers. Your existing Playwright, Selenium, or Puppeteer scripts connect to it over standard CDP – no code changes needed.

Each session gets a realistic, real-world device fingerprint (user agent, screen, canvas, WebGL, fonts, timezone, WebRTC, etc.) drawn from a database of thousands of profiles. We use it in production on Rayobyte’s scraping API to scrape millions of pages per day across some of the most difficult sites on the web.”

Why build your own web browser instead of using open-source tools like Patchright or Camoufox?

“I described our journey in ultimately building rayobrowse in an earlier blog post. But here’s the short version.

1) Camoufox

Camoufox is based on Firefox, which represents only about 2% of global browser market share. At the scale we operate at Rayobyte (hundreds of millions of requests per month) we knew that sending massive volumes of Firefox traffic would make us stand out immediately. Target websites simply don’t see that many real Firefox users.

Because Camoufox is open-source, we were also able to see how some anti-bot companies had reverse engineered subtle signals that made it relatively easy to detect. Over time, those detection vectors became more obvious. On top of that, the project was largely unmaintained. We’ve seen signs that a new owner may be taking it over, but time will tell and that still doesn’t solve the first two issues.

2) Patchright

Patchright works well for basic websites, but it doesn’t mask fingerprints at a deeper level. You still show up with the same underlying OS, GPU, and system-level characteristics.

The bigger issue is this: most scraping at scale doesn’t happen on someone’s local laptop…it happens on Linux servers. So when using Patchright in cloud environments, the fingerprint effectively tells websites, “Hi, I’m a Linux server.” In other words: “I’m probably a bot.”

At our scale, we needed to run on scalable cloud infrastructure while ensuring our fingerprints looked like real user environments (like a real Windows user). That level of control simply wasn’t possible with existing tools.”

Could you explain how rayobrowse works in more detail?

“It has three layers. At the bottom is a Chromium fork — we track upstream releases and apply a focused set of patches (similar to how Brave maintains its fork) that normalize exposed APIs, reduce fingerprint entropy leaks, and improve automation compatibility while preserving native Chromium behavior.

On top of that is a fingerprint engine: at session startup, each browser gets a complete real-world device profile — OS metadata, screen resolution, Canvas/WebGL rendering attributes, fonts matched to the target OS, locale, timezone, and WebRTC config. These profiles come from a database of thousands of fingerprints collected using the same techniques anti-bot companies use.

Finally, an automation layer exposes only standard Chromium CDP interfaces. Your scripts connect through native CDP and operate on unmodified page contexts. It all runs inside a single Docker container, so there are zero host dependencies beyond Docker and Python.”

How effective have you found it at unblocking major anti-bot systems? How does it
fare in synthetic benchmarks, like the one run by ScrapeOps?

“We’re not familiar with their benchmarks, but we were desperate to find a working browser solution to keep a major Fortune 500 customer of ours online that used our scraping API. We “got by” for a little while on Camoufox and Patchright until they detected that, and we tested every cloud browser + “antidetect” browser on the market.

We found a couple antidetect browsers that worked… but were Windows-based and very difficult to work with on the code side for large scale scraping (their business model is people managing accounts, etc… not web scrapers).

We are currently scraping many millions of pages with our browser per day on SERP, and millions more on a popular Asian ecommerce site. We also regularly test and get feedback from users that our system works great on: Cloudflare, Datadome, Akamai, Perimeterx

One motivation behind our release is to get feedback from real users about any website that it doesn’t work on so we can continually improve.

You currently distribute rayobrowse as a restricted-access beta tool. What are your future plans for this product?

Right now it’s distributed as a restricted-access beta with a free tier (one concurrent browser, no registration needed) and an unlimited-concurrency tier for Rayobyte proxy customers. A third option is that you can buy concurrent browsers so that you can self-host and bring your own proxy. Lastly, we’re also building out a cloud browser mode for those who don’t want to self-host.

Longer term, we want rayobrowse to be the default browser for anyone doing serious web scraping. It’s commercially maintained, always current against the latest anti-bot techniques, and available at whatever abstraction level you need: self-hosted, cloud, or integrated into our scraping API.”

We had the chance to collaborate with ScrapeOps and test rayobrowse using their evaluation framework. The benchmark relies on Antoine Vastel’s fingerprinting tool to check for giveaways in user agents, hardware parameters, CDP automation, and other signals. 

rayobrowse did very well, passing the majority of checks. In ScrapeOps’ framework, it got a score of 88.42, nearly matching the performance of established competitors like Bright Data, Scrapeless, and Zenrows.

Here’s a summary of the report:

In our tests, Rayobrowse provided a robust defense against browser fingerprinting by maintaining high internal consistency across various layers. The browser presented a modern Chrome 144 environment on Windows, with HTTP headers that included advanced compression methods like `zstd` and properly localized `Accept-Language` values for countries such as Japan, Germany, and Russia.

The provider’s hardware emulation was particularly diverse. Rather than using a static “one-size-fits-all” hardware profile, Rayobrowse cycled through various CPU core counts and genuine consumer GPU renderers, including both NVIDIA RTX series and Intel HD/UHD integrated graphics. This diversity, combined with unique fingerprint hashes for every session, ensures that automated traffic does not form a recognizable cluster. Despite these strengths, the browser’s internal geometry and font list remained static and somewhat unrealistic, which may alert more advanced anti-bot systems.

Unblocking browsers have exploded as a category in recent years. Established players like Bright Data and Oxylabs now compete alongside newer entrants such as Browse-Use, Kernel, and Browserbase. So where do you fit in?

“We see a clear void in the market.

On one side, you have cloud browsers built by companies that deeply understand scraping such as Bright Data and Oxylabs. But they’re expensive, and they don’t offer self-hosting. You’re locked into their infrastructure and pricing.

On the other side, you have cloud browser companies that don’t really come from a scraping background. The ScrapeOps benchmarks make this pretty clear: platforms like Browserbase, and others in that category, simply don’t have 10+ years of scraping and proxy experience behind them like we do. No disrespect intended – it’s just not their core business model. As a result, they’re not naturally tuned to the subtle signals and edge cases that matter in stealth. And again, they’re not self-hosted.

Then you have self-hosted solutions. Most anti-detect browsers don’t work well in Linux server environments, which makes them extremely hard to scale. When you’re running thousands of browsers like we are, managing fleets of Windows servers becomes a nightmare.

And then there’s Camoufox and Patchright (covered above) with their own structural limitations. So where do we fit?

We offer a browser you can self-host if you want to, and do so affordably. In my opinion, one of the reasons Camoufox became so popular was that it was the first true open-source stealth browser. Before that, your only options were expensive cloud solutions or patching something together that “mostly worked.”

We believe our browser fills the gap Camoufox left behind, with the added benefit of being Chromium-based. Stealth is not a side feature for us, it really is the core focus. It has to be. Our business
model as a scraping company depends on remaining undetected.

And for those who don’t want to self-host, we’ll offer an affordable cloud option as well. The key word is “affordable”. At Rayobyte we consume over 200,000 GB of bandwidth every month on our browsers. If we applied that to Bright Data’s $5/GB model, well, we would be out of business!”

Do you agree with Zyte’s notion that it’s becoming irrational for web scrapers to manage proxies in-house, and that they’ll increasingly need to use higher-level services like APIs to succeed?

“I share the same opinion that 10 years from now it’ll be impossible for hobbyists or startups to install a bunch of repos and ‘tada’, they’re able to scrape at scale.

The anti-bot companies continue to grow (look at Cloudflare’s dominance as a CDN), and their business model affords them to build a great anti-bot solution that can block many users. So, they will incrementally get better, while individuals won’t have the budgets to learn, and keep up with, their much larger budgets and knowledge.

That being said, that’s not today. Our browser is plug-and-play and in a couple minutes you can use it to load most major websites, and sites protected by anti-bot. So, until that future reality becomes true, we hope to help people stay ahead of the curve as much as possible.”

Rayobyte as a company itself has gone through quite a few changes these past few years. Can you share what’s next for you?

“As I wrote in my blog post, we discovered a year ago how good we really are at scraping. Previously we stayed in our lane of proxies, but we came to find out that we were able to scrape some high-value targets in Asia that the top scraping companies were unable to scrape.

From that knowhow, we’ve continued building our scraping services and our scale grows MoM. Within that effort, needs arose, such as a scalable browser, and hence rayobrowse was born. As we look forward, we’ll be leaning further into this skillset of ours to service high-value scraping endpoints, and provide tools that can help others scrape more easily.

Check out our Github and try for free today: https://github.com/rayobyte-data/rayobrowse

The post Building a Stealth Browser: Interview with Rayobyte appeared first on Proxyway.

]]>
https://proxyway.com/news/building-a-stealth-browser-interview-with-rayobyte/feed 0
Nimble Raises $47M to Build Web Search for AI https://proxyway.com/news/nimble-raises-47m-to-build-web-search-for-ai https://proxyway.com/news/nimble-raises-47m-to-build-web-search-for-ai#respond Wed, 25 Feb 2026 12:58:48 +0000 https://proxyway.com/?post_type=news&p=40423 The web data provider’s Series B follows the hype.

The post Nimble Raises $47M to Build Web Search for AI appeared first on Proxyway.

]]>

News

The web data provider’s Series B follows the hype.

Adam Dubois
nimble investment news main

Nimble, the US and Israel based provider of web data tools and services, has announced a new Series B investment round.

The Series B amounts to $47M. It’s led by Norwest and featured Databricks Ventures, among other participants.

Nimble’s announcement marks a shift in direction. The company will now focus on delivering web search for AI agents, joining the race with Exa, Tavily, Parallel, and a number of similar competitors. Perhaps incidentally, it has even changed its logo to a race flag. 

Nimble has also repackaged its suite of products, which now comprises: 

  • A real-time API with endpoints for searching, extracting, mapping, crawling, and agentically interacting with websites. 
  • A list of pre-made agents (or pipelines, or scrapers) that return structured data from dozens of websites. It includes an agent builder that creates new agents based on plain language queries. 
  • Managed web scraping services.
  • A residential proxy network.
nimble platform capabillities
Image source: Nimble

The pricing model has changed as well. Nimble has moved away from pricing plans to a usage-based format with per-product rates. The search API costs $1 for 1,000 requests, search & extract functionality adds up to $1.35 on top, while an LLM-generated answer charges $4 per 1,000 results.  

According to multiple sources, this Series B brings Nimble’s funding tally to $75M in total. 

Overall, agentic web search is currently one of the hottest areas of web scraping. Just weeks ago, Nimble’s competitor Tavily was acquired by Nebius in a deal worth up to $400M.

The post Nimble Raises $47M to Build Web Search for AI appeared first on Proxyway.

]]>
https://proxyway.com/news/nimble-raises-47m-to-build-web-search-for-ai/feed 0
ScrapingBee Launches Fast Search API https://proxyway.com/news/scrapingbee-launches-fast-search-api https://proxyway.com/news/scrapingbee-launches-fast-search-api#respond Tue, 17 Feb 2026 13:02:37 +0000 https://proxyway.com/?post_type=news&p=40297 The scraper fetches real-time data in under a second and leaves no logs.

The post ScrapingBee Launches Fast Search API appeared first on Proxyway.

]]>

News

The scraper fetches real-time data in under a second and leaves no logs. 

Adam Dubois

ScrapingBee, the French provider of web scraping tools, has launched Fast Search API. It’s optimized for AI use cases which expect real-time Google Search results in under one second. 

To achieve this, ScrapingBee has stripped the output from all unnecessary search elements, keeping only organic results and top news. 

Fast Search API supports several customization options, including the location and search page number:

scrapingbee fast search parameters
Source: ScrapingBee's documentation

The output is structured JSON:

scrapingbee fast search output
Source: ScrapingBee's documentation

In addition, the API applies a zero data retention policy, which may appeal to privacy-minded customers.

One thousand requests start from $1.96 (10 API credits) and drop to $0.75 with the largest public plan. 

Fast Search API complements ScrapingBee’s two other search engine scrapers: the full API which returns all SERP elements and the light API which tries to do the same without rendering JavaScript. 

ScrapingBee offers a card-free trial with 100 requests to Fast Search API.

The post ScrapingBee Launches Fast Search API appeared first on Proxyway.

]]>
https://proxyway.com/news/scrapingbee-launches-fast-search-api/feed 0
Rayobyte Slashes Mobile Proxy Rates by up to 98% https://proxyway.com/news/rayobyte-slashes-mobile-proxy-rates-by-up-to-98 https://proxyway.com/news/rayobyte-slashes-mobile-proxy-rates-by-up-to-98#respond Tue, 17 Feb 2026 07:41:44 +0000 https://proxyway.com/?post_type=news&p=40289 The price reduction sounds absurd but seems legitimate.

The post Rayobyte Slashes Mobile Proxy Rates by up to 98% appeared first on Proxyway.

]]>

News

The price reduction sounds absurd but seems legitimate.

Adam Dubois

Rayobyte, the US-based provider of proxy servers and web scraping tools, has revised the pricing of its mobile proxies. 

Their plan sizes and rates have both changed; the latter fell by up to 98%. While this may be hard to believe, the baseline used to be very expensive at $25/GB. 

Here are the previous pricing plans:

 

Price

Traffic

Price/GB

Starter

$50

2 GB

$25

Personal 

$300

15 GB

$20

Consumer

$600

40 GB

$15

And these are the new ones:

 

Price

Traffic

Price/GB

Starter

$250

500 GB

$0.50

Professional

$250

200 GB

$1.25

Business

$1,000

909 GB

$1.10

Corporate

$2,500

2,632 GB

$0.95

Enterprise

Custom

5 TB+

Up to $0.50

Note how the Starter plan costs less per unit than everything up to Enterprise. From what we know, it applies only to the first 500 gigabytes

Rayobyte’s mobile proxy network isn’t peer-to-peer. According to the provider, most IPs are sourced via B2B contracts with mobile carriers and are hosted on mobile device farms. 

As such, the proxy pool covers fewer locations (predominantly the US) and has fewer IPs online at once. However, it can accrue a relatively large number of IP addresses over time, depending on how many are assigned to the nearby cell tower. 

Rayobyte’s mobile proxies had been seemingly left for dead for several years before the provider found value in them and decided to reboot the product. You can learn more about the reasoning and results in this LinkedIn webinar (an account is necessary to view it).

The post Rayobyte Slashes Mobile Proxy Rates by up to 98% appeared first on Proxyway.

]]>
https://proxyway.com/news/rayobyte-slashes-mobile-proxy-rates-by-up-to-98/feed 0
Oxylabs Launches Dedicated ISP Proxies in Self-Service https://proxyway.com/news/oxylabs-launches-dedicated-isp-proxies-in-self-service https://proxyway.com/news/oxylabs-launches-dedicated-isp-proxies-in-self-service#comments Tue, 03 Feb 2026 09:25:10 +0000 https://proxyway.com/?post_type=news&p=39958 The product offers nearly unlimited access to proxy IPs in 14 countries.

The post Oxylabs Launches Dedicated ISP Proxies in Self-Service appeared first on Proxyway.

]]>

News

The product offers nearly unlimited access to proxy IPs in 14 countries.

Adam Dubois
oxylabs dedicated isp proxy landing

Oxylabs, the Lithuanian provider of web data extraction tools and services, has launched self-serve dedicated ISP proxies. Previously, this product was only available to enterprise customers through a sales process. 

The new product gives access to a list of IPs associated with consumer internet service providers. These IPs are static (with optional rotation functionality) and not shared with anyone else. 

The dedicated ISP proxies currently cover 14 countries across four continents. Each country includes between one and five ASNs. You’re free to choose any combination of countries and ASNs during purchase.

oxylabs dedicated isp country list

The proxies support HTTP, HTTPS, and SOCKS5 protocols, including UDP. They’re also nearly unlimited: the fair usage policy kicks in after hitting the 100 GB/IP mark; past this point, Oxylabs cuts concurrent sessions to 10/IP until the new billing cycle.

The dedicated ISP proxies charge per IP address. Their plans start from $16 for 5 IPs ($3.20/IP) and reach $750 for 300 IPs ($2.50/IP) with the largest public plan. 

The product is already available for purchase. There is no free trial, but Oxylabs offers a 3-day limited refund policy.

The post Oxylabs Launches Dedicated ISP Proxies in Self-Service appeared first on Proxyway.

]]>
https://proxyway.com/news/oxylabs-launches-dedicated-isp-proxies-in-self-service/feed 2
Google Disrupts 10+ IPIDEA-Related Chinese Proxy Providers https://proxyway.com/news/google-disrupts-10-ipidea-related-chinese-proxy-providers https://proxyway.com/news/google-disrupts-10-ipidea-related-chinese-proxy-providers#respond Thu, 29 Jan 2026 08:30:51 +0000 https://proxyway.com/?post_type=news&p=39866 LunaProxy, ABCProxy, and more have their websites down and their proxy networks disrupted.

The post Google Disrupts 10+ IPIDEA-Related Chinese Proxy Providers appeared first on Proxyway.

]]>

News

LunaProxy, ABCProxy, and more have their websites down and their proxy networks decimated.

Adam Dubois

On January 28, 2026, Google announced its actions to disrupt IPIDEA – one of the largest residential proxy networks in the world. According to The Wall Street Journal, these measures are expected to knock over nine million devices off IPIDEA’s network.  

IPIDEA is a Chinese company that controls or is related to more than a dozen of Hong Kong incorporated brands selling proxy servers. Their non-exhaustive list includes: 360Proxy, 922Proxy, ABC Proxy, Cherry Proxy, IP2World, IPidea.io, LunaProxy, PIA S5 Proxy, PyProxy, and TabProxy. 

Google Threat Intelligence Group’s efforts involved three main measures against IPIDEA:

  1. Taking legal action to take down both the public-facing domains of its brands and the domains used to control the proxy network. 
  2. Sharing technical intelligence on discovered IPIDEA SDKs and proxy software with platform providers, law enforcement, and research firms. (SDKs are components installed into apps like free VPNs to recruit devices into the proxy network.) 
  3. Ensuring that Google Play Protect automatically warns users and removes applications known to incorporate IPIDEA SDKs, also blocking future install attempts.

Google writes that IPIDEA’s network has facilitated several huge botnets like BADBOX2.0, Kimwolf, and Aisuru. For example, recent research surfaced vulnerabilities in IPIDEA’s infrastructure that allowed botnet owners to access and recruit other devices on local networks. Over 550 threat groups have used IPIDEA’s proxies directly to hide their malicious activities.  

IPIDEA-related brands have had a big impact on the proxy server market. Between 2022 and 2024, it was flooded with over 20 Hong Kong based entrants. They undercut competitors and, more importantly, popularized unlimited plans that offered unmetered rates for a fixed fee. 

To quote IPIDEA’s spokeswoman in the WSJ article: the company and its partners had engaged in “relatively aggressive market expansion strategies” and “conducted promotional activities in inappropriate venues (e.g., hacker forums)”

She claimed that they have since improved their business practices. While we’ve seen measures to appear legitimate (such as professing to source proxies ethically and screen customers), we found them to lack grounding and actual enforcement. 

At the time of writing, the websites of several potentially IPIDEA-related providers remain unaffected. However, the real impact on their proxy networks is yet to be measured.

The post Google Disrupts 10+ IPIDEA-Related Chinese Proxy Providers appeared first on Proxyway.

]]>
https://proxyway.com/news/google-disrupts-10-ipidea-related-chinese-proxy-providers/feed 0
Zyte Publishes 2026 Web Scraping Industry Report https://proxyway.com/news/zyte-publishes-2026-web-scraping-industry-report https://proxyway.com/news/zyte-publishes-2026-web-scraping-industry-report#respond Fri, 23 Jan 2026 11:40:30 +0000 https://proxyway.com/?post_type=news&p=39841 The report outlines six trends shaping web data collection.

The post Zyte Publishes 2026 Web Scraping Industry Report appeared first on Proxyway.

]]>

News

The report outlines six trends shaping web data collection.

Adam Dubois
zyte scraping industry report 2026 thumbnail

Zyte, the Ireland-based provider of web scraping tools and services, has published its 2026 Web Scraping Industry Report

Zyte’s whitepaper describes six key trends shaping web data collection; it overviews their impact and provides actionable recommendations for staying apace with the striding industry. 

To briefly summarize them:

  1. The web scraping stack is being increasingly bundled into unified tools – APIs, making individual components like proxy servers less capable or rational to manage in-house. 
  2. AI now sits across the entire scraping lifecycle, whether as LLM-based data parsers, machine learning unblocking algorithms, or code generators. 
  3. End-to-end automation will become the default for web scraping pipelines: we’ll have an agent orchestrating specialized sub-agents, with humans designing rather than implementing the process.
  4. Manual access strategies will become unsustainable at scale, giving way to machine learning on both sides of the equation: bots and bot detection tools. 
  5. Web traffic will divide into access lanes, establishing a hostile, negotiated, or invited relationship with websites. New standards and authentication protocols will grant preferential access to select entities. 
  6. Legal clarity is arriving with compliance demands in the shape of California’s Assembly Bill 2013, the EU AI Act, and other legislation. Enterprises are treating data provenance and compliance systems as increasingly important. 

Frankly, if you’re building your own scrapers, the overarching message is concerning: AI productivity gains are being overshadowed by access restrictions and growing complexity.

At the same time, commercial tools have become better than ever. Coupled with more legal clarity, they may actually make web scraping easier for companies that have budgets to outsource their web scraping operations. 

The report is free to access after entering your email address. We recommend reading it.

The post Zyte Publishes 2026 Web Scraping Industry Report appeared first on Proxyway.

]]>
https://proxyway.com/news/zyte-publishes-2026-web-scraping-industry-report/feed 0
New Research: Internet Sharing SDKs as an Emerging App Monetization Model https://proxyway.com/news/internet-sharing-sdks-app-monetization https://proxyway.com/news/internet-sharing-sdks-app-monetization#respond Wed, 31 Dec 2025 13:13:50 +0000 https://proxyway.com/?post_type=news&p=39637 We overviewed internet sharing SDKs and their role in app monetization.

The post New Research: Internet Sharing SDKs as an Emerging App Monetization Model appeared first on Proxyway.

]]>

News

Unused bandwidth is becoming a new revenue layer for app developers. We chimed in to explore the ins and outs of the emerging model.

Adam Dubois

We published a new research exploring internet sharing SDKs as an emerging app monetization method. In the report, we look at how developers can earn passive revenue by allowing apps to share unused user bandwidth with user consent, and how this model can sit alongside ads, subscriptions, and in-app purchases rather than replace them.

Our new piece focuses on a few key areas:

  • how internet sharing SDKs work in practice and where the model comes from;

  • what developers should consider before integration, including transparency, compliance, and user trust;

  • who the key providers in the market are;

  • why these SDKs are increasingly seen as a supplementary revenue layer, particularly for apps with large passive user bases.

As monetization pressure continues to grow across the app economy, internet sharing SDKs are becoming a more visible option for developers looking to diversify revenue without relying solely on engagement-driven methods.

The post New Research: Internet Sharing SDKs as an Emerging App Monetization Model appeared first on Proxyway.

]]>
https://proxyway.com/news/internet-sharing-sdks-app-monetization/feed 0
Google Sues SerpApi https://proxyway.com/news/google-sues-serpapi https://proxyway.com/news/google-sues-serpapi#respond Tue, 30 Dec 2025 06:55:27 +0000 https://proxyway.com/?post_type=news&p=39579 The lawsuit involves avoidance of Google’s anti-bot measures and copyright infringement.

The post Google Sues SerpApi appeared first on Proxyway.

]]>

News

The lawsuit involves avoidance of Google’s anti-bot measures and copyright infringement.  

Adam Dubois

On December 19, 2025, Google announced that it was suing SerpApi, a service for scraping Google Search results.

SerpApi allows AI agents, search engine optimization tools, and other companies to programmatically access the search engine at scale. According to Google, the defendant has been making hundreds of millions of search requests per day. 

Google’s claim hinges on two arguments:

  1. Circumvention of SearchGuard, a protection measure implemented in January 2025. This measure reinforces Google’s prohibition of web scraping in its terms of service and the robots.txt file. Circumventing it imposes a so-called “deadweight loss” which the platform is unable to offset with ads. 
  2. Copyright infringement which allegedly stems from taking licensed content from Google’s Knowledge Panel, Maps, Shopping, and other properties without authorization or payment. 

Google’s lawsuit relies on SerpApi violating the Digital Millennium Copyright Act (DMCA), which prohibits the creation of tools to bypass technological measures protecting copyrighted works. DMCA is no joke because it may incur criminal liability; however, the legal doctrine is still underdeveloped when it comes to web scraping. 

The tech giant claims that these infringements make it eligible for damages ranging between $200 and $2,500 for each count (in other words: every scraped page) – or at least the profits SerpApi has made from providing its services. 

In addition, Google wants SerpApi to stop any further violations, and to destroy the circumvention technology which reportedly was purpose-built for the task. 

SerpApi has drawn Google’s attention only now despite being in business since 2017. One of the reasons is likely scale: Google claims that the company’s search requests have increased by as much as 25 times over the past two years. 

Another reason is the emergence of AI search agents that threaten to replace the dominant search engine. For example, ChatGPT’s own search product has been proven to use Google for answers.

This is a second major lawsuit in 2025 that features SerpApi. In October, Reddit sued the service (and several other companies) for extracting Reddit snippets from Google Search.

The post Google Sues SerpApi appeared first on Proxyway.

]]>
https://proxyway.com/news/google-sues-serpapi/feed 0
New Report: Web Scraping APIs in 2025 https://proxyway.com/news/new-report-web-scraping-apis-in-2025 https://proxyway.com/news/new-report-web-scraping-apis-in-2025#respond Wed, 03 Dec 2025 09:42:43 +0000 https://proxyway.com/?post_type=news&p=39286 We benchmarked 11 companies and overviewed AI’s impact on the market.

The post New Report: Web Scraping APIs in 2025 appeared first on Proxyway.

]]>

News

We benchmarked 11 companies and overviewed AI’s impact on the market.

Adam Dubois
Our annual report on web data collection infrastructure is now available. This edition covers web scraping APIs, and it can be divided into two major sections:

1. The Unblocking Benchmark

Here, we tested 11 APIs from companies like Oxylabs, Zyte, ScraperAPI, and Apify. 

We accessed 15 protected e-commerce websites, search engines, and other targets, extracting six thousand pages at two and 10 requests per second. This would translate to five and 26 million monthly requests, respectively.

We discovered that even commercial APIs struggle with websites like Shein and G2. Some, such as Zyte or Decodo, were very effective with most targets, while others could be considered specialist providers.

2. The Impact of AI on the Web Scraping Market

This part is less technical – we put on an analyst’s shoes and tried to document how AI has affected our industry. The impact has been huge and multifaceted:

  • There’s been an unprecedented level of venture capital activity (to the tune of several hundred million dollars), which has birthed a generation of US-based providers. 
  • They, in turn, have brought a share of new features and a sense of urgency to entrenched market participants. 
  • And, of course, AI crawling hasn’t been unnoticed by major web scraping targets like Google or bot protection companies.

You’ll find the full report here: https://proxyway.com/research/web-scraping-api-report-2025

The post New Report: Web Scraping APIs in 2025 appeared first on Proxyway.

]]>
https://proxyway.com/news/new-report-web-scraping-apis-in-2025/feed 0
Zyte Extract Summit 2025 (Dublin): A Recap https://proxyway.com/news/zyte-extract-summit-2025-dublin-recap https://proxyway.com/news/zyte-extract-summit-2025-dublin-recap#respond Mon, 17 Nov 2025 12:52:10 +0000 https://proxyway.com/?post_type=news&p=38763 Our virtual impressions from the second edition of Zyte’s annual web scraping conference.

The post Zyte Extract Summit 2025 (Dublin): A Recap appeared first on Proxyway.

]]>

News

Our virtual impressions from the second edition of Zyte’s annual web scraping conference.

Adam Dubois
extract summit 2025

Six weeks after running Extract Summit in Austin (we cover it here), Zyte brought the web scraping conference to its home turf of Ireland. 

It would be wrong to call the Dublin edition a reprise of the first event; for the most part, it had a new line-up and focused on different areas of web data collection. As such, both events complement each other.

As always, our aim is to give you a brief (and very human) summary of the conference’s talks. Zyte has made them available on demand, so you can do some opinionated window shopping before committing to the full videos.

Organizational Matters

Zyte didn’t change much in the way it organized the event, so there’s no need to waste metaphorical ink describing it. The conference spanned two days (the first being dedicated to workshops), and there was an option to watch everything online. You had a Slido form for questions, and that’s about it. 

The execution wasn’t flawless: for the better part of the event, online viewers had to choose between mono audio on their headphones or cranking up the volume to the max on speakers. But other than that, Zyte’s organizing committee did a solid job.

Main Themes

AI? AI. It’s unavoidable, really. But if Austin was largely about LLM parsing and agent-assisted code generation, Dublin gave much more attention to the unblocking side of things. We had a panel discussion with none other than Antoine Vastel, and Kieron Spearing gave a structured drill-down into how websites construct their requests. We loved that. 

The panel of lawyers focused on intellectual property in particular, which is the hot-button topic of the day. And, of course, team Zyte once again tried to sell their internal-project-made-public (VS Code extension), which is proper etiquette for the host in these kinds of events. 

The last brief presentation, which for some reason escaped the official agenda, tried to straight out dissuade viewers from building their own scrapers, claiming that outsourcing was the more rational choice. Though structured as a personal story, the talk was nonetheless delivered by Zyte’s developer advocate, so it was hard to take it at face value.

The Talks

Talk 1. How to Make AI Coding Work for Enterprise Web Scraping

This is the one presentation that does repeat Austin. Zytans Ian Lennon (CPO) and John Rooney (Dev Engagement Manager) introduced their new VS Code agentic spider builder to the audience in Dublin. 

John first gave a tech demo where he wrote a quick spider to scrape and structure some e-commerce pages. Ian then took over and addressed the bigger picture from the business point of view. The extension is free for now, so we recommend giving it a go to see if it works for you. We were told that it’s already saved a lot of dev resources in Zyte’s internal use. 

As for the talk, we suggest watching the Austin version. John executed the demonstration live, which unfortunately resulted in the LLM executing itself mid-process. But even companies like Meta don’t always get live demos 100% right, so we respect John for his bravery.

extract summit 2025 dublin talk 1
Guess what Zyte’s Web Scraping Copilot is trying to solve.

Talk 2. Scraping a Synthetic Web: Dead Internet Theory Meets Web Data Extraction

If you thought the dead internet theory was fringe – or you didn’t know about it at all – this is the talk for you. Domagoj Maric, AI Customer Delivery Manager at Pontis Tech, described the many ways bots have infiltrated into our browsing lives, manipulating facts and impacting our decisions. 

It’s a sprawling talk filled with examples, personal experiences, and even an overview of relevant legislation. Domagoj went as far as to build his own social media bot, proving how cheap and fast this process is. To spoil it a little, 10k comments cost just $2, and this is with current token prices. 

While there was less to do with web scraping than the title led us to believe, this truly was a fascinating presentation that we recommend without reservations.

extract summit 2025 dublin talk 2
When they’re not sowing discontent in the West, Russian bots are busy making delicious cupcakes.

Panel 1. Antiban Panel

This is probably the only panel we’ve seen that brought bot makers and bot breakers to the same stage. It was hosted by Zyte’s CEO Shane Evans and comprised Antoine Vastel (Head of Research at Castle), Fabien Vauchelles (Scrapoxy), and Kenny Aires (Team Lead at Zyte). Antoine is a bit of a mythical figure in our niche, and he was able to participate because his current role doesn’t deal with web scraping that much. 

The panel addressed a range of topics, such as how anti-bot companies distinguish between good and bad bots, or how the busy month of November impacts the data extraction and protection industries. However, it mostly dealt with change: in detection techniques, the role of proxies, and the cost of web scraping in general. 

We learned a lot. One of the main findings for us was that proxies are becoming less important in the big picture, to the point where they’re now considered a weak signal. Even the consistency of a fingerprint is no longer the ultimate giveaway due to improving botting tools and edge cases from regular users. 

Anti-bots face the constraint of retaining a good user experience, bots are constrained by scraping costs, and no one knows what exactly to do with AI agents yet. A great discussion overall.

extract summit 2025 dublin panel 1
Can you find the impostor?

Talk 3. AI and the Web: What 2025 Changed and What Comes Next

Zyte’s Senior Data Scientist Ivan Sanchez returned to talk about LLMs. Compared to Austin, this presentation gave a more high-level outlook; it overviewed the prevailing trends and allowed itself to speculate a little. 

Ivan spent a lot of time talking about reasoning models. He believes that GPT-4o and beyond caused a revolution of sorts that not only improved answers but unlocked new capabilities. The paradigm shifted from guessing the next word to solving problems. Reasoning models become even more powerful when made into AI agents, which is where we currently stand. 

The next part dealt with broader market movements, such as more foundational models (including Google’s turnaround with Gemini and Meta’s setbacks), China leading the open source, concerns about a potential bubble, and agents as the new consumers of web data. The presentation is worth watching, especially if you’re not well acquainted with the developments in AI.

extract summit 2025 dublin talk 3
What an inspiring time to be a small publisher.

Talk 4. The Anatomy of a Request: Bypassing Protections and Scaling Data Extraction

An ex Michelin-star chef, Kieron Spearing from CentricSoftware, now runs 5,000 scrapers that make 130M requests per day. It’s a pretty huge scale, if you ask us! Kieran shared his process for scaling web scraping operations and not going insane with maintenance. It was a practical and highly actionable talk. 

According to the speaker, building resilient scrapers starts with the methodology. This requires experimenting with the request through cookies, headers, proxies, and other identifiers, until you’re left with the leanest working configuration. 

As a chef, Kieron is a big proponent of preparation. If there’s one thing we took away it’s that every minute spent in investigation will save ten in implementation. But there was much more: for example, that the browser’s dev tools may not honor the original header order, or that going through a website’s API is always worth it, even if it requires much more upfront unblocking.

extract summit 2024 dublin talk 4
Kieron’s awesome prep list for building a web scraper.

Panel 2. The Future of Data Laws: AI, Web Data, and Intellectual Property

The inimitable Sanaea Daruwalla, Zyte’s Chief Legal Officer, invited three more lawyers to talk about intellectual property in the age of AI. Its panelist Nikos Callum came from the F500 company Wesco, Dr Bernd Justin Jutte of University College Dublin represented academia, while Callum Henry works alongside Sanaea for Zyte. 

The discussion revolved around relevant legislation and legal concepts. It explored the EU’s AI act with its concept of risk tiers. We found it baffling that the level of risk should be self-assessed, and that this doesn’t apply to personal AI use. According to the panelists, the EU’s opt out requirement may also cause challenges, as there’s no set format for this procedure. 

We also had the chance to learn about US law, in particular its concept of fair use. Finally, the participants discussed some recent high-profile cases, namely the Anthropic book lawsuit and Getty vs Stability AI. It seems like so far judges have tended to favor AI companies in interpreting transformative use, but nothing has been set in stone yet. 

The panel discussion ended on a funny note: when it comes to giving legal advice on web scraping, large language models are much more cautious than even lawyers! Go figure. All in all, this one is highly recommended.

This was the only way to get all four panelists on one screen.

Talk 5. The New Era of AI Data Collection: A Deep Dive into Modern Web Scraping

Fabien Vauchelles, the man behind Scrapoxy, brought his famed slides to talk about the race between bots and anti-bots. Together with his collection of monochrome ducks, Fabien covered the main developments in bot protection. Then, he demonstrated how to build a self-healing scraper. 

Fabien’s anti-bot part talked about several threats. The network fingerprint, for one, is something that’s hard to create and easy to detect. The browser scene gave little relief, too, as our current champion Camoufox is open source and thus has been studied to death, and serious scraping requires expensive custom solutions. The presenter further identified new signals, such as the audio fingerprint. At least CAPTCHAs seem to be reaching a dead end for anti-bot tech. 

In the second part, Fabien showed several ways to maintain scrapers with large language models. He wrote an MCP server that injects middleware into Scrapy scrapers. Upon failure, an LLM generates new code until the spider works again. All a human needs to do is verify the pull request. 

Fabien’s conclusions weren’t very inspiring. In-house scraping is becoming too resource demanding for many new players; and at the same time, the internet is closing off. But hey: we’re still here, so it’s not all doom and gloom.

extract summit 2025 dublin talk 5
A hero isn’t someone who doesn’t fall but rather someone who gets back up.

Talk 6. IPv6-Powered Web Scraping: Design Patterns, Pitfalls & Practical Checklists

Yuli Azarch, CEO of Rapidseedbox, explained why IPv6 proxies should be used in web scraping and how to do that effectively. The why part basically boiled down to IPv6 adoption and the costs associated with getting IPv4 IPs; the how part had few slides but made the meat of the presentation. 

It turns out that websites don’t take IPv6 addresses as individual IPs – rather, they evaluate them in blocks of /48 (or septillion addresses). That’s why it’s best to have multiple /48 subnets or, in serious web scraping jobs, go as far as /29. Yuli found that setting up reverse DNS delegation also works to prevent blocks.

Frankly, we had such big expectations from this talk. Can you use IPv6 to scrape Google? Amazon? How many requests can you realistically make per a /48 subnet? What about IPv6-only residential proxy pools which are now emerging as a new product? Alas! But even if we ended up a little disappointed, we didn’t feel like our time was wasted watching the talk. 1.5x speed and skimming through the first half can give you a good bang for your buck.

extract summit 2025 dublin talk 6
The blueprint “it’s cheaper if you get in early” sounds vaguely familiar.

Ending Remarks

Thanks to Zyte for organizing yet another great conference. If you’re human and managed to get this far down the page – you have our sincerest admiration and respect. Otherwise, please give us your best recipe of cupcakes in the comments!

The post Zyte Extract Summit 2025 (Dublin): A Recap appeared first on Proxyway.

]]>
https://proxyway.com/news/zyte-extract-summit-2025-dublin-recap/feed 0
Finding Direction and Connecting Dots: My Journey with GeeLark https://proxyway.com/news/dominic-li-journey-with-geelark https://proxyway.com/news/dominic-li-journey-with-geelark#respond Mon, 17 Nov 2025 12:10:49 +0000 https://proxyway.com/?post_type=news&p=31520 By Dominic Li, the founder of GeeLark.

The post Finding Direction and Connecting Dots: My Journey with GeeLark appeared first on Proxyway.

]]>

News

By Dominic Li, the founder of GeeLark.

Dominic Li
geelark interview main
Every dreamer has that moment when random experiences suddenly align and make sense. For me, GeeLark was that spark – an “aha” moment that tied together ten years of hard‑won lessons, wrong turns, and late‑night epiphanies.

The Formative Years

Wind the clock back to 2011. After three years of wrestling with C++, I joined Tencent and officially entered the big leagues of IT. I bounced around Tencent News, Tencent Weibo, and eventually WeChat, watching up close as China jumped from desktop to mobile. Those days were more than just coding – they were a crash course in human behavior and product design. 

Allen Zhang, the founder of WeChat, was an unlikely mentor from afar. His belief that simplicity isn’t just an aesthetic, but a sign of respect for users’ time and attention, still anchors my approach. By 2015, I swapped the stability of Tencent for the chaos of my first start‑up, trying to connect traditional insurance companies with new O2O businesses. It was an early lesson in how the internet can rewire old industries – and how hard true innovation really is. 

A year later, I landed at Alibaba to manage user growth for UC Browser. There I learned what real growth looks like and got a front‑row seat to the explosion of user‑generated content and short videos that would later change the game.

Searching for Purpose

By 2018, the comfort of working for industry giants started to feel… well, boring. So I teamed up with several former Tencent colleagues to build a start‑up that focused on the WeChat ecosystem. 

We created hundreds of accounts, which we called an “account matrix”, that drove crazy traffic to our social commerce projects and built a loyal audience. It was here that I first encountered cloud phone technology. We used it to build a SaaS platform for in‑house marketing, helping online education providers manage their users more efficiently on WeChat and Rednote. 

Then came the Covid-19. In 2021, I co-founded a DTC furniture business targeting the U.S. market with a former Alibaba colleague. Navigating supply chain nightmares felt like trying to row a boat in a typhoon, but we got through it.

The Moment of Clarity

Throughout 2022, a question kept nudging me: why were there plenty of marketing tools for desktop, but nothing truly built for mobile social media? It wasn’t just a random thought. 

Across several ventures, I’d watched brands struggle to manage social channels efficiently. Social media apps like Facebook and Instagram have both desktop and mobile versions. Their importance is undeniable, but globally, more people use them on mobile devices. More importantly, social media platforms like TikTok are becoming even more popular, especially among younger users – and TikTok is mobile-native. 

Existing tools were fragmented and clunky. Sometimes I wondered if I was imagining the problem, but every conversation with marketers echoed the same frustration. 

By 2023, I was in serious talks with my soon-to-be cofounder, Dufei – an old friend from my early WeChat days who helped build the app’s Android version. We refined our vision and met potential investors. By late 2023, everything clicked. GeeLark was born.

From Vision to Reality

During those early days, you could really see my drive. I’d tell the team, “I know exactly what this thing should be and how we’re going to build it.” I even mapped out our weekly goals in spreadsheets – running things like a “benevolent dictator.” 

Before we had the payment system set up – when all we had were the core features – I was already inviting friends to try it out. They were excited, but they also pointed out where we needed to improve. We hit up industry events, soaked up every bit of feedback, and kept tweaking the product.

I’d picture how people would use GeeLark, what roadblocks they’d hit, and how we could help them. On March 14, 2024, GeeLark went live as the first SaaS platform built on cloud‑phone technology for mobile social media marketing. Even though I felt we were on the right track, I still wondered how folks would react. Would others see the value we did?

The Growth Journey

Throughout 2024, we zeroed in on building automation tools for TikTok, working with everyone from big public companies to solo freelancers. Our clients not only validated what we were doing but also offered fresh ideas to make the platform better. 

Seeing customers use GeeLark to tackle real challenges gave me a deep sense of satisfaction. That’s when I knew we were creating something that really mattered. By late 2024, after listening closely to our users, we rolled out automation support for YouTube, Instagram, Twitter, Facebook, Reddit, and Pinterest – turning GeeLark into a true all-in-one social media marketing platform, especially for UGC campaigns. 

Then in early 2025, we had another big realization: while GeeLark could boost efficiency tenfold (or even a hundredfold), many teams struggled to keep up with the creative demands that came with that speed. So, we integrated generative AI tools like Seedance, KLING 2.1 Master, Hailuo 02, and Google Veo3 – not to flood social channels with more content, but to help brands create, manage, and optimize their social media presence with greater creativity, consistency, and impact. 

We are facing different challenges every day. After all, GeeLark is the first antidetect phone product, and we need to explain to customers and the market what a cloud phone is and how it works. But I firmly believe the market needs a product like GeeLark.

The Path Forward

Today, GeeLark works with clients all over the world who rely on us to power their social media marketing. Our team runs on a tight two-week release cycle, constantly tweaking and improving the product based on real feedback and new challenges that pop up.

And honestly, this is just the beginning. GeeLark is still young, still evolving. As designer Charles Eames once said: “Toys are not really as innocent as they look. Toys and games are preludes to serious ideas.” GeeLark might look like just another tool, but behind it is a big idea: changing the way businesses connect with people in the mobile social era.

One day or another, when GeeLark is in your area, you will love her. (Editor’s note: This is actually a reference to the Korean girl group BLΛƆKPIИK!)

The post Finding Direction and Connecting Dots: My Journey with GeeLark appeared first on Proxyway.

]]>
https://proxyway.com/news/dominic-li-journey-with-geelark/feed 0