We use affiliate links. They let us sustain ourselves at no cost to you.

Building a Stealth Browser: Interview with Rayobyte

Neil Emeigh explains why his company built an in-house Chromium fork and how it differs from available open-source tools.

Neil Emeigh

Web scraping is becoming increasingly impossible without a web browser. We have great open-source implementations like Camoufox, while AI needs have spawned a segment of cloud browsers made for agentic tasks. 

Rayobyte, however, believes that the market lacks great tools that are purpose-built for web scraping. The provider has made its own implementation of Chromium to address this gap.

We sit down with Rayobyte’s CEO Neil Emeigh to discuss how rayobrowse works, the ways it differs from other available tools, and its place in the fast-growing market of cloud web browsers. The interview also includes a browser fingerprinting benchmark done in collaboration with ScrapeOps. 

You recently announced a new tool called rayobrowse. What is it, exactly?

“rayobrowse is a self-hosted, Chromium-based stealth browser built for web scraping and automation. It runs inside Docker, requires no GPU, and works on headless Linux servers. Your existing Playwright, Selenium, or Puppeteer scripts connect to it over standard CDP – no code changes needed.

Each session gets a realistic, real-world device fingerprint (user agent, screen, canvas, WebGL, fonts, timezone, WebRTC, etc.) drawn from a database of thousands of profiles. We use it in production on Rayobyte’s scraping API to scrape millions of pages per day across some of the most difficult sites on the web.”

Why build your own web browser instead of using open-source tools like Patchright or Camoufox?

“I described our journey in ultimately building rayobrowse in an earlier blog post. But here’s the short version.

1) Camoufox

Camoufox is based on Firefox, which represents only about 2% of global browser market share. At the scale we operate at Rayobyte (hundreds of millions of requests per month) we knew that sending massive volumes of Firefox traffic would make us stand out immediately. Target websites simply don’t see that many real Firefox users.

Because Camoufox is open-source, we were also able to see how some anti-bot companies had reverse engineered subtle signals that made it relatively easy to detect. Over time, those detection vectors became more obvious. On top of that, the project was largely unmaintained. We’ve seen signs that a new owner may be taking it over, but time will tell and that still doesn’t solve the first two issues.

2) Patchright

Patchright works well for basic websites, but it doesn’t mask fingerprints at a deeper level. You still show up with the same underlying OS, GPU, and system-level characteristics.

The bigger issue is this: most scraping at scale doesn’t happen on someone’s local laptop…it happens on Linux servers. So when using Patchright in cloud environments, the fingerprint effectively tells websites, “Hi, I’m a Linux server.” In other words: “I’m probably a bot.”

At our scale, we needed to run on scalable cloud infrastructure while ensuring our fingerprints looked like real user environments (like a real Windows user). That level of control simply wasn’t possible with existing tools.”

Could you explain how rayobrowse works in more detail?

“It has three layers. At the bottom is a Chromium fork — we track upstream releases and apply a focused set of patches (similar to how Brave maintains its fork) that normalize exposed APIs, reduce fingerprint entropy leaks, and improve automation compatibility while preserving native Chromium behavior.

On top of that is a fingerprint engine: at session startup, each browser gets a complete real-world device profile — OS metadata, screen resolution, Canvas/WebGL rendering attributes, fonts matched to the target OS, locale, timezone, and WebRTC config. These profiles come from a database of thousands of fingerprints collected using the same techniques anti-bot companies use.

Finally, an automation layer exposes only standard Chromium CDP interfaces. Your scripts connect through native CDP and operate on unmodified page contexts. It all runs inside a single Docker container, so there are zero host dependencies beyond Docker and Python.”

How effective have you found it at unblocking major anti-bot systems? How does it
fare in synthetic benchmarks, like the one run by ScrapeOps?

“We’re not familiar with their benchmarks, but we were desperate to find a working browser solution to keep a major Fortune 500 customer of ours online that used our scraping API. We “got by” for a little while on Camoufox and Patchright until they detected that, and we tested every cloud browser + “antidetect” browser on the market.

We found a couple antidetect browsers that worked… but were Windows-based and very difficult to work with on the code side for large scale scraping (their business model is people managing accounts, etc… not web scrapers).

We are currently scraping many millions of pages with our browser per day on SERP, and millions more on a popular Asian ecommerce site. We also regularly test and get feedback from users that our system works great on: Cloudflare, Datadome, Akamai, Perimeterx

One motivation behind our release is to get feedback from real users about any website that it doesn’t work on so we can continually improve.

You currently distribute rayobrowse as a restricted-access beta tool. What are your future plans for this product?

Right now it’s distributed as a restricted-access beta with a free tier (one concurrent browser, no registration needed) and an unlimited-concurrency tier for Rayobyte proxy customers. A third option is that you can buy concurrent browsers so that you can self-host and bring your own proxy. Lastly, we’re also building out a cloud browser mode for those who don’t want to self-host.

Longer term, we want rayobrowse to be the default browser for anyone doing serious web scraping. It’s commercially maintained, always current against the latest anti-bot techniques, and available at whatever abstraction level you need: self-hosted, cloud, or integrated into our scraping API.”

We had the chance to collaborate with ScrapeOps and test rayobrowse using their evaluation framework. The benchmark relies on Antoine Vastel’s fingerprinting tool to check for giveaways in user agents, hardware parameters, CDP automation, and other signals. 

rayobrowse did very well, passing the majority of checks. In ScrapeOps’ framework, it got a score of 88.42, nearly matching the performance of established competitors like Bright Data, Scrapeless, and Zenrows.

Here’s a summary of the report:

In our tests, Rayobrowse provided a robust defense against browser fingerprinting by maintaining high internal consistency across various layers. The browser presented a modern Chrome 144 environment on Windows, with HTTP headers that included advanced compression methods like `zstd` and properly localized `Accept-Language` values for countries such as Japan, Germany, and Russia.

The provider’s hardware emulation was particularly diverse. Rather than using a static “one-size-fits-all” hardware profile, Rayobrowse cycled through various CPU core counts and genuine consumer GPU renderers, including both NVIDIA RTX series and Intel HD/UHD integrated graphics. This diversity, combined with unique fingerprint hashes for every session, ensures that automated traffic does not form a recognizable cluster. Despite these strengths, the browser’s internal geometry and font list remained static and somewhat unrealistic, which may alert more advanced anti-bot systems.

Unblocking browsers have exploded as a category in recent years. Established players like Bright Data and Oxylabs now compete alongside newer entrants such as Browse-Use, Kernel, and Browserbase. So where do you fit in?

“We see a clear void in the market.

On one side, you have cloud browsers built by companies that deeply understand scraping such as Bright Data and Oxylabs. But they’re expensive, and they don’t offer self-hosting. You’re locked into their infrastructure and pricing.

On the other side, you have cloud browser companies that don’t really come from a scraping background. The ScrapeOps benchmarks make this pretty clear: platforms like Browserbase, and others in that category, simply don’t have 10+ years of scraping and proxy experience behind them like we do. No disrespect intended – it’s just not their core business model. As a result, they’re not naturally tuned to the subtle signals and edge cases that matter in stealth. And again, they’re not self-hosted.

Then you have self-hosted solutions. Most anti-detect browsers don’t work well in Linux server environments, which makes them extremely hard to scale. When you’re running thousands of browsers like we are, managing fleets of Windows servers becomes a nightmare.

And then there’s Camoufox and Patchright (covered above) with their own structural limitations. So where do we fit?

We offer a browser you can self-host if you want to, and do so affordably. In my opinion, one of the reasons Camoufox became so popular was that it was the first true open-source stealth browser. Before that, your only options were expensive cloud solutions or patching something together that “mostly worked.”

We believe our browser fills the gap Camoufox left behind, with the added benefit of being Chromium-based. Stealth is not a side feature for us, it really is the core focus. It has to be. Our business
model as a scraping company depends on remaining undetected.

And for those who don’t want to self-host, we’ll offer an affordable cloud option as well. The key word is “affordable”. At Rayobyte we consume over 200,000 GB of bandwidth every month on our browsers. If we applied that to Bright Data’s $5/GB model, well, we would be out of business!”

Do you agree with Zyte’s notion that it’s becoming irrational for web scrapers to manage proxies in-house, and that they’ll increasingly need to use higher-level services like APIs to succeed?

“I share the same opinion that 10 years from now it’ll be impossible for hobbyists or startups to install a bunch of repos and ‘tada’, they’re able to scrape at scale.

The anti-bot companies continue to grow (look at Cloudflare’s dominance as a CDN), and their business model affords them to build a great anti-bot solution that can block many users. So, they will incrementally get better, while individuals won’t have the budgets to learn, and keep up with, their much larger budgets and knowledge.

That being said, that’s not today. Our browser is plug-and-play and in a couple minutes you can use it to load most major websites, and sites protected by anti-bot. So, until that future reality becomes true, we hope to help people stay ahead of the curve as much as possible.”

Rayobyte as a company itself has gone through quite a few changes these past few years. Can you share what’s next for you?

“As I wrote in my blog post, we discovered a year ago how good we really are at scraping. Previously we stayed in our lane of proxies, but we came to find out that we were able to scrape some high-value targets in Asia that the top scraping companies were unable to scrape.

From that knowhow, we’ve continued building our scraping services and our scale grows MoM. Within that effort, needs arose, such as a scalable browser, and hence rayobrowse was born. As we look forward, we’ll be leaning further into this skillset of ours to service high-value scraping endpoints, and provide tools that can help others scrape more easily.

Check out our Github and try for free today: https://github.com/rayobyte-data/rayobrowse