Oxylabs Archives - Proxyway Your Trusted Guide to All Things Proxy Tue, 03 Feb 2026 09:22:23 +0000 en-US hourly 1 https://wordpress.org/?v=6.8.5 https://proxyway.com/wp-content/uploads/2023/04/favicon-150x150.png Oxylabs Archives - Proxyway 32 32 Oxylabs Launches Dedicated ISP Proxies in Self-Service https://proxyway.com/news/oxylabs-launches-dedicated-isp-proxies-in-self-service https://proxyway.com/news/oxylabs-launches-dedicated-isp-proxies-in-self-service#comments Tue, 03 Feb 2026 09:25:10 +0000 https://proxyway.com/?post_type=news&p=39958 The product offers nearly unlimited access to proxy IPs in 14 countries.

The post Oxylabs Launches Dedicated ISP Proxies in Self-Service appeared first on Proxyway.

]]>

Oxylabs

The product offers nearly unlimited access to proxy IPs in 14 countries.

Adam Dubois
oxylabs dedicated isp proxy landing

Oxylabs, the Lithuanian provider of web data extraction tools and services, has launched self-serve dedicated ISP proxies. Previously, this product was only available to enterprise customers through a sales process. 

The new product gives access to a list of IPs associated with consumer internet service providers. These IPs are static (with optional rotation functionality) and not shared with anyone else. 

The dedicated ISP proxies currently cover 14 countries across four continents. Each country includes between one and five ASNs. You’re free to choose any combination of countries and ASNs during purchase.

oxylabs dedicated isp country list

The proxies support HTTP, HTTPS, and SOCKS5 protocols, including UDP. They’re also nearly unlimited: the fair usage policy kicks in after hitting the 100 GB/IP mark; past this point, Oxylabs cuts concurrent sessions to 10/IP until the new billing cycle.

The dedicated ISP proxies charge per IP address. Their plans start from $16 for 5 IPs ($3.20/IP) and reach $750 for 300 IPs ($2.50/IP) with the largest public plan. 

The product is already available for purchase. There is no free trial, but Oxylabs offers a 3-day limited refund policy.

The post Oxylabs Launches Dedicated ISP Proxies in Self-Service appeared first on Proxyway.

]]>
https://proxyway.com/news/oxylabs-launches-dedicated-isp-proxies-in-self-service/feed 2
OxyCon 2025: A Recap https://proxyway.com/news/oxycon-2025-recap https://proxyway.com/news/oxycon-2025-recap#respond Thu, 09 Oct 2025 09:31:52 +0000 https://proxyway.com/?post_type=news&p=38386 Our virtual impressions from Oxylabs' annual conference on web scraping.

The post OxyCon 2025: A Recap appeared first on Proxyway.

]]>

Oxylabs

Our virtual impressions from Oxylabs’ sixth annual conference on web scraping.

Adam Dubois
oxycon 2025 main
OxyCon, one of the two major conferences in our field, flew by in an instant. If you didn’t manage to get on board, don’t worry – we watched it all and documented our impressions here. Oxylabs will make the talks available on demand, so you can quickly get acquainted with them before tuning in.

You'll find our coverage of earlier OxyCons and other major industry events here.

Organizational Matters

Oxylabs stayed true to its tested formula and made the conference online only. Anyone was free to attend, as long as they registered beforehand. On the day of the event, the organizer sent an email with a link and a code. It led to a lobby that included a video stream, a Slido widget for questions, and the agenda – a very standard affair.

oxycon 2025 platform
The platform for online viewers.

As a European company, Oxylabs catered mainly to this continent, in particular the British Isles. The schedule used BST as the reference timezone and occupied a timeslot between 12 and 5:30 PM. East Coast Americans were realistically able to watch it, but it was too early for the West Coast and too late for most of Asia. 

We always find this fascinating, and the year 2025 was no exception: despite opting for online-only attendance, the organizer still had a venue with hosts and a real audience. We never saw the live attendees, but they could be heard cheering and clapping. We presume these were mostly Oxylabs’ employees. 

To make attendance more exciting, Oxylabs ran several quizzes with prizes on its Discord. The server also had a conference chat where presenters could tackle the questions that didn’t make the cut on stage due to time constraints. Believe us – that was necessary, as each talk prompted a surprising number of questions. 

All in all, the event went smoothly, and it was clear that the organizers have more or less perfected this format. Our only observation is that it was short – including all the talks, panel discussions, and breaks, OxyCon took only five and a half hours in total.

Main Themes

No surprises here: the hero of this narrative was large language models. We saw them in all shapes and sizes: as parsing assistants, agents, and code generators. Zia Ahmad brought the theoretical chops, the famous Pierluigi from The Web Scraping Club shared some practical applications, while team Oxylabs demonstrated AI in their products. 

We’re sure this topic will remain on top of everyone’s minds for the foreseeable future (or until the impending burst of the AI bubble and subsequent collapse of the financial system, if you haven’t had your morning coffee yet). But who can blame them, really?

We loved that Oxylabs managed to fit in two panel discussions. The lawyers discussed large language models from their own perspective, which is always fascinating to follow. The second panel addressed another elephant in the room, which tends to be overshadowed by AI – unblocking. Both are highly recommended, but we’ll talk about them later in this recap. 

Our final note here is that OxyCon had not one, but two introductory speeches. The first was given by co-CEO of Tesonet (the company behind NordVPN) Tomas Okmanas. The second, which also took no longer than five minutes, warned about the dangers of gatekeeping and monopolizing data. But we shouldn’t let that put a cloud(flare) over our skies. Sorry, we couldn’t resist it.

The Talks

Talk 1. From Chaos to Clarity: Data Structuring in Large-Scale Scraping

Aleksandras Šulženko, Product Owner at Oxylabs, kicked off the presentations with a walk through history and a feature reveal. He recounted all of his company’s approaches to data parsing, culminating in AI-made parsers that heal themselves. 

The company’s road has been long and winding, with seven steps leading to the current implementation. They started with dedicated scrapers, dabbled in machine learning models, and even accepted manual parsing instructions, before arriving at an LLM-based approach. Aleksandras narrated the process very well, highlighting the strengths and weaknesses of each step. 

The apex approach generates selectors from plain language prompts, with an optional schema to ensure better accuracy. However, its main breakthrough is that the system can automatically notice once these static parsers break, regenerating them without manual intervention. At this point, the flow of the presentation collapsed a little (because how the heck do you demonstrate self-healing parsers?), but we still consider it a worthy watch.

oxycon 2025 talk 1
Aleksandras showing the vices and virtues of the parsing methods Oxylabs has tried.

Talk 2. Scaling E-Commerce Data Extraction: From Zero to 10 Billion Products a Day

A good one. Fred de Villamil, the self-proclaimed CTO of scale-ups, explained how his company NielsenIQ manages to run over 10,000 precisely geolocated spiders for digital shelf analytics. In a nutshell, Fred’s team helps brands like Walmart to understand how their stores perform online. 

The speaker outlined the three main challenges he faces, namely coverage, resource management, and anti-bots. He then introduced Nielsen’s strategy for building a process that scales. It involved custom anti-bot tooling, a centralized control center, robust monitoring tools, and even an academy for onboarding new people to his team of 50 web scraping specialists. 

Some facts: it takes between six and eight days to build a spider, and the hardest bot protection system to overcome is PerimeterX. You’ll find plenty more where that came from.

oxycon 2025 talk 2
Fred's employer is hoovering up data at an industrial scale.

Talk 3. Creating an AI-Powered Price Comparison Tool With Cursor and Oxylabs’ AI Studio

Another product demonstration. This time, Oxylabs’ Head of Data Rytis Ulys took the wheel to showcase his company’s new AI Studio. It includes endpoints for scraping and crawling websites, searching Google, and controlling cloud browsers – they’re meant for AI startups and bear a strong resemblance to Firecrawl. 

Rytis introduced a hypothetical scenario, where he wanted to open a bike store and needed competitive intelligence. He used Cursor, as well as AI Studio’s crawling and browser endpoints to create a scraper and build two sets of product data from competitor websites within minutes. 

The demo was pre-recorded, but it showed what the presenter wanted viewers to witness: that it’s now possible to quickly get data without building parsers, fighting with blocking mechanisms, or even knowing how to code well. The current iteration of AI Studio feels a little like a playground, removed from Oxylabs’ other services. But its utility is evident, and we’re sure the provider will figure out a way to incorporate it into the main product line-up.

oxycon 2025 talk 3
AI Studio includes AI Crawler, AI Scraper, AI Map, AI Search... and a Browser Agent.

Talk 4. The AI-Scraper Loop: How Machine Learning Improves Web Scraping (and Vice Versa)

Zia Ahman, Data Scientist at Turing, explored how AI (in the broader sense than only LLMs) and web scraping feed off one another, creating a virtuous cycle of improvement. 

The talk started off by showing how web scraping complements ML, which boiled down to stating that language models need a lot of data to work. In the second part, the speaker tried exploring web scraping through an LLM interface, with various results. He then moved on to data parsing techniques, which included computer vision, sequence models for selectors, and using multiple models at once to reach a consensus. 

Zia is an educator with many courses under his belt, so we enjoyed learning about the possible machine learning techniques for data parsing and validation. But when it came to data access, we found his arguments somewhat lacking.

oxycon 2025 talk 4
Turns out, democracy has a role even in data parsing!

Panel Discussion 1. Web Scraping and AI: Legal Touchpoints and Ways Forward

The first panel discussion had three lawyers (Mindaugas Civilka from Tegos Law Firm, Alex Reese from Farella Braun + Martel, and Kieran McCarthy from McCarthy Law Group), one VP of Engineering (Chase Richards from Corsearch), and Denas Grybauskas – also a lawyer from Oxylabs – as the moderator. The panelists have worked on some high-profile cases, such as HiQ vs. LinkedIn, so the line-up here was very strong. 

The discussion touched upon quite a few topics. For example, we learned about the main legal questions raised in web scraping, legislation involving AI and the changes it’s brought to the legal world. Much attention was given to the topic of copyright, raising the concept of copyright preemption. The panelists also spoke about how to balance the interests of AI companies and the rest of the world in general. The efforts include Cloudflare’s gatekeeping, remaking the robots.txt file, and more.

It was a brilliant choice to include lawyers representing both American and European legal systems. All in all, we highly recommend watching this panel.

oxycon 2025 panel
The three plus two panelists.

Talk 5. How AI Reshaped My Workflow As a Scraper Developer and Content Creator

The final solo presentation involved Pierluigi Vinciguerra from DataBoutique and The Web Scraping Club. He shared how LLMs helped him to automate time-consuming tasks both as a content creator and a web scraping professional. 

In particular, Pierluigi built several helper tools. One of them automatically manages the access level and permissions of paying newsletter users. The second aggregates relevant articles from sources like Reddit and Hacker News, compiling a summarized reading list. After this, Pierluigi showed his LLM-assisted scraping setup which included a blueprint with detailed instructions to ensure that the model will always adhere (to the best of its abilities) to best practices. 

Practical examples aside, Pierluigi shared some nuggets of wisdom. The main takeaway is becoming common knowledge, but it’s still worth repeating: language models are amazing for horizontal scaling. But the most striking statement was that AI wrote over 90% of his code last year. We enjoyed watching and recommend this talk.

oxycon 2025 talk 5
When LLMs dream of electric sheep, Anthropic's CEO dreams of Pierluigi.

Panel Discussion 2. Advanced Web Scraping: Techniques to Stay Unblocked

The second panel included Ieva Šataitė from Oxylabs, Juan Riaza Montes from Idealista, Hocine Amrane from Nielsen IQ, and Tadas Gedgaudas, ex-Oxylabs who left to found topYappers. The discussion was moderated by Juras Juršėnas, COO at Oxylabs. We’ll say outright that it’s one of the must-watches of the conference. 

The panelists started by sharing what changed in a year. Of course, the big topic was Google cracking down on web scraping. But in general, unblocking has become harder and now requires understanding deep tech. Anti-bot solutions have become a big business and, as the guys from Nielsen love to say, what took two days to unblock now can take two weeks. 

On the upside, there’s a lot of activity in open source tools, which are good for up to 90% of use cases. The key is to have a system where you can quickly plug in and test a tool. However, most agreed that it makes no sense to bang your head against the wall – at some point, the better option is to outsource. 

As in the previous panel, Cloudflare was on top of everyone’s minds, and it was evident that the incentive system of the web was changing. The panelists shared their other fears, such as new fingerprinting methods like JA4, the increasing resources required to find unblocking techniques, and the possible need to use real devices to scrape.

The discussion addressed many smaller questions: for example, if DataDome is the hardest anti-bot to defeat or if Asian e-commerce stores really serve more fake data than other continents. All in all, despite their concerns, the panelists remained optimistic about the future.

oxycon 2025 panel
Unblocking websites is no joke, but there's no need to take things too seriously.

Bottom Line

That was 2025’s OxyCon. We learned a lot, and hopefully, so have you! Go watch the talks while we wait for the second edition of Zyte’s Extract Summit.

The post OxyCon 2025: A Recap appeared first on Proxyway.

]]>
https://proxyway.com/news/oxycon-2025-recap/feed 0
Oxylabs Invalidates Four of Bright Data’s Patents in Court https://proxyway.com/news/oxylabs-invalidates-four-bright-data-patents-in-court https://proxyway.com/news/oxylabs-invalidates-four-bright-data-patents-in-court#respond Thu, 14 Aug 2025 10:17:05 +0000 https://proxyway.com/?post_type=news&p=37072 Bright Data’s grip on residential proxy technology is slipping.

The post Oxylabs Invalidates Four of Bright Data’s Patents in Court appeared first on Proxyway.

]]>

Oxylabs

Bright Data’s grip on residential proxy technology is slipping.

Adam Dubois

On August 1, the U.S. Court of Appeals confirmed the lack of validity for four of Bright Data’s patents: no. 10,257,319; 10,484,510; 11,044,342 and 11,044,344. 

The Israeli company used these patents as a basis for its residential proxy server technology – and as a bludgeon to challenge other providers in court.

The Court’s decision upholds an earlier decision made by the U.S. Patent Office. It came as part of a sprawling dispute between Oxylabs and Bright Data and invalidated Bright Data’s patents based on their obviousness and appearance in prior art. 

Throughout the years, Bright Data had sued multiple companies over matters relating directly to or involving its patents, such as NetNut, Oxylabs, and BiScience (GeoSurf). It had also intimidated competitors like SOAX against renting proxy servers in Texas, which Bright Data uses as the venue for litigation.

Oxylabs had the following comment:

We welcome the Court’s recent decision to invalidate Bright Data patents, including the two patents previously used in litigation against Oxylabs over our residential proxy technology. In our view, this outcome strongly supports the position we’ve maintained for years.

Although the process was lengthy, it reaffirms our belief that the legal system ultimately ensures fair decisions – benefiting not only Oxylabs but the entire market.

We’re grateful to our legal team, advisors, and colleagues for their dedication. At Oxylabs, we remain committed to fair competition, technological innovation, and defending both.

Bright Data preferred not to comment. 

While significant, this development doesn’t conclude the legal battles in the proxy server market. However, it does allow providers to breathe a little easier. 

We believe that ideas should be rewarded – as the proxy market grows mature, patents may serve that purpose. However, patents must be carefully crafted to protect genuinely innovative ideas rather than be weaponized to stifle competition or exclude legitimate market participants.

The post Oxylabs Invalidates Four of Bright Data’s Patents in Court appeared first on Proxyway.

]]>
https://proxyway.com/news/oxylabs-invalidates-four-bright-data-patents-in-court/feed 0
Oxylabs Reveals 2025 OxyCon’s Agenda https://proxyway.com/news/oxylabs-reveals-2025-oxycon-agenda https://proxyway.com/news/oxylabs-reveals-2025-oxycon-agenda#comments Wed, 06 Aug 2025 10:20:16 +0000 https://proxyway.com/?post_type=news&p=36518 The conference will feature five presentations and two panel discussions.

The post Oxylabs Reveals 2025 OxyCon’s Agenda appeared first on Proxyway.

]]>

Oxylabs

The conference will feature five presentations and two panel discussions.

Adam Dubois
oxycon 2025 main image

Oxylabs, the Lithuanian provider of web scraping infrastructure, has announced the agenda for OxyCon 2025, its annual conference on web data collection. 

The conference will take place online on October 1st. Participation is free of charge. 

This year, OxyCon’s line-up comprises five presentations and two panel discussions. 

  • The presentations will teach how to structure data at scale, grow e-commerce data extraction to billions of products per day, build a price comparison tool with Oxylabs’ new AI studio, improve web scraping with machine learning, and more. 
  • The panels will discuss the legal aspects surrounding web scraping and AI, as well as advanced web scraping techniques to stay unblocked.

Aside from Oxylabs’ in-house team, the list of participants includes companies like NielsenIQ, Google, Idealista, and leading legal law firms in the field. 

You can register for OxyCon on its designated page. If you want to learn more before committing, we covered the last three conferences in detail. 

OxyCon will be one of the two major web scraping-related conferences this year. The second, Zyte’s Extract Summit, will take place in late September (Austin) and early November (Dublin).

The post Oxylabs Reveals 2025 OxyCon’s Agenda appeared first on Proxyway.

]]>
https://proxyway.com/news/oxylabs-reveals-2025-oxycon-agenda/feed 1
Oxylabs Group Acquires ScrapingBee https://proxyway.com/news/oxylabs-acquires-scrapingbee https://proxyway.com/news/oxylabs-acquires-scrapingbee#respond Thu, 19 Jun 2025 08:40:26 +0000 https://proxyway.com/?post_type=news&p=35336 The web scraping API provider will continue functioning as a separate company.

The post Oxylabs Group Acquires ScrapingBee appeared first on Proxyway.

]]>

Oxylabs

The web scraping API provider will continue functioning as a separate company.

Adam Dubois
scrapingbee acquisition news

Oxylabs, the Lithuanian provider of web data extraction tools and services, has announced the acquisition of ScrapingBee, a web scraping API service based in France. 

The acquisition involved an eight-figure sum, financed from the acquirer’s internal resources.

According to Oxylabs’ company group Chief Financial Officer:

ScrapingBee has built a stellar reputation for its high-quality, easy-to-use web scraping product. With our expertise and technical capabilities, we are confident that ScrapingBee will scale even further. This acquisition also enables us to strengthen our position within the web scraping industry, as we are adding a leading direct-to-consumer product to our portfolio.

To quote ScrapingBee’s founders:

From a passion project to a fast-growing company, ScrapingBee has exceeded our expectations. Joining forces with an industry leader with deep expertise and a shared vision opens up exciting opportunities for our team and customers. Together, we're set to push the boundaries of what's possible in the web scraping industry.

ScrapingBee will continue functioning as a separate entity. According to the announcement, Oxylabs plans to gradually integrate ScrapingBee into its broader ecosystem. 

ScrapingBee was started in 2019 by two founders, Kevin Sahin and Pierre de Wulf. The company was mostly bootstrapped, scaling to over 2,500 customers and $5 million annual recurring revenue with a team of six.

The post Oxylabs Group Acquires ScrapingBee appeared first on Proxyway.

]]>
https://proxyway.com/news/oxylabs-acquires-scrapingbee/feed 0
Oxylabs Announces OxyCon 2025 https://proxyway.com/news/oxylabs-announces-oxycon-2025 https://proxyway.com/news/oxylabs-announces-oxycon-2025#respond Wed, 18 Jun 2025 10:51:29 +0000 https://proxyway.com/?post_type=news&p=35306 The annual web scraping conference will be held online on October 1.

The post Oxylabs Announces OxyCon 2025 appeared first on Proxyway.

]]>

Oxylabs

The annual web scraping conference will be held online on October 1.

Adam Dubois
oxycon 2025 main image

Oxylabs, the Lithuanian provider of web data extraction tools and services, has announced 2025’s OxyCon, an annual conference on web scraping. 

The one-day event will take place on October 1. As usual, the talks will be delivered online, with a Discord channel for discussion. Attendance is free of charge.

This year’s edition will revolve around three topics; the full agenda is yet to be revealed:

  1. Building efficient and resilient web data workflows, which will talk about scaling operations, beating anti-bot systems, ensuring compliance, and more.
  2. Discovering the role of AI in web scraping, where viewers will learn about adaptive scraping, AI agents, and machine-powered data structuring.
  3. Exploring how data turns into a business advantage, where businesses will share practical strategies and case studies to unlock the full potential of web scraping. 

The previous OxyCon drew over 2,250 attendees. You can read our recap of the event here.  

OxyCon is one of three major conferences dedicated to web data collection. The other two are Zyte’s Extract Summit and Bright Data’s ScrapeCon. 

For more information and registration, visit the dedicated OxyCon page.

The post Oxylabs Announces OxyCon 2025 appeared first on Proxyway.

]]>
https://proxyway.com/news/oxylabs-announces-oxycon-2025/feed 0
Oxylabs Launches a Product Suite for Video Data https://proxyway.com/news/oxylabs-launches-a-product-suite-for-video-data https://proxyway.com/news/oxylabs-launches-a-product-suite-for-video-data#respond Fri, 16 May 2025 12:19:20 +0000 https://proxyway.com/?post_type=news&p=34682 High bandwidth proxies, YouTube API & video datasets aim to satiate AI’s demand for multimodal data.

The post Oxylabs Launches a Product Suite for Video Data appeared first on Proxyway.

]]>

Oxylabs

High bandwidth proxies, YouTube API & video datasets aim to satiate AI’s demand for multimodal data.

Adam Dubois
oxylabs video data

Oxylabs, the Lithuanian provider of web scraping infrastructure and services, has launched a suite of products tailored specifically for extracting videos. 

It includes three options: 1) high bandwidth proxies, 2) YouTube scraping API, and 3) pre-scraped YouTube datasets. 

So far, all three have no public pricing and are sold through sales.

High Bandwidth Proxies

This is a proxy network built specifically for high-volume data collection. According to Oxylabs, its infrastructure can handle over 200 Gbps of bandwidth at once. 

The network includes “millions” of stable IPs from diverse subnets. The provider doesn’t even mention which proxy types the pool consists of – the full focus is on capacity and scraping success. 

High bandwidth proxies are designed to be a plug-and-play solution: they integrate using one dedicated endpoint, handling rotation and cooldown mechanisms automatically. 

High bandwidth proxies are fully compatible with yt-dlp.

YouTube API

Oxylabs’ YouTube API returns videos and related data in a structured format. The product includes five endpoints:

  1. Search for discovery; it can fetch up to 700 results per query. 
  2. Trainability to verify if a video is eligible for AI training purposes.
  3. Metadata to scrape the tags, supported formats, and other information related to the video. 
  4. Downloader to get the actual video file – or files using batch downloads.
  5. Transcript to extract transcribed text from a video file.
  6.  

The endpoints interplay with one another to cover all stages of YouTube data extraction:

stages of youtube data extraction

YouTube Datasets

Oxylabs’ YouTube datasets cover over 4M videos from 1M channels, including their transcripts (JSON), metadata, video (mp4) and audio (m4a) files. All videos reportedly have consent for AI training. 

The output can be delivered via webhook or to major cloud storage platforms. It’s also possible to request custom datasets.

Bottom Line

The video product suite has prominently appeared as a separate category on Oxylabs’ website. This shows how important – and data hungry – the use case of training multimodal AI models currently is for web data collection companies. 

The post Oxylabs Launches a Product Suite for Video Data appeared first on Proxyway.

]]>
https://proxyway.com/news/oxylabs-launches-a-product-suite-for-video-data/feed 0
OxyCon 2024: A Recap https://proxyway.com/news/oxycon-2024-recap https://proxyway.com/news/oxycon-2024-recap#respond Wed, 02 Oct 2024 11:18:02 +0000 https://proxyway.com/?post_type=news&p=26539 Our impressions from Oxylabs’ fifth annual web scraping conference.

The post OxyCon 2024: A Recap appeared first on Proxyway.

]]>

Oxylabs

Our impressions from Oxylabs’ fifth annual web scraping conference. 

Adam Dubois
oxycon 2024 banner

The anniversary edition of OxyCon is behind us. If you didn’t have the chance to participate, or simply want to read our detailed summary, these are Proxyway’s impressions from the event. The presentations are available on demand, so you can always watch the ones that caught your eye. 

General Information about 2024’s OxyCon

Like last year (and, as far as I remember, most years before), OxyCon took place online. However, there was one big change: all presentations were delivered in real time. There was also a live audience in the background, I suppose primarily the company’s employees, who would sometimes react or cheer. 

Some of the presenters were obviously dealing with nerves, and hiccups did occur. But this setup made the conference more human and less like Apple’s eerily robotic keynotes. 

Otherwise, the logistics changed little compared to previous iterations: you registered (for free), received an invitation email, and logged in to Oxylabs’ platform with an embedded video player and a Slido widget for questions. Those who wanted to discuss more, or more in-depth, could visit the provider’s Discord server.

Introduction aside, 2024’s OxyCon featured six talks and a panel discussion to conclude it. All in all, the event proceeded smoothly and according to schedule. 

The Talks

These are 2024’s presentations and panel discussions. You can jump to the talks you’re interested in using the quick links below.

  1. Introduction: Web Scraping Trends
  2. Ensuring Scalability in Data Collection: Key Components, Challenges, and Advancements
  3. Human-Centered Approach to Streamlined Data Gathering
  4. Imitating Real User Behavior With Mouse Movements
  5. Harnessing Gen AI for Data-Driven Answers
  6. AI-Powered Public Web Data Collection at Scale
  7. Legal Compliance in the Age of AI
  8. Panel Discussion: Advanced Unblocking Strategies

Introduction: Web Scraping Trends

A fast one. Gabriele Montvile, CCO at Oxylabs, outlined three major trends impacting web data collection. They’re well-known for industry insiders, so there’s no harm in spoiling them for you: AI, ethics, and advancing anti-bots. The interesting part was the supporting material, which included survey data, AI use cases and challenges. Ten minutes well spent.

oxycon introductory talk
The three major trends in today’s web data collection.

Talk 1: Ensuring Scalability in Data Collection: Key Components, Challenges, and Advancement

Zydrunas Tamasauskas, another C-level face at Oxylabs, spoke about web scraping pipelines, implementation strategies of proxy servers, headless scraping, and beyond. The title doesn’t make it clear, but this presentation is primarily about proxies. You’ll learn how to choose the appropriate type and implement several load balancing approaches. Some takeaways: desktop residential IPs are the best, and managing sessions between proxies and headless browsers is its own circle of hell. 

All in all, a useful talk. We were mentioned as well, so of course you have to watch it now!

the first talk of oxycon
No particular reason why we chose this slide. Truly.

Talk 2: Human-Centered Approach to Streamlined Data Gathering

Vilius Visockas from CityNow, a Lithuanian real estate intelligence website, disclosed (let’s sensationalize this a little) how he’s able to scrape nearly a thousand local sources with a small team of 3-4 people. In a wonderful synergy of capitalism and engineering, Vilius chose the only reasonable approach: he built a pipeline management platform, implemented some fail safes, and hired code school grads to mine experience and earn some cash. 

Vilius talks about the challenges of keeping the system abuzz. Among other things, this involves maintaining and optimizing schemas, as well as ensuring satisfactory results from contributors with different backgrounds and generally little programming experience. But to me, the real beauty lies in the idea itself and the self-interested social value it provides.

the second talk of oxycon
It’s free affordable real estate.

Talk 3: Imitating Real User Behavior With Mouse Movements

A practical boots-on-the-ground presentation that, according to feedback, made at least several viewers’ days. Tadas Gedgaudas from Oxylabs shared his know-how on dealing with mouse-based detection methods. 

The presenter dedicated the first part to establishing whether websites actually track mouse movements. (Examples from the wild and his own personal weeks-long goose chase to unblock a website prove that they do.) He then showed how to verify this with the browser’s dev tools and went through the pros and cons of three major mouse algorithms: Bezier, Gaussian, and Perlin. Finally, Tadas introduced an open source library made by Oxylabs that can implement any algorithm with a few lines of code.  

My biggest gripe is that due to time constraints we were all left hanging: why use anything other than Perlin? But that was probably answered on Discord…

the third talk of oxycon
This Python library is actually open source.

Talk 4: Harnessing Gen AI for Data-Driven Answers

Brace yourselves – we’re entering the AI zone. When Paul Felby (Adthena) started by demonstrating a chatbot, my first thought was, “Oh no!..”. But it turned out there was more to the talk than met the eye: in particular, how to ensure accurate answers and not make LLMs implode when working with a database that ingests hundreds of millions of SERPs per day.

Paul had multiple tricks up his sleeve. Some involved getting the LLM to generate proper queries, either directly in SQL or by adding a semantic layer. Others dealt with creating a team of agents, each performing their own task – even QA. There were layers and layers of AI, which all somehow worked together. The result: a chatbot, but not quite what we’ve grown to expect. Everyone’s working on AI now, so I’m sure you’ll find something to take away from this.

the fourth talk of oxycon
The elaborate multi-agent backend behind Adthena’s chatbot.

Talk 5: AI-Powered Public Web Data Collection at Scale

The commercial break of the day. Aleksandras Sulzenko from Oxylabs laid out the web data acquisition pipeline, then proceeded to talk about the challenges of each step and how Oxylabs’ tools can make it hurt less. That would be pretty much it, but Aleksandras also made a product announcement: the web scraping API was getting AI functionality called Copilot (how original). 

Alright, so it’s something to work with. And the implementation really did prove fascinating: the feature generates API queries from natural language instructions. The real utility here is that Copilot can also create custom parsers with a modifiable schema and visual interface for fine-tuning. Many competitors use AI to directly interact with the page, so this approach is highly practical, albeit somewhat more manual and less resilient to changes.

In brief, watch the talk if you’re in the market for scrapers – or you’re trying to create a competing service of your own.

the fifth talk of oxycon
Oxylabs’ AI parser in the flesh (or rather dashboard).

Talk 6: Legal Compliance in the Age of AI

Nerijus Sveistys, senior legal counsel at Oxylabs, went through the risks, regulations, and relevant lawsuits pertaining to AI. It was more of an overview rather than directly applicable guidelines. Sorry, AI startup founder – you’ll still have to hire a lawyer. 

Not keeping track of the legal environment too closely, I learned that the EU already has a regulatory framework, China enacts laws for particular issues, and the US lacks a uniform approach for now. I also saw how many lawsuits are taking place, mostly over copyright issues. My favorite example was the bathroom-invading Roomba surveillance system. A solid talk overall.

the sixth talk of oxycon
Beware of bathroom-invading roombas.

Panel Discussion. Advanced Unblocking Strategies

The discussion included Hocine Amrane from Dataimpact, Paulius Gerve from Oxylabs, Jonny Smyth from Ceartas, Brecht Stamper from Lighthouse Intelligence, and Carl Erkof from Wiser Solutions. The host was Juras Jursenas, COO at Oxylabs. Quite a crowd.

It took over 40 minutes, so I’m not sure I’ll be able to recount everything. I suggest just go and watch the discussion – it’s worth it. Some of my notes to give you a taste:

  • One of the biggest concerns of the participants was the commercialization of anti-bot software. Specialist tools are more robust, and these companies have marketing people to proliferate.
  • We’re finally starting to see detection methods like Canvas fingerprinting put to use. There are more techniques waiting to be exploited, such as local storage.
  • Anti-bot research, and much of unblocking, is still done manually, and success relies on human error (which is surprisingly frequent). 
  • To be successful in this game, you need patience and willingness to butt your head against the wall until it gives.


Fascinating stuff.

the panelists of the discussion
The panelists on stage.

Conclusion

That was it for this year’s OxyCon. Did you find anything interesting? The videos are available on demand. And now, we’ll be waiting for another major industry event which is right around the corner – Zyte’s Extract Summit. 

The post OxyCon 2024: A Recap appeared first on Proxyway.

]]>
https://proxyway.com/news/oxycon-2024-recap/feed 0
Oxylabs Unifies Scraper Line-Up, Introduces AI Copilot https://proxyway.com/news/oxylabs-unifies-scraper-line-up-introduces-ai-copilot https://proxyway.com/news/oxylabs-unifies-scraper-line-up-introduces-ai-copilot#respond Fri, 27 Sep 2024 09:42:53 +0000 https://proxyway.com/?post_type=news&p=26222 Inspired by a certain dev platform, the feature can structure any website using natural language instructions.

The post Oxylabs Unifies Scraper Line-Up, Introduces AI Copilot appeared first on Proxyway.

]]>

Oxylabs

Inspired by a certain dev platform, the feature can structure any website using natural language instructions.

Adam Dubois

Oxylabs, the Lithuanian provider of web scraping infrastructure and services, has revised its line-up of web scraping APIs, fitting them with more AI functionality and reducing the price. 

Formerly segmented by website category into SERP, e-commerce, and general-purpose APIs, the scrapers now fall under one Web Scraper API. This is great news for customers with diverse web scraping needs, as they’ll no longer need to buy separate subscriptions. 

But the biggest highlight is the newly-added OxyCopilot feature on Oxylabs’ dashboard. First announced during this year’s OxyCon, it takes in natural language instructions, sends them to a large language model, and puts out API request code, which you can then paste into a Python, Node.js, or other script. 

OxyCopilot not only simplifies onboarding, but it also includes a powerful ability to generate parser code for any target. For example, it’s possible to feed it a product URL from an e-commerce website and specify the data points to retrieve, such as: “Parse the item’s name, rating, and price.” 

The playground scrapes the page, creates a parser schema, and shows both the structured output and underlying selectors it generates. You can play around with the schema and generate new parsing code or start scraping if you’re happy with the result.

The AI parser sets out to address a large need: according to a recent survey made by the provider, 50% of developers identify parsing as one of the biggest web scraping challenges, and they spend between 10 and 40 hours every week on parsing processes.

By choosing to generate selectors rather than feed every page to AI, Oxylabs avoids the current cost pitfalls of large language models. It also reduces reliance on a middleman by only invoking AI in the preparatory stage. On the downside, once the parser code breaks, it won’t self-adjust without manual intervention. 

OxyCopilot comes at no additional cost. To further sweeten the deal, Oxylabs has significantly decreased the rates of all public plans:

 New CPM*Old CPMDifference
Micro ($49)$2$2.80-30%
Starter ($99)$1.8$2.60-30%
Advanced ($249)$1.65$2.40-30%
Venture ($499)$1.5$2.20-30%
Business ($999)$1.35$1.90-30%
Corporate ($2,000)$1.2$1.60-25%

* Rate for 1,000 requests

The feature is still in beta. Despite some quirks and inefficiencies that inevitably pop up at this stage, it shows Oxylabs’ commitment to innovation in the web data extraction space. You can read more about the research process and story behind OxyCopilot here

Even without this new AI functionality, the whole package is now more appealing than it’s ever been.

The post Oxylabs Unifies Scraper Line-Up, Introduces AI Copilot appeared first on Proxyway.

]]>
https://proxyway.com/news/oxylabs-unifies-scraper-line-up-introduces-ai-copilot/feed 0
Oxylabs Reveals 2024 OxyCon’s Agenda https://proxyway.com/news/oxylabs-reveals-2024-oxycons-agenda https://proxyway.com/news/oxylabs-reveals-2024-oxycons-agenda#respond Wed, 07 Aug 2024 08:06:01 +0000 https://proxyway.com/?post_type=news&p=24805 The annual web scraping conference will cover generative AI, large-scale data collection, and more.

The post Oxylabs Reveals 2024 OxyCon’s Agenda appeared first on Proxyway.

]]>

Oxylabs

The annual web scraping conference will cover generative AI, large-scale data collection, and more.

Adam Dubois
oxycon 2024 hero

Oxylabs, the Lithuanian provider of web scraping infrastructure and services, has announced the agenda of its annual conference on web data collection. 

The event is set to take place on September 25, 11 AM UTC, online. Attendance is free of charge. 

This year’s line-up includes six presentations and one panel discussion. They revolve around three themes:

  1. Fueling businesses with public web data, where companies will introduce how they handle scraping long-tail datasets at scale, successfully delegate tasks to novice coders, and make use of large language models for business needs while navigating its shortcomings.  
  2. Mastering AI & advanced web scraping techniques, which will discuss unblocking strategies and present methods for imitating realistic mouse movements. 
  3. Optimizing web data extraction operations, where attendees will learn how to build a scalable web data collection infrastructure, ensure legal compliance in the age of AI, and see Oxylabs’ AI-powered web scraping platform in action.  


Aside from in-house speakers, the list includes representatives from companies like Adthena (AI search intelligence), City Now (real estate data), Data Impact (e-commerce analytics), Ceartas (brand protection), and Lighthouse Intelligence (travel & hospitality insights). 

You can register for OxyCon on its designated page. If you’d like to get a taste of what the conference is about, we covered it last year and the year before

OxyCon is one of the three major conferences on web scraping, the other two being Bright Data’s ScrapeCon (which happened this April), and Zyte’s Extract Summit (which will take place in Texas on October 9-10).

The post Oxylabs Reveals 2024 OxyCon’s Agenda appeared first on Proxyway.

]]>
https://proxyway.com/news/oxylabs-reveals-2024-oxycons-agenda/feed 0
Oxylabs Announces 2024’s OxyCon https://proxyway.com/news/oxylabs-announces-2024-oxycon https://proxyway.com/news/oxylabs-announces-2024-oxycon#respond Wed, 19 Jun 2024 12:33:45 +0000 https://proxyway.com/?post_type=news&p=22643 The fifth annual web scraping conference will take place online on September 25.

The post Oxylabs Announces 2024’s OxyCon appeared first on Proxyway.

]]>

Oxylabs

The fifth annual web scraping conference will take place online on September 25.

Adam Dubois
oxycon 2024 hero

Oxylabs, the Lithuanian provider of data extraction tools and services, has announced 2024’s OxyCon, an annual conference on web scraping. 

This year, OxyCon is set to take place on September 25. The event will be held online, accessible after registering on the designated webpage. Participation is free of charge. 

The conference will focus on three areas:

  1. Fueling business with public web data
  2. Mastering AI & advanced web scraping techniques
  3. Optimizing web scraping operations


The list of speakers is yet to be announced.

Last year’s OxyCon featured 15 speakers and over 2,200 registered attendees. We covered it in our blog post here

Registration for 2024’s event is already open. All participants will get live and on-demand access to the talks. There’s also a Discord community to discuss OxyCon-related, as well as broader topics surrounding public web data collection. 

OxyCon is one of the three major conferences dedicated to web scraping. The second one, Zyte’s Web Data Extraction Summit, will take place in October in Austin, while Bright Data’s ScrapeCon already happened this April (you can find our recap here).

The post Oxylabs Announces 2024’s OxyCon appeared first on Proxyway.

]]>
https://proxyway.com/news/oxylabs-announces-2024-oxycon/feed 0
Oxylabs Launches ISP Proxies https://proxyway.com/news/oxylabs-launches-isp-proxies https://proxyway.com/news/oxylabs-launches-isp-proxies#respond Tue, 18 Jun 2024 09:20:17 +0000 https://proxyway.com/?post_type=news&p=22494 The service offers IPs registered under American and French internet service providers.

The post Oxylabs Launches ISP Proxies appeared first on Proxyway.

]]>

Oxylabs

The service offers IPs registered under American and French internet service providers.

Adam Dubois
Oxylabs, the Lithuanian provider of web scraping tools and services, has launched a new ISP proxy product. It offers a list of semi-dedicated (shared with up to 3 users) IPs associated with major consumer internet service providers.

For now, Oxylabs sells ISP proxies in the US and France. It advertises five premium ASNs: AT&T, Comcast, Lumen, Frontier, and Orange. The provider plans to add more in the near future.

Oxylabs’ ISP proxies are accessible via a gateway server. Customers can create sessions that don’t expire, and there’s also the ability to rotate IPs as needed, making proxy management easier. The service supports HTTP, HTTPS, and SOCKS5 protocols. 

Technically, the product comes with unlimited bandwidth, but there’s a fair usage policy that limits the number of concurrent connections to 100 per IP.

Oxylabs uses a pay-per-IP pricing model. The cheapest plan starts at $21 for 10 IPs ($2.1/IP). It’s possible to get up to 1,000 IPs through self-service with the unit price at $1.50 per IP address. Compared to other ISP proxy services, Oxylabs outprices some competitors like Rayobyte by offering competitive rates across various pricing tiers.

 10 IPs20 IPs50 IPs100 IPs200 IPs500 IPs1000 IPs
Price$21$40$95$180$340$800$1500
Price/IP$2.10$2.00$1.90$1.80$1.70$1.60$1.50

We had the chance to test Oxylabs’ ISP proxies beforehand. The provider stays true to its claims – the proxies are performant and very fast. The infrastructure let us down very few times, so its customers can expect an almost perfect success rate and a response time in milliseconds.

GatewayAvg. success rateAvg. response time
US99.97%0.06 s

The ISP proxies were able to access challenging targets smoothly as well.  We tested the product on Google and Amazon.

WebsiteAvg. success rateAvg. response time
Google100%1.1 s
Amazon96.36%2.9 s

Oxylabs’ semi-dedicated ISP proxies are already available for purchase.

The post Oxylabs Launches ISP Proxies appeared first on Proxyway.

]]>
https://proxyway.com/news/oxylabs-launches-isp-proxies/feed 0