How to Use Proxies with Python Requests
This is a step-by-step guide on how to set up and rotate a proxy in Python using Requests.
The Requests library is the most popular method to send HTTP requests in Python. It’s one of the easiest-to-use libraries and, compared to other Python alternatives, often requires writing less code to extract data.
Web scraping enthusiasts know – you won’t walk miles barefoot without quality proxies. Websites these days use advanced anti-bot measures to protect themselves from automation. So, building and maintaining your own scraper includes setting up a proxy server to avoid IP address bans or other web scraping obstacles.
In this article, you’ll learn how to set up and rotate a proxy using Python Requests.
How to Use a Proxy Server with Python Requests
Before you start, you’ll need the following prerequisites:
- Python 3. You’ll need the latest Python version installed.
- Requests. You can add it by running
pip install requests
.
- Code editor. Use any editor of your choice.
How to Set Up Proxies with Requests: Basic Configuration
Step 1. To set up a proxy with Python Requests, run the initialization command.
import requests
Step 2. Then, add a proxies argument with your proxy information.
HTTP proxy:
proxies = {
'http': 'http://host:PORT',
'https': 'http://host:PORT',
}
SOCKS5 proxy:
proxies = {
'http': 'socks5://host:PORT',
'https': 'socks5://host:PORT',
}
Step 3. Now, let’s create a response variable and pass the proxies parameter.
response = requests.get('URL', proxies = proxies)
Note: You can use any of the request methods like get (), post() or put().
How to Authenticate a Proxy
To authenticate your proxy, pass the username and password along with the proxy configuration.
proxies = {
'http': 'http://user:password@host:PORT',
'https': 'http://user:password@host:PORT',
}
response = requests.get('URL', proxies = proxies)
How to Set Up Proxy Sessions
If you want to make multiple requests with the same proxy configuration, you’ll need to create a session and add your proxy. You can do that by passing a session object with your proxy configuration and sending a request through it.
session = requests.Session()
session.proxies = proxies
response = session.get('URL')
How to Set Up Environment Variables
If you want to store your proxy configuration for future use, you’ll need to set environment variables. This way you can easily switch between different proxy settings without modifying your code.
Step 1. Depending on your operating system, you can set/export environment variables to the proxy address and port.
For Windows users:
set http_proxy=http://username:password@:PORT
set https_proxy=http://username:password@:PORT
For Linux users:
export http_proxy=http://username:password@:PORT
export https_proxy=http://username:password@:PORT
Step 2. Then, import the os library and set the proxies dictionary to use the environment variables.
import os
proxies = {
http: os.environ['http_proxy'],
https: os.environ['https_proxy']
}
requests.get('URL',proxies = proxies)
How to Rotate Proxies with Python Requests
If you don’t want to be blacklisted or rate limited by websites, you’ll need to rotate your proxies. Otherwise, you’ll make too many connection requests from one IP, and your target will notice that.
To rotate your proxies, you’ll first need a pool of IP addresses. You can get free lists, but we highly recommend using paid proxy services. Free IPs aren’t reliable, may inject ads, and you can easily expose your data. Paid proxies, on the other hand, maintain their infrastructure, so you’ll be less likely to get blocked.
Step 1. First, import the following libraries:
import requests
import random
Step 2. Then, define a list of IP addresses you want to use.
proxy_pool = ['user:password@host:3001', 'user:password@host:3002', 'user:password@host:3003']
Step 3. Now, let’s go through 10 requests.
for i in range(10):
1) Select a random proxy from your pool.
proxy = {'http': random.choice(proxy_pool)}
2) Send the request using the same proxy.
response = requests.get('URL', proxies=proxy)
3) Print the response.
print(response.text)
Here’s the full script:
import requests
import random
# Define your proxies
proxy_pool = ['user:password@host:3001', 'user:password@host:3002', 'user:password@host:3003']
# Going through 10 requests
for i in range(10):
# Select a random proxy from the pool
proxy = {'http': random.choice(proxy_pool)}
# Send the request using the same proxy
response = requests.get('URL', proxies = proxy)
# Print the response
print(response.text)