How to get ‘src’ attribute from ‘img’ tag using Beautifulsoup

A step-by-step guide on how to find image source using Beautifulsoup.

The majority of data were collected in February and March of 2023.

Step 1. Let’s start by importing the Beautifulsoup library.

				
					from bs4 import BeautifulSoup

Step 2. Then, import requests library.

				
					import requests

Step 3. Get a source code of your target landing page.

				
					r=requests.get("https://books.toscrape.com/")

Step 4. Convert the HTML code into a Beautifulsoup object named soup.

				
					soup=BeautifulSoup(r.content,"html.parser")

Step 5. Inspect the page to find the image object you would like to extract.

The code of this image object looks like this:

				
					thumbnail_elements = soup.find_all("img", class_ = "thumbnail")

NOTE: For this website, you can find images by looking for img tags that have a thumbnail class.

Step 6. Let’s check if our code works by printing it out.

				
					print(thumbnail_elements)

Step 7. Now you need to get the src attribute from each element.

				
					for element in thumbnail_elements:
    print (element['src'])

Results:
Congratulations, you’ve found and extracted the content of an image source using Beautifulsoup. Here’s the full script:

				
					from bs4 import BeautifulSoup
import requests
r = requests.get("https://books.toscrape.com/")
soup = BeautifulSoup(r.content, "html.parser")
thumbnail_elements = soup.find_all("img", class_ = "thumbnail")

print(thumbnail_elements)


for element in thumbnail_elements:
    print (element['src'])
    
    
#for element in thumbnail_elements:
#    print ("https://books.toscrape.com/" + element['src'])

If you rebuild the full URL, you can access the image.

How to get ‘src’ attribute from ‘img’ tag using Beautifulsoup

How to find all ‘href’ attributes using Beautifulsoup

How to scrape multiple pages using Beautifulsoup