Python has become the go-to language for web scraping due to its powerful libraries, clean syntax, and strong community support.
[adrotate banner=”3″]
In this section, you’ll learn how to build a simple yet effective web scraper in Python that extracts data from a real website — no prior experience required!
We’ll use:
- requests to fetch HTML content
- BeautifulSoup to parse and extract data
- pandas to organize and export the results
Let’s get started!
🔧 Step 1: Set Up Your Environment
Before writing any code, make sure your system has the necessary tools installed.
✅ Requirements:
- Python 3.x (download from python.org )
- A code editor (like VS Code or Sublime Text)
- Terminal or Command Prompt
📦 Install Required Libraries:
Open your terminal and run:
pip install requests beautifulsoup4 pandas
📌 Tip: You can create a virtual environment first to keep dependencies isolated:
python -m venv env
source env/bin/activate # On Windows: env\Scripts\activate
pip install requests beautifulsoup4 pandas
🌐 Step 2: Send an HTTP Request
Use the requests
library to send an HTTP GET request to the target URL and retrieve the page’s HTML content.
import requests
url = 'https://books.toscrape.com/ '
response = requests.get(url)
# Check if the request was successful
if response.status_code == 200:
html_content = response.text
else:
print(f"Failed to retrieve page. Status code: {response.status_code}")
📌 Tip: Always check the status code to avoid errors. 200
means success!
🧾 Step 3: Parse the HTML Content
Now that we have the HTML, let’s parse it using BeautifulSoup
.
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_content, 'html.parser')
print(soup.prettify()) # Optional: view nicely formatted HTML
With BeautifulSoup
, we can now search for specific elements like <h1>
, <div class="price">
, or <a href="#">
.
🧹 Step 4: Extract Data from the Page
Let’s extract all book titles and prices from the page.
books = soup.find_all('article', class_='product_pod')
for book in books:
title = book.h3.a['title']
price = book.find('p', class_='price_color').text
print(f"{title} – {price}")
📌 This loop finds each book item (<article class="product_pod">
) and extracts the title and price.
📊 Step 5: Store the Data
To save the scraped data, we’ll use pandas
to create a DataFrame and export it as a CSV file.
import pandas as pd
data = []
for book in books:
title = book.h3.a['title']
price = book.find('p', class_='price_color').text
data.append({'Title': title, 'Price': price})
df = pd.DataFrame(data)
df.to_csv('books.csv', index=False)
print("Data saved to books.csv")
You’ll now find a books.csv
file in your working directory containing all the scraped data.
🔄 Bonus: Scrape Multiple Pages
Most websites span multiple pages. Let’s modify our script to scrape all pages.
base_url = 'https://books.toscrape.com/catalogue/page- {}.html'
page_number = 1
all_books = []
while True:
url = base_url.format(page_number)
response = requests.get(url)
if response.status_code != 200:
break # Stop when there are no more pages
soup = BeautifulSoup(response.text, 'html.parser')
books = soup.find_all('article', class_='product_pod')
if not books:
break # No more books found
for book in books:
title = book.h3.a['title']
price = book.find('p', class_='price_color').text
all_books.append({'Title': title, 'Price': price})
page_number += 1
df = pd.DataFrame(all_books)
df.to_csv('all_books.csv', index=False)
print("All data saved to all_books.csv")
📌 Tip: Be respectful by adding a delay between requests:
import time
time.sleep(2) # Wait 2 seconds before next request
⚠️ Important Notes on Ethics and Best Practices
Even though you’re building your own scraper, always follow ethical practices:
- Respect
robots.txt
- Avoid excessive requests
- Identify your bot with a custom User-Agent
- Don’t scrape sensitive or private data

🎉 Congratulations! You Just Built a Web Scraper
You’ve successfully created a working web scraper in Python that:
- Fetches HTML content
- Parses and extracts relevant data
- Stores it in a structured format (CSV)
- Handles pagination
This foundation can be extended to scrape product listings, job boards, news articles, and more!
Related Article:
Part 1: Web Scraping ! The Ultimate Guide for Data Extraction
Part 2: Web Scraping! Legal Aspects and Ethical Guidelines
Part 3: Web Scraping! Different Tools and Technologies
Part 4: How to Build Your First Web Scraper Using Python
Part 5: Web Scraping Advanced Techniques in Python
Part 6: Real-World Applications of Web Scraping