🤖 Automating Web Scraping with Selenium in Python

Ever tried scraping a website and got… nothing? That’s probably because the data is loaded with JavaScript, and standard tools like requests + BeautifulSoup can’t see it. Enter Selenium — a browser automation tool that lets you scrape dynamic websites just like a human user would.

In this blog, you’ll learn:

✅ What Selenium is and when to use it
✅ How to install and set up Selenium
✅ How to scrape dynamic content
✅ Real-world examples with wait times and button clicks

🧠 What is Selenium?

Selenium is a Python tool that automates browsers. It lets you:

Open and interact with web pages
Click buttons, scroll, fill out forms
Wait for dynamic content to load
Extract visible content after JavaScript renders it

🧰 What You’ll Need

Python 3.x
selenium library
A WebDriver (e.g. ChromeDriver, Firefox GeckoDriver)

Install Selenium

pip install selenium

Download ChromeDriver

Go to: https://chromedriver.chromium.org/downloads
Make sure the driver version matches your Chrome browser version.
Add the driver to your system PATH or place it in your script folder.

🚀 Getting Started with Selenium

Step 1: Launch a Browser

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By

service = Service("chromedriver")  # or give full path
driver = webdriver.Chrome(service=service)

driver.get("http://quotes.toscrape.com/js")  # JavaScript version of the site

🔍 Step 2: Extract Elements

Unlike BeautifulSoup, Selenium uses .find_element() and .find_elements():

quotes = driver.find_elements(By.CLASS_NAME, "text")
authors = driver.find_elements(By.CLASS_NAME, "author")

for quote, author in zip(quotes, authors):
    print(f"{quote.text} — {author.text}")

⏳ Step 3: Wait for Content to Load

Dynamic pages need time to load content. Use WebDriverWait:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.CLASS_NAME, "quote"))
)

🧭 Clicking Buttons, Navigating Pages

Selenium is great for paginated sites.

while True:
    quotes = driver.find_elements(By.CLASS_NAME, "text")
    for q in quotes:
        print(q.text)

    try:
        next_btn = driver.find_element(By.LINK_TEXT, "Next")
        next_btn.click()
        WebDriverWait(driver, 10).until(
            EC.presence_of_element_located((By.CLASS_NAME, "quote"))
        )
    except:
        break

🧹 Clean Up

driver.quit()

Always close the browser when you’re done to avoid memory issues.

📦 Saving Data to CSV

import pandas as pd

data = []

quotes = driver.find_elements(By.CLASS_NAME, "quote")
for q in quotes:
    text = q.find_element(By.CLASS_NAME, "text").text
    author = q.find_element(By.CLASS_NAME, "author").text
    data.append({"Quote": text, "Author": author})

df = pd.DataFrame(data)
df.to_csv("quotes_selenium.csv", index=False)

✅ Summary

Task	Code
Load page	`driver.get(url)`
Find element	`driver.find_element(By.CLASS_NAME, ...)`
Wait for JS	`WebDriverWait(...).until(...)`
Click	`element.click()`

Selenium lets you interact with websites as if you were a user, unlocking pages that traditional scrapers can’t reach.

⚠️ Pro Tips

Use headless mode for background automation:

from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("--headless")
driver = webdriver.Chrome(service=service, options=options)

Add random delays with time.sleep() to avoid detection.
Don’t scrape sensitive or copyrighted data.
Always check the site’s robots.txt.

🔄 Bonus: Scraping Infinite Scroll?

You can use:

driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

Then wait for content to load and repeat.

💡 Use Cases

Scraping product prices from JS-heavy e-commerce sites
Scraping content behind login forms
Automating form submissions or report downloads

deltagradient