Featured image of post [Interactive Crawling 1] Selenium Basics

[Interactive Crawling 1] Selenium Basics

Let the browser operate itself for data collection.

Introduction

When datasets are unavailable, web scraping becomes essential. This series introduces Selenium, an automation tool that controls browsers, using the Central Weather Administration site as an example.

Preparation

Install Selenium

1
pip install selenium

Install WebDriver

Use webdriver_manager to automatically download drivers:

1
pip install webdriver-manager

Basic Usage

1
2
3
4
5
6
7
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager

driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get("https://www.cwa.gov.tw/")
print(driver.title)
driver.quit()

This script opens Chrome, navigates to the site, prints the page title, and closes the browser.

Conclusion

Selenium simplifies automating browser actions and lays the foundation for more advanced crawling techniques.

comments powered by Disqus
Built with Hugo
Theme Stack designed by Jimmy