Featured image of post [Interactive Crawling 3] Selenium Advanced – Headless Mode in Containers

[Interactive Crawling 3] Selenium Advanced – Headless Mode in Containers

Run Selenium without opening a browser window using Docker.

Introduction

After the basic Selenium articles, this installment shows how to execute crawlers inside Docker using headless mode, allowing automation on servers without displays.

Containerizing Chrome

The Selenium project maintains docker-selenium, offering images for popular browsers.

Workflow

Headless Mode

Use Chrome options to enable headless execution:

1
2
3
4
from selenium.webdriver.chrome.options import Options

chrome_options = Options()
chrome_options.add_argument("--headless")

Docker Compose

1
2
3
4
5
6
services:
  chrome:
    image: selenium/standalone-chrome
    shm_size: 2g
    ports:
      - "4444:4444"

Sample Script

1
2
3
4
5
6
7
8
from selenium import webdriver

options = Options()
options.add_argument("--headless")

driver = webdriver.Remote("http://localhost:4444/wd/hub", options=options)
# ... perform scraping ...
driver.quit()

Conclusion

Running Selenium in headless containers simplifies deployment and paves the way for scalable crawling pipelines.

comments powered by Disqus
Built with Hugo
Theme Stack designed by Jimmy