Preface
Project Amaterasu uses Taiwan freeway traffic data. This article shows how to fetch and organize the required files.
Downloading Data
Preparation
The open data site at the Ministry of Transportation hosts the M03A dataset, containing vehicle counts by type. Files are offered either as directories of recent data or as compressed archives for older days.
Because manual downloads would be tedious, we automate the process.
Fetch Files Automatically
By inspecting the webpage we find that each file can be retrieved simply by appending the filename to the base URL. With that knowledge we can craft a script using wget
:
|
|
This approach downloads multiple files concurrently, greatly reducing total time.
Organizing Data
After downloading, unzip archives and arrange them into a consistent folder structure ready for later preprocessing.
Summary
Automating downloads with wget
and ThreadPoolExecutor
makes gathering large volumes of traffic data manageable, paving the way for subsequent steps in Project Amaterasu.