This project demonstrates the implementation of an ETL (Extract, Transform, Load) pipeline using Apache Airflow and Docker. The pipeline extracts data from a Forex API, processes it, and loads it into a PostgreSQL database.
- Docker installed on your machine.
- Docker Compose installed on your machine.
- Fixer Obtain an API key from Forex API for accessing the data.
-
Clone the repository:
git clone https://github.com/SammyGISforex-etl-airflow-docker.git
-
Change into the project directory:
cd forex-etl-airflow-docker
-
Build and run the Docker containers:
docker-compose up -d
-
Access the Airflow web interface:
Open your browser and go to http://localhost:8080. Log in using the default credentials (username:
airflow
, password:airflow
). -
Trigger the
forex_pipeline
DAG in the Airflow web interface.
- Task to check if the Forex API is available.
- Uses an HTTP sensor to verify the presence of 'EUR' in the API response.
- Task to create a PostgreSQL table named
rates
. - Defines columns for the table:
rate
(float) andsymbol
(text).
- Task to extract data from the Forex API.
- Uses a Simple HTTP Operator to make a GET request and retrieve data.
- Task to transform the extracted data.
- Utilizes a Python function to process and normalize the data, saving the result as a CSV file.
- Task to load the processed data into the PostgreSQL database.
- Uses the Postgres Hook to copy data from the CSV file into the
rates
table.
- Tasks are linked in the following order:
is_api_available
>>create_table
>>extract_data
>>transform_data
>>load_data
.
(https://blog.devgenius.io/etl-process-using-airflow-and-docker-226aa5c7a41a) reference tutorial