Disaster Response Pipeline Project

Project Overview

In this project, i applied different data engineering techniques using Figure Eight to build a model for an API that classifies disaster messages.

This project will include a Flask app which takes the input message and classify it into different categories. It helps to the people who don't want to read entire text messages in the emergency situations.

Just copy the text message and paste it into the textbox. The app will classify the text messages into categories.

Project Components

ETL Pipeline

data/process_data.py clean and transform the text for Multioutput Classification

Steps:

Loads the messages.csv and categories.csv
Merge and Clean the data
Stores the merged and cleaned dataframe to SQlite Database

ML Pipeline

models/train_classifier contains ML pipeline that:

Loads stored data
Split it into train and test set
process the text with tex_tokenize.py file
Trains the tuned model which is tuned in ML Pipeline Preparation
Shows the Accuracy, Precision, Recall and F1 Scores for each category
Saves the model to pickle file

Web App Deployment to Heroku

Using Flask framework, the app has deployed to the Heroku you can check deployed app

You can run the web app using run.py

Text message area Results area

The app is classifies the text message into categories.

Installation

-Codes are written in python versions 3.* and check the requirements.txt for project.

Run the process_data.py: This python file provides a clean data from the disaster_categories.csv and disaster messages.csv to DisasterResponse.db

python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponse.db
1. data/process_data.py: Clean the data
2. data/disaster_messages.csv: Get the messages data
3. data/disaster_categories.csv: Get the categories of messages
4. data/DisasterResponse.db: Database for storing the processed data
Run the train_classifier: Get the data from db and create ML Pipeline and save the model

python models/train_classifier.py data/DisasterResponse.db models/classifier.pkl
1. models/train_classifier.py: Train the model with processed data
2. data/DisasterResponse.db: Stored Data which is provided from process_data.py
3. models/classifier.pkl: Path of the trained model
Run the run.py

After clean the data and trained in train_classifier. You can run the run.py for web app

Conclusion

I provided 3 graph according to training data:

The messages are related to 3 types of genres Here is the distribution:

First 3 Categories are: 1- related, 2- aid_related, 3- weather_related
This data is imbalanced in most of categories so that all categories prediction accuracies are around the 90+ but Recall and Precision scores are very low. I weighted for balance the attributes in Random Forest Classifier so that ones and zeros does not have equal importance for our case

Here is you can see the distribution of messages lengths distribution based on Genre. Most of messages are less than 400 letters

File Structure

|-- Notebooks-----------------#Notebook files for data process and model training
|-- README.md
|-- app
|   |-- run.py------------------# Flask app
|   |-- static------------------# Github and Linkedin Logos
|   |-- templates
|   |   |-- go.html-------------# Results page
|   |   `-- master.html---------# Main page
|   `-- text_tokenize.py--------# Text tokenizer 
|-- data
|   |-- DisasterResponse.db-----# Stored Data 
|   |-- disaster_categories.csv-# Categories.csv
|   |-- disaster_messages.csv---# Messages.csv
|   `-- process_data.py---------# Data processor
|-- img-------------------------# readme images
|-- models
|   |-- classifier.pkl----------# Trained model
|   |-- text_tokenize.py--------# Text tokenizer
|   `-- train_classifier.py-----# Classifier
`-- requirements.txt------------# Required Python Libraries

Requirements

All project written in Python 3.8 and requirements.txt file shows the necessary libraries

Authors

A. Uygur Yiğit

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Disaster Response Pipeline Project

Table of Contents

Project Overview

Project Components

ETL Pipeline

ML Pipeline

Web App Deployment to Heroku

Installation

Conclusion

File Structure

Requirements

Authors

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.idea		.idea
.ipynb_checkpoints		.ipynb_checkpoints
Notebooks		Notebooks
app		app
data		data
img		img
models		models
README.md		README.md
requirements.txt		requirements.txt

abduygur/messages-responses

Folders and files

Latest commit

History

Repository files navigation

Disaster Response Pipeline Project

Table of Contents

Project Overview

Project Components

ETL Pipeline

ML Pipeline

Web App Deployment to Heroku

Installation

Conclusion

File Structure

Requirements

Authors

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages