GitHub - HenrikPereira/EDSBproject: Project for EDSB project in EDSA 2019 post-graduation (NOVA

Human Resources Analysis - Predicting Attrition

EDS Bootcamp project | Enterprise Data Science and Analytics (2019)

Use Case: Human Resources Analysis Predicting Attrition

Authors: Bruno Candeias¹, David Oliveira², Henrique Pereira³ & Manuel Oom⁴

M20180313: [email protected]
M20181430: [email protected]
M20181395: [email protected]
M20181431: [email protected]

I. Important Files

The most important files in this work are the following:

Final Presentation_EDSB_20191209_v0.01.ppt
HumanResourcesAnalysis_PredictAttrition.pbix
Data_models.ipynb
data_pre_proc.py
auxiliary.py

Please check the requirements file for further information.

II. Status Report

We choose this use case mainly because it allows us to explore different models, which will give us the opportunity to have a broader view for the problem: descriptive and predictive. In addition, human resources turnover is very present in our professional life, which also motivated our choice.

Our approach will follow the work flow:

Data Exploration;
Model Evaluation & Selection;
Results Presentation.

III. Dataset Exploration

The dataset (HR_DS.csv) that will be used in the use case (Human Resources Analysis Predict Attrition) contains 1470 records with 35 columns:

Age; Attrition; BusinessTravel; DailyRate; Department; DistanceFromHome; Education; EducationField; EmployeeCount; EmployeeNumber; EnvironmentSatisfaction; Gender; HourlyRate; JobInvolvement; JobLevel; JobRole; JobSatisfaction; MaritalStatus; MonthlyIncome; MonthlyRate; NumCompaniesWorked; Over18; OverTime; PercentSalaryHike; PerformanceRating; RelationshipSatisfaction; StandardHours; StockOptionLevel; TotalWorkingYears; TrainingTimesLastYear; WorkLifeBalance; YearsAtCompany; YearsInCurrentRole; YearsSinceLastPromotion; YearsWithCurrManager

We'll explore the dataset to evaluate each variable and how they are correlated. Our first findings were:

Most of our data (DistanceFromHome, MonthlyIncome, NumCompaniesWorked, PercentSalaryHike, TotalWorkingYear, YearsAtCompany, YearsSinceLastPromotion) shows skewness, and not normal;
Columns like YearsWithCurrManager and YearsInCurrentRole have 2 different distributions with a cutoff by 5 years;
There are several variables that have outliers: MonthlyIncome, NumCompaniesWorked, PerformanceRating, StockOptions, TotalWorkingYears, TrainingTimesLastYear, YearsAtCompany, YearsInCurrentRole, YearsSinceLastPromotion, YearsWithCurrManager;
Regarding the variable Attrition and how it is correlated with other variables, we analyzed the data and we realized that younger employees leave more in all categories, except SalesExecutive, ManufacturingDirector, Manager and Divorced. We also realized that the gender is important in some conditions.

Regarding the block Model Evaluation & Selection, we will study several predictive and classification models to apply in the dataset related with the use case, namely:

XGBoost;
Logistic Regression Classifier;
Linear Support Vector Classification;
C-Support Vector Classification;
Random Forest Classifier;
Keras Deep Neural Network.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
.idea		.idea
Data		Data
Models_saved		Models_saved
Reports		Reports
Resources		Resources
Scripts		Scripts
logs		logs
LICENSE.md		LICENSE.md
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EDS Bootcamp project | Enterprise Data Science and Analytics (2019)

Use Case: Human Resources Analysis Predicting Attrition

I. Important Files

II. Status Report

III. Dataset Exploration

About

Releases

Packages

Contributors 3

Languages

License

HenrikPereira/EDSBproject

Folders and files

Latest commit

History

Repository files navigation

EDS Bootcamp project | Enterprise Data Science and Analytics (2019)

Use Case: Human Resources Analysis Predicting Attrition

I. Important Files

II. Status Report

III. Dataset Exploration

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages