Medium Post

https://medium.com/@rohanhazra4/sparkify-user-churn-analysis-eeb1ed88775f

Sparkify

Predicting user churn using PySpark

Project Motivation

Sparkify is a fictitious music streaming app similar to Spotify or Apple Music. Predicting user churn or churn analysis is used to find potential customers using a service, who will either downgrade or cancel the service. For Sparkify downgrade means moving from a paid subscription to an ad supported model. Churn analysis is extremely crucial for an business, as it can identify customers at risk and prevent a loss of revenue for the company.

File Descriptions

Sparkify.ipynb : jupyter notebook containing the code and in depth explanations.

Libraries

Pandas
Numpy
PySpark
datetime
matplotlib

Summary

Machine Learning at Scale using PySpark can predict the churn of users. Although I used a smaller subset in this project, running the model on the full dataset of 12GB will produce more accurate results.

Random Forest Classifier is used to predict the churn values. The F1 score achived is 0.91.

Acknowledgements

The dataset is provided by Udacity.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
README.md		README.md
Sparkify.ipynb		Sparkify.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Medium Post

Sparkify

Table of Contents

Project Motivation

File Descriptions

Libraries

Summary

Acknowledgements

About

Releases

Packages

Languages

License

curiousrohan/sparkify

Folders and files

Latest commit

History

Repository files navigation

Medium Post

Sparkify

Table of Contents

Project Motivation

File Descriptions

Libraries

Summary

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages