Skip to content

Machine Learning Algorithms applied to The Movies Dataset to predict a successful movie (in terms of rating).

Notifications You must be signed in to change notification settings

muhammedyusuf678/The-Movies-Dataset-1

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The-Movies-Dataset

Machine Learning Algorithms applied to The Movies Dataset to determine success factors.

Link to kaggle dataset: https://www.kaggle.com/rounakbanik/the-movies-dataset

Usage:

Data_Cleaning_PreProcessing.ipynb - the file which explores the data, cleans up, deals with JSON column for Genres, deals with Imbalanced classes problem, normalize the numerical columns (min-max scaling)

DimensionalityReductionPCA.ipynb - Apply PCA, selectKBest, selectPercentile

Gridsearch #4 XXX - Apply Gridsearch for each of the classifiers, find the best params for normal data and data with feature selection and/or dimensionality reduction.

The npy files are used to save data after pre-processing, and loading them subsequently.

About

Machine Learning Algorithms applied to The Movies Dataset to predict a successful movie (in terms of rating).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.7%
  • Python 0.3%