In this project, our group aimed to:
-
Understand the current situation of the Houston Rockets basketball team, by mining and analyzing structured historical player statistics and unstructured text data from various social media platforms, and
-
Provide actionable, data-driven recommendations derived from our machine learning models to improve team performance and create a championship winning team.
There are 3 main folders in this project,
- Player Performance Prediction
- Eric_Gordon.xlsx - Data for Eric Gordon
- Player_Performance_Prediction.ipynb - Notebook for Player Performance Prediction
- Reddit Sentiment Analysis and Play - by - Play Prediction
- NBA_Analytics.ipynb - Notebook for predicting outcome of game
- Reddit Public Sentiment - Scraping and Sentiment Analysis.ipynb - Notebook for web-scraping reddit posts
- WebScraping_PlaybyPlay.ipynb - Notebook for web-scraping basketball-reference.com play-by-play data
- master_datasheet.xlsx - dataset of all team and player information
- nba_games_data_sentimentanalysis_weighted.csv - dataset with twitter sentiment scores
- play-by-play-2022-04-10.json - dataset for play-by-play information
- reddit_scrape_posts.json - dataset for reddit posts
- reddit_sentiment.csv - dataset for reddit sentiment dates
- reddit_sentiment_scores.json - dataset for reddit sentiment scores
- Twitter Sentiment Analysis and Topic Modelling
- NBA Reporter Twitter Data Mining.ipynb - Notebook for web-scraping twitter reporters
- NBA Reporters.xlsx - dataset for twitter reporters
- Twitter_Data_Pre_Processing_and_Topic_Modelling.ipynb - Notebook for twitter sentiment scores and topic modelling
- nba_games.xlsx - dataset for nba games
- nba_games_data_sentimentanalysis_weighted.csv - dataset for nba games twitter sentiment scores
As our team utilized both Google Colab and Jupyter Notebook for development, please edit the file paths accordingly to load the relevant datasets that are in the same folder.