Twitter sentiment project comparing naive bayes, neural networks, and others.
Data science project for analyzing twitter sentiment using scikit learn. Uses python, javascript, sqlite, and jupyter notebook.
With a corpus of more than a million tweets classified for sentiment, I used scikit learn to train several models combining various features and learning algorithms.
The twitter api is used to fetch and classify fresh tweets. A frontend has been provided to allow users to search for latest tweets and then select the features and algorithm they prefer.
This was made as a final project for a data science course I took in fall of 2014. It was temporarily deployed to heroku.
I was obsessed with machine learning for a couple years after taking Andrew Ng's course. He is still my hero.
But machine learning is just a tool. Like any technology, what is more important is the problem you are trying to solve.
This is something I didn't understand when I was pouring over graduate papers looking for clever ways to classify the sentiment of 140 character messages.
It was a great way to play with feature selections and algorithms. However, the purpose was short sighted. I was basically asking, "Computer, can you use matrix math to tell me if these random sentences are good or bad?" The answer was, "Yes, if you force me to."
The next step to this project would be to introduce clause structure features and add an alert system to notify users.