This project processes Twitter data to classify tweets based on sentiment. Using Python, it calculates positive, negative, and net scores for each tweet based on predefined word lists. The script then outputs a CSV file with these scores, along with the number of retweets and replies for each tweet.
- Reads Twitter data from
project_twitter_data.csv
. - Cleans tweets by removing punctuation for accurate word matching.
- Uses
positive_words.txt
andnegative_words.txt
to identify sentiment scores for tweets. - Outputs sentiment scores as:
- Positive Score: Count of positive words.
- Negative Score: Count of negative words.
- Net Score: Positive score minus negative score.
- Outputs processed data to
resulting_data.csv
, including:- Number of retweets.
- Number of replies.
- Positive, negative, and net sentiment scores.
- A CSV file containing columns for tweets, number of retweets, and replies.
positive_words.txt
: A list of predefined positive words.negative_words.txt
: A list of predefined negative words.
- A CSV file containing:
- Number of retweets.
- Number of replies.
- Positive score.
- Negative score.
- Net score.
- Clone the repository and place the required files (
project_twitter_data.csv
,positive_words.txt
, andnegative_words.txt
) in theassets
directory. - Run the script to process the data:
python sentiment_classifier.py
- View the results in the
resulting_data.csv
file.