Skip to content

Latest commit

 

History

History
77 lines (49 loc) · 2.36 KB

File metadata and controls

77 lines (49 loc) · 2.36 KB

Sentiment Analysis COVID 19 Twitter

System Flowchart

sys_flow

Dataset

https://www.kaggle.com/datasets/dionisiusdh/covid19-indonesian-twitter-sentiment

Method

Prerocessing

  • Case Folding
  • Tokenizing
  • Filtering
  • Word Handling
  • Stemming

Feature Selection

  • TF-IDF

Classification

  • Logistic Regression

Handling Imbalance

  • Undersampling
  • Oversampling
  • SMOTE
  • Cost-Sensitive Learning
  • Bagging
  • Tomek Links

Data Exploration

Class

positif = 23521

negatif = 20055

netral = 9383

total jumlah sentimen

Top Features

  • Positive Class top features - positive class
  • Negative Class top features - negative class
  • Neutral Class top features - neutral class

Word Cloud

  • Positive Class word cloud - positive class
  • Negative Class word cloud - negative class
  • Neutral Class word cloud - neutral class

Evaluation

Accuracy

model accuracy comparation model accuracy comparation (2)

Classification Report

Untitled Diagram drawio