Sentiment Analysis using NLTK

Training data set: 31,961 tweets

Test data set: 17,197 tweets

Text Procssing

Coversion the text to lower case
Remove the URL's
Remove the special charactors
Tokenization
Lemmatization
Vectorization

Models

1. Logistic Regresion model

Accuracy: 95%
Precision: 90%
Recall: 31%

2. XGBoost model

Accuracy: 94%
Precision: 86%
Recall: 16%

These are our findings:

Used:
Github : https://github.com/lumindak/sentiment_analysis