Sentiment analysis can be used to understand the opinions and concerns of customers and receive their feedback. Using our tool, companies can understand key actions of competitors, already entered in the Metaverse, that obtained positive or negative sentiment.
The project contains many Python files and Jupyter notebooks organized in folders describing the branch of Machine learning techniques they refer to.
- Classification/ contains the source code used to perform the classifiers' comparison, as well as the .pkl file of the final classifier
- Clustering/ contains the source code of the clustering techniques used for topic modeling.
- SentimentAnalysis/ contains the source code of the techniques used to perform sentiment analysis on the tweets.
- TopicModeling/ contains the source code of the techniques used for topic modeling.
- data/ contains the dataset used during the different steps, as well as some results obtained
The data used in this study has been gathered using Twitter's API and the snscrape tool.
Botometer has been used as a tool for bot detection, removing tweets whose bot probability was above a certain threshold.
A pipeline of NLP steps has been then performed:
- Data Cleaning
- Tokenization
- Normalization
- Filtering
- Stemming
The models have been combined using an ensemble learning approach.
The tweets posted during the days that showed high positive sentiment or high negative sentiment were analyzed using topic modeling and clustering techniques. Different results obtained were compared with well-known news information during these days related to the Metaverse topic, proving the ability of the algorithm to detect the sentiment and the reason behind it.
More information can be found on the presentation and documentation files.