I’ve been taking a deep dive into real time data. Specifically working with AWS Kinesis Streams. The below power point is the result of combining my curiosity into the sentiment analysis world, nodeJS, and AWS Kinesis. Hope you enjoy.
A little about the data and calculating sentiment.
The data is exclusively real-time english only tweets from the Twitter Stream API. From observation, there’s 300-400 tweets a second. I used Sentiment Analysis to determine how a person was feeling based on what they tweeted. A sentiment score of 0 for example indicated a feeling of being indifferent. A sentiment score of -1, sad. -5 mad, -10 pissed. +1 OK, +5 happy, +10 ecstatic. The graph is a representation of taking X number of tweets in a second and taking the average of their calculated sentiment (Naive Implementation).
Im thinking of moving to Kafka soon to support higher rates of ingestion. Yes, Kinesis does support this but its not free. I love the fact that I dont have to worry much about the infrastructure while using Kinesis but sadly, its money I dont have. I’m also looking into changing the algorithm used when calculating the aggregate sentiment score.