How computers understand human emotions with sentiment analysis (Memairy AI Part 2)

Card image cap
{{formatDate(post.createdAt)}}
{{category.name}}   

This article is part 2 of a series on how Memairy uses artificial intelligence to create the best online journal app experience. We recommend you first read What is Artificial Intelligence? And what does AI have to do with an online diary? (Memairy AI Part 1)

In this post, we will explore how Natural Language Processing (NLP) is implemented by Memairy to create a superior and more fulfilling journaling experience. NLP is used in the following 3 major ways, 1. Sentiment Analysis 2. Named Entity Recognition (NER) 3. Topic Modeling Let’s go into sentiment analysis and briefly describe how it can enhance your journal entries.

What is Sentiment Analysis?

Sentiment analysis is a technique used to interpret and classify text by tagging with a certain sentiment, usually positive, negative, or neutral. Sentiments refer to the emotions that the author wishes to express in a particular portion of writing. Sentiments are inherently subjective. Thus, they may depend on the cultural background, values, and beliefs of a person. You and I may not always agree on the sentiment of the same text. Nonetheless, most people will generally consider emotions like happiness, excitement, and amusement as positive and emotions like sadness, fear, and anger as negative.

Humans have evolved to be very adept at inferring the emotional state of others. This ability must have been critical for survival. As a species, we are relatively weak physically compared to other similarly sized animals. We lack formidable claws, fangs, strength, or speed. Nevertheless, we have been highly successful in large parts due to our intelligence, ability to cooperate, and, just as important, the ability to assess the willingness of others to cooperate with us. The price of failing to understand the emotions of others could mean death.

caveman with computer

Today, sentiment analysis is as important than ever however on a much greater scale than the interactions of small groups from our prehistory. There are many uses for sentiment now possible due to the vast amounts of text data available from online reviews to social media posts. It is often used by businesses to better understand their customer feelings from product reviews, survey reviews, and social media. Sentiment Analysis enables businesses to learn what their customers do and do not like, so they can adapt to better meet their customers’ needs.

Memairy uses this technique to determine the sentiment of each diary entry to provide an assessment of the user’s feelings. By adding a more complete context to the entry, a user can better understand and process their emotions. The insights of Sentiment Analysis arm the user with a powerful tool for introspection and self-reflection.

Humans may instinctively understand language, but only computers have the processing power to analyze millions of pieces of text quickly. For computers to do automated sentiment analysis well, they must be trained to comprehend how language can express human feeling.

How Sentiment Analysis Works

For computers to perform the complicated task of understanding sentiment, as with many similar artificial intelligence goals, a model is required. Sentiment analysis is a classification model that determines the degree to which text expressed a sentiment. Model in this context means a set of mathematical operations that take inputs, often called ‘features’ in the jargon of data science, and transforms these inputs into a prediction.

In a greatly simplified procedure to build a model, data is first collected and tagged with the outcomes that we want the machine to learn to predict. You may have heard this commonly referred to as ‘machine learning’, a branch of artificial intelligence.

The next step is to extract features from the data. These features are the inputs that are used by machine learning algorithms to identify patterns that are key in determining the output.

Then, there are many sophisticated techniques used to determine the mathematical operations needed for the model to best turn the inputs into the desired output; in the case of sentiment analysis, the output is the sentiment of text.

machine learning model training diagram

Now equipped with a sentiment analysis model, we can provide the model with an arbitrary new text and get, as an output, a prediction of the sentiment of that text. The results of sentiment analysis typically have two outputs:

  • Score - the overall emotion of a document from -1.0 (most negative) to +1.0 (most positive).
  • Magnitude - a positive value that measures how much emotional content is present in the document and will generally be greater for longer documents.
These outputs are categorized to be easily understood by the user.

Sentiment Sample Values
Clearly Positive “score”: 0.8, “magnitude”: 5.0
Moderately Positive “score”: 0.2, “magnitude”: 6.0
Neutral “score”: 0.0, “magnitude”: 0.0
Mixed “score”: 0.0, “magnitude”: 8.0
Moderately Negative “score”: -0.3, “magnitude”: 7.0
Clearly Negative “score”: -0.7, “magnitude”: 4.0

“Neutral” documents have little emotional content with score and magnitude both around 0.0. Whereas “mixed” documents have both positive and negative emotions that cancel each other out resulting in a score around 0.0 but a higher magnitude.

To make this more concrete we analyzed the sentiment of each diary entry in Anne Frank’s diary. The diary entry with the greatest negative sentiment (-0.5 score 3.50 magnitude), MONDAY, JULY 19, 1943

Dearest Kitty, North Amsterdam was very heavily bombed on Sunday. There was apparently a great deal of destruction. Entire streets are in ruins, and it will take a while for them to dig out all the bodies. So far there have been two hundred dead and countless wounded; the hospitals are bursting at the seams. We've been told of children searching forlornly in the smoldering ruins for their dead parents. It still makes me shiver to think of the dull, distant drone that signified the approaching destruction.
On the opposite end of the spectrum, the diary entry with the greatest positive sentiment (0.80 score 3.50 magnitude), FRIDAY JUNE 12, 1942
I hope I will be able to confide everything to you, as I have never been able to confide in anyone, and I hope you will be a great source of comfort and support. COMMENT ADDED BY ANNE ON SEPTEMBER 28, 1942: So far you truly have been a great source of comfort to me, and so has Kitty, whom I now write to regularly. This way of keeping a diary is much nicer, and now I can hardly wait for those moments when I'm able to write in you. Oh, I'm so glad I brought you along!
Explore the sentiment analysis for all her diary entries.

Sentiment Analysis is one of the more challenging tasks in Natural Language Processing. Even us humans sometimes misinterpret someone’s email or text. I’m sure that this has happened to all of us – an email misread as rude by a coworker or a text not understood correctly by a friend. Combine this with how many social media messages mix text with emojis, and, suddenly, the interpretation of sentiment can become even more complicated.

Let’s take a closer look at some of the major challenges a machine faces in understanding text and how sentiment analysis overcomes those challenges. Emojis

To state the obvious, emojis are those pictorial representations of facial expressions, objects, and locations, among others. The use of emojis can help to communicate emotions reducing the chance that they will be misinterpreted.

text message misunderstood without emoji

One can image the difficulties that a computer faces when trying to understand such nuance in text. The cliché is true, “A picture is worth a thousand words”. The most sophisticated sentiment analysis incorporates the dense emotional information contained in emojis.

Fact or Opinion

In all classification problems, including sentiment analysis, defining the categories is a critical component. We need to tag what we want to model in the training which demands that we have a good definition of the categories. The accuracy of the model will only be as accurate as we are able to tag the training data. In the case for sentiment analysis, we want to identify text as neutral, positive, or negative. So, how do we define what is neutral?

Sentences can be divided into either fact or opinion. Fact sentences do not contain emotional sentiments whereas opinion sentences do. For example, consider the following two sentences:

  • FACT: The ball is red.
  • OPINION: The ball is beautiful.
I think that you will agree with me that the first statement is neutral and that the word, beautiful, in the second statement conveys a positive sentiment. Fact sentences are labeled as having a neutral sentiment.

Degree of Positive and Negative Sentiments

Now that we have a definition of what neutral sentiment is, we need to define positive and negative sentiment. This is primarily done using a collection of special dictionaries. These dictionaries are compilations of scores manually labelled by persons to rank words such as ‘good’ and ‘great’, giving ‘great’ a higher positive sentiment. Sentiment analysis looks up words and phrases in these dictionaries to assign the degree to which they are positive or negative.

Context

The context of sentence, the order of the phrases, and the surrounding words and sentences, are critical for understanding the meaning. These two sentences have very different meanings.

  • 1. The beautiful dog played with the dirty ball.
  • 2. The dirty dog played with the beautiful ball.
Each sentence has both positive (beautiful) and negative (dirty) sentiments but differ in whether the positive sentiment applies to the dog or the ball. To properly understand sentences, they must be broken down into its component parts of speech, such as, nouns, adjectives, verbs, and adverbs. Effective sentiment analysis uses the proximity of words to other positive or negative words to identify the sentiment toward specific entities like the dog and ball in the above sentences.

A major component of this is the identification of entities in the text, such as ‘dog’ and ‘ball’ in the simple sentences above. The natural language processing technique, Named Entity Recognition (NER), is critical however that is a topic for another post.

The Memairy AI-powered online diary mascot

What’s Next

Our goal with Memairy to aid people to process emotions and experiences, deepen self-awareness, and improve your mental well-being. Read more about how Memairy uses artificial intelligence to create the best online journal app experience.
  1. What is Artificial Intelligence? And what does AI have to do with an online diary? (Memairy AI Part 1)
  2. How computers understand human emotions with sentiment analysis (Memairy AI Part 2)
  3. Named Entity Recognition (coming soon)
  4. Computer Vision (coming soon)