Natural language processing (NLP) is a field within artificial intelligence (AI) that seeks to process and analyze textual data in order to enable machines to understand human language. NLP is widely used in applications from search engines to personal assistants such as Siri* and Alexa*. However, the deployment of NLP applications in commercial environments faces major challenges – from scaling across different domains to amassing large amounts of annotated training data – as we discussed in a recent blog. One of the major goals of the Intel AI lab is to address those challenges. That’s why we have introduced an Aspect-Based Sentiment Analysis (ABSA) algorithm that enables fast and robust deployment across different domains. The algorithm was released as part of the NLP Architect open source library version 0.4 in April 2019.
Sentiment Analysis – Overview
Sentiment Analysis (SA) is the task of detecting subjective information from text. SA is widely used by business organizations to understand customer opinions on products and services, enabling the ability to react accordingly. Most academic and industry research is focused around sentence-level SA, which generates a single sentiment score (positive/negative) per sentence. While this approach achieves high accuracy, it has a major drawback: many sentences include more than one aspect (sentiment target), each with its own sentiment polarity. Consider, for example, the following restaurant review sentence which includes positive polarity towards the ‘food’ aspect and negative polarity towards the ‘service aspect:
“The food was tasty but the service was poor”
Read More: Cryptocurrency Tax Returns and the IRS
In cases like the above example, using a single sentiment score per sentence is too broad to convey all the sentiment information expressed in the sentence.
Aspect-Based Sentiment Analysis (ABSA)
ABSA is the task of extracting aspect terms and their related sentiment polarity. This fine-grained trait of ABSA makes it an effective application for organizations to monitor the ratio of positive to negative sentiment expressed towards specific aspects of a product or service, and extract valuable targeted insight.
The ABSA Challenge
A major challenge of ABSA is the domain sensitivity of both the opinion terms and the aspect terms. Aspects within the same domain usually share close semantic similarity, while aspects from two different domains usually have a large semantic distance between them. Here is a sample of aspect lexicons from two different domains, restaurant reviews and hotel reviews:
In addition to aspect terms, opinion terms are also domain-sensitive. An opinion term conveying positive sentiment polarity in one domain (e.g. “delicate” in movie reviews) may convey negative polarity in another domain (e.g. “delicate” in cell phone reviews).
Supervised learning algorithms handle this domain sensitivity challenge well in cases where labeled data from the target domain is provided for training. However, generating labeled data is a labor-intensive and costly effort that requires human expertise. An alternative approach for supervised learning ABSA is lightly-supervised ABSA. This approach requires no labeled training data, making its deployment faster and cheaper and therefore more practical.
Intel AI Lab Lightly-Supervised ABSA
The Intel AI Lab has developed a lightly-supervised ABSA solution that was released as part of the NLP Architect open source library version 0.4 in April 2019. This solution enables a wide variety of users to generate a detailed sentiment report.
The solution flow is divided into two phases, training and inference:
- The training phase gets as input a collection of unlabeled text documents from the target domain and outputs domain-specific opinion lexicons and aspect lexicons. The user can edit the domain-specific lexicons which makes this a lightly-supervised approach. For more details regarding the training phase, see NLP Architect’s documentation.
- The inference phase inputs the opinion and aspect lexicons that were produced in the training phase and an unseen inference dataset from the target domain, then generates a detailed report regarding the amount of positive and negative sentiment towards each aspect of the product/service in the unseen inference dataset. Following is a screenshot of such a report, generated for an inference dataset containing restaurant reviews:
The green/red bars represent the amount of positive/negative sentiment detected in the inference dataset towards each aspect. By clicking a specific bar – related to a specific aspect – the solution displays a list of sentences from the inference dataset containing positive and negative sentiment towards that specific aspect.
We introduced an Aspect-Based Sentiment Analysis algorithm and solution with the following advantages for commercial use:
- Domain adapted
- No need for labeled training data
- Explainable AI: aspect-level visualization with drill-down capabilities