Irfan Shuttari, Veritas: Machine Learning in eDiscovery Upstream: Sentiment Analysis & Classification

Veritas logo

Extract from Irfan Shuttari’s article “Machine Learning in eDiscovery Upstream: Sentiment Analysis & Classification”

In this era where the global data sphere is expected to reach 175 zettabytes by 2025, it has never been more important for organizations to understand their data as early as possible. It’s not only important to apply traditional “right-side” EDRM phases like Analysis and Review further upstream, but it’s important to fully leverage technology to tame “eDiscovery in the wild”.

Two machine learning tools to leverage in eDiscovery further upstream are sentiment analysis and automated (auto) classification. Let’s look at these two technical approaches and how they facilitate conducting eDiscovery further upstream in to support today’s Big Data challenges.

What is Sentiment Analysis?

Sentiment analysis (also known as opinion mining or emotion AI) is the use of natural language processing (NLP) to identify and extract subjective information from source materials. It involves determining the attitude, sentiment, or emotion of a subject based on their spoken or written content. Sentiment analysis can be applied to a wide variety of data, and it can provide an organization with a wealth of information about its customers, products, brand, and even employees.

The simplest form of sentiment analysis is binary classification (positive or negative), but it can also involve more nuanced classifications such as “neutral”, “happy” or “sad”. After the sentiment is determined, the results are interpreted to generate insights. This might involve identifying trends over time, comparing sentiment between different demographic groups and so forth.

Read more here