The latest video in our explainer video series takes a closer look at one of AYLIEN News API's core features: entity extraction. 

Millions of news articles are published every day, but only a small fraction of them are about the entities you care about such as the companies you deal with, or their directors. So how do we identify which articles are relevant to these entities? The answer is entity extraction.

But what is entity extraction? Let’s take a standard news article. Chances are it’s mentioning certain people, perhaps some companies, or locations, or perhaps some products. We call these things entities or named entities.

The goal of entity extraction is to accurately identify these entities, so that you or your application can quickly identify all the entities that are associated with a document.

Once applied to millions of articles, as we do in our News API, you can quickly and accurately identify the documents you should care about.

Now that we’ve defined what entity extraction means, I’ll quickly walk you through some of the key steps in the entity extraction process:

  • Entity disambiguation: Some entity names are ambiguous. For instance the word “apple” could be simply referring to the fruit, or the company Apple Inc., and this needs to be inferred from the context. For example if the article is about technology companies then there’s a higher chance that apple refers to the company. We call this step entity disambiguation.
  • Entity prominence: Once you’ve identified the entities mentioned in an article, you will most likely want to get a sense of how central those entities are to the article - are they just passing mentions or is the document predominantly about them? This distinction is achieved using entity prominence detection.
  • Entity-level sentiment: To determine whether a given entity is referred to in a positive or a negative tone, entity-level sentiment analysis is performed which analyses the tone of the sentences that mention a particular entity.

AYLIEN’s neural network based entity extraction models can currently identify up to 5.6 million entities with very high accuracy across several languages from news articles. Additionally we provide individual and aggregate sentiment scores for each entity in each article.


Why not try AYLIEN News API for yourself by signing up for a 14 day free trial?

RADAR - Intelligent risk identification & monitoring

Stay Informed

From time to time, we would like to contact you about our products and services via email.