In our biggest feature rollout to date, we’ve brought a whole new level of discovery capabilities to solutions built with our News API.

  • Multilingual NLP: Further language support for global news coverage in 14 languages.
  • Translated Content: Our neural machine translation system translates every non-English article into English.

Together, Multilingual NLP and Translated Content can be used in the News API to:

  • Provide an English translation of news from across the globe, making more regions accessible.
  • Analyze a larger number of languages (up to 14, previously 5).
  • Provide multilingual search capabilities, making news events more discoverable.

The value of these additions to business capabilities include widening the reach of your worldwide news content coverage, adding value to each article collected and increasing the volume of stories processed.

multilingual-nlp

Why is Multilingual Content Important?

A common problem with Media Intelligence solutions is that organizations are often forced to focus their analysis on English-language publications. This is usually due to:

  1. Limited access to content.
  2. Substandard analysis and search capabilities on multilingual content.
  3. Lack of a multilingual workforce.

We’ve improved the News API’s capabilities to ingest and analyze content on a global scale with the addition of full NLP support for 14 languages (Turkish, Arabic, and Chinese content and more).

This makes it easy to detect and fully understand stories and events in different geographies. We also identify and track the long tail of events that may not be covered by English-speaking publications.

Here is an example of an event breaking locally in French, and only being picked up by an English language publisher well over an hour later.

 

Working with NLP and Translated Content

Our News API sources content in different languages from across the globe. This content is analyzed and served through the API, with both original and English versions included.

Multilingual NLP and Translated Content require an Advanced or Enterprise license key. Start a free trial to get your API credentials or contact sales to upgrade your account.

Translations

The title and body of each story sourced is translated into English.

You can find the translations in the “translations” field of story objects.

{
  "title": "original title",
  "body": "original body",
  "language": "", 

  "<...>": "<...>",

  "translations" : {
    "en": {
      "body": "translated body",
      "title": "translated title"
    }
  }
}

Searching multilingual content

There are three ways to search multilingual content in our News API.

1 - Search within the original text, regardless of language:

"title" : "" 

2 - Search the English text, whether the original article or the machine translated version.

"translations.title.en": ""

3 - Search over a specific combination of original text and translated text. This can be useful for searching for proper nouns in a given language while specifying additional keywords in English.

"title": "",
"translations.title.en": ""

Code snippets

The below code snippets demonstrate the three methods of searching listed above.

In the first example, we perform a search for mentions of "Putin" (Путин), covering Russian language articles published in the previous month.

from __future__ import print_function
import aylien_news_api
from aylien_news_api.rest import ApiException
from pprint import pprint
configuration = aylien_news_api.Configuration()

# Configure API key authorization: app_id
configuration.api_key['X-AYLIEN-NewsAPI-Application-ID'] = 'YOUR_API_KEY'

# Configure API key authorization: app_key
configuration.api_key['X-AYLIEN-NewsAPI-Application-Key'] = 'YOUR_API_KEY'
configuration.host = "https://api.aylien.com/news"

# Create an instance of the API class
api_instance = aylien_news_api.DefaultApi(aylien_news_api.ApiClient(configuration))

"""
Searches across a Russian stories for a query term in that language ("Putin").
"""
response = api_instance.list_stories(
    title='Путин',
    language=['ru'],
    published_at_start='NOW-1MONTH/DAY',
    published_at_end='NOW/DAY',
    per_page=3
)

for item in response.stories:
    print(item.title)
    print(item.translations.en.title)

The next snippet searches Russian content for stories that mention "Putin" in the title and contain the word "businesses" in the translated article body.

"""
Searches across translated Russian content for a search term in English for stories that mention "Putin" in the title.
"""
response = api_instance.list_stories(
    title='Путин',
    translations_en_body='businesses',
    language=['ru'],
    published_at_start='NOW-1MONTH/DAY',
    published_at_end='NOW/DAY',
    per_page=3
)

for item in response.stories:
    print(item.title)
    print(item.translations.en.body)

The final example is a search across all languages for articles with "investors" in the title, published within the previous month.

"""
Searches across all languages for a search term in English.
"""
response = api_instance.list_stories(
    title='investors',
    translations_en_title='investors',
    published_at_start='NOW-1MONTH/DAY',
    published_at_end='NOW/DAY',
    per_page=10
)

for item in response.stories:
    if item.language != 'en':
        print(item.language)
        print(item.title)
        print(item.translations.en.title)
    else:
        print(item.title)

All additional languages benefit from the analysis features supported by the News API, including Classification, Entity Extraction, Concept Extraction, and Sentiment.

Learn more about getting started with Multilingual content in our documentation.

Ready to get started?

These advanced discovery and investigation features are available in our Advanced and Enterprise Packages, which you can try them for free for 14 days by signing up for our free trial.

Are you an existing News API user? Please contact your account manager by filling out our contact sales form.

Start your Free Trial

Stay Informed

From time to time, we would like to contact you about our products and services via email.