We collect close to 1.2 M news articles every day and tag every single article with more than 26 metadata and enrichment tags using our proprietary ML and NLP models. This creates an insight-rich News Dataset for our customers. The tags and enrichments we add include timestamps, publisher information, entities mentioned, sentiment, and of course topical category tags.
The topical tags and enrichments we add to every article prove particularly useful for our customers looking to search for and surface relevant news articles in the apps or models they build. Our current categorization features are used by more than 70% of our customers as search filters or downstream tags.
We’re always striving to ensure our customers get the most out of our News API. Through countless customer conversations and analysis of feature adoption and completeness by our product team, we realized there were a few crucial areas or categorization capabilities that could greatly enhance the effectiveness of our category tags.
Characteristics of an effective tagging system
- Precision: Accuracy of tags added to articles
- Coverage: The breadth and depth of the potential tags added to articles
- Relevance: How useful the tags and filters are for our customer use cases
With all of these in mind, we set about rolling out a more effective categorization approach that would provide our customers with more reliable and overall relevant topical and industry-level tags that can be used to build targeted searches with confidence.
Introducing Smart Tagger
Smart Tagger leverages state-of-the-art classification models that have been built using a vast collection of manually tagged news articles based on domain-specific industry and topical taxonomies. Smart Tagger uses a highly effective rule-based classification system for identifying categorical and industry-related news content.
The Smart Tagger update greatly enhances how our users can search for, tag, and group news articles via the News API. You can read more about using Smart Tagger in our Getting Started blog or in our Documentation.
As part of the Smart Tagger update we’re introducing 2 new classification taxonomies; the AYLIEN industry Taxonomy and the AYLIEN Category taxonomy, which incorporates 2 curated category groupings; Adverse Events and Trading Impact Events.
You can explore the industry and category taxonomies here our you can check out our blog on using both categorization features here.
News API categorization features
To date we’ve offered news classification features based on 2 industry-standard taxonomies; IPTC News Codes which provides topical tagging capabilities useful in the news and publishing space and IAB-QAG which provides a mix of industry and topical tags that are more suited to the digital advertising domain. The Smart Tagger taxonomies on the other hand are significantly more useful to our customer base as they’ve been built specifically for the market analysis and financial services domains.
Comparing Smart Tagger, IPTC and IAB QAG
Smart Tagger in Action
Below we’ve set out some simple examples of category and industry queries to highlight how accurate and effective Smart Tagger enrichments can be. We’ve highlighted in each example how the new tags compare with our standard News API (IPTC and IAB) tags while focusing on some key News API use cases.
In the examples below you can compare the granularity and accuracy of the AYLIEN Category and Industry tags to our standard tagging features that use IPTC and IAB taxonomies.
Discovering market events of interest in the SaaS and Financial Services space
Identifying sales signals based on business impact events
Monitoring impactful stories that may affect an organizations reputation
Tracking emerging risk events such as product recall announcements
Tracking ESG related news in particular industries
You can read our how-to blog to see how easy it is to get up and running with Smart Tagger or request access via the link below.