Following on from our successful launch of the enriched Coronavirus News Dataset, we've just made another three datasets available for download.
We built these datasets to give you a snapshot of the data that you can expect to retrieve from our News API, giving insight into the structure, quality of content and NLP enrichment we provide to our API users.
We've created datasets made up of tens of thousands of news articles collected from hundreds of trusted sources covering Natural Disasters, Financial Crimes and the Nasdaq 100.
Natural Disasters Dataset
Download a collection of news articles relating to natural disasters over an eight-month period.
Dataset details:
- Size: 0.5 GB (~34,000 news articles)
- Language: English content only
- Timeframe: May 2019 - Dec 2019
- Sources: 141 distinct sources
Financial Crimes Dataset
Investigate financial crime events and the entities connected to them reported in the media.
Dataset details:
- Size: 0.3 GB (~20,000 news articles)
- Language: English content only
- Timeframe: May 2019 - Dec 2019
- Sources: 147 distinct sources
Nasdaq 100 Dataset
Analyze news articles related to the NASDAQ 100 companies, in the pre-COVID19 world.
Dataset details:
- Size: 0.7 GB (~38,000 news articles)
- Language: English content only
- Timeframe: Nov 2019
- Sources: 107 distinct sources
The Data Format
Once downloaded, the data is in compressed GZIP format. Uncompressed files are in JSONL format where one line relates to one story object. Learn more about AYLIEN's story object here.
The NASDAQ-100 file includes one individual file for each company. The Natural Disasters and Financial Crime datasets comprise of one file each.
Who can use the data?
These datasets provide a useful resource for evaluating the data that can be extracted via our News API and can not be used in commercial projects.
If you have a particular dataset in mind that you'd like to dive into, please get in touch with us and we'll see can we build it for you. If you'd like to gather your own, we offer a free 14-day trial of the news intelligence platform that you can sign up to below.
Related Content
-
General
20 Aug, 2024
The advantage of monitoring long tail international sources for operational risk
Keith Doyle
4 Min Read
-
General
16 Feb, 2024
Why AI-powered news data is a crucial component for GRC platforms
Ross Hamer
4 Min Read
-
General
24 Oct, 2023
Introducing Quantexa News Intelligence
Ross Hamer
5 Min Read
Stay Informed
From time to time, we would like to contact you about our products and services via email.