RavenPack Technology

Our State-of-the-Art NLP Infrastructure Produces Analytics used by some of the Largest Financial Institutions Worldwide

How to analyze

300 million unstructured documents per month

Over the past 20 years, RavenPack has built one of the largest civilian infrastructures for natural language processing at scale. Starting with news analytics, and expanding to larger sets of business-relevant data sources, and larger universes of tracked entities, the RavenPack infrastructure now processes millions of documents every day. The service turns heaps of unstructured textual data into structured insights augmented with analytics such as sentiment scores. Powered by proprietary technology, the infrastructure performs 5 key tasks:

Content Collection Curate data from over 40,000 sources or from your own proprietary content Text Extraction Transform any document into a normalized textual format Enrichment Tag content with sentiment, entities, events, relevance and more Self Service Data Enable the selection and filtering of data to create custom datasets Data Delivery Make the data available via a self service data platform or real-time APIs
Content Collection Curated data from over 40,000 sources or from your own proprietary content Text Extraction Transform every document into a normalized textual format Enrichment Tag content with sentiment, entities, events, relevance, and more Self Service Data Enable the selection and filtering of data to create custom datasets Data Delivery Make the data available via a self service data platform or real-time APIs
The latest generation of our infrastructure:

RavenPack Edge

The latest generation of our infrastructure, Edge, is the outcome of over 5 years of technological research and development. Edge achieves an unparalleled breadth of coverage and depth of analytic insights.

Capable of processing up to 3 times as many documents, from over 40,000 sources, Edge produces analytics both in real time and across our deep historical archive. In addition, Edge tracks more than 12 million entities, representing a 25 fold increase over the prior generation product. Edge also benefits from an enhanced event taxonomy and incorporates new technology to both detect more events, and to augment each event match by extracting more information from the document. The net result is nearly 5 times the number of records produced on a daily basis.

The RavenPack Data-as-a-Service platform scales both vertically and horizontally to maintain sub-second latency for the majority of documents flowing through the system.

RavenPack Edge is powered by

Machine Learning

Machine learning can be a powerful technique, particularly when coupled with a large and accurate training set. RavenPack’s traditional event sentiment applied to our 20+ year archive provides one of the most comprehensive sets of tagged sentiment on English language news available anywhere. Using this curated, high-quality archive, RavenPack has been able to train a novel model and apply it to Edge, generating high-quality sentiment across each sentence of the entire document archive.

Explore

RavenPack's NLP Resources