Have summaries of our latest blogs delivered to your inbox, so you can stay up to date on the topics and current events that matter to your business.
New technology allows many job responsibilities to be delegated to machines, but there are some tasks that just need a human touch—and proper research is one of them. As data-driven decision-making...
While conducting a news search can certainly be as easy as typing in a few keywords and hitting “enter,” there are many ways to optimize searches that you’re likely missing out on. Did...
Searching databases might seem simple, but if you’re not up-to-date on commands, you might be falling way behind. Nexis ® is already revolutionizing the way searches are conducted online, from...
If you scroll through a handful of job descriptions today, odds are that most—if not all—will list “proficient in research and analysis” as a requirement. But despite the need for...
For many PR pros, it can be difficult to pinpoint in real time whether your communications strategy is paying off. Total impressions, share of voice, and social engagement are great—but do they capture...
Futurist and data technology expert Bernard Marr notes that as much of 90% of data being generated daily is unstructured and the volume is growing at a rate of 55 to 65% a year. It represents mountains of data going largely unused, until recently.
But as organizations expand data analysis, unstructured data can contribute valuable context, filling in gaps left from strictly quantitative data analysis.
In this article, we’ll break down everything you need to know about unstructured data, from what it is to how to get meaning out of it. We’ll also show you how Nexis® Data as a Service is your one-stop-shop for all of your unstructured data needs. Let’s dive in.
If you’ve ever used Excel spreadsheets (and who hasn’t), then you’re already well-acquainted with structured data. Whether it’s company financials or personal recipes, the structure of a spreadsheet makes it easier to parse the data to uncover earnings trends or every recipe that uses broccoli.
Unstructured data is often overlooked, says Dave Hanson, resident expert on all things data and General Manager of the Data as a Service and Entity Due Diligence and Monitoring solutions at LexisNexis®.
He explains the value, noting “structured, quantitative data tells you the ‘what’. Unstructured data, on the other hand, offers critical context. It answers the ‘why’ fueling quantitative data.”
Unstructured data is created both internally and externally. Internally, unstructured data comes in the form of text from emails, invoices, corporate communications, and other text-based content generated while doing business.
Externally, unstructured data includes photos, as well as text-based sources like news, social media, press releases, and more.
Of course, before you weave unstructured data into your processes or AI-powered applications, you need to understand the pitfalls and promise of these datasets.
As mentioned above, unstructured data often comes in the form of articles, social media, emails, or other text-based communication. There may be quantitative data reported in a news article, but those numbers are distributed throughout the text, so they aren’t as easy to extract and analyze as a spreadsheet. The volume of news being generated daily—in print, on the web, over airwaves—can also be pretty intimidating. But locked into all of that text are details that can help you make sense of quantitative data.
In its raw form, unstructured data can be difficult to process, and the volume alone often poses a problem. If you’re trying to glean insight it can be like finding a particular needle in acres of haystacks. That’s why you need to look at how a data provider enhances unstructured data to make it more user-friendly. (More on that later.)
Quantitative data is all about the numbers, and it can answer questions related to numbers. How many units sold last month? How does that compare to the same month the previous year? It’s data that can be easily validated and verified.
Unstructured data, however, is qualitative in nature. It describes or explains, capturing events, emotions, and perceptions.
Hanson notes, “Qualitative data is less about figures and more about text and contextual-based information, but with that comes a huge amount of potential to tell the story around what is happening.”
This is where the real value of unstructured data comes into play. Take the example of units sold. Analyzing news data from the same time can help to explain why sales were up or down—for example if a certain quarter has events that drive sales or make it slower, like the holiday season.
Data without context can be misleading. Data informed by contextual insights enables better decisions.
MORE: 8 Ways to Use Alternative Data to Improve Your Financial and Data Monitoring
When you decide to integrate unstructured, third-party data in your processes and applications, you should consider three crucial factors.
Fake news and the loss of trust in media go hand in hand. In the third quarter of 2020, for example, there were 1.8 billion fake news engagements—and the pervasiveness of fake news continues still.
As a result, the 2023 Edelman Trust Barometer reveals that trust in media is still lagging behind trust in business, non-governmental organizations, and governments (which narrowly escaped last place by a percentage point.)
With 50% of people mistrusting media, it’s critical that you curate data from reputable and varied sources. When sourcing unstructured datasets, look for well-provenanced data that captures diverse viewpoints so you can deliver insights that are relevant and unbiased.
The volume and unwieldy nature of unstructured data demands a solution. After all, you can have the best datasets at your fingertips, but piles of great unstructured data from reputable sources are still just piles of data.
“Enrichment is effectively data added to data, bringing structure to unstructured data. Data is simply not very usable, especially at volume, unless it has been enriched,” says Hanson.
How do enrichments help? They makes huge datasets more searchable and allow you to slice and dice the data to uncover more insights. Say you’re searching for information about Apple. Enrichments allow you to easily exclude mentions not related to the entity. They effectively filter out unrelated mentions like candy apple red or apple recipes that might otherwise slow or skew your analysis.
Case in point: A financial insights company wanted to add topical news feeds that analysts could use to inform reports and data it curates for its own customers. When comparing third-party unstructured data providers, the enrichments proved to be a deciding factor in integrating data from Nexis Data as a Service or a similar provider. Enrichments vary by dataset but general include:
These enrichments, along with relevancy scores, enable analysts to make targeted data calls and refine datasets down to what really matters. “Particularly in analytics or data integrations, enrichments allow analysts to draw much more insight, programmatically, across a high volume of data,” says Hanson.
Ultimately, the financial insights company chose Nexis Data as a Service because the enrichments—especially entities mentioned and tags related to mergers and acquisitions—led to a more complete results set than the competitor. That’s why Dave Hanson says, “The volume and quality of our enrichments are the magic that brings unwieldy data to life.”
When choosing a data provider, make sure they can deliver data in different ways. While a Search & Retrieve data API may be ideal for ongoing trend analysis or scheduled data calls for PEPs and sanctions data, deep historical analysis may require bulk delivery of decades of news data.
In other instances, a Flat File may be the best option. By partnering with a data provider that offers a wide range of delivery options, you can build a long-term relationship that can adapt as your data needs evolve.
MORE: The Endless Possibilities of Data as a Service
LexisNexis has long been a go-to source of news, company and legal information. As a result, we’ve spent decades honing, expanding, and working to improve data we aggregate so it offers optimal flexibility and efficiency—whether it’s within our own platforms for business, academic and legal research or as unstructured datasets to power your own tools and applications.
Because we aggregate from global sources in 200 countries and across 37 languages, you can be confident that you’re capturing a multi-dimensional perspective. And with a 45+ year news archive and fresh data being ingested and enriched every day, Nexis Data as a Service is a one-stop-shop for a wide variety of quality, enriched data suited to use cases spanning predictive modeling and risk management, trend or investment analysis, and other data-driven projects or processes.
Ready to explore the options? Learn more about unstructured data available with Nexis® Data as a Service.