Data Enrichment is the enhancing of customer-provided data (first-party data) with data from an outside source (third-party data). Another name for Data Enrichment is Data Appending. The two names are interchangeable.
When data is enriched, the first-party data in one dataset is compared with the third-party data in a separate dataset. Third-party data is collected by a company with no direct connection to the customer. Sources of third party data can include websites, social media networks, surveys, subscriptions and more. The third-party dataset always has many more records and much more information about those records than the first-party dataset. The enrichment process looks for records in the two datasets that match.
Matches don’t have to be based solely on customer name, which can be entered inconsistently in databases. Matching can be based on categories like physical address, email address, phone number and more. When a match is found, data from the third-party dataset that is not included in the matching record is added to the first-party dataset. This creates a much fuller picture of that customer.
Every category of customer data can be enriched. A look at three of the more popular types of data enrichment makes clear how much information it can add and valuable it can be.
Demographic data enrichment
Demographics is one of the primary methods of customer segmentation, but first-party data often leaves out important information about the customer that can be filled in by third-party data, including:
The options go on and on.
Geographic data enrichment
Geography is another important area of segmentation and enrichment. Geographic and demographic segmentation are often categorized as one type of segmentation known as geodemographic segmentation. With geographic data enrichment you can fill in quite a few bits of information, including:
With your geographic and demographic data enriched for these details, you’d be able to target a direct mail audience with great precision. For example, suppose you wanted to reach credit-worthy female home owners between the ages of 40 and 65 who drive a domestic automobile, make at least $80,000 per year, live within a 20-mile radius of your store and are already your customer.
Targeting that specifically might be impossible with first-party data alone, so you’d have to mail to a much larger audience just to be sure your actual target was included. If each mail piece costs $1.25 and you have to send out 10,000 — instead of the 2,000 that you need to reach your entire target audience — you waste $10,000.
Semantic data enrichment
Semantic data works by putting words in the correct context. It is an invaluable part of search, especially now that voice search is expected to reach 50 percent of all searches in 2020 (according to comScore). Semantic search interprets the context of different words in combination, so the search is narrowed to the intended request, rather than searching for all of the words in a long tail search query.
For example, Semantic data enrichment searches text entered by customers for clues about attitudes and preferences. It then tags the customer record accordingly. So, if a customer has entered in one of their online profiles that they like doughnuts, the next time they visit your website you might recommend a post about doughnuts in your blog they likely would not have found any other way.
Two ways data enrichment is processed
Data enrichment can be processed in two ways, in real-time and by batch.
Real-time data enrichment happens as data enters your system. So, if a customer’s level of interest in a particular product or service is reflected by online activity, their customer record is updated immediately. If it is a prospect, this new activity can automatically increase their lead score. The increased lead score could automatically generate an email or a sales call.
Batch data enrichment is for information you don’t need urgently. Updates can be scheduled to occur at regular intervals.
Don’t go overboard
It’s important to limit each data enrichment project to the information you need for a specific purpose, especially when you consider that the number of ways data can be enriched is limited only by the data you have. Burdening datasets with unneeded data wastes storage space and slows down processing.
It’s also important to have confidence in the third-party data you rely on.