The artificial intelligence revolution – or the Fourth Industrial Revolution as it’s been called – has been built on a foundation of information. Lots and lots of big data, including petabytes of diverse data stored in data lakes.
At the turn of the century, we were still busy utilizing relational databases and storing structured data in pre-designed databases to then be accessed by queries in applications that were written many years ago. This was the extent of our ability to leverage data – but that was yesterday.
Today, the world has morphed into a living, breathing organism of unstructured data that has not only been engineered to store the data we sometimes unknowingly provide, but is also beginning to capture the subliminal motivations, decode seemingly benign actions, anticipate the intent and comprehend the feelings behind that data. The secret question we type into Google when we think we’re alone – is not so secret. Behind the scenes, the data has been captured to be used for something in the future. And perhaps today, we don’t know anything about what the something might look like.
Data – or the use of data – is being re-imagined in creative ways to give us true insight into peoples’ lives and their propensity to act in a certain way. In a recent book “Everybody Lies” by Seth Stephens-Davidowitz, the author points out that a true profile can be built not by what people tell you, but more accurately by what they type in their search engines in an unstructured manner. For example, you would never guess that when analyzing the top 5 concerns during pregnancy across various countries that, in India, having sex while pregnant comprises 3 of the top 5 search-related concerns, whilst in most western countries ‘stretch marks’ and ‘losing weight’ dominate the top positions. This may be a good example of what people will tell you vs. what they are really thinking.
In the same manner, we have heard that over-the-top (OTT) players like Netflix and Amazon use viewership data to hone in on creating hit TV shows. In the past, we depended on Nielsen surveys and other third-party data to determine which TV shows were hits. However, today by analyzing search terms, social feeds, viewership data such as number of views, rewinds, views of similar genres or actors, companies like Netflix and Amazon are able to predict the next big hit before its launch.
Advertising also has benefited immensely from the gathering and analysis of unstructured data. I hate ads – I think most people do. However, there are many times I have clicked on a Facebook-sponsored ad. Why? Because, eerily enough, Facebook knows me, understands my intents, recognizes my location, the types of things I like and “Like!” etc. So, suddenly, rather than being annoyed by a commercial for retirement homes, I am targeted with a personalized ad for a vacation I was planning on taking to the Maldives. This is truly the personalized, contextualized experience that so many companies have been searching for.
The simple premise of unstructured data is to capture all the disparate sources and repositories into a data lake, and hope that one day, when you are ready to create value and deliver benefits from a new data-enabler such as a fresh feature in an AI engine, or a new sales campaign, the data will reveal the truth about your customers. And maybe, just maybe, the data will help improve the service or experience you provide to them. A data-driven win-win.