Machine Learning a Natural Touch

Machine Learning a Natural Touch

How the data we create as individuals is the key to actionable insights and the next generation of analytics

Marketing was once the least quantifiable business function. Now marketing can access vast democratized datasets, yet faces significant organizational and computational challenges.

The explosion of data shifts the balance of power from companies to customers. Data may have come late to marketing but has been an aspect of other disciplines since the 1980s: the knowledge doubling curve.

High performing companies invest in data and analytics. Their organizational challenge is to create a framework for data as a strategic asset. Their computational challenge is to create meaning and insight actionable in the customer experience.

(Source Domo) IDC and EMC analyzed rich data and headlined the accelerating volume: It is doubling in size every two years, and by 2020 the digital universe – the data we create and copy annually – will reach 44 zettabytes, or 44 trillion gigabytes.”

The more interesting story is how little of the data we understand or use. The study found only 22% of the data was tagged. If data is not tagged or classified it becomes difficult to generate actionable metadata. Yet, only a fraction of the data classified is analyzed: in the study only 5% was analyzed at all.

(Source IDC/EMC)

Why does this matter? There is a contrarian view from some in advertising that the role of data is overblown. Arguably there is misuse of data in digital advertising and little evidence of demonstrable breakthroughs from programmatic media. Now, the lack of transparency has triggered a formal media rebate investigation.

We use a fraction of the data available to us. We question the integrity of data as well as the effectiveness of the analytics and targeting. Why has it come to this?

Marketing data is still mostly ‘small’ data. It is ‘small’ firstly in the sense that it is not particularly high in volume, measurable in gigabytes and definitely not zettabytes.

It is ‘small’ secondly in the sense that is mostly structured data, which fits in nicely to Excel spreadsheets and other traditional analytical platforms.

It is ‘small’ thirdly in the sense that it is not telling us much more than we knew five years ago. Our understanding of our customers is generally one-dimensional. The models we build are weak in relation to their potential. We still see churn modeling based on past behaviors and transactions with little development for years. What is the ‘big’ data marketing can access?

The IDC/EMC study confirms companies have liability or responsibility for 85% of the data from the digital universe. Yet, “the majority of the bits in the digital universe is created by people and not computers”.

People create unstructured data consisting primarily of language i.e. text and voice, although images and video are growing fast. Unstructured data also includes quantitative fields such as dates, timestamps and numbers.

(Source: sherpa.com)

Unstructured data is complex and ‘messy’ and does not fit into traditional, relational SQL (structured query language) databases. A new breed of databases known as NoSQL have sprouted up and traditional statistical techniques been updated to deal with big unstructured real time data.

Through qualitative market research marketers have some experience of unstructured data, but the data tends to be sample based and very small in nature. The function of marketing research is in desperate need of disruption.

In parallel with the rapid growth of data is the explosion in content creation. Enterprises are creating content heavy in text, voice, images and video. As Eric Schmidt of Google notes: “More content is being created in 48 hours than what was produced from the beginning of time until 2003.”

Here is perhaps the greatest opportunity for the partnership of CMO’s and CIO’s, in order to build deeper profiles and a holistic understanding of “the whole consumer” and their engagement with brands and content.

Natural Language Processing (NLP) is the ability of computers to understand human language, both text as it is written and speech as it is spoken. Other disciplines have developed NLP to deal with multiple messy data sets.

The most visible application, through Natural Language Understanding (NLU), is in human-computer artificial intelligence (AI) solutions like Apple’s SIRI and Microsoft’s Cortana. These applications translate voice data into text and computer tasks, generating output in both text and voice through Natural Language Generation (NLG).

While NLU starts with raw text NLG has the goal to determine how a system’s knowledge and ideas should be communicated. NLG is a huge opportunity to enrich customer engagement. Marketers generates growing content. People consume the content and create their own. We are becoming proficient at tracking structured data through the customer lifecycle. Now we can add unstructured.

Here are four initial opportunities to apply the different branches of NLP to marketing problems:

  1. Identify new people segments from language analysis connecting behaviors, attitudes and social connections.
  2. Identify influencers and the communities they reach with knowledge enhanced by the content they create and the language they use.
  3. Improve content creation across the customer experience and touch points by matching conversations and reviews to behaviors.
  4. Measure drivers of demand by generating relevant content as well as targeted for more effective engagement.

E-commerce companies serve user generated content online in the form of discussions, reviews and commentaries. An extremely powerful yet relatively simple application of NLP is to map the classic five stages of decision-making to content.

Each decision stage is distinct in terms of linguistic properties. When a customer is researching a product one typically sees a higher occurrence of verbs like “thinking about”, “looking for” and “researching”. When a customer has completed a purchase, one is more likely to see greater use of numeric symbols for units and prices, i.e. “10 tomatoes at $3.50” or “a new Chevy for $38,000”.

NLP is aided by the fact that humans have the tendency to say similar things time and time again, hence increasing the probabilistic understanding of the data. In a recent proof-of-concept for an online community we were able to determine with 90% accuracy in real-time the decision stage of the site’s members.

We see broader patterns. For another company recognition and information search are high at the beginning of the week. They decline through the week with buyers moving into evaluation and purchase modes. Understanding this basic behavior through data enables marketers to serve the proper content to their community over the week and have high odds of getting a well-researched product review on Sunday after all of the dust has settled.

The beauty of a machine learning methodology like NLP applied to unstructured marketing data is that the models can develop actionable insights with significant precision.

Armed with this knowledge from modern analytics firms can shift resources and focus to deliver a relevant and delightful customer experience.

Visit the Consilient-group website for more blogs and information: Here

To view or add a comment, sign in

More articles by Stewart Pearson

Others also viewed

Explore content categories