Defining data literacy

Defining data literacy

What is data literacy? What does data-literate mean?

Thanks to Jordan Morrow, whose recent posts and expertise in data literacy inspired me to create this series.

Falling for data

I've loved data since I was 11 when I bought my first Strat-o-Matic baseball board game. Out of curiosity I rolled the two six-sided dice 200 times and tallied their totals, not realizing I was developing my first histogram that eventually led to a visualized normal distribution. I've used data my entire career to make smarter business decisions, and data can be an invaluable asset, but many organizations still struggle on where to start to develop that asset. An asset is something that can be leveraged and executed against; finding the hidden patterns and stories in data is how assets are developed, and data science and data literacy--no matter how complicated, as even the simplest exploratory data analysis (EDA) can inspire and inform us--is the linchpin to creating the best products and services that support the economy of our digital world today.

Defining data literacy

Data literacy, like many terms associated with data engineering/mining and the digital world in general, has (what seems) an infinite number of definitions. Here's my definition:

Data literacy is the knowledge, curiosity, and experience to create value from data.

The three (or five) V's of data

Data has no inherent value until we apply people, process, and technology. Data does have attributes, and those attributes are used to describe something. I first saw the 3 V's of data in a Gartner article: volume, variety, and velocity. Two more V's have been introduced: veracity and value. Today organizations are collecting so much data (volume) they often struggle to manage it--organize, clean, and make it available--to create value from it. The data they're collecting is also broad (variety)--marketing data such as website, email, search engine optimization; transactions; sales; engineering; HR; finance, and robots or other connected devices--that keeping it under control and using it to make smarter decisions is difficult. That variety of data is also coming faster and faster (velocity) as the world continues to go faster, the channels to gather data become more real-time, and we all expect to be treated as individuals and our experiences become as important as the service or product we're buying. Veracity, a fancy word that continues the V alliteration theme, refers to the quality and accuracy of data. Like value, there's an argument to be made that quality and accurate data is not possible until people, process, and technology make it so. Volume, variety, and velocity are native to data--how much, from where, and how often--and veracity and value are transformational; they require manipulation. Understanding this difference and these attributes is important to data literacy.

Literacy and insights built on understanding, curiosity, and experience 

Great insights are built on the foundation of understanding your data--where it comes from, how much you have to work with, and how often you'll get updated or new data. Then you'll have to determine the quality of that data, what you'll need to do to make it accurate and usable, and what you'll do to create something valuable. Combining what you know about your data, your curiosity as to what can be done with that data, and your experience in the space in which you're solving problems, reaching goals, or testing hypotheses is a starting point for success, the first steps to data literacy.

In my next post I'll outline the skills--technical and personal--that make data-literate people who they are.

Thanks Sam Johnson. I really like that definition. Data needs to contribute to value...we forget that too often. Thanks for sharing.

Like
Reply

To view or add a comment, sign in

More articles by Sam Johnson

Others also viewed

Explore content categories