Data’s Hierarchy of Needs - Collection + Validation

Data’s Hierarchy of Needs - Collection + Validation

To build any solid structure, every craftsman knows the building will only be as good as its foundation. When it comes to building a successful data structure, collected data and its accuracy is the foundation. “Data’s Hierarchy of Needs” outlines this foundation of data collection and validation, and here we will discuss what data to collect and the importance of validating it.

Collection

The first thought most people have when they begin data collection is “Collect everything and we’ll decide what’s important later”. This strategy is dangerous for primarily three reasons:

1.      Deciding what data is essential later slows down the process of accomplishing the main task of collecting data fundamental to your goals. Collecting any and all data takes time from those who must do the actual collection and storage of this information and distracts from business goals.

2.      Delaying the identification of important data introduces too much overhead. By collecting everything in lieu of only what you need, time is wasted and issues are prolonged as someone will have to go back into the collected data and sort through it later.

3.      With regulations such as GDPR and CCPA, more data can mean more risk. Be aware and knowledgeable of what data you can collect.

To avoid these three issues, plan with those who have a vested interest in the data (stakeholders, analysts, et. al) and discuss what all parties need from the data. Identify these needs so you are collecting only what is deemed important. Remember this process is iterative. You won’t always know what you need when you begin a new project and that’s okay – realize what information is owned by who as well as the governance it requires. Come up with a plan for handling various privacy laws now, not when the need arises.

Validation

Now with essential data to build upon, the next step is check the quality of it. If you begin building your data structure on the crumbling foundation that is errant data, there is no hope to support the advanced analytics endeavors up stream of “Data’s Hierarchy of Needs”. The need to align data to a source of truth is paramount. Reporting on errant data leads to making wrong decisions and the consequences only become more drastic the more advanced your analytics gets.

Who’s Involved

Anyone with a vested interest in the data that a project will deliver should be involved. Since this is often a broad group, you can run into to the issue of “too many cooks in the kitchen.” Of course, not everyone needs to be in every meeting to discuss. When starting a new data project, however, those involved in the descriptive, predictive, prescriptive, and optimization portions of “Data’s Hierarchy of Needs” must be explicit about data needs.

Summary

It is vital when starting any project to begin it the right way. You must not only have the materials, but the right materials. In the case of data and analytics projects, you need to have the right data. Collect only the data necessary to make your project a success and help achieve your goals. Once you have the right materials, make sure your materials are quality. This means having data you and everyone involved can trust. If you don’t believe in the data you are using to make decisions, the entire project is doomed from the start. Once you have collected and validated the data you need, it can be trusted to move on to describing it. In the next post, we will discuss the value and steps of describing data. 


This is great, and also emphasizes the importance of conducting meaningful stakeholder interviews up-front. In a past life, I've had to come in behind folks who tracked literally everything. Unfortunately, they weren't supported by any business rules and by that point they were too embarrassed to even ask what was important!

To view or add a comment, sign in

Others also viewed

Explore content categories