Data -> Process -> Technology
Data being at the first level of any analytical analysis, which leads to multiple iteration and churning, and making meaning out of it, is very critical for process which follows it.
Data is the foundation of many processes, which totally build upon its behavior and type.
Up to half of the time needed for analysis is typically spent in "cleaning" the data. This time is also, typically, underestimated. Often, once a clean data set is achieved, the analysis itself is quite straightforward.
DATA CLEANING is a two-step process including DETECTIONand then CORRECTION of errors in a data set.
When this cleaned data enters a series of actions or steps taken in order to achieve a desired business metrics is commonly called as process. Normally it is a collection of actions, activities, steps or tasks.
A process is a collection of interrelated work tasks initiated in response to an event that achieves a specific result for the customer of the process.
As a process is subjective to a business, department or individual.
It should have:
- deliver a specific result
- a customer receives the result or is the beneficiary of it
- the result individually identifiable and countable
When repetitive set of data goes through the same process, without any major change, then to increase the productivity, technology is brought in which makes the input and output more systematic and faster, but there is a big catch in between, which often ignored by individuals or company.
So, what happens when a non clean data without a diligent check enters the technology environment.
A systematic generation of error.
Suppose, the data accuracy is 50%, and same data enters a technology enabled platform, which is quite systematic in generating the report.
This lead to systematic generation of 50% error of the report being manufactured. Some times it become difficult to judge whether problem is with data or the process or the technology platform, especially in a complex process, where multiple steps and heavy data crunching is required.
Many business leader also feel that, it is better to run a manual process rather than a process with error, because a process with error leads to waste of money spent on technology, manual effort required and no desired result.
So, before going the third level (Technology), first and second level (Data and process) needs to be in place and validated. Using few basic statistics, one can check the logic and also interpret the basic sense of data, and could do the specific checks to clean it.
Some of the commonly used one are DESCRIPTIVE STATISTICS, SCATTER PLOTS and HISTOGRAMS. Many available statistical packages can help and do these operations easily, where we can import the data from any format.
So, Data and its quality is very important before converting this to your business need and using the same for decision making.
Process is one of the most important : While victory in gaining competitive advantage lies in stream-lining company processes, the results of making changes to processes are not always predictable. Process complexity and variability combine to obscure relationships between cause and effect. Most companies fail to consider overall process performance as they create new policies and procedures designed to meet the shifting needs of their market, industry or strategy. Often, the resulting growth in processes creates huge inefficiencies.