WHY CLEAN YOUR DATA???
Knowing how to clean your data is advantageous for many reasons.
DATA CLEANING IS A 3-STEP PROCESS
STEP 1: FIND THE DIRT
Start data cleaning by determining what is wrong with your data.
Look for the following:
STEP 2: SCRUB THE DIRT
Knowing the problem is half the battle.
Recommended by LinkedIn
The other half is solving it.
How do you solve it, though?
One ring might rule them all, but one approach is not going to cut it with all your data cleaning problems.
Depending on the type of data dirt you’re facing, you’ll need different cleaning techniques.
Step 2 is broken down into eight parts
STEP 3: RINSE AND REPEAT
Once cleaned, you repeat steps 1 and 2.
This is helpful for three reasons:
As the old machine learning wisdom goes: Garbage in, garbage out...