From the course: Advanced Analytics Engineering: Real-World Practice

Unlock this course with a free trial

Join today to access over 25,500 courses taught by industry experts.

Dealing with missing data in SQL

Dealing with missing data in SQL

- [Instructor] Missing data is the downfall of quality analysis. It's hard to trust your own insights and conclusions when part of the dataset is missing or incomplete. It can make it harder to join or merge datasets together as well. The data cleaning step can totally derail a project plan if we didn't know data was missing before we started working on the project. All this is to say it's a good idea to improve our skills when it comes to cleaning missing data. Data can be missing, null, or blank for various reasons, including human error, data system design flaws, input errors, errors in Python and SQL code, bad questionnaire or form design. Number two, system issues: system bugs, broken or failed API calls, or faulty data pipelines. Three, intentional absence: optional fields or conditional data, known missing data that can't be retrieved, and four, latency or delay: data that hasn't been processed or recorded yet, staging lag, intentional delays for accuracy. Keep in mind, not all…

Contents