From the course: Data Preparation, Feature Engineering, and Augmentation for AI Models
Unlock this course with a free trial
Join today to access over 25,500 courses taught by industry experts.
Data exploration and initial quality assessment
From the course: Data Preparation, Feature Engineering, and Augmentation for AI Models
Data exploration and initial quality assessment
- [Instructor] Data exploration is a critical first step before any modeling or analysis should happen. And the reason we do this is because data exploration helps us reveal hidden patterns, relationships, and potentially anomalies in our datasets. Now we use data exploration to identify data quality issues that could undermine our AI model's performance. But it's also useful because data exploration can inform both feature engineering and feature selection decisions. And then the overall benefit of data exploration is that it really helps us reduce project risks. And it does this by exposing problems early in the development cycle. Now some common data quality issues to watch out for are missing data. Now we see this in the forms of incomplete records, for example, records that have empty fields or null values. We can also have issues with inconsistent formats. So for example, we might have dates in different formats. So some dates might be in year month day format while others are…
Contents
-
-
-
(Locked)
Data exploration and initial quality assessment4m 49s
-
Detecting and managing missing data5m 13s
-
(Locked)
Detecting and managing outliers3m
-
(Locked)
Challenge: Assess data quality of a dataset18s
-
(Locked)
Solution: Assess data quality of a dataset23s
-
(Locked)
Feature engineering: Scaling and normalizing data4m 47s
-
(Locked)
Feature engineering: Categorical encodings4m 8s
-
(Locked)
Challenge: Apply feature engineering to a dataset18s
-
(Locked)
Solution: Apply feature engineering to a dataset16s
-
(Locked)
-
-
-
-
-
-