From the course: Data Quality Testing with Great Expectations
Unlock this course with a free trial
Join today to access over 25,500 courses taught by industry experts.
Approaches to debugging and fixing data quality issues - Great Expectations Tutorial
From the course: Data Quality Testing with Great Expectations
Approaches to debugging and fixing data quality issues
Now that we understand where the data quality problems happen in our taxidata, the next question is how to respond. Not every data quality issue should be handled the same way. And as we've seen in a taxidata example, fixing the source data isn't always possible. A useful way to think about this is that there are three main response strategies depending on the root cause and impact of the issue. First, fixing the data or the code. Second, filtering out or quarantining incorrect records. And third, adjusting the data tests. The first strategy is to fix the data or the transformation code. When a data error is caused by a problem in your ingestion or transformation code, the best option is to fix those problems. This could mean correcting code that parses the data, handling data types correctly, or fixing incorrect joins. The second strategy is to filter out or quarantine incorrect records. In our case, we saw that we have negative values for the total amount. We can decide to exclude…
Contents
-
-
-
-
-
(Locked)
Triggering actions with checkpoints2m 55s
-
(Locked)
Understanding data test failures2m 32s
-
Root cause analysis of test failures3m 34s
-
(Locked)
Approaches to debugging and fixing data quality issues2m 42s
-
(Locked)
Creating fuzzy expectations2m 12s
-
(Locked)
Updating and deleting Expectations in an Expectation Suite2m 29s
-
(Locked)
Creating custom Expectations3m 24s
-
(Locked)
Monitoring ongoing data quality2m 24s
-
(Locked)
-
-