From the course: Data Quality Testing with Great Expectations
Unlock this course with a free trial
Join today to access over 25,500 courses taught by industry experts.
How GX fits into your data stack - Great Expectations Tutorial
From the course: Data Quality Testing with Great Expectations
How GX fits into your data stack
Up until now, we focused on learning the building blocks of Gx. Data context, expectations, validation definitions, and checkpoints. Now it's time to zoom out a little and talk about how Gx actually fits into a real production data environment. At a high level, Gx is not a monitoring tool that sits outside of your data stack. It's a testing framework that runs inside your data pipelines. GxCore executes wherever your data already lives and wherever your pipeline code runs, whether that's in Python scripts, Airflow tasks, Spark jobs, or DBT workflows. This is a key fact to understand. Gx doesn't move data, store copies of your data, or replace your pipeline logic. Instead, it validates data in place at specific points in the pipeline. The validation results can then be logged, used to trigger alerts, or used to stop downstream processing when something goes wrong. So, where exactly should Gx run in a production pipeline? The short answer is anywhere you move or transform data. A very…