From the course: Data Quality Testing with Great Expectations

Unlock this course with a free trial

Join today to access over 25,500 courses taught by industry experts.

Creating a batch definition

Creating a batch definition

Now that we have a data source and a data asset for the taxi data, let's set up a batch definition so that we can access the data in the asset that we just configured. A batch definition in Gx allows you to request either all the records from a data asset or a subset based on a filter, for example, a date field. Let's get back to our Jupyter Notebook to finish setting up our data connection with a batch definition. Just to recap, we've already set up the data source called myDataSource and a data asset called myDataAsset. Now we can create a batch definition using the addBatchDefinition statement like this. If you run this code, you won't see any output, since it's just a variable assignment. The full table name is a reserved name that configures the batch for the full table so that it returns all of the records in your data asset as a single batch. OK, now for the grand finale. Just to recap what we've done so far. First, we created a data source to connect to the Postgres database…

Contents