BigQuery and DataLab

Ruma Sinha

Published Jul 23, 2021

We can work directly with BigQuery from Jupyter notebook in DataLab. Can do the pip install as !pip install google-cloud-bigquery and we are good to go.

In the Jupyter notebook cell we can run the SQL query as:

%%bigquery --project $PROJECT
SELECT fixed_acidity,volatile_acidity,citric_acid,residual_sugar,chlorides,free_sulfur_dioxide,total_sulfur_dioxide,density,pH,sulphates,alcohol,quality
FROM `<project>.winequality.winequality`
ORDER BY quality DESC

The project is the current GCP project we are working. The table winequality contains the Wine quality data from Kaggle. This table is within the dataset of the same name. Once the query completes the output is displayed similar to pandas dataframe.

We can run a parameterized query as:

PARAMS = {"quality_val": 4}

%%bigquery --project $PROJECT --params $PARAMS
SELECT 
fixed_acidity,volatile_acidity,citric_acid,residual_sugar,chlorides,free_sulfur_dioxide,total_sulfur_dioxide,density,pH,sulphates,alcohol,quality
FROM `<project>.winequality.winequality`
WHERE quality = @quality_val
LIMIT 10

During the query runtime the value 4 is passed to the parameter quality_val and the query result comprise of data where quality = 4

We can directly load the query result into pandas dataframe with " %%bigquery df"

%%bigquery df --project $PROJECT
SELECT 
fixed_acidity,volatile_acidity,citric_acid,residual_sugar,chlorides,free_sulfur_dioxide,total_sulfur_dioxide,density,pH,sulphates,alcohol,quality
FROM `<project>.winequality.winequality`
ORDER BY quality DESC

df.describe()

The describe function of the pandas dataframe return the mean,max,min etc for all the numerical columns.

The boxplot of the columns density and quality as

df.boxplot('density','quality')

To view or add a comment, sign in

BigQuery and DataLab

Ruma Sinha

More articles by Ruma Sinha

Explore content categories

More articles by Ruma Sinha

BigqueryML and Logistic Regression

Loading data with csv file in Google Bigquery

Explore content categories