Databricks: Connecting to Azure SQL Database and loading the data into Azure datalake gen1

Hi everyone,

I will be writing series of Databricks topics in which I would like to share some connection mechanism. Here, I will be discussing about how you can connect to Azure SQL database. In later article, I will discuss about connection with Azure blob storage, Azure data lake Gen 1 and Gen 2 using python.

In this article, I am going to cover

  1. How to set up your jdbc connection for connecting Azure SQL?
  2. How to mount your Datalake gen1?
  3. How to load data from Azure SQL to Datalake gen1?
  4. How to connect your Power BI to Datalake for building the visualization?

In order to connect with Azure SQL, you need to enter the below details in Databricks notebook

No alt text provided for this image

JdbcHostName will be your Azure SQL server

JdbcPort will be 1433

JdbcDatabase will be your Azure SQL Database.

In Properties you need to pass credentials to login into Azure SQL server.

Note:

1.You can get Server details and DB details in Azure portal where you have created your azure resources.

2. Writing the credentials in the notebook is not a good practice. You can use Secrets and scopes instead.

Now, We need to set up our Azure data lake gen1. To do this we need to mount our Azure data lake store gen1 so that we can use this as our destination to write the output of our Azure SQL DB.

No alt text provided for this image

Note:

  1. Once you register an app in app registration, you will get ClientID and Directory ID.
No alt text provided for this image

2. You need to give access to this app and once you generate a key you can use that key in the config file above in credentials

Now, We are good to push the data from Azure SQL database to Datalake gen1. To do this we need to execute the below script in databricks notebook.

No alt text provided for this image

3. Now we have our sales data into our Azure datalake gen1.

No alt text provided for this image

We are going to add one more step here, We will try to connect this source data to Power BI. To do this Search For Azure data lake gen1 connector inside Get Data in your Power BI desktop.

No alt text provided for this image

Click on connect and Provide the URL of your datalake gen1

No alt text provided for this image

Click on Ok and now you are done, you can search for your container and start cleaning the data inside your Power query editor and load it to Power BI desktop.

No alt text provided for this image

Thanks for reading it!!

Amit


To view or add a comment, sign in

More articles by Amit Kumar

Explore content categories