Architecting Data Migration: Efficiently Loading Data from Oracle (On-Premises) to Snowflake

Nagaraju Kendyala

Published Nov 10, 2024

Loading Data from Oracle (On-Premises) to Snowflake: A Step-by-Step Guide

As a Snowflake Architect, the process of loading data from an on-premises Oracle database into Snowflake typically involves several stages—data extraction, file storage, and data loading. This guide outlines the key steps and provides a technical framework to move data from an Oracle database (e.g., a customer table with five columns) into a Snowflake environment.

1. Preliminary Setup and Assumptions

Prerequisites:

Oracle Database: Ensure that the Oracle database is accessible, with valid credentials and network permissions.
Snowflake Account: Valid Snowflake credentials (user, password, account URL, role, warehouse, and database).
ODBC/JDBC Drivers: Ensure the correct Oracle ODBC/JDBC drivers are installed if using any intermediate tools or scripts for data extraction.

2. Create a Stage in Snowflake

Before loading the data into Snowflake, the data needs to be staged in a temporary location such as S3 (for external stages) or an internal Snowflake stage. The stage provides a location where data is stored temporarily before loading it into the destination table.

Example: Create an S3 External Stage

sql

CREATE STAGE my_stage

URL = 's3://my-bucket-name/my-path/'

CREDENTIALS = (AWS_KEY_ID = '<AWS_KEY>' AWS_SECRET_KEY = '<AWS_SECRET>');

Alternatively, for an internal stage, you can omit the URL and use Snowflake's internal storage.

3. Extract Data from Oracle

The data extraction step typically involves exporting data from the Oracle database into a file format that Snowflake can read, such as CSV or Parquet. This is often done using tools like Oracle SQL*Plus, SQLcl, or ETL tools.

Using SQL*Plus to Export Data to CSV:

sql

SPOOL /path/to/your/output_file.csv

SELECT * FROM customer;

SPOOL OFF;

Alternatively, if using a more sophisticated ETL tool like Oracle Data Integrator (ODI), Apache Nifi, or Informatica, configure the tool to extract data from the customer table into the desired file format and upload it directly to the staging area.

4. Upload Data to Snowflake Stage

After extracting the data to a file, the next step is to upload it to the staging area in S3 or Snowflake's internal stage. This can be done using AWS CLI or Snowflake's Web UI.

For instance, using AWS CLI to upload to S3:

aws s3 cp /path/to/your/output_file.csv s3://my-bucket-name/my-path/

If you're using an internal stage, you can directly upload the file into the Snowflake internal storage via the Snowflake UI or PUT command.

5. Define the Target Table in Snowflake

The target table in Snowflake must mirror the structure of the Oracle source table to ensure the data loads correctly.

Example: Create the customer table in Snowflake:

sql

CREATE TABLE customer (

customer_id INT,

first_name STRING,

Recommended by LinkedIn

Zero-Downtime Data Loading in Oracle: A Blue-Green…

Ricardo orclAPEX Developer 1 month ago

The Science of (Data) Migration

Vincent Rainardi 3 weeks ago

Top 6 Techniques for Identifying and Resolving…

Kshitij (KJ) Wagle 2 years ago

last_name STRING,

email STRING,

phone_number STRING

);

Ensure that the data types in Snowflake are aligned with the source table in Oracle (for example, STRING in Snowflake for VARCHAR in Oracle).

6. Load Data from the Stage into Snowflake Table

After the data has been uploaded to the staging area, the next step is to load the data into the target table. Snowflake uses the COPY INTO command to efficiently load large volumes of data from external or internal stages into the Snowflake table.

Example: Loading CSV Data Using COPY INTO

sql

COPY INTO customer

FROM @my_stage/output_file.csv

FILE_FORMAT = (TYPE = 'CSV' FIELD_OPTIONALLY_ENCLOSED_BY = '"')

ON_ERROR = 'CONTINUE'; -- Optional: Skips records with errors

FILE_FORMAT: Specifies the format of the file being loaded. In this case, it's a CSV file, and we have specified the optional field enclosure (e.g., ") to handle potential special characters.
ON_ERROR: Optional parameter to handle loading errors. This can be set to CONTINUE, SKIP_FILE, or ABORT_STATEMENT depending on your error tolerance.

If you are loading a Parquet or JSON file, you would adjust the FILE_FORMAT parameter accordingly.

7. Data Verification in Snowflake

Once the data is loaded into Snowflake, it’s essential to verify the load and perform basic checks:

sql

SELECT * FROM customer;

You can run additional validation or transformation queries to ensure that the data integrity has been maintained during the transfer.

8. Post-Processing (Optional)

Post-processing can include tasks such as:

Automating the Data Load: Use Snowflake Streams and Tasks to automate data ingestion or transformations in real-time.
Clean-Up: If the data was loaded into an external stage (e.g., S3), you may wish to clean up the files after the load to save storage costs.

Example for cleaning up a Snowflake stage:

sql

DROP STAGE my_stage;

Summary of Steps:

Create a Stage: Set up an external or internal stage in Snowflake.
Extract Data from Oracle: Use tools like SQL*Plus or an ETL tool to export the Oracle data to a flat file.
Upload Data: Upload the extracted data file to Snowflake’s stage (either internal or external storage).
Create Table in Snowflake: Define a schema that mirrors the Oracle table structure in Snowflake.
Load Data: Use the COPY INTO command to load data from the stage into the Snowflake table.
Verify Load: Run validation queries to ensure data integrity.
Post-Processing: Optionally automate and clean up.

This process ensures the efficient transfer of data from Oracle (on-premises) to Snowflake with minimal risk of data loss or corruption, leveraging Snowflake’s powerful capabilities for scalable and cost-efficient data loading.

Let me know if you'd like further details on any of the steps!

Jawa Z. 1y

this is quite insightful

To view or add a comment, sign in

Architecting Data Migration: Efficiently Loading Data from Oracle (On-Premises) to Snowflake

Nagaraju Kendyala

Recommended by LinkedIn

More articles by Nagaraju Kendyala

Others also viewed

Efficient Data Handling: Bulk Data Insertion in SQL/Oracle Using C# .NET

Unlocking the Power of SQL*Loader: Streamline Your Data Import Process

Lets Be Great With Netezza

SQL How to generate an object-based load schedule without additional metadata

Obstacles Encountered During Data Extraction from Oracle Relational Database to Microsoft Azure SQL Server

How to create a Transformation job in Oracle ADW Data Transforms

Oracle Database 23AI Features - Part 4 - VECTOR SEARCH INSIDE DATABASE

Introduction to Database Systems

SnowConvert: The Ultimate Tool for Seamless Cloud Migrations to Snowflake

White Paper on Custom Change Data Capture Solution for SQL Server Databases

Explore content categories

Recommended by LinkedIn

More articles by Nagaraju Kendyala

Snowflake and Generative AI: Choosing Between Technical Depth, Architecture, and Leadership

Human Connection: The Most Underrated Leadership Strategy

The Trust-to-Trajectory Model (Reflective Leadership):The Shift That Changes Everything: From Building Platforms to Owning Outcomes

Charged 500 Rs instead of 350 Rs for 1.5 hrs. ( In Time : 2.30PM Out Time : 3.46 PM )

Building a Zero-Ego Team ( thought-leadership) : From Ego to Excellence

Two Paths to Insight: Snowflake and Databricks

Design and Implementation Strategy for Global Amazon RDS Aurora PostgreSQL Deployment

Snowflake: Architecting with Precision: Technical Scenarios Where Snowflake May Not Be the Optimal Choice.

Snowflake:Maximizing Data Integrity and Compliance with Snowflake Time Travel: Insights from a Developer and Architect

Snowflake : Top IDEs and Tools for Accessing Snowflake Beyond Snowsight

Others also viewed

Efficient Data Handling: Bulk Data Insertion in SQL/Oracle Using C# .NET

Unlocking the Power of SQL*Loader: Streamline Your Data Import Process

Lets Be Great With Netezza

SQL How to generate an object-based load schedule without additional metadata

Obstacles Encountered During Data Extraction from Oracle Relational Database to Microsoft Azure SQL Server

How to create a Transformation job in Oracle ADW Data Transforms

Oracle Database 23AI Features - Part 4 - VECTOR SEARCH INSIDE DATABASE

Introduction to Database Systems

SnowConvert: The Ultimate Tool for Seamless Cloud Migrations to Snowflake

White Paper on Custom Change Data Capture Solution for SQL Server Databases

Explore content categories