SQL Server Data Warehouse

SQL Server Data Warehouse

If you have a data warehouse on Microsoft SQL Server on-prem, and your ETL is SSIS on-prem, and you have SSAS cube and Power BI for reporting, how do you migrate that to the cloud? Would you "lift and shift" or build something new?

To migrate to cloud you need to move that on-prem SQL Server warehouse into Azure SQL. Use Azure SQL Migration Extension in Azure Data Studio: https://learn.microsoft.com/en-us/data-migration/sql-server/database/guide. Then migrate your SSIS packages to Azure Data Factory using this: https://learn.microsoft.com/en-us/sql/integration-services/lift-shift/ssis-azure-lift-shift-ssis-packages-overview?view=sql-server-ver16. Update your Power BI connection to point to the new warehouse in Azure SQL Database.

Unfortunately Azure Analysis Services (AAS) only supports tabular models, not multi dimensional cube. So if you want to use SSAS you will need to rebuild your SSAS cube using tabular model in AAS. The only option for multi dimensional cubes on Azure is to run SSAS in a virtual machine but I would not recommend this as it would paint yourself into a corner as Microsoft no longer support multi dimensional cubes.

That is if you are concern about the cost and the time it will take to migrate. The above option is the cheapest and shortest time line for you to migrate your on prem SQL Server data warehouse to Azure. But if it was me I would not do that. I would do the below instead, taking this opportunity to modernise the warehousing in my company, if the budget allow it.

In order to keep up with the era we are in now I would not migrate the on prem warehouse, but rebuild it from scratch using data warehouse "cloud native solution". I would choose Snowflake compared to Databricks because it is simpler and it is a database (whereas Spark is RDD, i.e. distributed dataset/files). I would choose Snowflake compared to other columnar like Big Query and Redshift because the support, trend and future prospect is better, particularly in AI.

This is a wild guess but you will need some costs so I'll try to guess. I would certainly make mistake because it involves numbers. When it involves numbers everyone make mistakes. The migration option to Azure above will probably take about 6 months from planning to completion. This include learning and trying things out in Dev before doing it for real in Prod. And it includes regression testing to make sure that the new functions the same way as the old. This assumes you will be doing it yourself i.e. 2 engineers from your current team. I'll take the "cost of employment" as £100k (salary, bonus, holiday, insurance, pension, tax, training, management, admin, etc.) so 2 developers for 6 months is £100k. The cost of Azure SQL and ADF during 6 months development is low, but let's budget that as £30k. Software, training, books, etc. let's say £20k. So the cost of migration is £150k. And then the cost of running it (usually calculated as annual figure) including production support team (say half FTE = Full Time Employee, £50k), development support (say half FTE, £50k), Azure cost (say £50k). So annual running cost = £150k.

If you rebuild in Snowflake and dbt, it would take much longer, probably 2 years, including learning. Using a team of 4 (1 BA, 2 engineers, 1 testers) plus a PM. So 5 FTEs x £100k x 2 years = £1m. The software cost (cloud cost, licence fee, storage cost) during 2 years development say £200k. So development cost is £1.2m whereas operation cost (annually) let's say similar to the above: half FTE for prod support (£50k), half FTE for dev support (£50k), plus Snowflake + Azure cost (say £200k/year) = £300k/year.

The cost of rebuilding is so high that the company probably would not be able to afford it. Companies like this, who still have a data warehouse on on-prem SQL Server using SSIS, have missed the boat 10 times (in the last 10 years). It is very unlikely that they can afford that £1.2m migration cost. It is much more likely that the company would choose the £150k "lift and shift" option which sadly wouldn't get them anywhere. Yes the SSIS will be "running" in Azure, but it's still SSIS packages, not ADF data flows. But if they don't set aside that budget in the next 5 years to migrate, they will be trapped into the past and will not be able to get out forever. And sadly, when the company's IT is "dead" (quote unquote), then the company itself would be "dead" too (closing down).

Anyway. I hope this gives enough illustration on what happens to companies which 15 years ago were considered in good position with regards to data datawarehousing (SQL Server on prem warehouse with SSIS) but sadly today is regarded as "backwards" or "at the back of the pack" in terms of technology. The data warehousing era has moved on to modern era of cloud technology.

Apologies if I feel passionate about this, I came from SQL Server / Microsoft BI background. Back in 2005 I was doing exactly that: T-SQL, SSIS, SSAS, SSRS. On prem. Indexing, partitioning, performance tuning. MDX. Yup MDX. Just look at the SSAS section on my blog here. And I know many companies are still on prem. I would love to hear your opinion on this. How to migrate SQL Server data warehouse to cloud. Whether you have a SQL Server DW or not. Your ideas and opinion might help our colleagues who are in those companies. For example, should the recode those SSIS packages into ADF data flows?

Disclaimer: what I wrote above is my own opinion. Not the view of my employer or my clients. Or previous employers/clients. And they are my opinion, not facts, which is always subjective. In particular the numbers: all the numbers above are estimates, made within only a minute time span, and not much research. So it's possible that they are inaccurate or incorrect. So please take it with a pinch of salt.

Keep learning!

List of my articles on Linkedin: https://www.garudax.id/pulse/list-all-my-articles-vincent-rainardi-eohge/

My blog on data platform and data science: https://dwbi1.wordpress.com/

I have to agree with the author. Our BI setup is based on on-prem SQL with SSIS and Power BI. Even though we now have Fabric capacity, moving everything to Fabric doesn’t really make sense at this point — there aren’t any major benefits for us. One possible path could be replicating the "Prod" databases into Fabric lakehouses using pipelines/notebooks, but that just adds extra steps and more complexity to the overall process. On-prem SQL will still remain main DWH anyway.

I could easily see Fabric as a game changer in this equation. But I guess in the end, are you doing a big bang migration, og do you shift your data sequentially to the new platform.

A pretty straight forward architecture. Thank you for sharing!

Like
Reply

This is by far the simplest and cost friendly form of implementing a data warehouse

ETL data collect/ELT/Orchestrator ---------------------------------------- SSIS offered with SQL server on premise was popular for a long time in the years 2000-2010 Do we know why Azure data factory did not reach same success ? Ingestions not so good performances ? Features wealth missing ? Great Github integrated source control but .. "tricky" CICD via Azure DEVOPS ? Regular innovations missing ? Cost ? .. DWH ------ Up to 2017 there was a DWH PAAS edition (MPP/OLAP) on Azure SQL Database familly

To view or add a comment, sign in

More articles by Vincent Rainardi

  • Unstructured Data - From Conversational Files to Conversational Analytics

    For decades analytics is about tables, numbers and relational databases. It is about structured data, as we call it.

    3 Comments
  • Business Analyst

    Before I was a data architect, I was a data engineer. And before I was a data engineer, I was a business analyst.

    1 Comment
  • CDO and CIO: What's the difference?

    So CIO is Chief Data Officer. And CDO is Chief Data Officer.

  • Snowflake dbt Projects

    How does Snowflake dbt projects look like? It looks like this: Snowflake dbt Projects and Cortex Code On the left you…

  • Stupid Questions

    There is NO such thing as a stupid question. Why? Because asking questions is a good way to get knowledge.

  • The Science of (Data) Migration

    Say you have a data warehouse in SQL Server or Oracle, and you need to migrate it to Snowflake or Databricks. The…

    1 Comment
  • Cortex Search

    Cortex is the AI capability in Snowflake. Of all the Cortex features, Cortex Search is probably the least well known.

  • AI-ready data: what does it mean?

    JI am a practical person and when I hear people talking “fluffy cloud” words like “AI-ready data” I always try find out…

  • Interval Data Type

    We all know a data type called Date. And Time.

  • Row Timestamp

    In Snowflake, the Row Timestamp is a column that stores when each row was last updated. It’s a brand new feature, went…

Others also viewed

Explore content categories