SQL Server Data Warehouse
If you have a data warehouse on Microsoft SQL Server on-prem, and your ETL is SSIS on-prem, and you have SSAS cube and Power BI for reporting, how do you migrate that to the cloud? Would you "lift and shift" or build something new?
To migrate to cloud you need to move that on-prem SQL Server warehouse into Azure SQL. Use Azure SQL Migration Extension in Azure Data Studio: https://learn.microsoft.com/en-us/data-migration/sql-server/database/guide. Then migrate your SSIS packages to Azure Data Factory using this: https://learn.microsoft.com/en-us/sql/integration-services/lift-shift/ssis-azure-lift-shift-ssis-packages-overview?view=sql-server-ver16. Update your Power BI connection to point to the new warehouse in Azure SQL Database.
Unfortunately Azure Analysis Services (AAS) only supports tabular models, not multi dimensional cube. So if you want to use SSAS you will need to rebuild your SSAS cube using tabular model in AAS. The only option for multi dimensional cubes on Azure is to run SSAS in a virtual machine but I would not recommend this as it would paint yourself into a corner as Microsoft no longer support multi dimensional cubes.
That is if you are concern about the cost and the time it will take to migrate. The above option is the cheapest and shortest time line for you to migrate your on prem SQL Server data warehouse to Azure. But if it was me I would not do that. I would do the below instead, taking this opportunity to modernise the warehousing in my company, if the budget allow it.
In order to keep up with the era we are in now I would not migrate the on prem warehouse, but rebuild it from scratch using data warehouse "cloud native solution". I would choose Snowflake compared to Databricks because it is simpler and it is a database (whereas Spark is RDD, i.e. distributed dataset/files). I would choose Snowflake compared to other columnar like Big Query and Redshift because the support, trend and future prospect is better, particularly in AI.
This is a wild guess but you will need some costs so I'll try to guess. I would certainly make mistake because it involves numbers. When it involves numbers everyone make mistakes. The migration option to Azure above will probably take about 6 months from planning to completion. This include learning and trying things out in Dev before doing it for real in Prod. And it includes regression testing to make sure that the new functions the same way as the old. This assumes you will be doing it yourself i.e. 2 engineers from your current team. I'll take the "cost of employment" as £100k (salary, bonus, holiday, insurance, pension, tax, training, management, admin, etc.) so 2 developers for 6 months is £100k. The cost of Azure SQL and ADF during 6 months development is low, but let's budget that as £30k. Software, training, books, etc. let's say £20k. So the cost of migration is £150k. And then the cost of running it (usually calculated as annual figure) including production support team (say half FTE = Full Time Employee, £50k), development support (say half FTE, £50k), Azure cost (say £50k). So annual running cost = £150k.
If you rebuild in Snowflake and dbt, it would take much longer, probably 2 years, including learning. Using a team of 4 (1 BA, 2 engineers, 1 testers) plus a PM. So 5 FTEs x £100k x 2 years = £1m. The software cost (cloud cost, licence fee, storage cost) during 2 years development say £200k. So development cost is £1.2m whereas operation cost (annually) let's say similar to the above: half FTE for prod support (£50k), half FTE for dev support (£50k), plus Snowflake + Azure cost (say £200k/year) = £300k/year.
Recommended by LinkedIn
The cost of rebuilding is so high that the company probably would not be able to afford it. Companies like this, who still have a data warehouse on on-prem SQL Server using SSIS, have missed the boat 10 times (in the last 10 years). It is very unlikely that they can afford that £1.2m migration cost. It is much more likely that the company would choose the £150k "lift and shift" option which sadly wouldn't get them anywhere. Yes the SSIS will be "running" in Azure, but it's still SSIS packages, not ADF data flows. But if they don't set aside that budget in the next 5 years to migrate, they will be trapped into the past and will not be able to get out forever. And sadly, when the company's IT is "dead" (quote unquote), then the company itself would be "dead" too (closing down).
Anyway. I hope this gives enough illustration on what happens to companies which 15 years ago were considered in good position with regards to data datawarehousing (SQL Server on prem warehouse with SSIS) but sadly today is regarded as "backwards" or "at the back of the pack" in terms of technology. The data warehousing era has moved on to modern era of cloud technology.
Apologies if I feel passionate about this, I came from SQL Server / Microsoft BI background. Back in 2005 I was doing exactly that: T-SQL, SSIS, SSAS, SSRS. On prem. Indexing, partitioning, performance tuning. MDX. Yup MDX. Just look at the SSAS section on my blog here. And I know many companies are still on prem. I would love to hear your opinion on this. How to migrate SQL Server data warehouse to cloud. Whether you have a SQL Server DW or not. Your ideas and opinion might help our colleagues who are in those companies. For example, should the recode those SSIS packages into ADF data flows?
Disclaimer: what I wrote above is my own opinion. Not the view of my employer or my clients. Or previous employers/clients. And they are my opinion, not facts, which is always subjective. In particular the numbers: all the numbers above are estimates, made within only a minute time span, and not much research. So it's possible that they are inaccurate or incorrect. So please take it with a pinch of salt.
Keep learning!
List of my articles on Linkedin: https://www.garudax.id/pulse/list-all-my-articles-vincent-rainardi-eohge/
My blog on data platform and data science: https://dwbi1.wordpress.com/
I have to agree with the author. Our BI setup is based on on-prem SQL with SSIS and Power BI. Even though we now have Fabric capacity, moving everything to Fabric doesn’t really make sense at this point — there aren’t any major benefits for us. One possible path could be replicating the "Prod" databases into Fabric lakehouses using pipelines/notebooks, but that just adds extra steps and more complexity to the overall process. On-prem SQL will still remain main DWH anyway.
I could easily see Fabric as a game changer in this equation. But I guess in the end, are you doing a big bang migration, og do you shift your data sequentially to the new platform.
A pretty straight forward architecture. Thank you for sharing!
This is by far the simplest and cost friendly form of implementing a data warehouse
ETL data collect/ELT/Orchestrator ---------------------------------------- SSIS offered with SQL server on premise was popular for a long time in the years 2000-2010 Do we know why Azure data factory did not reach same success ? Ingestions not so good performances ? Features wealth missing ? Great Github integrated source control but .. "tricky" CICD via Azure DEVOPS ? Regular innovations missing ? Cost ? .. DWH ------ Up to 2017 there was a DWH PAAS edition (MPP/OLAP) on Azure SQL Database familly