Data Warehousing - The Road Ahead

Data Warehousing - The Road Ahead

As little as 10 years ago the main database options to deploy a Data Warehouse were provided by Oracle, Microsoft and IBM using hardware installed on-premises, with a an expensive and inflexible license cost. Fast forward to today, and the market for Big Data and Analytics have exploded, and there’s a massive wave of innovation and disruption including Hadoop, Data Lakes, Real Time Analytics, and a wholesale migration of IT systems to The Cloud.

The Data Warehouse industry has not remained immune to upheaval, and according to Datanyze, Oracle, the previous market leader doesn’t even make the top 10 in terms of new sales and market share. Also, three of the market leaders (Amazon Redshift, Google Big Query and Snowflake), are all dedicated cloud based data warehouse solutions - all released in the last ten years.

My research indicates that cloud based analytic platforms come with compelling benefits of cost control and extreme agility, with the ability to provision a warehouse within minutes. In addition, some solutions can scale up (or down) within seconds, and produce a fully working multi-terabyte sized database copy within minutes.

To address this subject area, I’ve written an eBook Comparison of Cloud Data Warehouse Platforms, which is free to download. In this 26 page document, I explore how the selection criteria for an analytics database has changed, and compare solutions from a number of vendors against the ideal.

Options evaluated include Oracle, Microsoft, Amazon, Snowflake and Hadoop, and I think you’ll be genuinely surprised by the findings.

You’ll need to provide an eMail addresses to download the eBook, but it does mean you’ll receive further insightful articles on data warehouse technology directly to your mail-box, and yes, you can rest assured, you won’t be spammed with rubbish.

A Comparison of Cloud Data Warehouse Platforms

Thank You

Thanks for reading this article. You can view more articles on Big Data, Cloud Computing, Database Architecture and the future of data warehousing on my web site www.Analytics.Today.

Hi John, I'm joining this party late. Your article makes it seem as if only your highlighted vendor can provide the cloudy goods. Not true! Teradata offers the same industry-leading capabilities via AWS, Azure, and Teradata Cloud as what customers have harnessed for years in their own data centers -- with all the cloud benefits you tout. Teradata's strategy is unique in that 100% software consistency and easy license portability address the reality -- as confirmed by Gartner, Forrester, Constellation, and others -- that most companies more than 10 years old will need a hybrid solution (on-prem + cloud) in order to leverage previous investments and massive on-premises data gravity. Respectfully, the "all-or-nothing" public-cloud-only approach of your favored vendor is just not feasible for most firms from either an economic or risk management perspective. With Teradata, cloud customers enjoy multiple elasticity options (Scale Up/Down/Out/In + Stop/Start) to boost compute when needed then dial it back (or off) to optimize spend. And, while some vendors’ lack of indexing and workload management may simplify life for users with basic or non-production requirements, most enterprises today want the ability to optimize performance and manage query SLAs deterministically rather than settle for best effort. We'd be happy to talk with you and others to demonstrate what's possible with analytics in the cloud using Teradata. Start at https://www.teradata.com/Products/Cloud Cheers, Brian

I would also add that Teradata have also not been standing still on the Cloud (AWS, Azure, Teradata cloud, Private cloud) front. Snowflake is included in this study yet Teradata has so many more advantages... 1) SQL Engine with analytic functions: 4D Analytics, nPath, Sessionisation, Attribution. 2) Can handle advanced data types: Hybrid row/ column, geospatial JSON/BSON, temporal.... 3) Has parallel integration with external data and analytic engines. 4) Has specialised Machine Learning and Graph Engines. 5) Advanced resource and workload monitoring and management, and performance optimisation...and so on. Check out... https://www.teradata.co.uk/Resources/Videos/High-Performance-Analytics-in-the-Cloud Richard.

Jon, makes a valid point here. IBM has consistently innovated in the database market and continues to do so with its offerings for data warehouse in the cloud, on prem or hybrid. Check out this link: https://www.ibm.com/uk-en/marketplace/db2-warehouse-on-cloud

It seems that IBM Db2 Warehouse on Cloud (Db2Woc) has been overlooked and not included in this study. Maybe because it has a label of being old school. The truth is far from this narrow perception. New databases born in the cloud, built for the cloud may have perceived cloud deployment models but does that really scale? Do they have the functionality and allow easy migration from on-premise databases? And with a loosely coupled model of Compute storage and services do they perform at large data volumes, will a cloud database with caching and cloning really perfrom well, imagine moving that data around and the cost. - Maybe or maybe not? I cannot imagine taking my on-prem relational database unload it to files put it up into AWS S3 storage via a snowball and then try and rewrite my whole database functionality to run my database - really? A database with a long history adapted for the cloud is more robust, feature rich and perfomant, and any migration is just an upgrade and there have been many years to work out any issues. A heritage and experience of database architecture also allows easier, quicker and risk free migration from on premise structures to a HYBRID database management system that includes cloud databases. Some cloud databases do not even support some features that are present in today's on-prem solutions, stored procedures is one that springs to mind. Detailed statistical and Analytics functions are others. Some of these reasons maybe are why the comment on the full migration to public cloud only taking a decade. A far better way is to move some workloads off to the cloud whilst others might go to private cloud and some stay on-prem, but all can talk to each other and exchange data via a common SQL engine (write once deploy many) and also federate to other databases and data sources. Data virtualisation is better that data warehouse virtualisation!

Nice article having concise information on currently trending cloud based DWH solutions. My view is that although snowflake seems promising for now but we can not discount Hadoop technology as its primarily being used for on-premise data lake implementations and has potential to replace needs of BI with Analytics offerings that's possible with Hadoop and Data security assurance that on premise solution brings.

Like
Reply

To view or add a comment, sign in

Others also viewed

Explore content categories