MPP comes to SQL Server

In the past, to process large amounts of data in SQL Server, you had to use an appliance called ADW (Analytics Data Warehouse) or also commonly known as PDW (Performance Data warehouse). ADW is not only a special version of SQL Server but a whole appliance including CPUs, memory and storage. ADW was very expensive and because of the cost, wasn't used all that much. Even though it is expensive, it is also very powerful. The reason behind that it was so powerful is it uses MPP. MPP is Massive Parallel Processing.  It devides any computing needs over mutlitple processing nodes with highly partitioned data.

To do the same sorts of workloads that ADW accomplishes, you can actually accomplish that without all the limitations and expenses occurred in ADW. The way to do this is to enable Polybase in SQL Server 2016. While you will need the Enterprise Edition of SQL Server to do this, it is much cheaper and easier than you probably imagine. At it cores, Polybase is a SQL Server implementation of Hive over HDFS (Hadoop). If you are familiar with Hadoop and Hive, you know that the power of Hadoop is in its distributed file system and map reduce over multiple processing nodes. Hadoop provides you with a similar power of processing as MPP. If you are familiar with Hive, you know it provides a SQL interface to produce map reduce jobs over your Hadoop cluster. Now imagine combining HDFS, Map Reduce and TSQL together. That is exactly what Polybase does under SQL Server.

With Polybase, you get you get a massively scalable and powerful MPP engine for your Data Analytic needs in a familiar and easy to use SQL Server implementation. If you need more power, you can just add more nodes to your cluster. If you need the benefits of relational technology it is there too. Just think of many things you can accomplish to processing tons and tons of data for your Data Warehousing and Analytic needs. The possibilities are endless.

If you would like to know more about Polybase and how to architect a great powerful analytics solution, please feel free to contact me.  

MS has been trying to conjure up a credible MPP story since the acquisition of Datallegro about a decade ago. Close. No cigar.

Like
Reply

With Embedded R Engine in SQL 2016 and Enhancements in In Memory Analytics, this solution could be effective for Mission Critical Analytics which require more security controls.

The appliance you are referring to is APS (Analytical Platform System) which uses Massively Parallel Processing running SQL Server technology on each of its compute nodes to retrieve data from storage nodes. APS is based on relational data storage on which technology like columnstore are implemented. With the introduction of Polybase to APS it allowed a single T-SQL query to process HDFS stored data (Hadoop file system) in combination with the relational engine. In SQL Server 2016 Polybase was introduced to the SQL Server SMP engine to execute in a similar fashion towards HDFS as Polybase does on APS. It's a little misleading saying MPP comes to SQL Server as the processing takes place on the HDFS cluster.

To view or add a comment, sign in

More articles by Jeff Johnson

  • SQL Server Data Warehouse Reference Architecture in Azure

    Building a SQL Server Data Warehouse solution can be accomplished in a number of ways in Azure. For good data warehouse…

  • Data Lake - Schema on Read or Write?

    In continuing the discussion on Data Lakes verses Data Warehouses, it leads into a discussion also of schema on read…

    2 Comments
  • Competitive Advantage

    Do you have the competitive advantage? What does it even mean in your business life? It means “A superiority gained by…

  • Data Warehouse or Data Lake

    When it comes to how you store your data, you might have heard of the term Data Lake. While most of us are very…

    15 Comments
  • Slicing and Dicing Indexes in SQL Server 2016

    With the release of SQL Server 2016, building the right kind of indexes got much more complicated. We now have…

    5 Comments
  • The costs of Ego

    Lately I have been reading a book called Ego is the Enemy by Ryan Holiday. As I was reading through the book and its…

  • What is the difference in responding and reacting?

    What is the difference in responding and reacting? While at first glance it might just seem like semantics, the impact…

  • Creating a tribe

    Why is creating a tribe so important? Well there is nothing that functions at a higher level than a tribe. When you are…

  • Keeping Your Word.

    Every person I have ever met would tell you how important it is to keep your word. If you ask anyone if they would do…

  • 3 types of leaders

    I was thinking of various leadership styles and came up with 3 primary types of leaders. Those who lead by fear, those…

Others also viewed

Explore content categories