Computational Loads - Serverless Architecture

Introduction

Today, there is huge buzz in the developer community on serverless architectures. In general, serverless architectures are all about breaking your application business logic down into functions that run on a Platform as a Service (PaaS) infrastructure. There are no physical servers, and no Virtual Machines. You simply pay a simple “per function call” fee to execute the logic you have. This type of business model can be SUPER beneficial in terms of cost as you only pay for what you use (no fixed monthly fees), and the costs are incredibly affordable (often very small fractions of a penny). Depending on your application, this can add up to real cost savings. This enables new computational designs to be build that can parallelize load at an incredibly affordable price.

Azure Functions

Microsoft is currently on its next generation of Azure functions called “V2”. V1 was a paradigm shift for Microsoft in that it offered a “consumption model” that allowed a developer to spin up functions as needed. The pricing model for Azure Functions is based on a “per call” fee (currently $.20/million transactions), and a memory/time fee (currently $.000016/GB-S). Even better, the first million calls are free, and the first 400,000 GB-s are free as well. Based on our benchmarking, I can’t begin to express how inexpensive this model is relative to traditional server, Virtual Machine, or even App Service pricing (which is also an option for Azure Functions). 

Pricing explained

For most developers, the first component of pricing is pretty straightforward…$/call. If I call a function 1 million times, I pay $.20. Easy. However, the second consumption metric is GB-s, or Gigabyte seconds. This one is a bit harder to estimate. Having done some recent benchmarking, I believe there is an easy way to understand this. At its most basic level, GB-s equates to “how hard is the function to execute”. I think most developers would agree that organizations should pay more for functions that consume more compute cycles, and take longer to execute. GB-s is how Microsoft Azure handles this problem.

Every function call a developer makes consumes memory.  The amount of memory a function consumes is based on how much data is passed to the function, and what the function does with that data in memory to process it. For most scenarios, the data passed is minimal, and the function memory usage and execution time is minimal (measured in milliseconds). If you are using Azure functions as a middle tier for your application (passing data to/from your database), your GB-s cost is going to be quite nominal. However, if you are writing Azure functions that are computationally intense (passing in huge arrays of data as a parameter), you will have a relatively high GB-s costs. 

Dealing with LOTS of data

Note that today, the largest amount of memory you can consume an Azure function is 1.5GB of data, which is a HUGE amount of data. Azure functions consume memory in 128MB increments, up to the 1.5GB maximum. There are 2 ways to load data into memory in an Azure function. First, is to pass the data as a parameter (in an array) to the function. The second approach (and I think the better approach for computing lots of data) is to load data from some sort of storage into the function itself. There are 3 strategies you can use for bringing in large amounts of data into an Azure function:

1.      JSON BLOB (or I suppose CSV or XML if you had to) loaded directly from Azure BLOB store into an in-memory array.

2.      CosmosDB – Load the JSON data from a CosmosDB database.

3.      SQL Server/some other relational database – Load the data from a stored procedure of some sort.

In a computationally intense Azure Function, the primary issue is “how do I keep cost to a minimum”. There are a few approaches to take here. First, your Azure functions are not going to be your big cost item. Getting the data into the function is where you will have the financial challenge. The reason stems from the nature of a computational architecture, which typically follows a “fan-out/fan-in” pattern, which can be described as follows:



In this case, the pre-processor breaks up the data into chunks, and then passes the parameters for the “chunk” to an individual Azure Function. Because these chunks can be processed in parallel, each individual function makes its own call to the database to fetch its own chunk of data to process. Often, this data is loaded into an in-memory array for calculations/comparisons. This parallelizes the load, and allows the execution to complete in a fraction of the time needed than if we had only one function call to process all the work. 

The post-processor takes the calculated results from the function, and decides what to do with that data. Results can either be written back to the database, or some result can simply be passed to the calling application. Often times, the post-processor can be used to push data to a Big Data repository (e.g. HD Insight or Spark Cluster) for further analysis down the road. The possible applications for this type of architecture are practically limitless.

How to minimize database costs

So getting back to database cost, the issue is how you inexpensively pull large amounts of computational data into Azure functions. Clearly, pulling data from a BLOB is the least expensive way to go. Simply pull the data from the BLOB into an in-memory array, and then cycle through the data. This works really well IF your data is always formatted in the sort order you always want it in, or if you don’t need to filter the data in any complex way. Even worse, if your data is getting constantly appended to from other sources, this also quickly becomes a problem.

So, if you choose to pull data from a database, you have a choice of using a document database such as Cosmos DB, or a relational database such as SQL Server. In either case, parallelized queries to the database will cause your database connections and utilization to spike in a big way. In the case of Cosmos DB, I will have to consume lots of RU’s (Resource Units) to handle huge numbers of incoming connections and data load. In the case of SQL Server, I will need to up my DTU (Database Throughput Unit) subscription to very high levels to deal with numerous parallelized queries. In either case, the database quickly becomes the performance and financial bottleneck..

So what tricks can I use to minimize this? The first thing to remember is that computational problems typically don’t require sub second response times.  A user who is asking for hundreds of thousands or even millions of computations is willing to wait a reasonable amount of time. The inherent problem with a fan-out architecture is the potential for MASSIVE performance spikes if you get lots of users coming in and making big computational requests. 

Architectural tricks

The first trick to reducing database cost is to stretch out requests over time. An analogy is how our electric power grid systems work. If everyone turns on their air conditioners on a hot day at the same time, the grid experiences a demand spike. The way utilities deal with this is to use smart meters to automatically cycle turning off power to electricity guzzling appliances in people’s homes, and also implement large batteries to help manage demand loads. Bottom line, utilities look for ways to architect their grids for “average load’ instead of “peak load”.

The way we do this in Azure is simple. We place a queuing mechanism in front of the system to give us the ability to serialize incoming requests. In the case of Azure, a great solution is Azure Service Bus. This allows us to have a Request Coordinator application that reads messages off of the queue, and hands off a capacity constrained number to pre-processors.   For example, you may only allow 5 requests to be processed at a time. If all 5 requests are actively being processed, the 6th request has to wait until one of the first 5 finish processing. The 6th request’s user is given a message of “waiting for processes availability” message, and an estimated time before their request will be handled. If you know your scale loads, this is easily anticipated. The 6th user simply waits their turn, and load is deferred till later protecting the system and avoiding spikes.

The second trick is to dynamically switch database subscription levels on the fly. For example, Microsoft (through Powershell) provides an ability to increase the database subscription on the fly. It takes approximately 30 seconds for Azure to change the subscription level. This can be handled in 2 ways. If you know your load peak happens at 9am in the morning, simply schedule your database to switch the subscription to a higher level at that time, and lower it when the peak load passes. The other option is to “stair-step” the subscription based on a spike. For example, if 10 users are waiting (with 5 active pre-processors), simply up the subscription level to handle those additional 10, and 30 seconds later those extra 10 will start processing. After the queue is reduced, you can again dynamically lower the subscription. At night, you can run the database at the lowest possible subscription level to save significant money.

Other technologies considered

Microsoft Azure has another 2 technologies we evaluated. Azure Event Hubs, and Azure Service Fabric. Azure Service Fabric was decided against due to the cost of implementation and management. It required spinning up lots of infrastructure (Virtual Machines), and was quite costly relative to functions. Service Fabric Mesh is a PaaS solution that is currently in preview, but was not available in our timeline (prices are still not published).

We spent a great deal of time evaluating Azure Event Hubs. The concept behind Hubs is that it watches a stream of data, looks for patterns, and spits out results that match. Conceptually, it is like watching cars go down a highway, and counting the number of “red cars” that go by. It was a very interesting approach, but we decided against it due to one significant limitation. It has a 256k message size. Our design pushed a 256k JSON message stream (via batch) into it’s ingestion engine. Although performance was fast, it wasn’t fast enough compared to Azure Functions for loading and calculating large data sets. Basically, the cost of the overhead of feeding it exceeded the benefits of its speed. It was also approximately 100X more expensive than Azure Functions, and ultimately much slower for a computationally intense solution.

The architecture we selected

Ultimately, we ended up going with SQL Server, even though we have become HUGE fans of Cosmos DB.  Our issue was that users were constantly uploading new data to us, and we have the business requirement of being a “source of record”. This meant data was being constantly appended, and we had no control over exactly how customers gave us the data. This meant we needed the ability to compare lots of records, and only append data if there were new records being sent to us. SQL Server does an AMAZING job at handling complex queries quickly, and returning large amounts of data fast. And because we didn’t have huge datasets over time, we were well within SQL Server’s overall storage size limitations.

Document Databases such as Cosmos DB, and Azure Tables simply do not do a good job at returning huge amounts of data at a time. It has a limit of 2MB document size, and then requires the application to use a continuation token to continue returning data in 2MB chunks. This is time consuming, and difficult to manage. It is also quite expensive as each 2MB chunk returned is another call to the database...consuming lots of expensive RU's. SQL Server ended up being the best choice as it is fastest for what we needed, and allows us to modify DTU subscription levels at will to minimize cost.

Why share this article?

Mike Graber is a senior Microsoft Azure Architect with significant development experience in both applications and databases. Please reach out to me any time if you are interested in hiring us for assistance with your cloud project!

Michael.Graber@ceappdev.com


To view or add a comment, sign in

More articles by Michael Graber

  • The Magic of Azure Functions

    One of the more interesting aspects of using different components of Azure for building out solutions is that there are…

  • Document Databases vs Relational

    In my discussions with customers, I am spending more time these days on the benefits and drawbacks of a document…

  • 7 reasons why Electric Cars are in our future

    I believe in 10 years, the majority of cars on the road will be electric cars. My 7 reasons for this belief are:…

    1 Comment
  • 5 traits of successful people

    Over the years, I have compiled a list of traits that I find all successful people have. I try my best to live up to…

  • Database Hypercluster Virtual Private Cloud Architecture

    Having observed MANY IT organization’s database architecture, I believe most can cut their database licensing spending…

  • Future of the auto industry

    So lately, I have been intrigued with the announced capabilities of autonomous cars, and what it will mean to humanity…

  • Business case Enterprise IT infrastructure migration to public cloud

    So, you have read all of the IT trade press discussions about moving to the cloud, and are now considering taking that…

    1 Comment
  • Enterprise database migration to the public cloud

    Technical feasibility of moving databases to the public cloud Cloud…Cloud…Cloud. The hype around moving IT workloads to…

    2 Comments

Others also viewed

Explore content categories