Demand Scaling - that art of matching compute instance to the demand workload

Stuart Bean

Published Jan 14, 2021

Most applications do not run all the time at 100% load, high demand, maximum transaction throughput - real life demand ebbs and flows, is sometimes high, but mostly low with some periods of medium demand.

Aligning the amount of computation infrastructure against that real life demand is the art of demand scaling ( a form of intraday scheduling of compute need ). In a fully automated and “cloud mature” approach, applications would be built to be “aware” of the demand levels, and be able to scale up, that’s increase server instances, to cater for the increased workload, and likewise, scale down when the demand diminishes.

In cloud computing this is achieved by auto scaling - the use of cloud formations to define the instance count ideal low value ( say 1 or 2 transaction front line servers behind a load balancer ), with settings to define a monitoring ( of cpu utilisation say ) and indications of when levels of load indicate to increase server count ( cpu above 70% say ), or decrease server count ( cpu below 30% say ).

Using cloud formations in this way automates the application “auto scaling” against demand. But the same could also be achieved for a legacy application ( that are unable to make real-time monitoring and load analysis ) by way of scripts. After human analysis of daily demand profiles for example, scripts could be written that increase server counts at pre-determined times ( say increasing from 1 server to 3 servers at 8am start of day, and increasing more around 11am ready for midday peak load ), in this way the application total compute needs can be scaled up to match peak demand periods and reduced for out of demand periods ( like overnight ).

Applying this strategy ensures cloud consumption, which is charged when compute instances are running, can be optimised and achieve cost savings by running the maximum number of servers only when demand is high. Large savings can be achieved running a 1 server to peak of 10 servers across the business working week and working hours – that be Monday to Friday or Monday to Sunday for a full 7 days business need - instead of the peak 10 servers all running 24*7.

To view or add a comment, sign in

See all

Demand Scaling - that art of matching compute instance to the demand workload

Stuart Bean

More articles by this author

Others also viewed

VMware Edge Compute Stack at the Industrial Edge

Rivos Accelerates Chip Verification with Spillbox’s Hybrid Cloud Bursting Solution

NVMe SSD RAID on Oracle Cloud Infrastructure (OCI)

Modern Containers

The Art of Effectively Balancing Committed Use Discounts (AWS)

Containers, The New Compute Disruption

The Azure Virtual Datacentre using Infrastructure as Code

Powering Partner Integrations: How Capacity API Enables Smarter Infrastructure Provisioning

Securely Accessing AWS Lambda Inside Your VPC

Choosing the Perfect Azure Load Balancer for Your Workload or Application

Explore content categories

Cost Optimisation - methods of reducing cloud costs - REVIEW

Feb 10, 2021

Cloud Native - server less

Feb 5, 2021

Latest Versions - leveraging Mores Law of ever improving technologies over time

Feb 3, 2021

Enterprise Discount Plans - multiyear commitment to lock in a savings percentage

Feb 1, 2021

Right Sizing - assessing current cloud builds and the execution performance

Jan 28, 2021

Database Reservations - lock in long term database computation to gain discounts

Jan 26, 2021

Compute Reservation – ( AWS Savings Plans ) lock in forecast need plus benefit

Jan 20, 2021

Compute Reservation – ( Reserved Instances ) lock in forecast need at a discount

Jan 18, 2021

Compute Optimisation - methods of reducing cloud costs

Jan 12, 2021