Perishable Computing Cycles
Walking through the grocery store last night I noticed that the produce manager was not around, and usually I see someone in the produce section packing out old produce and replacing with new produce for the days activities.
I thought; "Why can't they just sell all the produce that is expiring 1-3 days before the expiration date rather than the labor to pack it out and the disposal time and cost associated?" We are all struggling to eat healthier and the fresh produce is one of the items whose costs quickly add up for the family. Dynamic pricing management systems(intelligent policy driven complex software algorithms) are the answer being used today. Otherwise known as yield management systems which have a long proven history in the hotel, airline, taxi industries where a time period length from the present time has price decay and how to maximize utilization of the resources for the asset owner/producer.
In 2014 I was tasked with researching and culminating many facts about how many excess compute cycles the typical F500 and tech heavy companies had within their own CapEx IT infrastructure. Then, I had to find and locate a large collection of compute cycles for a customer in BioIT services and learned that the cloud suppliers also had quite a bit of excess compute cycles. So I ran a few numbers and figured out the best way to explain the waste & losses to the CTO/CIO's was this grocery produce dynamic pricing management system. Every 24 hour period from right now, Enterprises, Organizations, Universities, and Cloud Suppliers have cpu computing cycles that will not be used and you cannot get them back tomorrow. You are going to pay for the electricity, licenses, chassis, labor, space, and people already, but the utilization rates are sitting around 55-75% in cloud suppliers and a ghastly 15-25% in Enterprise IT computing infrastructure utilization rates. Yet everyone wants different tiers of performance or low cost cloud computing to utilize in their enterprise or cloud business or BioIT research.
Perishable is a word that begs context clarification in this sense. Customer side perishable conditions where kids outgrow the shoes you own even though they are like new condition. This causes a disposal by the owner to trash or donate the item and then purchase a new pair of shoes that fit. Material side perishable conditions where the lettuce turns brown after a few days and has to be consumed or disposed of in a fixed time frame. This causes the customer to buy on a repeating basis in small unit sizes for a given time period many times throughout the year. Supply side perishable conditions where not enough lettuce was available in that given time period when the customers needed to buy it. Large unit change in short amount of time perishable conditions arise when a larger than expected movement in buy side or sell side causes issues for the buyers not being satisfied because of a run on the new superfood not being in stock, or the customers flight risk of buying the superfood from a different store that left the grocery manager with a large amount of unused inventory for that time period.
What's odd is that perishable computing cycles don't really fit into any of these categories in my mind. You can simply get more RAM/CPU if you outgrow the size of machine you have today, you can get more machines if you outgrow the current quantity. Your computing instance does not decay in a given time frame because the suppliers can live migrate them to new computers whenever they want or you can move it where you want it to run. With the current state of top 10 cloud suppliers and private intercloud interoperability there is a perceived unlimited supply of compute resources for a single customer to tap into in real time. If you have a large spike in computing demand, you can leverage autoscale up and autoscale down to meet your consumption needs.
The one thing that is an acknowledged supply and demand principle that scares the other 327 hosting companies and smaller cloud companies is customer flight risk on the hourly or on-demand pricing models. Why do you think they hate hourly pricing and continuous price drops from big 6 cloud suppliers? There is 100% customer revenue flight risk and they are treating their IT CapEx like packaged goods with a 3 year shelf life and customers wanting 3 year term known quantities with little change and set buying periods. This is not what the customers want or need today. In every case I ran across, it is always a mix.
Computing cycles are perishable like a seat on a plane that day, a bed in a hotel that night, or a rental car for the business traveler. The only thing that matters to the supplier is the overall monthly utilization of the resource at whatever price points you can earn. This leads to the most savvy cloud suppliers I worked with actually increasing profits 50%-150% by having a 4 tier dynamic pricing model diversification system that was integrated into the ecosystem. Retail(Published Pricing per hour or month), Bulk Order Retail, Wholesale to Key Partners, and then a dynamic Spot price. The reason this has to be done with software is that you have 4 types of compute leases on the buy/demand side to deal with along with the 4 pricing models from the supply side, then multiply those combinations by the near infinite combination of CPU/RAM/Storage/Network/etc requirements of each o/s or application service. Next, do that market analysis every single hour like an active free market trading center. -Aside: there are actually 7 types of dynamic pricing as defined here, but boiling it down to the ways people are actually buying compute and offering compute today.
Computing types of leases:
- Job Dependency Leases requiring a single OS or groups of OS's that must run in parallel.
- Best-effort leases, which will wait in a queue until resources become available by given supplier and buyer matched policy values .
- Advance reservation leases, which must start at a specific time in future. Some have fixed duration time length of start/stop dates.
- Immediate leases, which must start right now, or not at all.
Next generation use cases for these algorithms can...
- ... explicitly schedule the deployment overhead of virtual machines, instead of having it deducted from a user's allocation. For example, if a lease must start at 2pm, dynamic pricing mgmt systems will schedule the transfer of the necessary VM images to the physical nodes where the virtual machines will be running (and will make sure that the images arrive on time).
- ... leverage the suspend/resume capability of virtual machines to suspend preemptible leases when a higher-priority lease needs resources. It can also leverage cold migration of VMs (migrating a suspended VM to a different machine to resume it there). Live migration scheduling is capable for continuous chassis match placement and optimization throughout hardware lifecycle.
- ... schedule best-effort requests using a First-Come-First-Serve queue with backfilling (aggressive, conservative, or with any number of reservations).
All of this is great, but what compute CapEx owners really need is the simulation and data exploration and data visualization to ensure the decisions are accurate continuously. Then you need to automate so that this whole process is self optimizing.
Very few CFO/IT Controllers/VP IT Finance are getting there or even have an initiative in place today for dynamic 4-7 tier pricing schemes, but they all still have to stitch many things together for the full nirvana state to be achieved. I feel this is critical to anyone that runs computing CapEx assets to look at how you can get the most utilization out of your assets and be application centric in your cost analysis to ensure that the workloads are on the most expertly architected systems, using the right sized infrastructure continuously for it, and never pay a penny more for a commodity OS instance than you have to. Regardless of your view as supplier or buyer of perishable computing cycles because you will play both roles in the future.
How will you do that in 2015? 2016? How much bigger will we all let AWS get? How long will you accept anything less than 95% utilization of your IT CapEx?
Very interesting read Brent E.. The Internet of things (IoT) and the constant need for Big Data will drive efficiencies for the nimble and agile companies that want to differentiate themselves in any given vertical. The new paradigm of compute and cost will evolve quickly when customers can qualify, test and implement the architectures you propose. Simplicity will be key, but the ultimate driver will be the savings on multiple fronts that will get the attention of a CFO that wants to differentiate themselves from the herd and gain a true competitive advantage.
Interesting read. The analogy to airline pricing and yield management is a good one. But sophisticated as that has become, it is an essentially static technology and service. When technology, services and pricing models all evolve continually and feed back into each other this is only going to become more complex. Will be interesting to see if any equilibriums appear.