Surviving the Surge

Surviving the Surge

A Practitioners Guide to Surviving Massive Launch Events

You're in a football stadium watching your favorite college team and trying to text your friend the location of your seat. "Not delivered". You anxiously await the highly anticipated release of your favorite new online video game and start the game when it opens. "Servers Unavailable". You try to get insurance coverage under the new USA health plan available to the general public on the day it opens. "Web site down". These are all problems with scale and the challenge with trying to use services at the same time millions of others are trying.

Let me first say, this is a very difficult problem. There is no easy solution and even those of us that have been through massive launch events multiple times still struggle with surviving the surge. There are so many variables and they change every time given different people, processes, technologies, and applications.

This is a screenshot of a monitor during a recent game launch.  The surge is obvious.

Below are some strategies to help you survive the surge, all of which I've learned through challenging experiences.

  1. Evolve into a launch. Use public tests or open betas that evolve into the actual released GA product. This may be an uphill battle with marketing teams who prefer big bang launches. While this makes sense when it comes to focused advertising around specific dates, big bangs lead to big problems. Try to create a plan that evolves into the launch product over time.
  2. Open early at a secret time. "Game xyz will be available Friday at 9:00 am". All marketing promotions point to this time and your company just ran a Superbowl ad to promote it further. Millions of people around the globe will be ready at that exact time. If you're in a scenario like this, my suggestion is to secretly open a little early. This may be in direct contradiction of what your marketing team wants. But we all want a smooth launch and the best way to have one is to avoid a surge. This will usually result in a more gradual traffic ramp - exactly what you want to ensure availabilty. The picture below is an actual screenshot of one of our recent game launches when we secretly opened early.
No alt text provided for this image


3. Segment the user base. In games, we have the ability to separate players by region and also by Platform (PC, Playstation, Xbox). Sometimes we have control over this when working with the platform vendors, other times we do not. Anything you can do to segment the user base and open at different times with a subset of the total population will result in a nice even ramp.

4. Have a queue system. Not trivial to do but if done well can allow you to control inflow traffic. A good queue system will accomplish two things: 1) provide a trusted mechanism for you to control user flow into your online service, and 2) a useful indicator to the end user that communicates they are in a queue and the anticipated wait time.

5. Last resort. If all other options are not possible... this is a bad place to be but here you are! In this case, over-provision the environment and disable any auto-scaling rules (up and down). I have not seen autoscaling features be able to keep up with massive spikes in load. Take some additional hardware costs on the chin just until you're past the surge. Then scale things down and enable your auto-scaling rules.

If you are able to evolve your product into a launch then your life just got a lot easier. If not, subset your user base, secretly open early, over provision the environment and have a queue system just in case you need it.

I hope these tips help.









Excellent post, Dave.  Thanks for your sharing your insights based on EXPERIENCE.

Like
Reply

Awesome article Dave! You've got more experience in this topic than pretty much anyone I know so it's great that you're sharing this insight!

Like
Reply

To view or add a comment, sign in

More articles by Dave Moore

  • A Message From Your Database

    Dear Technical Leader, This is your Relational Database writing – “RD”. I’m writing to let you know that I feel…

    4 Comments
  • Data Science & The Perfect Bracket That Never Was

    It’s my favorite time of year. As a college basketball enthusiast, I’m convinced there’s simply no better event in…

    1 Comment
  • The Power of Queues

    Queueing - one of six words in the English language with five consecutive vowels. Pretty cool but not nearly as…

  • The People Side of Cloud Migrations

    As I look back on the journey my team took to get us to the cloud, many things stand out as worthy of sharing. At the…

    2 Comments
  • Vendor Lock-In Revisited

    One of the most frequently used reasons for not going "all-in" with innovative technologies is what is known as "vendor…

  • DynamoDB: A Real Game Changer

    I've managed and written applications for many different types of databases throughout my career. Most of my expertise…

    3 Comments
  • Seeking Inspiration? Try Looking Outside Your Industry

    It there's a book that I think will help me improve at what I do or sharpen my programming skills, I have probably read…

    3 Comments
  • 5 Ways to Make Offshore Development Work

    I've spent a considerable amount of time and effort during my career working with engineering outsourcing teams in…

    7 Comments

Others also viewed

Explore content categories