The Power of Queues
Queueing - one of six words in the English language with five consecutive vowels. Pretty cool but not nearly as interesting as the impact that queues have in building fault-tolerant, scalable applications.
These days many systems are built based on APIs. Usually, a client-side application (or server) will access an API and either request or submit data. REST APIs are still a very common approach these days - the problem is that APIs represent a highly coupled architecture and those just don't scale. An API by itself is not highly coupled.
When one service calls another or an application calls a service that is accessed via API, the following challenges come into play.
- The API is not available - this is the main challenge when accessing a system via an API - what to do when it's unavailable? Meanwhile, you’re holding a JSON payload while a million requests are piling up. The app will need to save this data off locally and then retry sending later when the API is available again. But, it’s so backed up that when it does become available again, it probably won’t be able to handle the load. It'll never catch up.
- The API is available but returning error codes. A different scenario than #1 above but how you deal with it involves the same challenges (traffic backing up, system overloaded).
Enter the message queue. In this example, "Service 1" puts a message on the queue in more of a "fire and forget" approach. As long as the message was successfully placed on the queue, "Service 1" goes on about its business. Meanwhile, "Service 2" is pulling messages from the queue and will process them when it can.
If more instances of "Service 2" need to be spawned to keep up with the workload, that's easy. Now we have a scalable architecture that is truly decoupled in nature. Now we have something more resilient and the scalability factor is simply the number of "Service 2" instances you want processing these messages in the queue. We can sleep peacefully at night.
I know what you're thinking - "What if the queue is down"? Having a single point of failure on a queue would be a problem so most architects have multiple queues, each in a different availability zone.
Front end engineers struggle with this concept, because they don't have an immediate response given that the application is decoupled. "Did the order go through? Who knows? We would like to inform the user that the order went through before we move on."
Building scalable systems requires a different architecture and often times a different mindset. A seasoned scalability architect would suggest striving for stateless and designing the system to compensate for the decoupling. For an online web site for example, get the things you need to process the order and confirm the order to the user. If there's a problem during fulfillment, you can resolve it behind the scenes and not lose the order or any other orders, for that matter. If there's a problem that requires user intervention, you could notify them via email that their attention is needed to complete the order. Have you ever received such emails from Amazon or other high scale online vendors? It's a sign that they have a highly scalable and decoupled system - most likely based on queues.
New credit cards are rare compared to the number of orders using existing credit cards for an account. So you might check the new credit card number while the user is still there and make sure it is valid. This makes sense. But orders using existing credit cards are handled differently and in the background, because you have millions of orders using them and can notify the user during fulfillment if there was a problem (like expired card).
I love queues and they are one of the best tools in the arsenal for those architects focused on scalability.