Timeout bugs in cloud servers and cloud applications

Timeout bugs in cloud servers and cloud applications

The need to access data quickly is increasing. Therefore, effective and efficient infrastructure management is essential. With the rise, there is a high demand for storage capabilities to support software-defined storage (SDS) and hyper-converged infrastructure (HCI) in server technology. 

The term server can refer to a physical machine, a virtual machine or to software. The job of a server is to accept and respond to requests made over a network. There are cloud server systems like Hadoop and Cassandra.

No alt text provided for this image
(Hadoop Migration Effort)


With the rise in server applications, the likelihood of timeout bugs has also increased. For example; a connection timeout error occurs when the server is too busy to respond to the request. Failed handshake timeouts are very common in popular games and applications, this is because many users are trying to use the server at the same time.

No alt text provided for this image

Timeout bugs are difficult to diagnose because most timeout bugs do not show any error message or either produce misleading error messages. Dai et al., (2018) researched timeout bugs in each of these systems.   

No alt text provided for this image
T. Dai, J. He, X. Gu and S. Lu, "Understanding Real-World Timeout Problems in Cloud Server Systems," 2018 IEEE International Conference on Cloud Engineering (IC2E), 2018, pp. 1-11, doi: 10.1109/IC2E.2018.00022.

The study highlighted the root causes of timeout bugs

 Misused timeout value where a timeout variable is misconfigured, ignored or incorrectly used. 

Missing timeout checking where inter-component communication lacks timeout protection.

Improper timeout handling is where a timeout event is handled by inappropriate retried or aborts.

Unnecessary timeout is where a timeout is used for a function call which does not need timeout protection 

Clock drifting where timeout problems are caused by asynchronous clocks between distributed hosts.

Real-World Timeout Cases

Loading time-out error in the game call of duty. Loading Timeout error in Call of Duty Mobile is usually a result of poor internet connection(Wi-Fi or cellular network). While some bugs are minor and temporary, others are long-lasting and game-breaking. Another example is the Amazon DynamoDB Service Disruption.

To view or add a comment, sign in

More articles by Ellie Kulsuma

Others also viewed

Explore content categories