Timeout bugs in cloud servers and cloud applications
The need to access data quickly is increasing. Therefore, effective and efficient infrastructure management is essential. With the rise, there is a high demand for storage capabilities to support software-defined storage (SDS) and hyper-converged infrastructure (HCI) in server technology.
The term server can refer to a physical machine, a virtual machine or to software. The job of a server is to accept and respond to requests made over a network. There are cloud server systems like Hadoop and Cassandra.
With the rise in server applications, the likelihood of timeout bugs has also increased. For example; a connection timeout error occurs when the server is too busy to respond to the request. Failed handshake timeouts are very common in popular games and applications, this is because many users are trying to use the server at the same time.
Timeout bugs are difficult to diagnose because most timeout bugs do not show any error message or either produce misleading error messages. Dai et al., (2018) researched timeout bugs in each of these systems.
Recommended by LinkedIn
The study highlighted the root causes of timeout bugs
Misused timeout value where a timeout variable is misconfigured, ignored or incorrectly used.
Missing timeout checking where inter-component communication lacks timeout protection.
Improper timeout handling is where a timeout event is handled by inappropriate retried or aborts.
Unnecessary timeout is where a timeout is used for a function call which does not need timeout protection
Clock drifting where timeout problems are caused by asynchronous clocks between distributed hosts.
Real-World Timeout Cases
Loading time-out error in the game call of duty. Loading Timeout error in Call of Duty Mobile is usually a result of poor internet connection(Wi-Fi or cellular network). While some bugs are minor and temporary, others are long-lasting and game-breaking. Another example is the Amazon DynamoDB Service Disruption.