Scalability: A deep-dive
I have been reading about scaling in distributed systems from various sources and I also covered about scalability as one part in my last article: Building Reliable, Scalable and Maintainable Applications | LinkedIn
Hence I planned to deep dive into the scalability aspects. The topics covered are,
What is scalability and when is an application considered scalable ?
Scalability refers to an application's capacity to manage an increased workload without compromising its performance. The response time for a user request should remain consistent even when the number of concurrent requests increases, and the app's back-end infrastructure should be able to handle high traffic without experiencing latency or system failure.
What is latency and how to measure it
Latency refers to the time it takes for a system to respond to a user request. A system's scalability is determined by its ability to maintain low latency even with an increased workload.
Latency is measured as the time difference between a user's action and the system's response, which is divided into network and application latency.
Network latency refers to the time it takes for a data packet to travel from point A to point B in a network. To reduce network latency, businesses use CDNs to deploy servers closer to the end-users in edge locations.
Application latency, on the other hand, is the time it takes for the application to process a user request. To reduce application latency, stress and load tests can be run to identify bottlenecks that slow down the system. Additional measures can also be taken to optimize code, improve hardware and software infrastructure, and implement caching and other techniques.
Types of scalability and its pros & cons
Vertical, Horizontal and Cloud Elasticity Scaling
Vertical scaling means adding more power to our server. Let’s say our app is hosted by a server with 16 gigs of RAM. To handle the increased load, we now augment the RAM to 32 gigs. Here, we have vertically scaled the server.
Horizontal scaling involves adding more hardware resources, such as servers or data centers, to an existing resource pool to increase the system's computational power and handle increased traffic. This approach allows for infinite scalability and dynamic scaling in real-time as traffic fluctuates.
Cloud elastic scaling is the ability of the cloud to dynamically add or remove servers to the hardware resource pool in response to changes in traffic load on a website or application. This allows businesses to stretch or return to the original infrastructural computational capacity, on the fly, without having to manually add or remove servers.
Horizontal scaling pros:
Horizontal scaling cons:
Recommended by LinkedIn
Vertical scaling pros:
Vertical scaling cons:
Determining right scalability approach for an application
Determining the right scalability approach for your application depends on various factors such as application architecture, resource requirements, traffic patterns, and budget.
If your application is a monolithic architecture and has limited resource requirements, vertical scaling can be a good choice. However, if your application is a distributed architecture and requires more resources, horizontal scaling can be a better choice.
And If your application is following microservice architecture, then you can also consider scaling only a specific service based on the ballpark estimate you have in your mind.
You should also consider your traffic patterns. If your traffic is consistent, then vertical scaling can be a better choice. However, if your traffic is variable or unpredictable, then horizontal scaling can be a better choice.
Lastly, you should consider your budget. Vertical scaling can be more cost-effective for small applications, while horizontal scaling can be more cost-effective for larger applications.
In summary, to determine the right scalability approach for your application, you should consider factors such as application architecture, resource requirements, traffic patterns, and budget. You may also want to consult with experts in the field for advice and guidance.
Bottlenecks that hurts the scalability of the system
There are several bottlenecks that can affect the scalability of a system, including:
Apart from the above mentioned reasons there are other bottlenecks such as Insufficient configuration and setup of load balancers, Adding business logic to the DB, Not picking the right DB, Poor performing code.
It's important to identify and address these bottlenecks to ensure the scalability of the system.
Improving and testing the scalability of our application
Reference and credits