Hands-On Enterprise Application Development with Python
上QQ阅读APP看书,第一时间看更新

The multiple options to scale up

The world of technology provides a lot of options to scale up the application to accommodate the ever increasing user base; some of these options simply ask for increasing the hardware resources whereas the other options require the application to be built around dealing with multiple requests internally itself. Most of the time, the options of scaling fall into two major categories, vertical scaling, and horizontal scaling:

Let's take a look at both of them and figure out their pros and cons:

  • Vertical scaling: The whole concept of vertical scaling is based upon the fact of adding more resources to the existing hardware to increase the processing power and to accommodate increased concurrency. For example, by adding more processors, we can improve upon the time it takes to process an individual request and, hence, allow for more increased capacity for handling more requests. But vertical scaling also has its own limits; we just can't keep adding more and more resources to the existing hardware and expect that our application will be able to deal with the increasing number of requests. Every hardware has a defined upper limit of how many resources it can accommodate and, once we have reached this limit, it won't be possible to add any further resources to it, hence limiting the scope of how much of a capacity we can increase for the application. This problem brings us to another approach to scalability: horizontal scaling.
  • Horizontal scaling: The concept of horizontal scaling is based upon adding more nodes to increase the scalability of the application. By adding more nodes where every node shares a part of the responsibility of the application, we can improve upon the capacity of the application to handle a higher number of concurrent requests. Another possible way is that we can run multiple instances of the application on a distributed set of hardware nodes, running behind a load balancer. The load balancer is responsible for distributing the requests across the nodes to provide a higher capacity. The advantage of this approach is that, as the number of users is expected to rise, we can add more nodes so as to help with the increased load. This is one of the approaches the public cloud providers such as AWS have made famous.

Both of these options are available once we have started deploying our application to the production infrastructure. But, is there something which we can do to build the application in such a way that it can maximize the number of concurrent requests it can deal with? Let's explore the answer to this question.