Building for Large-Scale Request Handling
In an enterprise environment, as the number of users grow, it is normal for the number of users who try to access the web application at the same time to also grow. This presents us with the interesting problem of how to scale the web application to handle a large number of concurrent requests by the users.
Scaling up a web application to handle a large number of users is a task that can be achieved in multiple ways where one of the simplest ways can be adding more infrastructure and running more instances of the application. However, this technique, though simple, is highly burdensome on the economics of application scalability, since the infrastructure costs associated with running the application at scale can be huge. We certainly need to craft our application in such a way that it is easily able to handle a lot of concurrent requests without really requiring frequent infrastructure scaling.
Building on the foundation laid out in the previous chapter, we will see how we can apply these techniques to build a scalable application that can handle a large number of concurrent requests, while also learning a few other techniques that will help us scale the application in an effortless manner.
Over the course of the chapter, we will be taking a look at the following techniques to scale our web application for large-scale request handling:
- Utilizing reverse proxies in web application deployment
- Using thread pools to scale up request processing
- Understanding the concept of single-threaded concurrent code with Python AsyncIO