So your business is growing, and your management system needs to support more multiple simultaneous users. Two options come to mind: you can either upgrade the server hardware, which is scaling vertically, or balance the load onto more servers, which is horizontal scaling.
What most IT managers may be unaware of is that not all online transactions are created equal, and that there is a third scaling option - Temporal Scaling.
It's okay to wait for certain tasks
Browser-based applications are collections of web pages that perform business functions. Web applications serve different purposes than informational web pages. We have formed the expectation that every page will be loaded within an acceptable time span. This expectation, combined with the notion that a business transaction is usually a page load, makes us overlook the types of transactions that are not like opening a web page.
For example, if we expect the system to generate a complex report within a second, and we aim to support hundreds or thousands of users, the only scaling options are either vertical or horizontal. However, if we queue up the report requests but handle all the other requests timely, the overall performance of the system is still acceptable.
Tasks that are suitable for Temporal Scaling
Certain tasks are resource consuming and/or long running by nature. They are more suitable to be queued up and processed in order. These tasks include:
Queuing is done automatically, but poorly
- mass emailing
- batch processing
- large reports
- video encoding
One of the reasons we don't hear about temporal scaling, or trivialized as queuing, is because it is done on the web server implicitly. When the server host is overwhelmed by too many requests, the server software will line up the requests in a buffer. Once the buffer runs out the site either crashes or blatantly drop new requests.
Clearly the behavior of queuing on the server layer is not at all scalable. The underlying issues are: 1) all requests are treated the same; 2) the queue scheduler is reactive to resource usage as an after fact; 3) the server software has no knowledge of task timing.
In addition to server software (Apache, Nginx, etc.)'s limited ability to prioritize large amount of diverse types of transactions, timeouts also make temporal queuing awkward to implement. Both the browser and server may terminate the process after a time threshold. The most important timeout of all is "user timeout". If a transaction has to run for a long time on the server, it shouldn't tie up the browser and force the user to wait.
The Asynchronous Distributed Processor, or AsyncD, is a framework we created that handles long lasting web requests on the application layer. Gyroscope now has built-in interface to AsyncD.
An AsyncD implementation has three components: the web user interface, a server-side scheduler, and one or more workers. The user interface is built through API calls in libasyncd
. It allows the user to either "request and leave", or "wait and watch". If the UI is set to monitor the process, a task handler that's returned by the scheduler will be used to pull the task's completion status for live updates.
The scheduler distributes the load across multiple workers. On a server with limited resources only one worker should be used. It is possible to configure the scheduler so that different types of tasks go to different workers. Each worker corresponds to a dedicated server thread. A worker can run in remote servers and can be revived in case of a crash.
The scheduler also offers a mechanism that detects and prevents redundant operations. For example, if one user requested encoding of a video file, and the same video file is requested by another user, AsyncD will only encode it once and share the result to both users.