Improving Monolith's Availability

Introduction

"High availability, is a characteristic of a system which aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period." - Wikipedia

Regardless of the software architecture, high levels of availability is the ultimate goal of moving to the cloud. The idea is to make your product available to your customers at any time.

Servers have a limited amount of resources under disposal. Storage resources are quite cheap these days, but compute resources are precious, and we should aim to squeeze every bit of power from them.

When you offer a SaaS solution, you do that in the form of a web service, hosted somewhere in the cloud. If the traffic to your application is low, you will probably not have much problems to serve all your clients, as these modern web applications can handle a substantial amount of concurrent user connections. Each request that hits your server is handle from a thread picked up from the thread pool.

Once the traffic grows and the server has to handle all those requests, the thread pool shrinks in the sense that, more threads are in use, and there are less of them under disposal to handle new requests. We want to avoid this!

When the thread pool gets exhausted, it is reflected to the outside world as an application that is unreachable from clients. The faster we can handle a request and return the appropriate response, the better for our application's availability.

Even though we like to think that using async/await makes our application asynchronous, it is actually fake asynchronous behaviour, in the sense that the application is not in a blocking state but the response can not return to the caller, until everything has finished.

Messaging makes true asynchronous behaviour possible. Although messaging is fundamentally a pragmatic reaction to the problems of distributed systems. As a technique, it is not restricted to distributed systems. We can leverage messaging in monolithic applications too.

Messaging used in a Monolith

The idea is to leverage messaging to reduce the response time of HTTP requests. This has direct impact on a monolithic application's availability.

When HTTP requests hits the server, messages get published to the message broker. The responses are returned "immediately". The monolith afterwards consumes these messages, and actually does the work that was supposed to be done, triggered by the respective request.

We can see here that the monolith is acting as the publisher & subscriber.

Bi-directional communication between a Monolith & Message Broker [source].

Use case

Imagine you are building an appliction that offers a way to store files in some kind of hierarchical structure of folders. It also supports folder nesting. Files can get large, and we do not want to store those in our RDBMS. But instead store only the path to the actual file, which in-turn live somewhere else. Maybe in our file system, blob storage, or even a NoSQL database. The point is, it is stored somewhere cheaper 😉.

What happens if a user deletes one of the folders?

Well, we need to make sure that we:

Delete that folder in our database.
Delete all file paths that belong in that folder.
Delete all the actual files that correspond to the above step.

So far so good, but imagine if that folder had hundreds or thousands of files. Even worse what if the folder had many other folders within it, and those folders had other folders, each with hundreds or thousands of files. This would require a lot of IO calls, wether it be file system calls or network calls.

You could apply (if applicable) some batching techniques to take care of the number of calls. But nonetheless, the bigger the batch is, the longer it takes to complete that request.

With async/await you can make sure that the UI does not hang, but the HTTP response won't get started untill everything has completed. This impacts availability of our application!

Messaging to the rescue

With messaging, once the HTTP request hits our server, we can publish a message (a command to be more precise) to the message broker, which would contain only the ID of the folder that we want to delete.

I should premise that the publishing of the message to the broker, should be done with at-least-once delivery kept in-mind.

The moment the message gets stored in the Outbox table and the transaction has been commited, we can start the HTTP response. Here the monolith is the publisher, but we already mentioned that the monolith is also the subscriber of that message. That means, somewhere in our codebase we have a handler for that message, which triggers the actual work that previously was supposed to be done, while the HTTP request was running.

Needless to say that we do this work in a background thread.

Example

Below we can see a JSON representation of a nested folder hierarchy with files and their corresponding paths.

Nested folder hierarchy and file paths.

The files are housed in our local file system. The hierachy has been flattend in the file system. This is actually quite common in a lot of blob storage service offerings from various cloud providers.

File's housed in the file system.

Let's have a look at the DeleteFolderCommand.

public class DeleteFolderCommand : IRequest<Unit>
{
    public Guid FolderId { get; set; }
}

public class DeleteFolderCommandHandler : 
    IRequestHandler<DeleteFolderCommand, Unit>
{
    private readonly FileStore fileStore;
    private readonly SqlConnectionFactory factory;
    private readonly ApplicationDbContext context;

    public DeleteFolderCommandHandler(
        FileStore fileStore,
        SqlConnectionFactory factory,
        ApplicationDbContext context)
    {
        this.fileStore = fileStore;
        this.factory = factory;
        this.context = context;
    }

    public async Task<Unit> Handle(DeleteFolderCommand request, 
        CancellationToken cancellationToken)
    {
        var folderDtos = await LoadFlattenHierarchy(request.FolderId);
        var paths = folderDtos
            .SelectMany(ff => ff.Files
                .Select(f => f.FileName));

        await fileStore.Remove(paths);


        var folders = context.Folders
            .Where(f => folderDtos
                .Select(f => f.Id).Contains(f.Id));

        context.Folders.RemoveRange(folders);
        await context.SaveChangesAsync();

        return Unit.Value;
    }
}

We get the folder id that the user wants to delete.
Inject dependencies like FileStore, SqlConnectionFactory, and ApplicationDbContext.
Load a flattened representation of the folder hierarchy.
Select all the paths of the files.
Delete those files from the file store.
Remove all the folders from our instance of ApplicationDbContext.
Save changes to the context.

DeleteFolderCommand is the command that does the actual work. But we do not want this to be callable from the API endpoint. Instead we want the InitDeleteFolderCommand to be called.

public class InitDeleteFolderCommand : IRequest<Unit>
{
    public Guid FolderId { get; set; }
}

public class InitDeleteFolderCommandHandler : 
    IRequestHandler<InitDeleteFolderCommand, Unit>
{
    private readonly ICapPublisher capPublisher;

    public InitDeleteFolderCommandHandler(
        ICapPublisher capPublisher)
    {
        this.capPublisher = capPublisher;
    }

    public async Task<Unit> Handle(InitDeleteFolderCommand request, 
        CancellationToken cancellationToken)
    {
        await capPublisher.PublishAsync("delete.folder", request.FolderId);
        return Unit.Value;
    }
}

We are using CAP as our library that implements the Outbox Pattern, and the neccessary communications to our message broker (RabbitMQ). But don't let that distract you! At the end, it is just an abstraction of a publishing mechanism to a message bus.

We get the folder id that the user wants to delete.
Inject our message bus publisher abstraction ICapPublisher.
Publish a message named "delete.folder" with the folder id as the payload.

After the command execution has finished, the HTTP reponse gets send to the caller. Once the message is received by the monolith, we internally kick-off the DeleteFolderCommand to do the actual work.

public class InitDeleteFolderHandler : ICapSubscribe
{
    private readonly ISender sender;

    public InitDeleteFolderHandler(ISender sender)
    {
        this.sender = sender;
    }

    [CapSubscribe("delete.folder")]
    public async Task Handle(Guid folderId)
    {
        await sender.Send(new DeleteFolderCommand()
        {
            FolderId = folderId
        });
    }
}

Notice ⚠️

We could have also written the code in DeleteFolderCommand within InitDeleteFolderHandler. But if you have a look at the source code you can see that, starting a command also starts a database transaction from the transaction pipeline.

Comparing execution times

Below we can see the total execution time for InitDeleteFolderCommand. It only took 79 [ms] to complete the command, as opposed to the 12,805 [ms] that it took to complete DeleteFolderCommand.

Execution time for both commands.

I should put emphasis that the HTTP response has been sent to the caller, after InitDeleteFolderCommand has finished. We can see that DeleteFolderCommand has started on 16:41:45 which is 8 [s] after the first command has finished on 16:41:37. During that time, the message "delete.folder" has been stored in the Outbox table, and published from an outbox processor to RabbitMQ.

"delete.folder" message stored in the Outbox table.

"delete.folder" message stored in the Inbox table.

Which one of the commands would you like the client to invoke, and which one to run in the background?

Obviously we want the client to call InitDeleteFolderCommand and run DeleteFolderCommand in the background for reasons we elaborated on.

Summary

In this article, we have gone through the process of increasing a monolithic application's availability by leveraging messaging techniques, which are commonly used in distributed systems.

If you found this article helpful please give it a share in your favorite forums 😉.
The solution project is available on GitHub.