Async messaging, it’s one of those things you come across once you start to hit any kind of traffic in an application where you don’t want to (or simply can’t!) process certain tasks whilst the user is waiting. Years ago, I bumped into this problem when I noticed some parts of my web app (Secret Santa) was processing emails, which meant that every one else was blocked from using the website! It was my first ever app I wrote and one I did completely by myself without any outside help. I was naive and I learned very quickly about asynchronous messaging.
Why Async Messaging?
My first interaction with async messages was with a fabulous library in my rails app called Sidekiq, which leveraged redis to put messages into a temporary place which could then be used to process later. With the advent of cloud computing, there are now many more options to do this, far more easily as well as reduce compute requirements for processing relatively simple tasks. Previously I needed to run a linux server which has multiple CPU cores to process web traffic as well as processing alternate jobs such as sending emails. Now, I can throw the email sending task into a queue to be processed separately from the web service, reducing the size of the web server since it doesn’t always need to have available compute just in case it gets busy. This is the beauty of “serverless” where you offload some tasks to functions or whatever to process only when you need it to, not have something on all the time “just in case”.
Each cloud provider has their own native services to handle this kinda stuff, for example AWS has SNS or SQS and on Google Cloud there’s a number of options but the two I’d like to talk about today are Cloud Tasks and Cloud PubSub. On the face of it, they kinda achieve the same thing, async messaging, but importantly, they go about it in different ways.
PubSub
Much has been said already about PubSub, and you’ll find it in a bunch of tutorials and videos, and if you’re not sure which one to choose and you don’t already have a messaging system in place to handle messages, PubSub is probably the one you want. But at a high level, let’s examine the main differences, and why you might want to choose one or the other.
As the name might suggest in Cloud PubSub there’s a concept of a publisher and a subscriber. The publisher is able to communicate with the subscribers anonymously and asynchronously by publishing an event to another concept called a topic. A topic is something that can have multiple subscribers to it which have different responsibilities.
For example you could have a shopping cart service publish an event saying “purchase completed” to a topic with information about that purchase and there might be multiple subscribers to that topic which do different things like sending a confirmation email to the customer with a receipt, sending the order to the warehouse for shipping and notifying some sort of internal analytics that a purchase has been made. The subscribers to the topic don’t need to know who sent the message or even care who sent it, they just care that there’s a message for them to process so that their work can be done. If you’re interested in reading more and seeing other common use cases of Cloud PubSub, there’s some examples here.
Cloud Tasks
Google’s docs page describes Cloud Tasks as “letting you separate out pieces of work that can be performed independently outside of your main application flow, and send them off to be processed…” sounds similar right? Well that’s because it is similar. Cloud tasks main difference is that async messages are not anonymous. That means that the handler of the task needs to know about the sender of the task. Ok, so what!? Well what that means is that if you have some work that needs to be done by your application which is time consuming, and you don’t want your user to wait, then Cloud Tasks can fit well here.
Let’s use an example. If your application would normally do some work which might take a while, let’s say someone wants to download a complicated report about their usage data. If that report is complicated, takes a while (eg, 10-30 seconds or something), and you suddenly get several requests for that kind or work then you might run the risk that your service would become locked up and unavailable whilst it’s processing all those requests. Instead, you could use Cloud Tasks for this and tell your users that they’ll be emailed a report. The difference here is that the task queue can smooth out the work required, but also each task has context and headers in it that need to come from the task creator. It’s like the app is calling some other service but in a delayed fashion and it’s decoupled from the app itself. If that makes sense. Sort of a way of saying “here’s my request (task), it came from me (headers), do it when you can (async)”.
Oh yeah, and the handler of a Cloud Task needs to be able to receive PUT or POST http methods, since that’s what the API will accept.
Comparison
In short, both can achieve asynchronous messaging. It’s just how they go about it. Google say that Cloud Tasks can provide tools for queue and task management which you can’t get with PubSub such as scheduling specific delivery times, delivery rate controls and configurable retries among a few others.
Rather than regurgitate verbatim from Google’s docs about the differences, there’s a neat table they have over on their docs about how to choose.
Honourable mention: Kafka
You’ve probably heard about Kafka, and I think it deserves a mention in this post, since it’s a choice many use in async messaging given its popularity and open source nature. Apache Kafka and Google Cloud Pub/Sub offer different strengths for async messaging. Kafka is really good in complex, high-throughput scenarios, giving fine-grained control over data management. Google’s Pub/Sub prioritises simplicity and scalability within the Google Cloud ecosystem, ideal for cloud-native applications. In general, you might choose Kafka if you need extensive control and customisation, but you’d go with Pub/Sub for easier integration and managed scalability as part of Google Cloud’s ecosystem and offering.