Find my posts on IT strategy, enterprise architecture, and digital transformation at ArchitectElevator.com.
After examining which patterns are embedded in Google Cloud Pub/Sub in an earlier post, I set out to implement a few common messaging patterns on top of Google Cloud Functions, Google's serverless implementation.
Serverless is one of the latest buzzwords in the cloud world and a name that's easily mis-understood. Of course, serverless applications still run on servers. The key point that the application owner doesn't have to worry about which server their application runs on. Technically, that has been true for most PaaS (Platform-as-a-Service), but serverless takes the concepts of deployment automation to a new level. Mike Roberts wrote an excellent article explaining serverless in detail. I generally summarize the evolution towards serverless as follows:
The dramatic progress can be summarized in a table as follows. The figures are meant to be qualitative in nature and can vary depending on many factors:
Stage | Deployment unit | Deployment method | Deployment time |
---|---|---|---|
Physical server | Server | Manual | Months |
Virtual server | Operating System | OS automated, application usually manual | Days or weeks |
Container / PaaS | Application | Automated | Minutes |
Serverless | Function | Automated | "Real-time" |
Google Cloud Functions is Google Cloud Platform's implementation of a serverless architecture. Cloud functions can be written in JavaScript in a Node.js run-time environment. As the run-time environment manages the execution of the function, a function has to be bound to an endpoint or an event in order to be invokable. Google functions come in two flavors of binding:
As we are interested in asynchronous messaging, we implement a background function that is triggered by a message event on a pub/sub channel. Writing and deploying cloud functions is quite easy - you can deploy them from the command line if you have the Google Cloud SDK installed. You manage the function bindings via command line arguments. Once deployed, functions are invoked directly by the run-time - that's the power of serverless!
The big advantage of using cloud functions is that the amount of wrapper code needed to bind and invoke code is dramatically reduced. This makes writing messaging pattern examples actually easier, as so much is already taken care of.
Disclaimer: I am not a JavaScript developer, so some of this code is likely non-idiomatic and not production quality. My intention was to focus on the direct translation of the pattern into an expressive implementation. Feel free to send me a pull request with suggestions for improvement.
Let's start with a very simple pattern implementation, a Content-based Router. This pattern inspects an incoming message and routes it to different destination channels based on its content. To avoid exercising unneeded creativity, we stick with the Widgets and Gadgets example from the book, which routes incoming orders to a widget or gadget channel depending on the order type.
Setting up a basic cloud function is quite easy. The function is called with an event
parameter that holds all needed data and optionally a callback parameter that the
function must call when it is done. Alternatively, the function can return a promise.
So pretty much all we need to do is unpack the incoming data, look at the order type,
determine the correct channel, and forward the message to that channel. All this can
be done with a few lines of JavaScript:
const Pubsub = require('@google-cloud/pubsub'); const pubsub = Pubsub({projectId: "eaipubsub"}) exports.contentBasedRouter = function contentBasedRouter(event) { const pubsubMessage = event.data; const payload = Buffer.from(pubsubMessage.data, 'base64').toString(); console.log("Payload: " + payload); order = JSON.parse(payload) outChannel = getOutChannel(order.type); console.log("Publishing to: " + outChannel) return pubsub.topic(outChannel).get({autoCreate: true}).then(function(data) { var topic = data[0]; return topic.publish(order); }) }; function getOutChannel(type) { switch(type) { case "widget": return "widgets"; case "gadget": return "gadgets"; default: return "unknown"; } }
Most code examples in the book are extracts that rely on quite a bit or wrapper code
in order to function. In this case, the code above is all the code there is! There's
a fairly good JavaScript library for Google Cloud Platform, which we require
to publish messages. Luckily I found a function to automatically create a topic if
it doesn't yet exist, which eliminates a lot of conditional code that plagued my first
version. The pub/sub message comes in in the data
field of the event
parameter. JSON data is base64 encoded, so we unpack it first and do some unnecessary
logging for our own entertainment. After we parse the data into JSON, we look at the
type
field to determine the channel to relay the message to. That's almost it - we still
have to get a reference to the topic and off we go. The get
and publish
methods return JavaScript promises. In case of get
, we process the result synchronously so we can return the promise returned by the
publish
method to Google Cloud Functions.
All that's missing is the dependency on Google Cloud Pub/sub in the package.json
file:
{ "dependencies": { "@google-cloud/pubsub": "~0.10.0" } }
Deploying a function from the Cloud SDK is quite straightforward:
gcloud beta functions deploy contentBasedRouter --stage-bucket eaipubsub-functions
--trigger-topic orders
This command deploys the function and binds it to the topic orders
within our project, resulting in the full topic name projects/eaipubsub/topics/orders
. We are now ready to feed some messages in this topic and see our pattern function
in action:
gcloud beta functions logs read contentBasedRouter D contentBasedRouter 118580632422832 2017-04-23 17:53:34.912 Function execution started I contentBasedRouter 118580632422832 2017-04-23 17:53:38.788 Payload: { "type":"widget", "quantity":3, "ID":123 } I contentBasedRouter 118580632422832 2017-04-23 17:53:38.799 Publishing to: widgets D contentBasedRouter 118580632422832 2017-04-23 17:53:38.843 Function execution took 3933 ms, finished with status: 'ok' D contentBasedRouter 118580674297419 2017-04-23 17:54:13.921 Function execution started I contentBasedRouter 118580674297419 2017-04-23 17:54:15.873 Payload: { "type":"widget", "quantity":3, "ID":123 } I contentBasedRouter 118580674297419 2017-04-23 17:54:15.879 Publishing to: widgets D contentBasedRouter 118580674297419 2017-04-23 17:54:15.914 Function execution took 1994 ms, finished with status: 'ok' D contentBasedRouter 118580598336288 2017-04-23 17:54:15.921 Function execution started I contentBasedRouter 118580598336288 2017-04-23 17:54:15.936 Payload: { "type":"widget", "quantity":3, "ID":123 } I contentBasedRouter 118580598336288 2017-04-23 17:54:15.937 Publishing to: widgets D contentBasedRouter 118580598336288 2017-04-23 17:54:15.999 Function execution took 78 ms, finished with status: 'ok'
We can see that execution times vary quite a bit, which hints at some cache loading / warm-up being at work. After I converted the function to return a promise instead of explicitly calling back, the times appeared to get a bit more consistent. I didn't do any performance tests, though, and we have to keep in mind that a setup using pub/sub channels is not intended to minimize latency but to maximize throughput.
Cloud functions are stateless - that's how they can be instantiated and discarded by the framework at will. Stateful patterns like an Aggregator therefore require use of a database. I'll tackle those next.
The source code is available on github.com.