Find my posts on IT strategy, enterprise architecture, and digital transformation at ArchitectElevator.com.
It's hard to imagine cloud without automation. Cloud computing has transformed the way we provision infrastructure and deploy applications because it makes functions that used to require lengthy manual processes available as an API call. Not taking advantage of this capability would seem outright silly. So, let's automate our Loan Broker application.
This is the Part 4 of the Serverless Loan Broker mini-series. If you landed on this page first, here's a quick recap:
Automation is hardly a new idea—sysadmins have been replacing manual tasks with shell scripts for decades. Shell scripts make automation straightforward because they use the same commands that you'd issue by hand. For that very reason, my initial implementation uses the following shell script with calls to the AWS CLI (command-line interface) to create the bank functions (on Github):
account=`aws sts get-caller-identity --query Account --output text` # TODO: edit to reflect your role to be used role=arn:aws:iam::$account:role/service-role/CreditBureau-role-abcdefg zip BankSns.zip BankSns.js aws lambda delete-function --function-name=BankSnsPawnshop aws lambda delete-function --function-name=BankSnsUniversal aws lambda delete-function --function-name=BankSnsPremium aws lambda create-function --function-name=BankSnsPawnshop \ --runtime=nodejs12.x --handler=BankSns.handler --role=$role \ --environment="Variables={BANK_ID=PawnShop,BASE_RATE=5,MAX_LOAN_AMOUNT=500000,MIN_CREDIT_SCORE=400}" \ --zip-file=fileb://BankSns.zip aws lambda create-function --function-name=BankSnsUniversal \ --runtime=nodejs12.x --handler=BankSns.handler --role=$role \ --environment="Variables={BANK_ID=Universal,BASE_RATE=4,MAX_LOAN_AMOUNT=700000,MIN_CREDIT_SCORE=500}" \ --zip-file=fileb://BankSns.zip aws lambda create-function --function-name=BankSnsPremium \ --runtime=nodejs12.x --handler=BankSns.handler --role=$role \ --environment="Variables={BANK_ID=Premium,BASE_RATE=3,MAX_LOAN_AMOUNT=900000,MIN_CREDIT_SCORE=600}" \ --zip-file=fileb://BankSns.zip
The script packages the Bank source code, deletes any potentially already existing functions and deploys three Bank lambda functions, passing the respective configuration parameters.
Although convenient, this method has a major (and well-known) drawback: it isn't designed to deal with change. How would you deploy one additional bank? If you make a separate script for just that bank, you'd end up with a large collection of scripts that depend on each other. Deleting and re-creating all Bank functions, like this script does, unnecessarily interrupts the existing service. A smarter script could determine which banks already exist and update only the ones that need changing, but it'll become excessively complex as it has to cover an ever larger variety of cases. That's why modern cloud automation doesn't use shell scripts.
Dating back to 2011, 5 years after the birth of AWS, AWS CloudFormation is one of the earliest cloud automation tools (Terraform came about in 2014 with Terraform 1.0.0 being released in 2021). CloudFormation looks more like a data structure than a script, using hierarchical YAML (or equivalent JSON) syntax to describe the resources that should be deployed. It follows a declarative approach by specifying a desired target state, e.g., three banks. By letting the tool figure out what resources to (de-)provision to reach that state from the current setup, it solves the script explosion problem from above.
CloudFormation is a new language, but you can get a small head start from the CLI
by describing existing resources in a YAML format via the output
parameter:
$ aws lambda list-functions --output=yaml --query='Functions[?starts_with(FunctionName, `BankSns`) == `true`]'
Although the result won't fit the CloudFormation script 100%, for example because CloudFormation requires additional settings like the source file, it's a reasonable start for simple resources.
I like to start simple, so my first CloudFormation template creates just a single bank function:
AWSTemplateFormatVersion: '2010-09-09' Parameters: BankRole: Type: String Resources: BankSnsPawnShop: Type: AWS::Lambda::Function DeletionPolicy: Delete Properties: Runtime: nodejs12.x Code: S3Bucket: loanbroker-source S3Key: BankSns Handler: BankSns.handler Role: Ref: BankRole FunctionName: 'BankSnsPawnShop2' Description: 'Pawn Shop' Environment: Variables: BANK_ID: PawnShop BASE_RATE: '5' MAX_LOAN_AMOUNT: '500000' MIN_CREDIT_SCORE: '400'
After specifying a version header and requiring the security role as a parameter,
my template defines a single resource of type Lambda::Function
. All parameters that we'd pass on the command line before are now represented as
nested YAML elements. For example, the environment settings of the name of the handler
code can be found under Properties/Environment/Variables
.
CloudFormation pulls the ZIP'd source code from an S3 bucket, so before we can execute the CloudFormation template, we upload (copy) it there (yes, via command line):
$ aws s3 cp BankSns.zip s3://loanbroker-source/BankSns
CloudFormation is based on the concepts of a template and a stack. The template is the file above - a description of the resources that should be provisioned. A stack is an actual deployment of the template that tracks the deployed state. You could think of it as classes and objects instantiated from a class, with the key difference that your instances are running systems, which also incur costs. Deleting a stack equates to deprovisioning all resources associated with it.
We create a stack LoanBrokerPubSub
based on the template above to deploy a single bank function, specifying the required
security role parameter (as always, replace it with your role).
$ aws cloudformation create-stack --stack-name LoanBrokerPubSub \ --template-body file://LoanBrokerPubSub.yml \ --parameters ParameterKey=BankRole,ParameterValue=arn:aws:iam::1234567890:role/service-role/CreditBureau-role-abcdef { "StackId": "arn:aws:cloudformation:us-east-2:1234567890:stack/LoanBrokerPubSub/abcdef" }
The CLI returns a StackId
so we can refer to it later. Just having an ID doesn't mean the whole stack was created
flawlessly, though, so it's a good idea to check the detailed event log:
$ aws cloudformation describe-stack-events --stack-name LoanBrokerPubSub
This command will show the time and status for each resource that is created by the
template (you'll fondly remember the JMESPath syntax for the query
parameter from Part 2).
So far, so good, but we want to do a bit more than just deploy a single bank. Being largely a data structure, CloudFormation doesn't support loops or expansions as you would find them in regular programming languages.
Provisioning multiple banks therefore follows the proven CTRL+C / CTRL+V method. A
thorough read of the CloudFormation Resource Reference and good dosage of grit enabled me to also create the QuoteRequestChannel
as an SNS topic, the QuoteResponseChannel
as an SQS queue, and the message filter as an EventBridge rule. The script wires
everything together by subscribing the Bank functions to the SNS channel and sending
responses to the SQS queue via EventBridge. Those links are accomplished with resource
references, e.g. to the subscribe banks to the Topic (some unrelated settings are
omitted):
BankUniversalSubscription: Type: AWS::SNS::Subscription Properties: TopicArn: !Ref QuoteRequestChannel Protocol: lambda Endpoint: !GetAtt BankSnsUniversal.Arn
and to send responses to the EventBridge bus:
SendQuoteUniversal: Type: AWS::Lambda::EventInvokeConfig Properties: FunctionName: !Ref BankSnsUniversal DestinationConfig: OnSuccess: Destination: !GetAtt FilterMortgageQuotesBus.Arn
Even for our simple example, the code grows quickly, so you'll find the 200+ lines on Github.
Security constructs such as queue and event bus policies are also represented as CloudFormation resources. For example, allowing our banks to receive quote requests and to publish responses requires the following policies:
BankPawnShopInvokePermission: Type: 'AWS::Lambda::Permission' DeletionPolicy: Delete Properties: Action: 'lambda:InvokeFunction' FunctionName: !Ref BankSnsPawnShop Principal: sns.amazonaws.com SourceArn: !Ref QuoteRequestChannel AllowMessagesToResponseChannel: Type: AWS::SQS::QueuePolicy DeletionPolicy: Delete Properties: Queues: - !Ref QuoteResponseChannel PolicyDocument: Statement: - Action: - "SQS:SendMessage" - "SQS:ReceiveMessage" - "SQS:DeleteMessage" - "SQS:ChangeMessageVisibility" Effect: "Allow" Resource: !GetAtt QuoteResponseChannel.Arn Principal: AWS: !Ref AWS::AccountId - Action: - "SQS:SendMessage" Effect: "Allow" Resource: !GetAtt QuoteResponseChannel.Arn Principal: Service: events.amazonaws.com
Once defined, we can deploy the banks and queues (the Step Functions broker and aggregator aren't yet included) with a single command:
$ aws cloudformation update-stack --stack-name LoanBrokerPubSub --template-body file://LoanBrokerPubSub.yml \ --parameters ParameterKey=BankRole,ParameterValue=arn:aws:iam::1234567890:role/service-role/CreditBureau-role-abcdef
Proud of our creation, we can test the setup by sending a quote request to the SNS channel and looking for responses on the response queue:
$ aws sns publish --topic-arn arn:aws:sns:us-east-2:1234567890:MortgageQuoteRequest2 \ --message '{ "SSN": "123-45-6666", "Amount": 500000, "Term": 30, "Credit": { "Score": 803, "History": 22 } }' \ --message-attributes '{ "RequestId": { "DataType": "String", "StringValue": "ABC12345" } }'
The great news is that with a single command we can deploy a serverless solution that combines Publish-subscribe Channels, Message Queues, Message Filters, and Lambda Functions. However, composing such an automation script blends resource provisioning, application composition, component configuration, and security settings. Not only does this mix separate concerns, it also makes writing and debugging these scripts a non-trivial exercise.
Automating with CloudFormation is a huge step ahead from CLI scripts. However, getting the syntax right can be challenging at times (at least for me). The following snippet that configures the event bus to filter out empty mortgage quote responses might not look too complicated but took me an extraordinary amount of time to get right:
RouteMortgageQuotes: Type: AWS::Events::Rule DeletionPolicy: Retain Properties: Name: RouteMortgageQuotes2 Description: "Filter out empty quotes" EventBusName: !GetAtt FilterMortgageQuotesBus.Name EventPattern: detail: requestContext: functionArn: [{ prefix: !Sub 'arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:BankSns2' }] responsePayload: bankId: [{ exists: true }] Targets: - Arn: !GetAtt QuoteResponseChannel.Arn InputPath: $.detail.responsePayload Id: MortgageQuotes
After trying to assemble an expression using the Fn::Join
function, the Burning Monk saved me by recommending Fn::Sub
instead. Nevertheless, I found the cognitive load to be high as this single line
combines syntax from multiple languages:
functionArn: [{ prefix: !Sub 'arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:BankSns2' }]
This line filters incoming events to only process those originating from our banks,
identified by the name prefix BankSns2
. Because EventBridge only supports prefix matching, the code constructs a full arn
(AWS Resource Name) prefix.
!Sub
is syntactic sugar for Fn::Sub
, a CloudFormation intrinsic function that substitutes placeholders in a string.
arn:aws:lambda:us-east-2:1234567890:function:BankSns2
.
[{ prefix: "string"}]
.
functionArn
is the field name used in the envelope (requestContext
) added by the Lambda destination to form the invocation record.
This single line pulls from several topic domains, including CloudFormation syntax, functions, and pseudo parameters, EventBridge filter patterns, and Lambda Destination event formats, so it's low-code but in a polyglot flavor. Luckily, my simple use case didn't need to supply integer values.
Defining the event bus target utilizes another syntax. This EventBridge SQS Target filters down the message content to the actual quote, removing the event envelope (a classic Content Filter):
Targets: - Arn: !GetAtt QuoteResponseChannel.Arn InputPath: $.detail.responsePayload Id: MortgageQuotes
These harmless-looking lines also combine syntax from multiple sources:
InputPath
is an EventBridge attribute to pass filtered content to the target. Alternatively,
you can use the Input
(JSON literal) or InputTransformer
attributes (a map of JSONPath expressions plus a template in form of a JSON literal
with placeholders for the fields calculated in the map).
$.detail.responsePayload
references a specific field from the event using JSONPath syntax. Nested fields are
represented with dot notation here, as opposed to nested elements in the EventPattern
attribute.
Arn
references the MortgageQuotes
SQS channel that non-empty mortgage quotes are sent to, using the intrinsic function GetAtt
, which accepts the name of an attribute in dot notation, however without a $
.
Although automation scripts are a great help, they can be verbose at times with too much boilerplate text. Oddly enough, they also contain very dense lines that combine syntax from multiple languages. Knowing that architecture is defined by meaningful decisions, I'd like to find a way to amplify the meaningful text while reducing the noise. In short, I am looking for useful abstractions. In AWS-land the next stop is SAM - The Serverless Application Model.
The Serverless Application Model comprises several elements, including the ability to build and run Lambda functions locally and to speed up (non-production) deployments with SAM Accelerate. At the heart of it are additional resource types that promise to make CloudFormation templates less verbose.
In our automation template above, defining a Lambda function that consumes messages
from an SNS Topic and sends responses to EventBridge via Lambda Destination required
three resources (AWS::Lambda::Function
, AWS::SNS::Subscription
, AWS::Lambda::EventInvokeConfig
) plus IAM permissions (AWS::Lambda::Permission
, AWS::Events::EventBusPolicy
). SAM condenses this to a single Lambda resource with Event
and EventInvokeConfig
properties (I omitted the environment parameters below):
BankSnsPawnShop: Type: AWS::Serverless::Function Properties: PackageType: Zip Runtime: nodejs12.x CodeUri: src Handler: BankSns.handler FunctionName: 'BankSns3PawnShop' Description: 'Pawn Shop' Events: MortgageQuoteRequest: Type: SNS Properties: Topic: !Ref QuoteRequestChannel EventInvokeConfig: DestinationConfig: OnSuccess: Type: EventBridge Destination: !GetAtt FilterMortgageQuotesBus.Arn
You notice the new resource type AWS::Serverless::Function
. Instead of specifying the ZIP package, this resource type accepts the source code
because SAM will actually build and package the function (for my trivial example it's
really just ZIP-ing up the source file). Also, this resource no longer specifies a
Role
attribute as SAM will generate one for us, including the required permissions. The
reduction from five resources to one is surely welcome. The resulting template (on
GitHub) is now 129 non-empty lines, a 35% reduction.
You build and run SAM applications from the command line (CloudShell has SAM pre-installed):
$ sam build $ sam deploy --guide
The guide
option allows you to enter required settings (like the CloudFormation stack name)
on the command line and stores them in a local file so you don't have to repeat the
process. SAM expands the serverless resources into actual resources:
CloudFormation stack changeset ---------------------------------------------------------------------------------------------- Operation LogicalResourceId ResourceType ---------------------------------------------------------------------------------------------- + Add BankSnsPawnShopEventInvokeConfig AWS::Lambda::EventInvokeConfig + Add BankSnsPawnShopMortgageQuoteRequestPermission AWS::Lambda::Permission + Add BankSnsPawnShopMortgageQuoteRequest AWS::SNS::Subscription + Add BankSnsPawnShopRole AWS::IAM::Role + Add BankSnsPawnShop AWS::Lambda::Function + Add BankSnsPremiumEventInvokeConfig AWS::Lambda::EventInvokeConfig + Add BankSnsPremiumMortgageQuoteRequestPermission AWS::Lambda::Permission + Add BankSnsPremiumMortgageQuoteRequest AWS::SNS::Subscription + Add BankSnsPremiumRole AWS::IAM::Role + Add BankSnsPremium AWS::Lambda::Function + Add BankSnsUniversalEventInvokeConfig AWS::Lambda::EventInvokeConfig + Add BankSnsUniversalMortgageQuoteRequestPermission AWS::Lambda::Permission + Add BankSnsUniversalMortgageQuoteRequest AWS::SNS::Subscription + Add BankSnsUniversalRole AWS::IAM::Role + Add BankSnsUniversal AWS::Lambda::Function + Add FilterMortgageQuotesBus AWS::Events::EventBus + Add QuoteRequestChannel AWS::SNS::Topic + Add QuoteResponseChannel AWS::SQS::Queue + Add RouteMortgageQuotes AWS::Events::Rule
Here you can see roles created for each function, e.g. BankSnsPawnShopRole
plus the required permissions. The cool part is that with just two command lines,
you're ready to send quote requests and fetch the results from the MortgageQuotes3
channel (that query
parameter is really coming in handy):
$ aws sns publish --topic-arn arn:aws:sns:us-east-2:1234567890:MortgageQuoteRequest3 \ --message '{ "SSN": "123-45-6666", "Amount": 500000, "Term": 30, "Credit": { "Score": 803, "History": 22 } }' \ --message-attributes '{ "RequestId": { "DataType": "String", "StringValue": "ABCD1234" } }' $ aws sqs receive-message --queue-url https://sqs.us-east-2.amazonaws.com/1234567890/MortgageQuotes3 \ --query='Messages[].Body' --max-number-of-messages=5 --wait-time-seconds=5 [ "{\"rate\":6.134508050538122,\"bankId\":\"PawnShop\",\"id\":\"ABCD1234\"}" "{\"rate\":5.692421843390756,\"bankId\":\"Universal\",\"id\":\"ABCD1234\"}" ] $ aws sqs purge-queue --queue-url https://sqs.us-east-2.amazonaws.com/1234567890/MortgageQuotes3
sqs receive-message
doesn't delete messages from the queue, so for testing it's handy to purge the queue
after receiving messages (it's great that SQS includes a Channel Purger). Also, be aware that a single call to sqs receive-message
might not return all messages at once.
What initially tripped me up is that specifying an EventBridge Target to an SQS channel
doesn't automatically give the event bus permission to send messages. If you consider
that the EventBridge resource type is still AWS::Events::Rule
, this makes sense as all the SAM magic is reserved for resources of the AWS::Serverless
types. So, the SAM template still needs a AWS::SQS::QueuePolicy
to allow our event bus to send messages.
SAM certainly makes automation template development easier, especially for complex resources like API gateways. But did it really provide us with a new model, i.e. abstraction, as the name suggests?
Cloud without automation is just going to be a better data center, which isn't really what anyone wants. Especially with serverless applications, automation takes on a whole new meaning as it's less concerned with provisioning (the platform takes care of that) but much more with composition (which resource sends messages where) and application settings (like our bank parameters). However, these expanded responsibilities also stretch what automation languages like CloudFormation were originally designed to do. Time for us architects to zoom out.
Even for our simple demo application, the automation script handles multiple levels of application management:
Having all these capabilities coded and version controlled is a definite asset. However, you can expect these respective settings to be made by different parties at different times and perhaps subject to different authorization rules. Parameters allow separation for very simple cases (like the role identifierin my simple example), so you might need to resort to YAML / JSON manipulation (they're easily parsed!), or built-in mechanisms like nested stacks. The danger here is that you are building layers over layers (your templating system - SAM - CloudFormation - AWS API), which will make debugging cumbersome.
As shown above, using YAML (or JSON is you prefer) syntax is convenient for expressing
resource definitions. However, most automation languages like CloudFormation combine
elements from multiple technical domains into the same file and often the same line.
For example, the following snippet nests CloudFormation-defined elements (Properties
, EventPattern
) with event node names that are defined by AWS resources (detail
, requestContext
) and event node names defined by your application (bankId
):
Properties: EventPattern: detail: requestContext: functionArn: [{ prefix: !Sub 'arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:BankSns2' }] responsePayload: bankId: [{ exists: true }]
On top of this, expression combines syntax from CloudFormation functions (e.g., !Sub
or ${AWS:Region}
) and that used by the respective service (e.g., prefix
in this case).
The basic syntax also makes the scripts verbose: a script that implements only a subset of the Loan Broker (the most complex elements, i.e., the Step Functions orchestrator and the Aggregator are missing) already contains over 200 lines.
The automation scripts we created provide a huge convenience and in fact a new way
of working. For example, I had made a mistake by not specifying the correct bank IDs
and all I had to do was update the SAM template, build
and deploy
and SAM figured out to re-provision just those functions (SAM accelerate might have
been able to do it on the fly, even). However, the automation languages don't really
provide abstractions over the cloud resources: we are still dealing with Functions
and EventBridges and Step Functions. You might guess that I have some ideas on how to describe asynchronous, distributed systems at a higher level of abstraction.
Using such abstractions isn't syntactic sugar or shortcuts but a different vocabulary
that emphases the key design decisions. For example, I'd want to be able to express
that my EventBridge Rule acts as a Message Filter and a Content Filter, highlighting the key parameters, which are the filter predicate and the content
selection.
Distributed solutions should be loosely coupled, meaning that a change in one component
doesn't propagate to other components. Not specific to automation tools but rather
the platform itself, I spotted several instances where the current implementation
would struggle to meet this test. For example, the EventBridge pattern filters on
the specific event header added by the Lambda Destination (detail.requestContext.functionArn
). If I was to change the implementation to use a different composition mechanism
(e.g. sending messages directly from the function code), this logic in the EventBridge
pattern would fail.
Likewise, when a Lambda Destination calls calls EventBridge, the data is inside the
Detail
element (we referenced that above). However, for Lambda-to-Lambda calls you have
the option of sending the response only by specifying responseOnly
in the CDK Configuration. That option isn't available for other destination targets. Making the availability
of options independent from other settings, makes a system orthogonal and therefore more freely composable. A great example are Unix pipes, which are freely
composable due to all components reading and writing to standard streams.
Luckily, I am not the first person to encounter these limitations. A new generation of automation tools like AWS CDK, Pulumi, or CDK for Terraform provide programming libraries for cloud automation. We might appropriately call this IaaC - Infrastructure as actual Code (which, not coincidentally, is also a chapter in my book Cloud Strategy).
That automation code gives us an ideal starting point for abstraction and tooling. That'll be the perfect topic for the next post!