Boosting Lambda performance with AWS Amplify
This will be the first in a series of articles I'll be writing about the engineering journey of Trueme - a dating, friendship, and advice platform for single parents.
We have used AWS Amplify to great effect to build out and manage our backend infrastructure — including DynamoDB for profile creation and search, Cognito for user management, S3 for photo storage, and Lambda/API Gateway for our APIs — all from the command line.
Out of the box, Amplify allows us to bootstrap Lambda functions with just a few CLI prompts.
During our beta testing cycle, a number of users noted that our APIs were slow, so I undertook a number of code-level updates to enhance performance, but this only made a slight amount of difference.
I looked at our CloudWatch logs and noticed that some of the search API requests were taking upwards of 2000ms to execute. I knew that our DynamoDB queries were reasonably quick, as we are querying GSIs, rather than using Scan, so the problem laid elsewhere.
After I did a bit of RTFM, I realised I could throw more memory at the Lambda functions by updating the CloudFormation template associated with each Lambda.
Amplify generates a file called {function-name}-cloudformation-template.json which contains all the orchestration information necessary to run that Lambda. It contains a Resources property which contains information about environment variables, user roles, runtime versions, and timeout values.
I noticed that one can override the default setting of 128mb by adding a MemorySize property to this section, so I bumped it up to 1024.
{
"AWSTemplateFormatVersion": "2010-09-09",
"Description": "Lambda resource stack creation using Amplify CLI",
"Parameters": {
...
},
"Conditions": {
...
},
"Resources": {
"LambdaFunction": {
...
"Properties": {
"MemorySize": "1024",
...
}
}
}
}
Then I realised that this would update the memory size for the Lambda functions in each of our environments, all of which are non-production except one. While it is unlikely we would ever break the free tier for our non-prod environments, it felt prudent to only increase the allocation for our production functions.
Luckily the boilerplate CloudFormation templates gave us a clue. The Conditions section contains a property called ShouldNotCreateEnvResources which looks something like this:
"Conditions": {
"ShouldNotCreateEnvResources": {
"Fn::Equals": [
{
"Ref": "env"
},
"NONE"
]
}
}
Without looking at the documentation, I could tell that this was some kind of function that would return true if the function parameter env was NONE. So I had a crack at creating something similar, something that would return true if the current Amplify environment was our production environment.
"Conditions": {
"ShouldNotCreateEnvResources": {
"Fn::Equals": [
{
"Ref": "env"
},
"NONE"
]
},
"IsProduction": {
"Fn::Equals": [
{
"Ref": "env"
},
"prod"
]
}
}
Then I updated the MemorySize property to call the IsProduction function.
{
"AWSTemplateFormatVersion": "2010-09-09",
"Description": "Lambda resource stack creation using Amplify CLI",
"Parameters": {
...
},
"Conditions": {
...
},
"Resources": {
"LambdaFunction": {
...
"Properties": {
"MemorySize": {
"Fn::If": [
"IsProduction",
"1024",
"128"
]
}
...
}
}
}
}
So I did an amplify push on my dev environment, and checked the Lambda's basic settings in the Lambda console. The memory allocation remained at 128mb.
This wasn't necessarily a sign of success, the true test would come when I deployed our production environment.
A quick check on all other environments showed that they were all running at 128mb, while production was running at 1024.
I noticed a decent speed increase on my version of the app, so I decided to leave it a few days while we onboarded a few dozen more beta testers, so I could then analyse the data using CloudWatch.
What I saw was stunning. This graph shows the execution time for our main search API function for roughly a day and a half either side of the MemorySize update.
As you can see, the variation between minimum and maximum before the change was huge, anywhere from a tens of milliseconds all the way to 2000 milliseconds! After the change the difference between minimum and maximum is much, much smaller, with an average of about 145-150ms per request.
I couldn't believe this was all down to memory alone, so I looked at the CloudFormation documentation again, and noticed a very important point tucked away in the description:
MemorySize: The amount of memory that your function has access to. Increasing the function's memory also increases its CPU allocation. The default value is 128 MB. The value must be a multiple of 64 MB.
So there you go, increasing the memory available to a Lambda function also increases its CPU allocation. A two-for-one deal.
What other performance hacks have you found with Amplify? What features would you like to see integrated into the Amplify CLI?
Skills.
Congrats DJ!