Amazon released Lambda nearly 2 years ago at AWS re:Invent 2014. The service provides a managed environment that runs code invoked by an HTTP request or internally by configuring supported AWS services as event sources.
Calling this code and understanding how it accepts input, processes the request and then returns valid output is relatively straightforward. However, understanding the environment running the code and how it may affect its execution and influence its design requires some poking around.
So let’s take a look under Lambda’s hood and highlight interesting points.
Lambda functions when first executed creates a new Linux container and then loads it by copying the code into it for execution. After the function has completed its execution, Lambda has the right to terminate the instance.
This requires writing Lambda functions in a stateless style meaning that the function should not rely on the underlying compute infrastructure’s local file system and child processes for managing any kind of state between requests.
Persistent state must kept in a remote store such as Amazon S3, Amazon DynamoDB or other data storage service. Keeping the Lambda environment stateless allows multiple instances to run in parallel when handling large bursts of concurrent requests.
Container Instance Reuse
Lambda’s right to terminate instances after they have finished executing enforces its stateless nature, but terminating instances after every function call would incur a performance hit as the entire container will need re-instantiation each time. To manage this case Lambda will reuse running instances to handle subsequent requests.
This is great, not only will Lambda scale by spinning up to 100 instances at a time to process large bursts of requests, it will keep instances running to handle subsequent requests to avoid the overhead of creating and instantiating new containers and their environments each time the function executes.
Now this raises some questions. How long can an instance stick around before terminating? How many instances will remain running after a large burst and for how long? To answer these questions I created and performed some simple tests that show some interesting results. Lets take a look.
Test 1 – Warm Instance Runtime
This will test how long Lambda container instances will run for before terminating. To determine an instance’s runtime, I created a Lambda function that increments a global integer each time the function executes.
The test will call the function once per second with the expectation that the global integer will continue to increase the longer the instance remains running. After the global integer resets back to 1, we can determine that a new instance has started.
Let’s have a look at the results below.
Test 2 – Idle Instance Timeout
The previous test determined the runtime of an instance that is constantly receiving requests. In this test, we will determine how long an idle instance will remain running before terminating.
I used the same Lambda function written in the first test, but instead of calling the function once per second it runs for the first time and then again after 5 minutes.
When the global integer is reset, the test determines that the instance was reset and captures and records the timestamp and if the integer was not reset, the test increments its wait time by 5 minutes, makes a new request and then waits 10 minutes before making the next call.
This will repeat with the wait time increasing by 5 minutes each time until the global integer resets. The test ran over a 24 hour period.
Test 3 – Instance Burst Idle Timeout
The previous tests focused on a single running instance. Next, I wanted to observe the idle timeouts of many concurrent instances.
The Lambda function from the previous tests were modified by adding a 1 second sleep. This helps ensure unique Lambda instances are handling each request by having each function run long enough to complete all requests simultaneously. This prevented functions from exiting early and being reused by an un-handled request.
The test ran 4 independent variations of 25, 50, 75 and 90 concurrent requests. Each time the concurrent requests are initiated, the test records their values and then waits 60 seconds before making another set of requests. This 60 second interval repeats 20 times for each variation without increasing.
Lambda limits the number of running instances to a maximum of 100. For this reason, I limited the last variation to 90 concurrent requests to avoid bumping up against this threshold.
If your site maintains a steady stream of traffic, it seems that the instances may continue running anywhere from 45 minutes to 3 hours. This presents an opportunity to store a short lived cache in the global namespace of a Lambda function. Can your service benefit from a bit of caching? Perhaps, but you must weigh the risks.
This is a quick peek under Lambda’s hood, Amazon has the right to change this behaviour at anytime. If you require a long term solution or are integrating with critical infrastructure, it is probably not the best idea to rely on.
However, if you have an adequate fallback plan in the case this behaviour disappears or require something short term, this could give your serverless solution that extra edge.
These tests provide some insight into the behaviour of the Lambda runtime environment. They ran over the course of a weekend and they may produce different results if run during different days of the week. What will happen if there are a number of different Lambda functions being call simultaneously under load? Will Lambda restart instances more often?
I encourage you to explore and share your findings in the comments below.