AWS Bites Podcast

104. Explaining Lambda Runtimes

Published 2023-11-17 - Listen on your favourite podcast player

In this episode, we celebrate AWS Lambda's 9th birthday by taking a deep dive into Lambda runtimes. We discuss how Lambda works, compare official runtimes vs. custom runtimes, and explain when and why building a custom runtime might be worth the effort. We talk through how custom runtimes work, options for deploying them, and potential use cases where they could be beneficial over standard runtimes.

AWS Bites is brought to you by fourTheorem, an Advanced AWS Partner. If you are moving to AWS or need a partner to help you go faster, check us out at fourtheorem.com!

In this episode, we mentioned the following resources:

Let's talk!

Do you agree with our opinions? Do you have interesting AWS questions you'd like us to chat about? Leave a comment on YouTube or connect with us on Twitter: @eoins, @loige.

Help us to make this transcription better! If you find an error, please submit a PR with your corrections.

Luciano: Happy 9th birthday AWS Lambda! Yes, AWS Lambda was launched nine years ago this week. And to celebrate this birthday today, we're going to answer the question, what's inside a Lambda function? I don't mean your JavaScript or Python code, I mean everything around it. How does Lambda work as a service? How does it execute your code and integrate it with the rest of the AWS ecosystem? Today, we'll deep dive into the fascinating topic of Lambda runtimes.

We will discuss how Lambda works, what a runtime is, we will compare official runtimes versus custom runtimes. And if you stick until the very end of this episode, we will also share when and why putting the effort into learning a custom runtime and or building one might actually be worth your time. I am Luciano, I'm here with Eoin, and today we are here for another episode of AWS Bites podcast. AWS Bites is brought to you by fourTheorem, an advanced AWS partner. If you're moving to AWS or need a partner to help you go faster, check us out at fourTheorem.com. Let's start by recapping what a FaaS serverless system is, how it works, and in general, how does that refer to AWS Lambda? What do you say? Yep, Lambda is the FaaS or functions as a service, service within AWS.

Eoin: And it's an event-based system. You write some code in the form of a function, that function takes an event as its input and responds with a single response. It supports multiple programming languages. So as a client, you will send your function code and the event configuration to your cloud provider like AWS, and they will make sure to run your code when the event happens. And this is the magic. It just figures out where to run your code and how to place that within their vast set of compute infrastructure.

And it's also well integrated with lots and lots of other things. Your function can be used to connect or to extend different cloud services. A few examples of that, you can use a Lambda function with API Gateway to define the logic for your web requests. You can use a Lambda function to process jobs from a queue like SQS and signal which ones have been processed correctly and which ones may have failed. And another example is you can use Lambda functions to define your custom GraphQL resolvers. But there's lots and lots more beside that, which I think we've covered in lots of previous episodes. So this magic, how does it work? Well, I think at the core of that is the concept of a runtime. So what is a runtime and why do we need one?

Luciano: So yeah, you said that the cloud provider needs to have some kind of infrastructure that can use to execute the code when needed, so when a specific event happens. So this infrastructure also needs to make sure that all the information is passed correctly into the function. So some kind of description of the event needs to be passed as an input. And then the function is going to do some magic, it's going to do some computation and eventually provide a response or an output.

And the runtime also needs to collect that output and do something useful with it. For instance, if there are integrations to be triggered, it needs to make sure it takes the output and use it to wire things together correctly. And of course, there might be errors because in the cloud there are always errors around the corner. So if there are errors, the runtime needs to make sure it captures the error.

In some cases, there might be the possibility to retry the execution, so it needs to make sure the execution is retried. If there are too many retries, eventually it needs to stop retrying and make sure the errors are communicated correctly to the user in the form of logs. So the runtime has to do basically all this kind of coordination around the execution of a specific Lambda function. There is also an extension system that exists inside Lambda, so the runtime is also responsible for integrating possible extensions.

And this is something that you might have seen, for instance, if you use an external provider to collect telemetries on the Datadog, they might be providing their own extension that you embed in the Lambda execution. And as your Lambda is running, they can collect all sorts of information and record it in the telemetry system so you can inspect it later. Speaking of runtimes, there are generally two main categories.

One is built-in runtimes and another one is custom runtimes. When we talk about built-in runtimes, we generally talk about the common languages that we have seen with Lambda, so Node.js, Python, Java, .NET, Ruby, Go. Even though Go has been recently deprecated, we'll talk a little bit more about that in a second. And generally, you can expect that the most recent versions of these programming languages are supported.

So if you have a long-term supported version of a programming language, that's generally going to be supported within that runtime. You can also use a custom runtime, as I mentioned, and that's the idea that you can support virtually anything else that you want to run in Lambda. There are some cases that are actually well supported, even though they are still custom runtime by AWS, because AWS provides libraries for you to make it easy to build a custom runtime supporting specific languages.

And this is generally the case for languages that compile to native binaries, for instance, Rust, Go, and C++. And I was mentioning before that Go was deprecated as a built-in runtime, and this is because now you have a library that allows you very easily to build a binary that contains all your code and the runtime itself, and then you can ship it as a custom runtime. So pretty much the same experience you would get with Rust or C++.

And that's, of course, not it. Like, you can effectively build custom runtimes for anything you want. Maybe you want to build older, newer versions of Node.js or Python or languages that are not even supported by Lambda itself with the built-in runtimes. Very common examples are Bref, which is basically an open source PHP runtime. Another one exists for the Swift language, which is really well supported, even though it's not officially coming from AWS.

So you need to download it from an open source project and figure out exactly how to compile it and ship it. And then there might be other interesting use cases, even though maybe a little bit less mature at this point. For instance, I've seen Lua runtimes, WebAssembly runtimes, Elixir, PowerShell, Bash. And there are even more crazy examples, even esoteric one, I would call them, like the BrainFact language. A lot of people have spent their time building a COBOL or a Fortran runtime, mostly just for fun. So let's maybe try to deep dive a little bit on what a custom runtime actually is. How does it work?

Eoin: Yeah, well, a custom runtime is really just a program that communicates between your handler, I guess, and the control plane that is passing events in from the Lambda service itself. When you're creating a runtime, you essentially just create a program that needs to be called Bootstrap and is placed at the root of your Lambda package. So this can be a Linux binary or a shell script. Remember that, I guess, the Lambda runtime environment, it's just a Linux environment and it's running on Amazon's Firecracker, lightweight virtual machines, which are really low overhead, highly optimized container-like things that run an isolated and secure sandbox for a Lambda function.

So your Bootstrap program needs to target the Amazon Linux distribution. So I think recently they've been moving to Amazon Linux 2023, the latest version, which has just been released. Now, what does this program do? Well, there are two phases within this runtime initialization. You've got initialization and then processing. And in the initialization phase, it's going to retrieve some settings and it can read special environment variables.

One is the handler, which handler file should be executed. Then you've got the Lambda task root variable, which tells you where the code is stored. And then you've got this AWS Lambda runtime API environment variable. And this is the host and the port of the Lambda runtime API. And this is a really important part, which we'll talk about in a little bit. So there's lots of other environment variables.

The full link to all of them will be in the show notes. Once that's done, then it can load the handler file. And this is into your function initialization. So there are language-specific operations here. So it might require initializing your runtime environment, like your JVM, for example. And then loading classes, loading jars, etc., or loading libraries. And then for compiled languages, so we're talking about Rust, Golang, C++, the code is generally preloaded as part of that runtime binary.

You also need to think about handling errors during this phase. So if any error happens while loading the runtime, the program needs to notify specific API and exit cleanly with an error code. When it moves then into the processing phase, it's essentially running a loop. It's like an event loop, so it fetches an event at a time from the runtime API. It passes that to the handler function with the event payload.

Then it will collect the handler's response and forward it back to AWS. There are also other secondary things that it needs to think about, like propagating tracing information, creating the context object, handling errors, and cleaning up resources. Now, we talked about this runtime API. So this is how you communicate with the AWS Lambda service. And the AWS Lambda service is responsible for receiving the events from its API, like from invoke or invokeAsync.

And then it needs to think about the worker placement, finding a worker that has the capacity to run your function, and then passing the event to the runtime on that worker. So your runtime is running on a fleet of workers, and the Lambda service is going to pass it to you. You need to pull it using this runtime API. So there's a get method on a specific invocation next path that you need to pull to get the next event.

And you just do this one at a time. And this will just hang until there's a new event available. So you might have to set a long timeout on this HTTP connection. When you're finished, there's also a post with an invocation response URL, where you can signal that the request has been completed, and then that's used to send the response payload back to AWS, so that it can use it for other downstream invocations.

You can actually use this API to do response streaming as well, which we discussed in a previous episode. And we'll give a link to that episode in the show notes, as well as the link to how to use this API for response streaming. Another one to be aware of is the invocation error response URL. And that's a separate path that you need to use if you've got an error in your function, and you need to report that back.

And then you can pass in special headers to report the specific kind of error. The body of that will also contain error information and even a stack trace. The fourth URL might be useful to know in the runtime API is one that you can use to report initialization errors in the initialization phase of your runtime. So that's basically how the runtime API works. I think all Lambda runtimes are using this to communicate with the Lambda service just slightly different ways. But one of the ways you mentioned Luciano is that you can create your own custom runtime, and then you can interact with this runtime API directly. So if somebody's thinking about using a custom runtime, what do you have to do to ship that?

Luciano: Yeah, I guess the question is you have built this integration using the specific runtime API that you just described. Now, how do you actually push it to production? And generally speaking, there are two options. One is that you can zip the bootstrap file within your code and ship everything as one package, or you can create a Lambda layer. So when you zip everything, it's more in the case that maybe you're doing something that's like one-off kind of use case.

You are maybe doing something that you are going to be doing once. You don't expect to be like a general use case within your company or even within kind of the open-source space for other people, other customers. So maybe it's just easier to do one zip file and ship it. And this is actually the case when you use combined languages, again, like Go, C++, or Rust, because since you are producing just one binary that contains the runtime code that is coming as a library and your own custom handler code, and eventually ends up everything together in the single binary, that's pretty much the only way you have.

You just zip it and you ship it as one thing that contains both the runtime and your own custom business logic. The other option, as I mentioned, is a Lambda layer, and this is more convenient when, for instance, you think you have a use case that is a little bit more common. You might want to do multiple Lambdas pretty much using the same runtime, or maybe you are building something that can even be an open-source project.

Maybe you want to support a new language and you expect other people to be willing to use the same runtime because they also want to play with that new language in Lambda. And the way you do this is actually pretty simple, because again, you just need to zip that bootstrap file and then you can publish it as a Lambda layer. And another case where this is very convenient is where you have interpreted languages, because once you have shipped it as a layer, anyone that wants to use that runtime, the only thing they need to do is basically go, even from the web UI, they can just go on the Lambda service, they can create a new Lambda, they select the custom runtime, they select the specific layer that implements the runtime, and then they can just use the built-in editor to create script files.

For instance, if we have built a runtime that can support bash scripting, they just need to select the layer and then you can just create a file called, I don't know, handler.sh, write your code there, and assuming that you are following the spec of the underfile directory as the runtime expects, you can just run your Lambda from there without needing to do anything more complicated than that. So this is actually convenient, again, in this case where you have either scripted languages or you want to do something a bit more reusable.

But one thing that is always worth mentioning when it comes to Lambda layers is that they are not a way to escape file size limitations that you might have with Lambda, because the layers are basically added on top of the total 250 megabytes unzipped that you can have for your Lambda package. So if you have very, very big, I don't know, runtimes, because maybe you have something like a JVM or something very big that includes lots of native libraries that your code can use, this is something that generally can go easily in the order of hundreds of megabytes. So in that case, you need to be very careful, because then you might, just the runtime might go over the 250 megabytes, or you might be leaving very little space for the user code. Which brings us to the next topic, because these are actually not the only two options, zip in the code or Lambda layers, there is also the option of using containers. Do you want to talk about that, Eoin?

Eoin: Yeah, I'm getting more and more, I'm warming more and more to the idea of container image deployments for Lambda, because they're showing a lot of benefits, and one of the huge benefits there is that you've got 10 gigabytes to include all of your layers and dependencies and everything else. So everything we've talked about so far has been about zip packaged functions. When you have standard zip packaged functions, you have the option of using the built-in or the custom runtimes, and in the case of the built-in runtimes, it's completely managed by AWS, and they are responsible for keeping it up to date and secure, and this is one of the big benefits of Lambda in general, and one of the reasons why people don't like custom runtimes, they don't like container image deployments, is because you'll sacrifice that if you go with one of those.

With container images, you don't have the provided built-in runtimes like you do with zip packaged functions. Instead, AWS maintains and provides base images that you can use to build your container image that you deploy for your function, and these are available for all of the provided runtimes that you already mentioned. The shared responsibility model here is going to be different, though, because although AWS is providing these base images, they are not going to be automatically updated without you having to redeploy your function.

You will need to continuously build and deploy against the latest build image in order to stay secure and up to date. You also have the option of going with a completely custom approach with container image deployments, so it's similar to zip packaged functions, where you're using the very same Lambda runtime interface client that could communicate with this runtime API, and you just add that client to your container image.

So you have a choice with the container image build. You either start with one of the base images, and you add your subsequent layers, or you can start with your own image. If you've got some machine learning image, for example, you need all of its base components, then you just add the runtime interface client and the entry point and everything at the end. So if you're using Lambda container image deployment to take advantage of existing images you have and you don't, and you just want to use them with Lambda, it's possible that you'll just start with your own base image and add that runtime interface client, even if you don't have any need otherwise for a special custom runtime.

With container images, you also have the benefit that you can use the runtime interface emulator, and that allows you to run your container image locally with this emulator, and you just get an HTTP endpoint to post events to. And it behaves then a lot more like a real Lambda function, not completely like a Lambda function, but it's a nice local emulation thing that you get for free with container images that I think is sometimes nicer than the other local emulation options you have.

Now, container images, it's probably worth saying, might even be a preferable way to deploy functions if you're trying to reduce the call start times. And I've been doing a bit of benchmarking of certain runtimes recently, particularly runtimes that involve typically heavy package dependencies. I'm talking particularly about the Python data science stack when you need pandas and pyarrow and numpy and all of these things, and you quickly run into a 250 megabyte limit.

Now, AWS actually released a paper where they describe all of the special performance optimizations that they made for container image deployment that caches files that are used by multiple images, even by multiple different customers. So the time to load a 10 gigabyte function may actually be less with container images than the time it can take to call start a 250 megabyte zip. And that's very counterintuitive, but it is the case, and I've definitely seen results that show that. And we'll link that paper in the show notes. It's pretty short, but it talks about the neat caching strategies that the Lambda team put in place to make sure that container image deployments can be really fast, even though you're talking about 10 gigabytes of storage. So going back to runtimes and custom runtimes, then Luciano, what is our recommendation for people? Do you need a custom runtime? Is this something people should be thinking about doing in their job for any particular reason or are there good use cases for it?

Luciano: I've personally been playing a lot with the Rust runtime, so I kind of had to explore this space a little bit more in depth, and I am very excited to understand more how Lambda works and to use Rust in the context of Lambda. But if I have to be honest and think about kind of the generic use cases that I've seen in the industry, I think the answer to the question, do you really need a custom runtime?

Most of the time, it's probably not. And the reason is because the official runtime gives you, basically gives AWS more of that chunk of shared responsibility, and you are free to think more about the business value that you want to provide. You don't have to think about all these details about the runtime, you just write your own code and everything should work out of the box for you. AWS focuses on keeping the runtime up to date, performant and secure, and you just focus on writing your code and making sure that it's as bug-free as possible.

And provider runtimes are also potentially more, I guess, optimized to avoid cold start times because AWS can easily keep the runtime cached in the local workers, which is not something you can do with your own custom runtimes, because of course, every time you are publishing the runtime, it can be different from customer to customer. Most likely it's going to be very different. So there is really no point in AWS trying to cache that locally.

And the other thing it's in terms of pricing, because with the provider runtimes, you don't pay for the cold start phase for the most part. And there is actually a very interesting article by Luc van Donkersgoed that explains a little bit of the research that has been doing, but the summary of it is that if you do your own custom runtime, you pay not just for the execution time, but even for the cold start time.

So there might be an impact there in terms of additional cost if your runtime is not particularly faster than what you could do with the built-in runtimes. And again, I think that the point of this episode was more to try to understand a little bit better how Lambda works under the hood. And then there might be cases when you actually might need a custom runtime. What can be those cases? One case could be maybe you have some legacy stuff that runs maybe in a very old Python, let's say Python 2, because that's on the, that I still see actually frequent enough, and you don't have time right now to move it to something more up to date and use the latest runtime.

So what you can do as a quick and dirty solution is just create your own runtime using Python 2, and then you can run your own code. Of course, this is far from ideal because you need to be aware you are still exposed to a bunch of security issues because all the runtimes are probably not supported from a security perspective. So this is only a very dirty hack that you can do for a limited amount of time, and eventually you need to have a plan to migrate to the newer versions.

So a more interesting use case is actually when you want to be bleeding edge and you want to try very new runtimes, very new version of runtimes, for instance, Python 3.12, I think it was released last month, and I believe there isn't yet, there isn't already an official version of that runtime supported by AWS. So if for any reason you want to use maybe some of the newest features or the additional performance gains that that version provides, and you're willing to take the cost of building your own runtime in exchange for that, that could be a very valid use case.

And we can do a very similar conversation for Node.js 20, even though it seems that AWS is going to release that very, very soon. Another use case which we have seen actually across a bunch of people that were trying to experiment with different JavaScript runtimes, maybe they want to play with Deno or BUN in the context of Lambda, either to be able to run TypeScript more natively or because they want to compare different performance characteristics.

And because there is no official Deno or BUN runtime, the only option you have in that case is to build your own runtime and basically package all of that that way. We already mentioned other cases, like you want to use compile languages like Rust, Go, C++, and this can be a good use case when you are looking for extreme performance or reduce latency to the very minimum, or maybe because you need to use some kind of native library that only exists for these compile languages.

In those cases, I would recommend don't reinvent the wheel. AWS gives you these libraries that are really well maintained and they have really good developer experience. So just use the library and that will cover 90% of what you need to do, and you can focus on the writing, actually the business logic of your own Lambda. And the last point is if you want to use a language that is not supported yet, or maybe it's never going to be supported because it's kind of a niche language, that's definitely a good use case.

And a true story is that we once had a customer that had a significant existing code base in TCL, or sometimes called 'TICOL', which is a relatively old language, but apparently there is lots of software that historically has been built with this scripting language. And as part of their migration strategy to the cloud, we consider using a custom Lambda runtime just because some of their work was very event driven.

So creating a Lambda would have been very convenient from an architecture perspective, but of course we were missing the runtime. So we were considering, is it worth building the runtime or not? So this is a consideration you might be doing when you're facing this kind of migration scenarios and you have things that might be well-fitted to run in a Lambda, but maybe just the language support is not there yet.

And I think that's everything. So we are at the end of this episode. I really hope that you found this particular episode informative and useful, and we always look for your feedback and your comments. So don't be shy, reach out to us and tell us what we can do better. What did you like, what you didn't like. And if you think this is useful, please remember to like and subscribe, share it with your friends and colleagues, and this way we can grow the channels together and always make sure we provide the best value we can provide to you. So thanks again, and we'll see you in the next episode.