AWS Bites Podcast

101. Package and Distribute Lambda Functions for fun and profit

Published 2023-10-27 - Listen on your favourite podcast player

Today we embark on a fascinating journey into the world of AWS Lambda functions and how to make them accessible to the public. In a recent use case, involving the creation of a public Lambda function for AWS users, we asked ourselves some interesting questions. How can you securely, cost-effectively, and conveniently publish AWS resources, especially Lambda functions, for others to use? And... can we possibly make some money out of this?

Join us as we explore various options and share our findings for making your AWS resources available to the world. We dive into the Serverless Application Repository (SAR), an AWS treasure trove for publishing resources. And SAR isn't the only way! We also discuss alternatives like CloudFormation templates, GitHub publishing, Terraform modules, and container images. We explore the pros and cons of these methods and debate the implications in terms of cost, security, and ease of use. Finally, we touch on the AWS Marketplace as a platform to monetize your AWS resources.

AWS Bites is brought to you by fourTheorem, an Advanced AWS Partner. If you are moving to AWS or need a partner to help you go faster, check us out at fourtheorem.com!

In this episode, we mentioned the following resources:

Let's talk!

Do you agree with our opinions? Do you have interesting AWS questions you'd like us to chat about? Leave a comment on YouTube or connect with us on Twitter: @eoins, @loige.

Help us to make this transcription better! If you find an error, please submit a PR with your corrections.

Eoin: We recently had a use case for creating and publishing a public Lambda function so other AWS users could make use of it. This gave us an interesting challenge. How do you easily publish a function or indeed any other AWS resource in a way that is simple for users to adopt, but also is secure, cost-friendly and maintainable? Today, we are going to go through all of the options and let you know what we recommend if this is something you want to do. My name is Eoin, I'm joined by Luciano, and this is the AWS Bites podcast. AWS Bites is brought to you by fourTheorem, an advanced AWS partner. If you are moving to AWS or need a partner to help you go faster, check us out at fourtheorem.com. Luciano, you raised this question recently. Apart from your generosity, what was the rationale for thinking about making a Lambda function public in the first place?

Luciano: Let me try to describe the specific use case. I wanted to create an all-time high-level version of the Lambda function and I wanted to create a specific version of the Lambda function so I wanted to create an OIDC authenticator for API Gateway. And I think in order to really understand why that's something that could be interesting, we need to remember something that we have been talking about previously in another episode and we'll have the link in the show notes that API Gateway currently has effectively two competing implementations, REST and HTTP.

And these two implementations have different feature sets. And the business use case I was working on was basically requiring us to have a private API Gateway so something can be accessible only from a private VPC. And also because we are using OIDC tokens, there needs to be some kind of authorizer that could verify that the OIDC token received is valid, it's related to a specific user with the right permission before the request is forwarded to the backend.

Now, the problem is that if you want to use OIDC authorizer with the HTTP version of the API Gateway, there is actually a very nice one already built in, so you just need to configure it, you don't have any extra cost, AWS will take care of everything for you. But if you want to use a private API Gateway, that's only available in the REST version of API Gateway. So this kind of made us forced to choose the REST version, so we didn't have the option to use the built-in OIDC authorizer.

So at that point, the only option was we need to build our own custom authorizer, which thankfully is something you can do with Lambda, you can create a Lambda that acts as an authorizer and give it to API Gateway, so API Gateway is going to call that Lambda to validate the token and then decide whether to forward the request or not, depending on the result of that validation. Now, since we have the solution working, and this is apparently a gap that exists in API Gateway, if you're doing private API Gateways and you're forced to use REST, you don't have an OIDC authorizer, we have the solution, and I think it could make sense to make it available to other people because I think it could be something relatively common in the market to have this particular use case.

So we were trying to think, okay, if we want to open source it, how do we make it easy for the users to install it? Ideally, something that is like one-click install with some minimal configuration, and the first thing that came to mind was SAR, Serverless Application Repository, so probably our first option to consider. SAR is Serverless Application Repository, and it is basically something that allows you to create infrastructure as code, and you can use CloudFormation or some or something similar, and then you can publish it in this kind of publicly available repository.

And you have to specify a special resource, which is AWS::ServerlessRepo::Application, which is the way that you can attach additional metadata to your project, things like description and version, so when people are going to be browsing this catalog of different solutions, they will see exactly what the specific solution is about. You can also use parameters, so every time that you need something configurable, that's one way to expose effectively the option to the users to provide their own configuration.

For example, in our use case, we most likely need to make the token support different token providers. I don't know if you're using Azure ID, you probably want to specify your own tenant, but you might be using other OIDC providers. And maybe the user also wants to make sure that the tokens are given for a specific audience, so they will need to provide that audience, or maybe they want to validate other token claims, so they need to have a way to specify all these different options, and parameters can allow us to make that flexible enough.

You can also use this approach to make private resources or private solutions, so it's something you might consider to use internally in your own company. If you have certain things that you think might be useful for other teams or for other projects, you can just publish them as SAR applications, and then they will be available inside your AWS account. So it's not something that you use only for public things, but you can also consider it for private solutions that you want to make reusable.

Now, once you have published something on SAR, other people can install them using the CLI or using infrastructure as code. The name is Serverless Application Repository, so you might think, okay, this is just for Lambdas, right? This is just for serverless things. But in reality, because you are effectively writing cloud formation, you can use this approach to specify any cloud formation template, any resource, so you might also go beyond the scope of serverless application, if that's something that makes sense for you.

So the idea is more if you want to make bits and pieces of your infrastructure usable and configurable, that's one way of doing it, regardless of whether it's serverless or not. I think I really like this approach because the user experience is pretty good. So you basically can browse this catalog, you can see all the different solutions. There is some degree of documentation that describes every single solution, and then when you want to install it, it's pretty much almost like one-click approach.

Or it's very seamless the way you do it. There is one disadvantage, though. I have been using some of these publicly available solutions from other creators, and the problem is that, especially, for instance, with things like Node.js, where the runtime evolves quite rapidly, the owners of this solution don't always keep the runtime up to date. So you might end up in a situation where you want to use a specific SARS solution, but then the runtime is not available in AWS anymore. So you are kind of forced either to try to open up your R and get the owner to update and republish, or just fork it and maintain it yourself. So this might be one of the downsides that, of course, because somebody else is maintaining the solution, you need to make sure that they're actually committing to keep it up to date and maintain it every time that there is a break and change like that. Now, of course, this is not the only way to share Lambda functions. We have other options. So any ideas? Once you have a CloudFormation template, there's actually a lot of options.

Eoin: Around how you can share it, you can just create it and publish it on GitHub or anywhere else on the web. The main disadvantage really there is that you're giving your users a bit more work. Just putting it in a GitHub repo. Versioning support is something you will have to think about yourself. And another thing is that you will have to decide then how to package the Lambda function code. So if you publish it in a GitHub repository, you can always just let the user package and deploy for themselves.

For example, you could provide a SAM template along with the code assets. This might be more work for the user because they'll need all the tools to deploy the function in whatever language you have chosen. On the other hand, it does have the advantage that the code is easily visible and the user has the freedom to change things as needed, fork it and make their own versions of it. Now, when you're creating Lambda functions, you have the options of specifying the code inline.

You can specify it as a base 64 encoded zip file or putting a zip file on S3. I think the zip file on S3 is probably the most common. The inline options are easier to publish. They limit because you don't have to worry about buckets, but they limit how much code you can have because there's a maximum there of four megabytes for the zip. And then you have to think about how do you bundle dependencies into that inline code.

If you do go for the bucket option, you essentially have to make the bucket public if you want it to be shared and usable as is by the user. Now, you can restrict your bucket to read only on specific prefixes, just get object. And you can even use condition keys in the policy to get access to the AWS Lambda service itself. So that's the only principle that will be able to read the code when it's deploying the Lambda function.

And this is something that we have tried out and we have a GitHub repository with a code example showing you exactly how to configure that bucket. It is a public bucket, so it's not going to be for everyone's taste. It's getting to a stage now where public buckets are becoming like socially unacceptable as regarded as being a bit insecure, but it can be done for specific cases like this. Sometimes if you want to provide code publicly via S3, you need to have public access.

So there are ways of doing that where you give the least privilege possible. Another simple way to let users deploy your CloudFormation template is to create a one-click URL. That's been around for a while. You might have seen websites and GitHub repos with click to deploy in CloudFormation, and it just gives you the ability to have a button on your website that would take users directly to the CloudFormation UI with the template preloaded from S3.

Now, if you don't want to make the template available on GitHub or S3, you can publish it as a module on the CloudFormation registry. So this also gives you options for public and private access like the serverless application repository, and it will also allow you to do versioning. It's basically a way to publish a set of resources in a template and let other users include that module in their template.

Then when the user deploys their one, CloudFormation will automatically kind of inline all of your resources from the module. Now, CloudFormation registry is there for lots of different purposes. You can publish your own providers there too. It's not incredibly common to see it used, but if you're doing something public, it's an option. If you want it to be private, so just for your organization accounts, then you might say, well, why bother using the CloudFormation registry? Because I think service catalog is probably a more common approach than the CloudFormation registry in that case. So you've got, I think, quite a few options when you just have a CloudFormation template and some code, but of course, it's not restricted to CloudFormation Luciano. Modules are something you can do also with Terraform or OpenTofu, I guess.

Luciano: Yeah, exactly, and this is one of the killer features of Terraform. And if people have a preference for Terraform over CloudFormation, you can basically make things reusable by defining infrastructure as called as Terraform modules. And the idea is that you can package together a collection of resources in Terraform files, and then you expose an interface that can receive inputs for configuration and provide outputs to basically be able to connect what you generated from your module with the rest of your own infrastructure that you're working on.

And this is a fairly common approach. People using Terraform should be quite used to this, especially since modules have been available for so long. So in this particular use case, what you could do is basically define everything you need for this Lambda Authorizer to work as a module and then make this module available. So now, once you have the module, what are the options to make it available? And I think the most common one is to just publish it on GitHub, because one of the many ways that you can install modules coming from third parties is just by referencing the GitHub repository.

So if that's a public repo, it's very easy. So most of the modules out there, public available, will follow this approach. There is also another option called Terraform Registry, which is a bit more centralized. It's managed by Terraform itself. And as a Terraform user, you can publish your own modules into the registry. This is basically designed specifically for sharing Terraform modules and providers.

So you can also use that not just for modules, but if you have more advanced use cases where you want to create custom providers, maybe because you are interacting with resources outside AWS, you're using other, I don't know, SaaS solutions, and you are creating your own Terraform binding code to be able to provision resources into other third-party SaaS. So it's very powerful because it kind of gives you all the extensibility of Terraform in one place.

Once you have your modules in the Terraform registry, you can easily include them in your configuration so anyone can just reference a module from the registry. Another approach which is somewhat similar to CloudFormation is that you can just make things available in a URL. So this is called direct download, and it's basically a tarball or a zip file that can live in a CDN or your own website or basically anywhere else where you can access through an HTTP URL.

And this way, you have another additional way to reference all the Terraform module code and include it into another project when needed. It's, I don't know if I see a use case for this. Like, I think if I had this need, I would probably prefer GitHub rather than having to think how to host that package myself in a URL. But if you have strong reasons not to use GitHub, that can be probably another approach.

There is another idea that might come to mind because we are thinking about packaging this Lambda into a zip file, but you probably know that this is not the only option to provide Lambda code because the other option is to use container images. And if we think about that, we open up another bunch of possible options on how to host the Lambda code because basically we can host it everywhere where you can host a container image.

And the obvious options are you can use Docker Hub or you can use an ECR registry or you can even use GitHub registry because recently GitHub opened up support also for Docker images in their own registry. This is actually not something we have tried, but because you can do public images, that should work out of the box because it's still using the OCI standard in terms of allowing people to pull their containers.

There is one thing that I think is worth discussing because we were actually thinking about this option. When you think about hosting stuff on S3, if this is your own bucket, then imagine that that thing becomes very successful, it goes viral, everyone is using it, you have millions of downloads per week, and then suddenly you realize, wait, I'm paying for all of this. I'm paying for all the access to this bucket.

So this is maybe not desirable because you created something open source, you are not monetizing from it, and suddenly you even have a cost. So one thing that is commonly done in those cases, so when people are sharing resources over S3, but they don't want to take a hit on the cost, is to use a feature called RequestedPays, which basically allows you to say, you can only access this public resource if you decide that you are going to be paying for the access cost.

There is one problem with this in this particular case, is that you need to pass a header in the S3 request to say, I accept the cost of downloading this. So it's almost like saying, I am aware that there is a cost, I'm not just downloading this for free, I'm going to be taking on the cost of the download. And this way, AWS basically allows this mechanism where you as a publisher don't have to take the cost and the user is aware that they are paying the cost for the download.

The problem is that because it requires an extra header, this is not something that is built in when you reference the Lambda code from S3. You cannot just say, use the x-amz-request-payer header. So basically, this option is not available for you. And this is one reason why if you want to use SAR, in that case, all your code is going to be hosted in a bucket that is owned by AWS. So you don't have to worry about cost, you don't have to worry about the fact that people cannot specify the request by your header as another alternative. So basically, this reinforces the idea that SAR is probably the simplest approach for this particular use case. That maybe makes me think that you mentioned SAR is like an app store for your resources. So now the next question is, what if we actually want to make a business out of it? Maybe we want to make this thing something that people have to pay to use it, because maybe it's giving them so much value that... And we have so much maintenance burden that it makes sense to make it as a, I don't know, paid service. Does that make sense? And if it does make sense, what are the options in terms of hosting it as a paid solution?

Eoin: While SAR might be like a bit of an app store, you can't monetize on it. But there is an option for that. You can publish CloudFormation templates to the AWS Marketplace and set a price there. So you might be familiar with AWS Marketplace as a way to get third party products like AMIs, SaaS solutions. You can buy through AWS Marketplace, like even Datadog and things. You can buy through there so you can get it on your AWS bill.

But you can just provide CloudFormation templates as a product and then you can charge people a monthly or a once-off cost or lots of different billing options. So if you think there is a market for your highly prized Lambda function, this is a reasonable option. Now, I don't have any experience really of doing CloudFormation templates in the AWS Marketplace, but I frequently listen to the Cloudonaut podcast.

We've mentioned it a few times where Michael and Andreas talk about their experiences in publishing their products to the Marketplace. And we will have a link in the show notes to a really great blog post from them with an accompanying video. And it's all about how to provide CloudFormation as a product on the AWS Marketplace. And unlike the Apple App Store, they don't completely fleece you on commission.

I think it's much more reasonable on the Marketplace. I think that's probably a good place to wrap this one up. And I hope we've covered all the options for publishing Lambda functions and indeed other AWS resources for public consumption. We generally recommend SAR, the Serverless Application Repository, as the first option since it handles the code distribution and a lot of the complexity. But let us know if you can think of any other creative ways for this. Also, watch out and see what Luciano does in the future. And if he manages to become a high-earning AWS Marketplace CloudFormation entrepreneur! 😜 Thank you for listening and until next time, goodbye. 👋