Help us to make this transcription better! If you find an error, please
submit a PR with your corrections.
Luciano: Lambda functions are small units of code that achieve a very specific purpose. It's always a good idea to keep your code short, clean and simple, and yet sometimes you find yourself writing a lot of boilerplate code in every function to do common things like parsing events, validation, loading parameters and a lot more. The Middy.js framework was designed to help you keep Lambda function code simple, letting you focus on the business logic and clearing away duplication and boilerplate. By the end of this episode, you will know how Middy.js works with JavaScript and TypeScript, how to perform validation, event parsing, parameters loading, how can you even write and use your own middlewares with Middy. And you're going to learn a little bit about the history of Middy, who is using Middy right now, and how is the community evolving around Middy. My name is Luciano and today I'm joined by Eoin and this is AWS Bites podcast. Luciano, I know that you created Middy.
Eoin: I remember hearing about it all the way back when it launched in 2017, 2018. How did it come about and why did you start it? Yeah, it's an interesting story. I'm going to try to summarize it for a minute.
Luciano: But basically I was working with this company, which was a spinoff of USB, which in Ireland is one of the main electricity providers. And we were building an innovative project around energy trading. And we decided to build it entirely serverless, which I think it was very brave at the time. This was around, I think 2016 was actually the year. And so it was the very beginning of serverless. Lambda was still quite new.
There wasn't really a lot of documentation out there and case studies. But we were really excited about this idea because the project was like a startup and we wanted to keep different components very simple and then build on top of those and evolve it that way. And funny enough, our assumption was that if we use Lambda, then our code is going to be very simple and we're going to focus strictly on the business logic because everything else is done for you by Lambda, by the runtime itself. That was kind of our initial assumption.
And then we realized very quickly after we wrote the first prototype that in reality, our Lambdas were far from simple. There was so much boilerplate in every single Lambda and there was so much inconsistency because we were literally copy pasting this boilerplate around and then we were not keeping it in sync. And at that point, we realized that there must have been a better way to manage all our code base and avoid all this duplication and make sure that our function would be, I suppose, as focused with the business logic as possible.
So basically, we kind of explored some of the ideas that we have seen in frameworks like Express.js or other Node.js frameworks. And we thought, OK, in those frameworks, you literally have the same problems, even though you build more monolithic applications. You still have to do a bunch of things like validation, authentication, and response realization, all these kind of concerns that generally go around the actual business logic that you want to implement in a particular endpoint. So we kind of took inspiration from the way that frameworks like Express solved that problem by basically using middlewares and trying to push all this concern outside your controllers or Lambdas or whatever you want to call your unit of business logic for an endpoint. And we tried to apply that same principle to our Lambdas. And at that point, we realized, OK, in a Lambda, you don't really have something like Express, even though at the time there were already ways to put Express in a Lambda, but we didn't feel that was the right way of doing it. And we also wanted something a little bit more generic that we could use even for non-HTTP related Lambdas. So we basically ended up implementing our own middleware engine for specifically built for Lambda. And then we used that and then helped us a lot to simplify our code and remove all this duplication. All this generic boilerplate code became a unit that we could easily write once, make it testable, reuse it, keep evolving it, and then consistently use it everywhere else. And that was it. We basically used it for about one year and we were very happy with it. And eventually we decided to open source it. And that was basically how Midi came to be. OK, very good.
Eoin: And I guess so you and your team and this startup were the original contributors. Have you managed to grow much of a community around it? Are there other maintainers now?
Luciano: Yeah, that's an interesting story because just shortly after we open sourced this project, then the company I was working with effectively stopped. It was a startup, a very experimental project, kind of a spin-off board just to experiment a particular idea. But then eventually they decided not to go ahead with that idea. So the whole project ended and everyone found a different path working for other companies. So the main core group kind of dissolved at that point and everyone was doing something else and people were not really interested in continuing working on Midi because they didn't really have a use case anymore. So because I'm very passionate about open source and I felt like there is something there that is worth continuing and some people were starting to use it and they were very happy. They actually found it was solving the same problems for them that we saw. So we realized, okay, there is value in the community for something like this. So what I did, even though I moved to a company where I was not doing that much serverless anymore, I kept maintaining Midi for about another year. And meanwhile, there was a little bit of community that organically formed around the open source project, like people that just were coming randomly asking for help or maybe submitting PRs and contributing in all sorts of different ways, writing documentation as well. And among these people, specifically, there was Will Farrell, who kind of was one of the main contributors and he was helping a lot in making sure that Midi was like a serious project, not just something done and left on GitHub and people might just copy paste things. Yeah. He was literally putting a lot of effort in making sure that it was always up to date, documentation was clear, there were examples and also adding more and more middlewares because Midi is not just the runtime, but there are also a bunch of built-in middlewares that you can just use and configure. So eventually I decided I was not having enough time and focus to continue being the main maintainer of Midi. So I asked if somebody wanted to step in and take over and then Will decided to do that. And that was, I think, around 2019 and 2020, but I think we kind of officialized that in 2020 with the V1 release. Okay.
Eoin: So I remember using the pre-V1 release and you could use Midi, you installed one package, it came, I think, with a bunch of in-built middlewares and you can also write your own. How has it changed since then? Because I know that you've had another milestone release recently.
Luciano: Yeah, I think the first big change was that when we started working on Midi, it was still the time where everything was callbacks, even like writing a lambda, it was like the signature was your function, then event, context and callback. And actually Midi did support already a way to use callbacks, to not use callbacks, but to use async-await and promises. But at the time, async-await wasn't even available in different version of Node.js that were mainstream. So basically the way it was working was kind of using Babel, you needed to transpire your code and then it was just giving you an interface. But then at the end of the day, your lambda was still being exposed to the lambda runtime as a callback-based function. So it was kind of an abstraction layer, it was a little bit messy. And I think that's something that we kept doing throughout all the version 0.x. And then with version 1, I think that the ecosystem was mature enough to start to use async-await consistently. So then it was, we decided to go with version 1 because we kind of cleaned up all that mess and make it much more integrated with the ecosystem, like basic, real async, and not just like a simulation of all of that through transpilers. So that was the first big milestone. And also in that milestone, we also, initially Midi was like very monolithic. It's just like one package and you get everything, the core middleware engine, but also, I don't know, I think there were something in the order of 10 or 12 different middlewares and all their own dependencies, different middlewares might have different dependencies. So it was like a very, it wasn't like really a small package. Like you really needed to have a strong use case to make sense for you to import that package and include it in your lambdas. So we decided, okay, if we break this down into smaller units and we do like a monorepo where every unit is published independently, then people can just install the core and then only the middlewares that they really need to use. And this way we can offer an API that is much more lightweight and it's not going to affect your Lambda runtime because you are importing only the code that you actually need. So this was another big change from version 0 to version 1. Everything became like monorepo. We started to adopt the atmid namespace on npm, and then you install atmid decor and then you decide to install all the other middlewares independently. So that was, yeah, I suppose the story of experimenting with version 0.x and getting to a state with version 1 where we felt, okay, this is really something that people can use and have a good experience. Okay, nice. So you mentioned all these different middlewares then.
Eoin: So what are some of the common things you can do with Middy? Maybe we could talk about some of the canonical examples with these core middlewares.
Luciano: Yeah, so there is actually a page in the documentation, and we'll drop a link in the show notes, that basically showcases all the official middlewares. So we have this concept of community-maintained middlewares, but also official middlewares. And the difference is that we recognize that there are a bunch of use cases that are so common that it's worth to have those use cases solved within Middy. And every time we do a new release, we make sure that all these middlewares are maintained and they work well with the new changes that we might have introduced in the new version. So that's why we have this list of official middlewares, and we basically maintain them together with the core engine. But then, of course, there is an active community and people are creating all sorts of middlewares that are useful to them. So in the website, you can also find a list of community-maintained middlewares, and they are not necessarily always up to date or tested together with the core, but we kind of got a selection of the ones that we think are reasonably well written and you might use without too many issues.
So the ones that are in the core, I'm just going to mention a few. We kind of group them in different groups. There are ones that are related to basically like handling input, I don't know, doing validation with the input or certain events in AWS are a little bit flaky. There are certain gotchas that are not obvious, like I don't know, certain strings are encoded in ways that you might not expect. And we have also middlewares that will normalize your JSON for you, basically giving you a cleaner JSON and you don't have to think... For instance, the one use case that I think is worth mentioning just to explain this better is S3 events. So when you have a file in S3, if that file contains... the path of the file contains certain characters, you will receive an object as an event that says the key of this file is a string, but that string encodes the special characters in a certain way. And that might be actually, has been for me, source of bugs in the past because I never realized that that string was encoded until I actually had the case where it was using special characters. And then my lambda would explode because I would just take the string as this and use it without realizing that I needed to decode it first. So we have a normalizer that will take care of, make sure that if there is any special character, when you get your event, it's already converted to a proper clean string that you can just use. I think the example is if you have a space, rather than getting a space, you get a percentage 20 or something like that, or a plus. I'm not really sure, but it's one of those gotchas that, yeah, you don't expect. So this is one class where you can kind of simplify handling inputs and validation and make sure that the events are clean enough so that you can just use the data without having to do additional conversion. Then there is also parsing stuff. For instance, if you are building, I don't know, an API that receives data from a form, you might want to use the proper algorithm to decode that form encoded input. Or if it's a JSON, you don't want to do JSON parse manually. Maybe you just want to have the body already parsed as an object. Or I don't know, if it's XML, because you are implementing an API that needs to receive XML, there is a parser for that as well. And then there is also something similar for responses. So if you're building an API that needs to send a response in a certain format, like again, JSON or XML or YAML, whatever, you can have your own serializers and do that. And the best part is that there is also a content negotiation middleware, where if you want to build an API that can receive different types of inputs and response in different formats, it follows the HTTP specification to negotiate, OK, I am receiving an XML and I expect to receive back an XML, your lambda business logic remains completely abstract from all of that. It just needs to receive an object and produce an object back. And then this middleware takes care of the serializing and re-serializing requests and response respectively. So basically you have all these middlewares to try to focus more and more on the business logic and leave all these extra concerns to the middleware layers. That sounds really great.
Eoin: So let's say if I've got a set of lambda functions and I've been doing serverless for a few months or maybe even years, but I realize that every time I'm doing JSON.parse on the body and I have to construct a response that has the status code and the encoded result, and I'm thinking, OK, this is causing bugs. There's duplication everywhere. I want to clean this all up. How do you get started then with Middy? What's the process? Yes, so I will say that again, I'm going to point on the documentation.
Luciano: There is like a getting started section, which gives you examples and so on. But I think the main thing you should do is just do npm install at Middy slash core and that gives you just the middleware engine. Just the middleware engine. And at that point you need to decide, OK, what am I doing? Am I building an API? Do I need to parse JSON? If I need to do that, I can install the Middy at Middy slash HTTP JSON body parser. And similarly, you can install a bunch of middlewares that you think you're going to need, like validation, error handling and so on. And then the way that Middy changes your way of writing lambda is actually very subtle. Like it's not very, it doesn't force you to change your coding style too much because you are still writing your handler in the same way.
You are still writing the same signature of a function. The only difference is that for every handler that you write, you need to, let's say something we call midify the handler, which basically means take that handler and wrap it with this middleware layer, middleware runtime engine. So this is literally a function that you import from core that is called Middy. So you just need to say, call Middy, pass the handler inside and you are basically getting a new instance of the same function handler, which has, let's say, additional superpowers. And this superpowers is that you can use the.use syntax to basically specify which middlewares do you want to attach.
So the idea is that you write your handler, you don't worry too much about all these extra concerns in your handler. So you assume that the data coming into your handler is already clean and ready to be used and that you don't need to do anything extra to send back a response, just provide an object. You midify this handler and then you attach all the middlewares that you need to actually do all the pre-processing and post-processing of the request and response.
There are slightly variations in syntaxes that you can use today because we try to listen to feedback and figure out ways that could be simpler in different use cases. So if you look at the documentation, you can find that you can use other things, not just.use, you can use.before,.after, .error because we have different use cases and if you're writing something very, very simple, you don't necessarily need to write or use fully-fledged middlewares, you can find shortcuts. So I'm going to let people check the documentation for more details about that, but in broad strokes, write your handler, keep it simple, midify it and then.use all the middlewares that you want to use.
Eoin: And then you can start deleting all that boilerplate code you had before, which is nice. One of my favorite things in software development, deleting code you don't need anymore. So given that you got started, maybe are there any kind of interesting examples of people out there using Middy in production or like open source projects that are building on top of Middy? Yeah, that's a good question.
Luciano: So it's something that we are trying to collect more use cases and hopefully we'll be able to showcase them on the website. We haven't done that in a formal way yet, but we have been very happy about mentions that we got in the public from actually pretty big name. Like we had a conference, I think it was one of the serverless days a few years ago, where Lego mentioned that they were using Middy internally for some of their own APIs built on top of Lambda. Then we also had recently, I think it was in the last re-invent, if I'm not wrong, Taco Bells that also mentioned Middy in their own presentation at re-invent as one of the things that they use for serverless. And I think the best one is the fact that the upcoming TypeScript power tools for Lambda also supports Middy. So of course, it's not the only way you can use power tool, but if you are already using Middy, they make it easier for you to add all the extra functionality that they are providing with power tools. So I think that's an interesting validation also from AWS that they think Middy is actually solving a problem for the Lambda ecosystem in Node.js. And I recently noticed that there is a repository called AWS Solutions, open source from AWS, where there are also a bunch of examples that also use Middy and this is AWS providing examples on how to use Lambda and they suggest to use Middy. So that's also another very good validation that the project makes sense and it's actually solving a real problem for people. That's great.
Eoin: I knew Middy was useful, but I didn't know that it was powering tacos. So that's... Yeah.
Luciano: That was actually the comment I got from Will when we shared this news that we realized that it was mentioned, it was like, oh, it's amazing to see that this open source project is helping people to have more tacos. Making the world a better place. Okay.
Eoin: So let's say you're up and running and using some of these really good official middlewares on some of the third party middlewares out there. What about writing your own middleware? Is that something that people would commonly need to do? And how would you set about that task? Yeah, that's a very good question.
Luciano: So again, there is an intersection in the documentation with examples and so on, and we'll be linking that in the show notes. But I will say that for simple use cases, you generally don't need to do that because probably the default middlewares are going to cover most of the needs. But there are cases where, I don't know, maybe you're doing something very custom. For instance, you have your own authentication mechanism, right? So you'll need to validate credentials in a way that is not a canonical way of doing it. Maybe it's not using JWT, maybe it's not using Cognito. So you have your own mechanism, you need to use your own libraries to do that. And of course, this is one of those concerns that you don't want to copy and paste into every single Lambda, or even you don't want to call a function, remember to call a function inside your Lambda handler every time and manage the error. And you probably want to just say, use validation somewhere, so use authentication somewhere, and then keep your Lambda code as clean as possible. So this is one use case where you could decide, okay, I'm just going to use Middy, write my own authentication middleware, and then for every handler, I'm going to attach that particular middleware where I want to use the authentication feature. And there are different ways you can write a middleware. The simplest one is literally just write one function, and this one function needs to have a very specific signature that, by the way, is the same signature as a Lambda function, you just receive an event and context. Or you can just call it request, that is kind of an object that we use that contains both the event and the context and give you extra functionality. And basically, the only thing you need to do is rather than saying.use on the mid-defined handler, you say .before and you pass this function inside it. And then that function can basically either return or throw exception to try to handle the different use cases where you want to stop the execution early with a success, or you want to fail because maybe the authentication is not valid, or if basically you don't do anything in that function, you assume that everything was fine, the authentication was okay, and at that point when that function completes, it's going to run for you, your own handler. And that's one use case. Sometimes you want to have actions that happen before your actual handler runs, after your handler runs, and also in case of error, you want to have specific logic to be executed because maybe you need to clean up something. In those cases, it's worth to write a fully-fledged middleware where the syntax is very similar, but it's just that you have an object that contains a before function, an after function, and an on error function, and you define the behaviors that want to happen in those three different phases. That sounds really good.
Eoin: And just as you were talking about authentication, I was thinking about the case where commonly you have with a multi-tenanted application, the need in your Lambda function to assume a specific role that is scoped down for a tenant or for a specific user, for example, that restricts them to specific key prefixes on S3. We talked about that in the previous episode, for example. And that's the kind of thing that you would be doing for every function, and you need to make sure that you're doing it for every function and that people are using that scoped down policy. So that seems like an ideal application for a custom middleware that could be invoked before your handler and ensure that your context is decorated with a session that you can use to make calls out to AWS for DynamoDB in S3. You mentioned that the documentation, and I've seen the documentation, is really good and improving all the time. Is there anything specifically we need to point, or are there any tutorials out there that people have contributed that will help people to get started? Yeah, that's a good question. I think we have some links in the documentation.
Luciano: There was one recently by Serverless Stack, I think, where they show how to use Middy with Serverless Stack. And I think a good pointer, and this is something that we want to expand more in our own official documentation, is that Middy integrates very well with basically all the tools, because it's not an opinionated take on how you deploy your code. It's more helping you to write the code with a different style that promotes focusing on the business logic and keeping every other concern outside the business logic. Because of that, you can use Middy with Terraform, with Serverless Framework, with Serverless Stack, with, I don't know, SDK, CloudFormation, everything you are currently using. It just changes the way you use your code in the way that any other library would affect your code. But it doesn't affect anything else outside the code.
So we want to have a section called integration in the documentation where we already started that. But it's still pretty much a to-do. There are different pages, but if you open most of them, it's like, please help us to fill this guide. But we really want to highlight the fact that Middy plays well with most of the other tools, so it's not really going to force you down a particular part. So maybe that's something for the audience. If people are actually already using Middy and they want to contribute, it would be nice to get some help in writing some of this documentation. I want to just give a final shout out to Will, who has been maintaining Middy in an excellent way for, I think, more than two years at this point. Because I felt like I took my distance from the project more and more in the last years, and this project wouldn't be today at this level if Will wasn't there putting a lot of effort every day into maintaining it. So I just want to say again, thank you to Will for making effectively the project available to everyone today. Good shout. Yeah, that's great. Okay, thanks everyone for listening. We'll see you next time.