How do you containerise and run your API with Fargate?

Description
Transcript

We recently talked about migrating a monolithic application to AWS, using EC2, load balancers, S3 and RDS. In this episode we want to talk about a slightly different setup, where we are going for containers instead of EC2 and we want to deploy them in Fargate. In this We are going to cover all the components you will need in your architecture, the reasons to choose Fargate over any alternatives and discuss some CDK tricks to get started in a quick way (and the pitfalls that might come with them).

In this episode, we mentioned the following resources:

Let's talk!

Do you agree with our opinions? Do you have interesting AWS questions you'd like us to chat about? Leave a comment on YouTube or connect with us on:

BlueSky: @eoins.sh + @loige.co,
LinkedIn: eoins + lucianomammino,
Twitter: @loige

Help us to make this transcription better! If you find an error, please submit a PR with your corrections.

Eoin: A couple of episodes back, we talked about the process of migrating a monolithic application to AWS using EC2 load balancers, S3 and RDS. Today we want to talk about a slightly different approach where we're going to use containers instead of EC2 and we want to deploy them into Fargate. So we're going to cover all the components you need in that architecture, why you choose Fargate over some of the alternatives and some CDK tricks to help you get started faster.

My name is Eoin, I'm here with Luciano and this is the AWS Bites podcast. In episode 37, we talked about migrating a monolithic application to AWS. And in that case, we talked about how you'd choose EC2 because adopting containers was a step too far for the team. The team was already having to learn a lot of new skills approaching AWS for the first time. But what about if we do have an appetite to move to containers and you've already got some of those skills? So we're going to talk about that example where we take something like an API backend written in Python that can run in a container. What are the simplest ways of getting it to run in a scalable and a reliable way using containers when you're moving into the cloud? So there's a lot of ways to run containers in AWS. Why would we go for Fargate, Luciano?

Luciano: Yeah, I think another one would be AppRunner, which is probably the simpler that I've seen so far, or at least that's the way it's presented. But it's still very new and that probably deserves its own dedicated episode when we have some more time to actually play with it and see how it feels like. So Fargate so far seems kind of the default choice to me because, well, I had some experience with it and it's basically built on top of ECS.

So all the concepts are the same if you're familiar with ECS, which stands for Elastic Container Service. And just to summarize what are the main reasons, it's basically very simple to set up. It doesn't require you to manage instances as in EC2 instances. It's kind of serverless that way. You just say, run this container for me and it will figure out some hidden instance where to run it for you. It supports autoscaling and also integrates very well with Elastic Load Balancers, but also with CodeDeploy. So you get autoscalability through Elastic Load Balancer and through multiple containers running in a cluster. But also you can fine-tune your pipeline with CodeDeploy to actually build and deploy your containers. You mentioned, though, that it's interesting to go through all the different components that an architecture like this actually will require under the hood. Should we describe what are those components? Yeah.

Eoin: I mean, I went through a lot of detail in episode 37 and a lot of them will be the same here. It's just the compute layer that we're switching out from EC2. We're going for Fargate ECS instead. So the VPC will be similar. You've got a public and private subnet. You've got the NAT gateway, your internet gateway, so that you've got outbound internet access. You've got your routing tables, the VPC security groups.

So that's your networking foundations really. And then you'll have an application load balancer on top. The difference between our EC2 approach and the Fargate approach is that the targets in your target groups within the application load balancer will be different. And yeah, we'll assume, again, we're using HTTP ES, so we'll have a Route 53 hosted zone for the DNS and we'll have a certificate using certificates manager.

So that's the similarity. And then the different parts are around ECS and Fargate. And when you're working with Fargate, there's a few different resource types you need to create. So you've got your task definition, which is like the defining the container image and all of the container configuration like environment variables, what ports are you exposing, how much memory and CPU does your container need, and any volume mappings as well, volume mounts.

So that's your task definition. You'll also have the ECS cluster itself, which is kind of like a boundary that all of your Fargate services will run in. And that's where you'd basically just specify the VPC. So you don't have to configure any EC2 instances because it's Fargate. All of that is taken to care of for you. So then the last and probably the most important thing you need to create is the Fargate service itself.

And that's where you specify, okay, how many of those tasks that I've outlined in my task definition, how many of them do you want to run? How do you scale it? And the Fargate service is the bit that integrates well with the other pieces. So it integrates well with our application load balancer. So when you start a task, it will register the IP address of that container in the target group so that traffic can start to be directed to that container.

It also will maintain a desired level of healthy containers. So you can specify in your service what the minimum healthy percentage is, how many containers you're desiring to run, your desired count, and your maximum count as well. And then you can specify your auto scaling configuration. And this is what makes it, I suppose, very advantageous moving to this container based approach because based on whatever criteria you specify, you can choose to scale up and down those containers.

So that could be based on a schedule. If you know that all your traffic happens on, I don't know, Monday to Friday at 9 AM, or if you're based, you can base it on the API request count, but like auto scaling groups with EC2, you can also base it on any metric. So the CPU of your containers, memory utilization of your containers, or actually any other metric. It could be a custom metric even that you're generating within the containers themselves. So there's lots of different triggers you could use to say, this is when to scale up, this is when to scale down, and there's a lot of advanced configuration there. So there's quite a lot there, right?

Luciano: But- Yeah, one interesting detail that maybe it's worth mentioning for people that are more skilled with, I don't know, something like Kubernetes, is that a task is probably a little bit closer to the concept of pod in the Kubernetes world. Because it's not necessarily like one-to-one with a container image or a container definition, whatever you want to call it, but it's more, it could even contain multiple containers, like the idea of you could run a main application and a sidecar container. So that is just an interesting thing that I wasn't really aware at the beginning. The task is kind of a higher level concept than just one container, but it could be multiple containers that need to run together.

Eoin: That's true. And if you want to run the CloudWatch agent, for example, which you normally you run EC2, but sometimes you won't need to run it with containers if you want to get the EMF metrics out and stuff. That's one application for that, where you would run an application with the CloudWatch agent as a sidecar. So this is, for people who are used to ECS, this is probably okay. But if you're thinking, okay, I was expecting to hear about a simple way to get started with containers on AWS. And if you've never heard of any of this before, it probably doesn't sound very simple. So what do you think? Would you recommend any kind of templates or tutorials out there? What's the best way to get started easily?

Luciano: Yeah, there is one particular way that I used. This was about one year ago, I think. So my view of this thing might be a little bit obsolete right now, but I'm going to try to describe the experience I got about one year ago anyway, and you can challenge me if you had a more recent experience. So there is something as part of CDK, it's like a set of patterns that are maintained as higher level constructs by AWS itself.

So you just install them from AWS and this ECS patterns with CDK, what they do is they basically allow you to define how, basically where the source code of your application is and more or less few configuration toggles that you can just play around with to say, okay, do you need volumes? Do you want to associate a domain name? Is it going to use a load balancer? And it literally, you end up writing 10 lines of code and it's code as in configuration code, you copy paste and you change a few things.

And later you just do CDK deploy, it takes maybe around 10 minutes and it will provision almost everything for you. And it will, the magic thing is that you basically just connect to that domain that you specified and your application is working, which feels a little bit magic that you with 10 lines of code, if you're used to AWS where it takes you to really deep dive to do anything, it feels like, okay, this is still a different experience that you're not used to when dealing with AWS.

And this can have of course, pros and cons because on one side you get started very, very quickly, but it might give you a false sense of confidence that basically you know what you're doing while in reality there is a lot of stuff that is being hidden from you. And I think if you are building a serious application, eventually you might want to know what's happening behind the scenes. And when I actually did that about one year ago, I was surprised because I was trying to run, it was kind of a microservice project with, I think it was something like about five application runnings on different domains, but those applications would be related with each other.

Each user will log in as one experience and jump into different domains, depending which parts of the application the user was trying to use. So it was like, okay, we deploy them together as part of the same cluster. And I realized, and they will be using five different subdomains on the same main domain. And I realized at some point that this thing was provisioning five different load balancers rather than just creating rules on the same load balancer and divided the traffic that way.

Maybe someday it could be fine tuned if you're willing to spend more time and starting at the configuration, maybe there are ways to actually reuse the same load balancer. But it was something that I only figured out deep down the road when I was starting to look back at all the resources like, okay, this is going to be expensive and for no reason because you are provisioning five load balancers when you might use just one and route the traffic in a more, I don't know, efficient way, I guess. So that's the caveat. Just be careful that with CDK, that is kind of a golden rule anyway, that when you use higher level constructs, they can do so much stuff that you're not aware. So it's always good to kind of have a look at some point and make sure you really understand what's going on. And probably there are opportunities to fine tune a few things for your actual use case.

Eoin: I've had another similar kind of a shock with the CDK patterns because the first one I ever used was not the application load balancer one, but there's another one that is more designed for background processing. It's called the queue processing Fargate service. And you can specify an SQS queue and it will create all the infrastructure you need to scale up and down the number of workers based on the number of messages in the queue.

And it was really easy to get started with. That's sometimes a very misleading sense of security, like you say, because it was not later that I realized it had created the NAT gateway as well because it created the whole VPC. You don't want to necessarily create a VPC for every single thing you deploy, right? You probably want to think about your VPC design a little bit more carefully. And you can specify your own VPC in these services, but it's always definitely worth a while to do a CDK synth before you deploy and actually look at all the resources that are being deployed.

Because in our case, we ended up with S3 traffic going through that NAT gateway where we would have just preferred a VPC endpoint. And we ended up with a, it wasn't a massive bit of bill shock, but some unexpected cost there for sure. So I think it's a really good thing, right? Because I've been using them recently as well with this API load balancer service. And I think I'm still impressed when you can create those 10 lines of code, wire it through to route 53.

You don't have to configure the certificate resource, the load balancer resource, all the VPC resources, it's all done for you. So it really, and you can just as well point it at your Docker file. You don't even have to push a container anywhere and CDK will manage all of that for you. So it's really good for getting started, but then also think about, okay, now that I've got started, how do I want to keep going?

Do I want to continue using this pattern or is this just like a learning experience where I can see all the things that's generated and then I'm going to pick and choose the pieces, understand them and kind of replicate that in a more with lower level constructs in CDK or with CloudFormation or Terraform. So I think CDK is sometimes a neat trick for getting started and figuring out how everything should fit together, but you don't necessarily have to keep going with it.

Maybe it's worthwhile talking about the deployment then because with containers, you have to think about, okay, you've got the repository, you need to deploy that. Then you've got the cluster itself and the service, and then you've got your load balancer in front. What happens when I've got a new version of the image and I want to deploy it? What happens to my current connections? What happens to existing users? What happens if I deploy a container that's got a bug in it? How do we manage this? What do you, how will we break this down? How does the first CDK pattern manage it actually? Do you want to talk about that? Yeah, I can try to explain what I remember from my previous experience.

Luciano: I hope it's still up to date and relevant, but I think in broad strokes it should be still the way it works today. And it's very convenient. As you said, you just literally in your CDK code, you literally say, this is where my Docker file is. You literally give it a relative path. So you probably have your Docker file in the same folder where you have your CDK code and magically it's going to do a lot of stuff for you.

And that magically means that when you do CDK deploy, basically it's going to, the same CDK deploy process is going to spin up a Docker build process in the background. It's going to wait for the container to build. And then in the bootstrap phase, CDKs also create an ECR repository for you. So basically at that point, it's built the container, finishes to build the container already as a repository.

So it's going to push a new version of that image. And then at that point you have everything ready to start a new deployment because your image is up and you can just say, okay, now I want to switch over all my containers to the new version. And that's interesting as well because we know that there might be a lot of complexity there, but you don't get to see it with CDK. And if I understand correctly what's going on, it does kind of a blue green deployment where it's basically provisioning a new version of the container as a new service.

I don't know if it's the right terminology, but basically you get a copy of the previous version that is still running and the new version that starts to spin up. It doesn't kill the old version until the new version is up and running and all the health check pass. And it registered that as a new target in the load balancer. Then it starts to drain the connection from the previous version, move the connection to the new version and eventually start to destroy all the old containers.

And of course there are health checks. So if your new version of the container cannot really run and receive connection correctly, it will kind of roll back. It will just kill the new containers and say, okay, this deployment failed. It wasn't able to pass the health checks. The only issue I had with this process at the time, and again, I was trying to run five different applications, is that the steps were very sequential.

So I had to go through five, not parallel, but sequential times building the same thing. So okay, building container one takes a few minutes. Then provision container one takes up to 15, 30 minutes, depending on the case, because yeah, all the time of draining connections and uploading everything, health checks and so on. And then it goes with the second container, third container, fourth container. So it might take a long time to do that. And there is actually an article we're going to link in the show notes that gives you ideas on how you can tweak the configuration of health checks to speed up the process. So if you have containers that can come up very fast and they can prove that they are healthy very, very quickly, you can kind of fine tune some of these configurations. And then you have much shorter times to tell, okay, this container is already good to go, start to swap them out.

Eoin: Yeah, I think that's a good way to, I guess, get started with the deployments. You can use that workflow. If you have to wait so long, it's going to be a little bit awkward. And I'd also recommend kind of thinking about your long-term deployment strategy, because we mentioned at the start that code deploy integrates well with Fargate. And that's another way to manage shifting traffic over to a new version of the container image.

So you could think of your CDK stack as managing the infrastructure for this service, then the container image updates could be done with code deploy. And that's a really nice set of features, I think, what code deploy gives you, because it allows you to trigger code deploy deployment, and it requires two target groups with your load balancer. So you could do like a blue-green deployment strategy, in which case it will start creating a new target group and starting new versions of the task, but putting them in the different target group.

And then it can use the traffic shifting features of the load balancers. So it can start the waiting of the traffics between the two target groups can be adjusted over time. So it can start shifting some of the requests over to the new version of the image. And then there's all sorts of advanced health checks you can do. So you can just use your load balancer health check, but you can also put in like hooks into the deployment so that it will check that all of the expected business logic is working as well as you want it to, or you can check for alarms, and then you can gradually shift traffic over to the new version.

And if anything is detected, any kind of failed health check or alarm, it will revert back to the old version. So you can get much safer deployment strategies and it decouples the, I guess, the creation of the cluster and the service and all that stuff from the deployment of the software on it. So I think it's definitely worthwhile. And you can integrate it into CodePipeline, or you can integrate it into whatever else you're using to deploy, be it like GitLab or Bitbucket or GitHub actions, say. Given that we've talked about everything from the foundations to setting up all of the resources with CDK patterns and then deployment with CodePipeline, that's probably a good place to wrap it up. What we'll do is we can give a link to that ECS pattern in the show notes, and we'll also give a link to episode 37, where we talked about migrating to the cloud with EC2. Thank you very much for listening. We'll see you in the next episode. Have a great day. We'll see you in the next OneOG video and, of course, soon in grassroots media. you

42. How do you containerise and run your API with Fargate?

Let's talk!

Prev

Next

AWS Bites Podcast

42. How do you containerise and run your API with Fargate?

Let's talk!

Prev

Next