AWS Bites Podcast

93. CDK Patterns - The Good, The Bad and The Ugly

Published 2023-08-11 - Listen on your favourite podcast player

In today's episode, we're diving into the fascinating world of CDK Patterns - those ingenious building blocks that can transform your cloud journey. We uncover what CDK Patterns are, where to find them, and why you'll want to use them!

With CDK's object-oriented abstraction, L2 and L3 Constructs bring a whole new level of convenience. We'll explore where to find these powerful patterns, from the ones baked right into CDK to the inspiring examples showcased by community websites such as cdkpatterns.com.

Why bother with CDK Patterns and L3 Constructs? Well, imagine encapsulating best practices, avoiding tedious configuration repetition, and ensuring a consistent approach across your services. That's just the tip of the iceberg!

Of course, we'll be candid about the challenges you might encounter, like versioning and resource oversight. Fear not! We'll share practical tips to address these hurdles, including automated testing and vigilant monitoring using CDK diff functionality.

And wait, there's more! We'll reveal some exciting alternatives to CDK Patterns, giving you a broader perspective on reusable modules for your cloud adventures.

fourTheorem is the company that makes AWS Bites possible. If you are looking for a partner to accompany you on your cloud journey, check them out at fourtheorem.com!

In this episode, we mentioned the following resources:

Let's talk!

Do you agree with our opinions? Do you have interesting AWS questions you'd like us to chat about? Leave a comment on YouTube or connect with us on Twitter: @eoins, @loige.

Help us to make this transcription better! If you find an error, please submit a PR with your corrections.

Eoin: Today, we are going to be diving into the world of CDK patterns, what they are, where to find them and why you might want to use them. Join us as we discuss the benefits and challenges of using these powerful, reusable modules and explore some alternatives available if the thought of generating infrastructure with dynamic, reusable code gives you nightmares. I am Eoin, here again with Luciano for another episode of the AWS Bites podcast. fourTheorem is the company that makes AWS Bites possible. If you're looking for a partner to accompany you on your cloud journey, check them out at fourtheorem.com. Luciano, can you give me a quick recap on CDK, just for people who don't remember or don't know what it's all about?

Luciano: Yes, so CDK stands for Cloud Development Kit, and it's basically an object-oriented abstraction for CloudFormation. So the idea is that rather than using YAML or JSON to write your infrastructure as code, you can actually use real code like JavaScript, TypeScript, Python, C-sharp, Java or Go, I believe are the ones supported, to actually define the infrastructure that you want to be provisioned in your AWS environment.

It's actually not really limited to CloudFormation, because if you look at the bigger picture in the realm of CDK, you also have this project called CDK Terraform. So you can also generate infrastructure that then is deployed with Terraform. And I think there is also a project that allows you to provision Kubernetes configuration using CDK as well. Today, we only want to focus on the CloudFormation one, because this is the one that we have been using the most and the one we know the best.

So, yeah, infrastructure as code, as we said, is generally declarative. And that brings certain challenges, because it's always very tricky to do things like loops or condition logics, or if you want to add extra code, hooks, maybe do something before or after. You always need to figure out your own orchestration, your own bash scripts to wrap around things or generate code dynamically using Jinja templates.

I've seen all sorts of variations of that, just because there are limitations into the way you typically write infrastructure as code in a declarative way with languages like YAML or JSON. So, CDK tries to fill that gap and try to give you a nicer experience. And the idea is that you write code that effectively, by instantiating a bunch of classes, you are defining the things that you want to appear in your infrastructure and how they are configured and how they are integrated together, because you can easily reference properties from one another.

And then at some point, when you're happy with it, you can run a step that is called synthesize. And what synthesize does is basically taking all of that code definition, whatever is your language of choice, somehow evaluating it and converting it into a proper cloud formation stack that can be used to be deployed, still using cloud formation behind the scenes. So CDK gives you all the fundamental building blocks and it generally maps one to one to what you get with cloud formation.

But then you also have other things that are like other abstractions on top of it. And the basic abstraction is constructs, which is like representing all the entities that you can define in cloud formation. But also you can start to use constructs to define your own custom things. And then you also have assets, which are not really a cloud formation thing, but a nice extension that CDK gives you to be able to deploy code, for instance, in Lambda functions or container images as part of your own infrastructure definition. So how do we start to make sense of all these different concepts? For instance, can we start by describing better what constructs are and how they are organized?

Eoin: Yeah, the construct is the main thing you need to be concerned with with CDK. And they're essentially classes that are going to generate one or more cloud formation resources. And you have three different levels. There's actually a fourth level, but the level one, two, three are the main ones you would encounter in the wild. And level one constructs are just simple representations of the cloud formation resource exactly as it's defined in the cloud formation doc that just generated.

It's just the same as writing cloud formation, except it's represented by a class. So you get type safety and code completion. And often for new services, this is all you get. You just get the `Cfn` resources because all L1 constructs or level one constructs always begin with `Cfn`. Then you have where it really starts to add value is with the level two or L2 constructs. And they provide more convenient helper functions and types to reduce the amount of code that you have to write and allow you to connect resources together more easily.

So if we take an example, an L1 construct for an S3 bucket would be the `CfnBucket` class, and it would require you to pass a string for the encryption method. But the L2 construct for bucket has typed values for unencrypted KMS, S3 managed, etc. So the L2 construct then also has helper functions like grant put that will generate the right resource policy statement to allow principal to put an object on that bucket.

And this is one of the big benefits of level two constructs for many, since it can reduce the human error encountered in creating IAM policies. And we know all about that. Then if we're moving into the realm of CDK patterns, then we're looking at generally level three or L3 constructs. And these are really higher order constructs that combine multiple level one and level two constructs together to achieve a specific use case.

For example, you could create a construct to create a cluster of EC2 instances, security groups, VPC, network routing, logging backups all in one class. You can kind of compare it to React components or you have simple components and then you have higher order components. And that's exactly what L3 constructs are trying to do. So today we're talking about CDK patterns, and these are often created by providing L3 constructs. And there are tons of CDK patterns out there. And you can also create your own quite easily. So Luciano, where can people start to find the CDK patterns and level three constructs? The first thing that comes to mind is that CDK itself has a concept of patterns built in.

Luciano: And there are a couple of interesting sub libraries that are already available once you install CDK. And one is called AWS ECS patterns and another one is called AWS Route 53 patterns. The ECS one, I think, is fairly powerful because also ECS is notoriously complex to configure yourself. There are so many resources, so many configuration options that having patterns is really needed there because otherwise you might be always reinventing the wheel and always bumping into the same old mistakes.

So what you get out of the box with AWS ECS patterns is if you want to do a web application running on Fargate, backed by a load balancer, all of that stuff is made very easy if you use this specific pattern. Similarly, you can switch the application load balancer for a network load balancer as well. Or another use case that is covered very nicely is when you want to use Fargate, for instance, to process jobs coming from an SQS queue.

And you can do all of that with a container running on Fargate that scales. And it's very easy to configure all the different resources this way when you use this particular pattern. And the interesting thing, again, is because it's a pattern, they will give you a higher level abstraction. So you just in your code, you just instantiate one class or very few classes. And then behind the scenes, it doesn't really map one to one, like one class with one resource, but actually ends up creating all the necessary resources for you.

So you get load balancer, you also get health checks. If it needs to create queues, it's going to create the queue, it's going to create auto scaling rules for you. And of course, everything that is customizable, you just will have a higher level interface to specify how to customize the different things. And another interesting detail is that you can even let it create VPCs or you can use a VPC that you already have in your account, for instance.

So you can also reference other existing resources in some cases. So the cool thing is that it's something that is going to save you a lot of time, it's going to save you a lot of headache because it's easier to end up with the result you want without doing mistakes. But at the same time, it's hiding a little bit what's being generated. You need to be really diligent into looking into the generated resources. They are not so transparent anymore from the code that you are writing. So sometimes there might be things that you didn't account for. Maybe it's creating a NET Gateway that you didn't need, but now you're suddenly paying for it. So the general advice is there is to don't trust CDK blindly. Always spend time looking at what's being generated, review the stacks, review all the resources in the stack and make sure you understand why all the resources are there and if you really need them.

Eoin: AWS also have this open source extensions set for CDK called solution constructs. So this is a different type of CDK pattern really, because rather than providing these kind of reusable higher order constructs for complex configurations like Fargate with load balancers and all the other integrations, this is essentially around 50 different simple patterns for connecting commonly used resources together and normally like two resources. So an example would be connecting a CloudFront distribution to an API gateway. So they're not as rich as ECS patterns, but more just examples of connecting two services together with the right permissions.

Luciano: Yeah, actually that reminds me of another similar project, which is more open source and community driven called cdkpatterns.com. And you might have heard of this one because it was also mentioned by Vernon Vogels at one of the recent Dreambands. So that it's kind of a similar idea. It's still giving you examples of solutions that you might want to deploy using CDK. So it's code that you can easily take and bring into your own CDK and do all the necessary changes.

And just to give you some examples, there might be things like you can do an API where the backend is Lambda and that backend is using Polly and Translate, maybe to do interesting things with audio and text. Or you have other examples where you take a CSV and import it into DynamoDB and from there you create a processing pipeline that does other interesting things. And these are typically not L3 constructs. Again, more of examples that you can take and change as needed. So not meant to be highly reusable, highly configurable, but more these are use cases that we commonly see. Just take them and adapt them to your actual needs. But I think the main question that we still have is what is really the value there? Why we would want to use L3 constructs or higher level constructs in general?

Eoin: But the reason for using CDK patterns and level three constructs particularly are they're like reusable modules that can be shared within a community or an organization, especially if you're all in on CDK. We've seen companies do this where they go all in on CDK, use it for everything and then do lots of sharing and collaboration and have central teams managing these reusable components. And there's lots of good reasons for doing that.

Firstly, it allows you to encapsulate proven best practices. It also allows you to build in well-architected framework principles. And you mentioned duplication and reusability. It helps you to be dry by avoiding duplicating the same configuration for groups of resources everywhere you go. And then you can get just consistency and usage of services across the organization. So when people go from one team to another and one project to another, they've got consistency and they can understand how things work. So it just stops you from reinventing the wheel and provides you with hopefully something that allows your teams to go faster because they're getting the encapsulated best practices for your organization out of the box. So that's the positive, but it's not without its trade-offs. So what are some of the challenges?

Luciano: Indeed, there are challenges as with every technology is always a trade-off between some nice things, some less nicer things, and you need to find the balance and figure out when it's worth it or not. So one of the challenges is versioning and keeping teams up to date when improvements are made. And this is both changes in CDK itself. We have seen, for instance, a fairly big change between version one and version two.

So there might have been some disruptions for people having to upgrade from one version to the next one. But also changes that you do in your own CDK code, right? How do you keep that in sync with other teams? If you change some of the best practices, how can you track down the places where you are not using that best practice yet? And all this kind of concern wouldn't exist even if you use other tools.

So it's not necessarily a problem with CDK itself, but it's still something that you need to think about CDK is not magically solving that problem for you. And in general, I would say that if everyone is using patterns, again, there is a risk that you don't think anymore in terms of AWS resources being created, you just think about use cases and you kind of start to lose track of the bigger picture there.

Like at the end of the day, you want to know which resources you are creating because they will impact you in terms of cost, quota, security. So if you stop looking at those, you might end up with lots of problems that you didn't expect and be surprised when you have a security issue or when you start to reach quota, or maybe you have a massive build shock and you are not really realizing why, maybe you just deployed a simple API project and it's costing you way more than it should.

So all these things might be a problem. And again, the suggestion there is always try to put an eye on what's happening behind the scenes and always try to think in terms of AWS resources at the end of the day, not necessarily. You still should focus on the abstraction layers, but without forgetting that those abstraction layers will create resources that ultimately is what you should be caring about.

And another thing is that CDK is not necessarily deterministic because of various reasons. Again, changes in CDK itself, the way you write your stack might not be deterministic on its own. For instance, I don't know if you do a mat random in your code, right? That value will change every single time you synthesize your stack. And if you're using that to synthesize different resources or maybe to change the name of a property, maybe the name of a resource, you end up with a stack that is always different.

So when you try to deploy that, you will always have changes, even though you might not want those changes because logically you are not changing anything relevant. So these are just issues that people will bump into initially when they start to use CDK because they think about, I can write all the code I want. It's just code. But I think it's still important to understand that there is a very specific mental model.

There are phases. Ultimately, you are generating cloud formation. You are still deploying a cloud formation stack. So you really need to understand some of the inner workings of CDK to avoid some common mistakes. I was personally burned a few times by trying to do conditional logic with values that sometimes are not immediately available when the CDK code is evaluated. CDK has this concept of tokens, which are values that will be available only at deployment time.

So if you try to do if statements, maybe checking if those kind of values are true or false and then based on that generate some resources or others, that code is just not going to work for you. It's just going to always be true or always be false. And that conditional logic is not going to work the way you expect. And similarly, you can have other problems if you use the same approach for loops. Maybe you don't go through the loop at all, or maybe you just do one iteration or maybe you do endless iterations. So just be aware that you really need to understand what is the mental model, how the execution flow of CDK works, because you cannot really write all the code you want. It's not going to magically do everything you want to do in code. The code you write still needs to fit nicely with the model that CDK was built for.

Eoin: It is possible to address those challenges, but it's also good to be aware of what it takes to mitigate the risk there. So lots of automated testing and continuous delivery is one thing that will definitely help. If you have an organization and are going all in on CDK, having dedicated people to maintain these constructs as well, rather than trying to scramble to maintain them in a distributed fashion across multiple teams who are focused on other goals.

Good semantic versioning enforcement, of course, is always important for reusable modules. Great documentation will really help because it helps to make everybody self-sufficient. Observability as well, so that when things go wrong, you can detect early and have maybe like canary checks in your deployment pipeline as well. So that even if things build and deploy successfully, you can check what happens.

Cost management is another thing, because if you're using patterns and aren't really looking at what's being generated, then one of the risks is that you can incur cost under the hood. So if you've got good observability on the cost side of things, that will help with that risk. And then a really simple one is just keeping a close eye on change sits. Another simple one is just keeping a close eye on change sets. So using CDK diff and CloudFormation change sets and inspecting generated output and what has changed from one to the other so that you can detect if you're upgrading to a new version of a construct that there are changes that you may not have expected. So this is the CDK world of reusable modules. But do we have to use CDK if we want to get this level of reusability or are there alternatives for people who just don't want to go into CDK?

Luciano: Yeah, there are definitely alternatives and I know lots of people that don't like this idea of writing code to define stacks. They prefer something more declarative. They still prefer something that looks more like YAML. And I can understand the way of thinking, of course, there are good reasons for that. So what can you do in that case? If you are more on that side, you want to stick with writing infrastructures code in a declarative way, not using programming languages.

So if you use CloudFormation, there are a few options. For instance, you can create your own CloudFormation library. For instance, you can create your own CloudFormation macros to try to do more stuff or even try to do things that CloudFormation cannot do today. Maybe integrate with other providers outside AWS. You can definitely use macros for that. You can also use CloudFormation templates. There is actually a really good repository that we will have in the show notes that has lots of examples.

So with CloudFormation templates, you basically build stacks that are highly parameterized. And then by just passing the specific parameters, you can adjust that particular stack or solution to your needs. And there is also service catalog, which is somewhat similar to the idea of CloudFormation templates. Instead, if you are a user of Terraform, Terraform comes with a built-in concept of modules. So there is already an idea in Terraform itself to have reusable units that are configurable and you can compose them together. And there is a really good repository called Terraform AWS modules, which has a huge collection of solutions and high-level models. Somehow they remind me of L3 constructs in CDK, but applied to Terraform. So we will have that link as well in the show notes. And it's definitely a must if you're doing AWS using Terraform.

Eoin: We have plenty of options there. Well, I think that's it for today's episode of AWS Bites. We hope we gave you a valuable take on CDK patterns and the power you get, but also the responsibility you need to take if you want to make them work well. As always, we want to thank you for learning and sharing AWS ideas with us. Please leave us a review and share the podcast with your colleagues and friends. We really appreciate your support and look forward to bringing you more cloud goodness in the next episode. Catch you then.