AWS Bites Podcast

16. What are the pros and cons of CDK?

Published 2021-12-23 - Listen on your favourite podcast player

In this episode, Eoin and Luciano explore the various pros and cons of AWS Cloud Development Kit (CDK).

We start by describing what CDK is and what it is used for. Then we spend a bit of time covering the details of how CDK actually works defining L1, L2 and L3 constructs and the integration with CloudFormation.

In the central part of the episode we deep dive into a bunch of pros and cons of CDK, mostly trying to describe the tradeoffs and the pitfalls.

Finally, we close the episode by giving a piece of advice on what we believe is the best way to get started with CDK to minimize the surprises and be able to reap all the benefits of this amazing tools.

In this episode we mentioned the following resources:

Let's talk!

Do you agree with our opinions? Do you have interesting AWS questions you'd like us to chat about? Leave a comment on YouTube or connect with us on Twitter: @eoins, @loige.

Help us to make this transcription better! If you find an error, please submit a PR with your corrections.

Eoin: Hello and welcome to AWS Bites, the weekly show with bite-sized episodes where we answer your questions about AWS. My name is Eoin and I'm joined by Luciano. Before we get started, make sure you give us a follow and subscribe so you can be notified when the next episode goes live. Today's question is, what are the pros and cons of CDK? And as always, let's start off by talking about definitions. What is CDK?

Luciano: Yeah, CDK is a relatively new tool slash service from AWS, which the name actually means Cloud Development Kit and is essentially another way of writing infrastructure as code. But this time, rather than using a declarative language like JSON or YAML or something similar, you can actually use real imperative programming languages and many of them are supported, for instance, TypeScript, Java, C-sharp, Python. I actually don't know if even Go is supported. Maybe it's in beta, but yeah, the idea is that you can use probably a language that you are already comfortable with when you're writing code. Yeah. Do you want to tell us how it works? Yeah.

Eoin: There's a lot of complexity under the hood and a lot of magic actually at supporting all those languages. I know that they've got this project called JSII, which allows, I think they write basically, write it in TypeScript and then it gets all these other languages get generated. But yeah, so it's generating cloud formation, as we said. We talked a lot about cloud formation in the past and how important cloud formation is as a service if you're using AWS.

But CDK gives you layers of abstraction on top of it. So the question becomes how much of an abstraction do you want to have on top of your cloud formation? Good resources. Because cloud formation can be very verbose, but it's also very clear and declarative in some ways. When you're reading a JSON or YAML file, you can get very comfortable with that and it's very easy to see what you're about to deploy.

Cloud formation will give you a number of levels. So there's these concept of L1 and L2 constructs. So an L1 or level one construct is basically a programmatic object oriented wrapper around all of these cloud formation resources. So you have these classes that would provide that begin with CFN. And those are really just a typed layer on top of the resources you're going to generate. Then you've got these L2 patterns, which are higher levels of abstraction.

And those are like CDK classes that will give you same defaults for a lot of use cases. So you can create an SQS queue, but you don't have to declare all of the properties. It will give you some sensible defaults. And then beyond that, you've got higher level patterns and constructs that can be whole applications even. They're groups of cloud formation resources with lots of defaults, some configurability, but the idea about those is that they're kind of patterns that allows you to deploy a lot of resources with a couple of lines of code. And that'll lead us into, I guess, some of the advantages and disadvantages of CDK because people who are experienced with lots of abstractions in various different programming paradigms over numbers of decades will understand that abstractions don't always come for free. So let's go through the pros and cons. That's what we're here to talk about. What do you think are some of the advantages with CDK Luciano?

Luciano: I suppose the most obvious ones, you're probably getting that already, are that if you are already comfortable with a programming language because you are mostly writing code most of your time, you are already in a way good to go. You don't need to learn anything new. Of course, you need to learn what are the classes that you need to use to build different things. The different levels that you described are not so obvious at first.

So there is still, of course, a learning curve, but at least you don't have to learn an entirely new language. Like for instance, when I started doing Terraform, I needed to learn all the syntax and the nitty gritty details of HCL, the AshuCorp language. And that sometimes is a little bit of a barrier that you can avoid with something like CDK. Also we mentioned that there are all the different levels and these levels exist for what you get out of the box from AWS.

Like if you start to use CDK, but there are also third party patterns that you can just use, you can search your line and you can download the ones that you think they make the most sense for you. And you can even avail of work that other people are doing, just bringing it into your project and you are more or less, let's say, good to go. Probably we'll spend a little few words later on that. And I suppose another great thing is that because you are using all the languages that are supported are typed languages.

And because of that, you get a very good level of auto-completion and type checking. So for instance, if you are writing, let's say a low level construct, like a CFN one, when you start to initialize the classes, you're going to get good auto-complete. You can immediately see all the properties. You can see documentation in line. So I feel that that flow that I used to have when I was doing a cloud formation or Terraform where you always have on one page of the documentation, on the other page, your editor, or maybe two different screens and you're always looking at two different places to try to reconcile them. Now it's a little bit more streamlined, like in one window, you are going to have everything you need to figure out like what kind of properties you need to set and where.

Eoin: When I started using CDK, when it first came out and it was in beta and I opened up, I think VS code and started writing TypeScript and I got all this auto-completion and type checking and immediate error feedback on what properties were missing. It was like, you know, it's something we're used to for developing code, but with cloud formation, even then, I don't know if CFN lint was available. It's deploying, getting feedback on cloud formation was typically, you know, something you needed to deploy in order to get, and this, that was a big productivity win. So it just, yeah, that was really good for me. And then other things as well, like IAM policy generation, have you found that like with CDK, you spend much less time hand crafting and tuning policies and figuring out why you're getting malformed, you policy document legacy particles failed, these kinds of errors. Yeah.

Luciano: And I really like, actually, this is a really good point because basically what you can do, I'm going to give a practical example. For instance, you define an S3 bucket and then you want to allow a particular EC2 instance to be able to read and write in that particular S3 bucket. What you will typically do with cloud formation, you create three distinct resources, the bucket, the EC2 instance, and then you craft your own policy that ties the two together, giving the right permission.

With CDK, because you have this idea of kind of an object oriented approach, what you could say is you use the object that in your code describes the S3 bucket and you just say dot grant read and write to and pass a particular instance reference. And that will automatically generate a policy for you. So it's a little bit of abstraction, but I think it feels a lot more readable and it's easier to get the link between the two resources right because you don't really have to manually reference things. You just essentially let the autocomplete guide you and it will most likely do the right thing for you. So that's something I really liked and I think even when I gave that code to somebody else that wasn't familiar with either cloud formation or CDK, they immediately realized, okay, you are creating this instance, you are creating this bucket, and then you are granting permissions for the instance to read and write in that bucket. So I think that's another very clear advantage.

Eoin: There is one thing I like as well, which is that with cloud formation, you have the concept of change sets and it has a lot of features like change sets and stack sets, but change sets allow you to kind of make a plan for what you're about to deploy, inspect that plan and then apply it in separate steps. And Terraform also has this concept with Terraform plans. But change sets don't tend to be used very commonly.

I think I heard somewhere recently saying that some ridiculously high percentage of cloud formation in the world is deployed using serverless framework, like 80 or 90%. And serverless framework doesn't use change sets by default. I think it's a plugin for it, but it tends not to get used. So people tend to just deploy. But CDK is very much built around change sets. So you create a change set, it allows you to inspect and verify. So it has this synthesis process, which allows you to see the template. Then when you're deploying, you can see the change set and you can verify the change set, particularly the security changes before you deploy. So this is nice. It's allowing you to follow best practices by default rather than having to add that in yourself.

Luciano: Yeah, this is something I used to do a lot with Terraform where I was prototyping something and even before deploying, I would do like a Terraform diff, which is pretty much the same thing we are describing. And you will get a list of, okay, this is what is going to change if you actually want to deploy right now. Maybe I didn't want to deploy, it was just a good sanity check to see, okay, I'm really going in the right direction where I am describing the changes that I want to happen in the infrastructure. And now with CDK, you can just do CDK diff, I think is the command, and it will give you like a list of, okay, at this stage, if you apply this, this is what's going to change compared to your current infrastructure. And I think that that's really powerful and really useful, especially if you are starting to use infrastructure as code, it will give you a lot more confidence when you are writing for the first time that you are going in the right direction. Yeah, yeah.

Eoin: One of the other things that is kind of really rising to the fore with CDK is how easy it can make to do pipelines. How easy it can make doing pipelines. So if you use CodePipeline and CodeBuild, CodeDeploy, all these services, creating pipelines for them with CloudFormation is hard work and maintaining those pipelines is really hard work. And I think pipelines are the first, building CI-CD stacks, this is one of the first things I ever used CDK for.

And I still use it very commonly because it just makes that process much easier when you have particularly dynamic behavior in your pipelines. You know, the stacks you are deploying change, the number of steps in your pipeline change, you want to be able to replicate a deployment pipeline for a dev environment or for a new set of accounts, CDK really facilitates that. And beyond that, if you are really going all in on CDK and you are using L2 constructs for creating Lambda functions and all of your resources for each service in your application and multiple stacks within your application, it has a pattern of, a pattern, so it's kind of level three constructs for CDK pipelines.

And this will basically create pipelines out of the box with very good defaults that would deploy all of the stacks in your application. And so this is really good. It means you have to go all in on CDK. That's the only thing, but it is a really nice advantage. And CDK pipelines then are also self-mutating. So if you have got the pipeline code in the same repo as all of your application stacks, then you commit to a branch or trigger a release, then when the pipeline runs, it will first make sure that the pipeline itself is running the latest version of that code, and then it will deploy everything else. So it's really nice from a change management point of view. You can imagine a PR that introduces a new service into your application and it includes the pipeline changes as well. So it makes it really easy from a code review, collaboration point of view.

Luciano: So that's really nice. Another thing that is probably relevant to what you just said is the concept of assets that exist in CDK, which I think is really clever and can simplify a little bit your life in many use cases. For instance, if you use, I don't know, something like Terraform, every time you need to, let's say, deploy a Lambda or a container in ECS, you need to, of course, to specify, okay, where is the source code for that?

And generally that means, okay, I need to create an artifact, publish it to S3 or to a container registry, and then I can reference that particular artifact in my infrastructure as code. With CDK, there is a way that you can abstract all that work, and if you have the code collocated with your infrastructure as code, you can just reference assets in the same project. And then behind the scene, that asset abstraction will, for instance, deploy the source code of a Lambda in S3 or use a container registry to deploy the source code for a container. And they would even do the build phase within the context of CDK. So it's a little bit more streamlined process where you don't end up using different tools and different steps of a pipeline to just deploy your changes, which can be nice. I guess there are pros and cons, but it can be nice, especially if you're working on a small project because it makes your life easier. But now that we've talked a lot about the goods, what do we have to say about that? Do you want to mention something on that one?

Eoin: Yeah, there's quite a lot to cover here. In my own experience, one of the big things is that there hasn't been a lot of consistency in it, especially when you're talking about some of the L2 constructs. Not every service in AWS has L2 constructs. So I remember trying to deploy batch and you had to use the L1 constructs. So sometimes it falls a little bit behind the CloudFormation and it takes time for those L2 constructs to emerge.

But I suppose one of the main things, if we're looking at disadvantages of CDK and approaches like that, is that this is an abstraction layer and abstractions should always be treated with a decent amount of caution. If you look at object relational mapping as an example of that, there's always a price to pay for abstractions. One of the things then is if you don't understand the details of what is being generated, this is a dangerous thing.

So if you've got a client-side application that's making decisions for you about the resources that would be deployed on your cloud. And if you're using this as a way of escaping, getting the understanding of what you're deploying in the cloud, that's a dangerous thing, right? Because you can really end up having performance or cost or other unexpected behaviors emerging from what you're deploying.

So there's a couple of cases where CDK is really good, but I wouldn't say it's good if you're just coming at it from a beginner and using it to completely skip having a good understanding of what CloudFormation does and what it's doing for you. I think going back to the episode where we talked about our favorite service, we mentioned it as one of our top service. I think it might've been on my number one spot.

There's a reason for that, right? It's just critical to everything you do and having a predictable deterministic deployments is really important. If you don't understand what's being generated, you might lose that. Absolutely. So yeah, I can't emphasize that enough, right? So it's important to go in with your eyes open. So we have seen a lot of change as well. I started using it in the beta phase and as you would expect when it went to general release, there were some breaking changes, but you also have a lot of style changes since then, even across the version one series and deprecation of methods of doing things.

And recently the last couple of weeks we've had version two come out and I've already seen a lot of people complain about breaking changes and how it's not ready for general release and how they have to start redesigning their stacks. So this is something that's important to clarify. There's always a trade off with these levels of abstraction. The other thing that can be quite confusing to new users is that you begin deploying a CDK application by executing this bootstrap phase.

And it's not immediately obvious what this is for or why you would do it, but you need to start, I suppose, with a bucket and some things that CDK can use to deploy its resources. So it needs to prepare sometimes a bucket for assets and it actually now generates an ECR repository for container images and it'll generate some policies for deployment across account policies. And these are all useful things, but when you're deploying your own application stack, sometimes it's a surprise to see that you need to deploy another stack first in order to just deploy the stack you're actually targeting. So that's a bit of complexity and something that's not universally understood. So I put that down as a disadvantage. Yeah, actually I'm not sure I understood that so well until now.

Luciano: So thank you for explaining that to me as well. I always done the bootstrap phase, but I was a little bit like, okay, whatever, I guess I have to do this, but it's good to know what actually happens behind the scene, I think. I have another one, which is something that has been beaten me quite a few times actually. And that's basically you think about CDK as, okay, this is just code. I can do whatever I want.

You can write all sorts of business logic in there, loops and if statements and do things as you would do in any other regular programming language, but sometimes you don't get the behavior you would expect. And this is because in reality, what CDK is doing is managing an entire life cycle and you are defining resources that will need to be provisioned at certain points. Sometimes you reference resources that might or might not exist at that particular moment in time and CDK will try to figure out that when you actually try to deploy. So for instance, one thing that happened to me a few times is that I was trying to read the content of an SSM variable and decide like a piece of business logic. If this is the content I'm going to, I don't know, maybe want to provision something else or I don't know, whatever business logic trying to make the type of resources that I wanted to provision determined by the value in an SSM parameter. And that doesn't always work. I mean, you need to do different things to make it work. So that was an interesting one. And it might be confusing why that happened. So that's a quirk, I would say, of CDK.

Eoin: There's another thing. I mean, I think there's a lot of debate in the community and on public discourse around CDK benefits versus disadvantages. And one of the things that's cited quite often is, you know, it's not a deterministic deployment path because if your client code that's generating your cloud formation code changes, that the cloud formation template can change without you having changed any of the inputs.

So it's not really a deterministic path. We kind of talked about that when we're dealing with the disadvantage of abstraction. Another kind of disadvantage that I've seen referred to are the kind of cultural barriers that it can bring about in an organization or that it fails to deal with. And if you look at how things have gone over the past few years in cloud and the emergence of a DevOps culture, what you're trying to do is break down barriers and walls between operations and developers.

And if you want to be able to create these cross-functional teams that can build it and run it, I think there's a danger that using programmatic constructs and using imperative languages can actually rebuild some of those walls because if you look at people who have great expertise and experience in SysOps and are coming maybe from a background of using change management tools like Ansible and Chef and Puppet and lots of other tools and even Terraform, if you start coming along and saying, well, this is better because it uses imperative languages and imperative languages are real programmatic languages and they're better because they allow you to do all of this great object-oriented stuff.

I don't think that's a really genuine or helpful argument. These languages aren't inherently better at all. They're just more familiar to people who are used to writing software with business logic. It doesn't necessarily mean that they're a better tool for this job. In fact, there are plenty of arguments to say that infrastructure should always be declarative. So I think organizations kind of need to think about this before just going all in because sometimes what you can actually do is end up isolating people with really good skills and SysOps experience in your organization by essentially potentially gatekeeping by saying you now need to have this set of skills in order to be able to do infrastructure in a modern software application. And I don't think that's the case. I think we should be able to have cross-functional teams where we meet people where they are and understand that sometimes for infrastructure, imposing your set of tools on top of them isn't necessarily the best thing. I think that's really important disadvantage that can emerge and something that should kind of slow people down from just adopting it just for the sake of it because it seems easier to get started. Yeah, that's an amazing point.

Luciano: I totally agree on that. And it's interesting to see what the industry is going to decide, I suppose, in a few years if tools like CDK are going to be more mainstream or if eventually we are going to go back to more declarative approaches maybe, I don't know, with different tools or different languages.

Eoin: So how do you think people should get started then if they're, I guess, people are going to come at it from different angles. Maybe people are using CloudFormation already or using something else. What's the best place to get started with CDK?

Luciano: Yeah, I think I would like to suggest a little bit of a backward approach because what we say is that the main advantage of CDK is that it's a level of abstraction and you can deploy things probably quicker than you will do with just writing CloudFormation from scratch. But at the same time, we say that there is a danger that if you do that, you're not going to really know what's going on really at the stack level, like what kind of resources are you going to end up deploying.

So what I would like to suggest, and maybe this is a little bit of an experiment, so please let us know if you do that, what do you think about this experiment, is try to start maybe using CDK almost like CloudFormation. And by that I mean, use just level one constructs. So you are literally just writing CloudFormation but in something like TypeScript. And that will give you a good approach, in my opinion, to the tooling around it and to like what's really happening in the different phases.

And then from that point on, you can start to avail from the different abstraction. You can use level one constructs or you can use level two constructs or third party constructs to use that more level of abstraction where maybe, I don't know, you just want to create a VPC and you are okay with some of the defaults. But at that point, you should be comfortable enough knowing where to check to see what's actually going on behind the scene. So this is what I'm suggesting. Start from actually the lowest level and then add abstraction as you feel more and more comfortable and as you feel you understand what those abstraction are really doing for you. So maybe that will give you less surprises, I would say. Maybe a little bit more painful to reap the benefits of CDK, but also probably a safer approach and less surprises at the end of the day.

Eoin: I think that's good. Yeah, yeah, yeah. It's good to not jump into these things feed first and go into these high levels of abstraction. So that makes sense. And sometimes I've actually used CDK just to generate cloud formation so I can see what syntax I should be writing manually and then I just create the template. So you can always use it in that mode of operation as well. Maybe that's the best of both worlds. Okay, so I think given that we've finished up on how to get started and we've covered all the disadvantages and advantages we can think of, it's time to finish up and maybe ask for your feedback to anybody who's listening to let us know what you think in the comments, to share it with your colleagues and friends and let us know how you get on with CDK. And if you've enjoyed the episode, give us a thumbs up as well and follow and subscribe. So we're going to see you in the next episode. Thanks very much for listening and goodbye.