AWS Bites Podcast

109. What is the AWS Project Development Kit (PDK)?

Published 2024-01-12 - Listen on your favourite podcast player

This episode of the AWS Bites Podcast provides an overview of the AWS Project Development Kit (PDK), an open-source tool to help bootstrap and maintain cloud projects. We discuss what PDK is, how it can help generate boilerplate code and infrastructure, keep configuration consistent across projects, and some pros and cons of using a tool like this versus doing it manually.

Is PDK something you should use for your cloud projects? Let's find out!

AWS Bites is brought to you by fourTheorem, the ultimate AWS partner for modern applications on AWS. We can help you to be successful with AWS! Check us out at fourtheorem.com!

In this episode, we mentioned the following resources.

Let's talk!

Do you agree with our opinions? Do you have interesting AWS questions you'd like us to chat about? Leave a comment on YouTube or connect with us on Twitter: @eoins, @loige.

Help us to make this transcription better! If you find an error, please submit a PR with your corrections.

Luciano: Ever find yourself bootstrapping new cloud projects, endlessly copy-pasting config files and drowning in repetitive tasks? What if you could ditch the copy-paste chaos and channel all of that energy into real value, crafting your project business logic itself? Today we are going to discuss a new AWS open source project that aims to address this problem. It's called AWS Project Development Kit, aka PDK. If you stick around, you will learn what PDK is, what kind of features it supports, how to get started with it, and finally we will also disclose our opinion on it, what are the pros and cons, and whether you should be using it or not, depending on the kind of company you work with. My name is Luciano and I'm joined by Eoin for another episode of AWS Bites podcast.

Eoin: We have discussed CDK in the past, but what is PDK?

Luciano: Yeah, it's another three words acronym, so it might be more confusing than ever with all the variety of tools that exist in AWS. We have a new one with another acronym, so the name PDK stands for Project Development Kit, and it's relatively new. I actually checked the first commit was like more than two years ago, so I think it has been going on for a while, but I think only now it's starting to get a little bit of traction.

Maybe it's becoming a little bit more stable. We only discovered it like last week, so it came under our radar very recently. So the idea is that if you actually have the problem that we discussed in the introduction, so you spin up projects and projects, you work in a very active company with lots of stuff going on, you end up probably copy pasting a lot and lots of boilerplate being somehow managed, and this tool is trying to give structure to that kind of problem.

So how do you consistently put up new projects? That's the goal of the tool. And just to give you an example, if you start a JavaScript project, you probably know how many configuration files do you need. You probably need a TypeScript config, linting config, coverage config. You will have some kind of CI CD, and when you start to add on top of that, also infrastructure with CDK configuration and all that kind of stuff, it only compounds, and you have all this massive configuration file that you have to manage over and over and over again, and then every time there is a change in your organization, good luck trying to keep everything standardized and up to date.

So that's kind of the problem space that this tool tries to address. And the example that you get from the website is basically you can build a monorepo, so that seems to be kind of one of the main cases that the tool tries to address. So when you build a monorepo, you can also have additional things like how is the build pipeline going to work? Maybe I only want to build the projects that are actually changed from my latest commit, and you need to have some kind of mechanisms to do all of that consistently.

Generally, you also have caching. You might have dependencies between the different projects that are part of your monorepo. So this tool also tries to give you all the tools to manage all the different problems in a consistent way. It is built on top of CDK and Projen, so two very interesting technologies. We have been speaking about CDK before, and we have an entire episode dedicated to that, so we will have a link in the show notes. But if you've never heard of CDK, it's pretty much a way to do infrastructure as code using programming languages. So rather than using YAML, you can use TypeScript, Python, Java, Go, and so you can think of it as terraform or cloud formation, but using those languages and variables and classes and instantiating constructs rather than defining the clarity code. Okay, so yeah, I mean, we like CDK for a certain number of cases.

Eoin: I think we've both got our reservations for when it comes to CDK wider adoption, but we know why it shines. It's really good for bootstrapping and prototyping and building dynamic infrastructure as code, because you just create those classes and it can create a load of boilerplate for you. So I can kind of understand that if you've got project templates and all that boilerplate that you have with new projects, especially with full-stack projects and monorepos, it gets to be a lot, right? And copy-pasting everything every time is really very difficult to get a handle on it and then keep it up to date over time as some of those dependencies change. So I'd say, okay, there's definitely a problem to be solved here. I guess I haven't seen a perfect solution to this. We've seen some odd things in the past. And you mentioned Projen. So Projen is something I hear of behind CDK experts. I hear Matthew Bonig and other CDK Day speakers talking about Projen from time to time. Do you know much about it? Can you describe how it fits into the ecosystem?

Luciano: Yeah, it's probably worth detailing a little bit more what Projen is. By the way, it's spelled Projen as P-R-O-J-E-N. So it sounds like project, but it's Projen instead, which I always find it a little bit of a funny name. Anyway, it's a project that was started. It's an open-source tool. It was started by Elad Ben Israel. And there is actually a very interesting talk with a demo that summarized very well in just 15 minutes what the project does.

And you can find it from the... It's one of the talks from a CDK Day from a couple of years ago. So we will have the link in the show notes if you want to just 15 minutes intro of what it does and how it works. But the idea is that it basically allows you to synthesize project configuration files. We discussed before, if you start a JavaScript project, probably you need a Packet JSON, TS-Config JSON.

You will need some kind of Git ignore, ESLint configuration, Jest, and all that kind of stuff. So the idea is that it can synthesize all of that stuff for you in an opinionated way, but you can probably pick and choose from different templates. And the more revolutionary idea is that it doesn't just do that once. Like you don't just bootstrap your project and that's it, and then you take it from there.

It is a way to manage all these files going forward as well. So these files that are created, you can kind of consider them read only. So you're not going to attach them manually anymore. But instead you're going to be using a more, let's say, consistent configuration, where that configuration gives you a programmatic interface that looks like the one you get with CDK. So you write typed codes with classes and variables and so on.

And in this code, you actually try to describe exactly what this that you want to generate. For instance, if you want to generate a front end with React, it will know how to generate code for all these configuration files that are suitable for that particular environment. And of course, you can tweak things. So you can change configuration options that maybe you want to change, that may be different from what you get by default.

And it's basically a CDK, but for generating project templates with all the related configuration. So if you use something like Projen, you can spin up multiple projects in your organization. And then if there is a change that you want to apply across the board, you can easily just change one line of configuration and reapply to all the projects. And basically will align all the configuration across all of them, which seems very powerful. So again, it's just a tool to generate configuration, but also tries to give you a way to keep that configuration consistent, which is something that is missing in most of the other tools for generating boilerplate or configuration or just bootstrapping projects.

Eoin: Okay. This is the kind of project do you think that where you might have like a platform engineering team or something maintaining this, these constructs, this Projen stuff? And they would say, okay, there's new versions of all of this front end stuff or even backend libraries coming out all the time. So we'll have a maybe a test project and a build pipeline that will test the generator for the project. We'll check that it all works and builds together, and then people will have an easier time upgrading and aligning to the latest version and keeping kind of evergreen all of their dependencies in the individual projects in a bigger company. Yeah. I haven't used it myself outside that just a quick demos, to be honest, but I can see organization using it that way.

Luciano: And you can also take it to the next step and even automate that. So you can have pipelines that will look for updates on, I don't know, common dependencies or maybe look for vulnerabilities and automatically try to apply the updates around all the tests. And if everything is green, they can just submit a request. So some kind of like the bot you get with GitHub to apply to your project configuration rather than just do your own dependencies. So that could be a cool use case, but I haven't seen it, to be honest. It's just making it up on the fly because it kind of makes sense. Okay. It sounds kind of valuable.

Eoin: Every time I go back to a project I've written even like a year or two years ago, full stack project, I find there's so much that has stagnated or decayed because everything moved on and node versions are out of LTS and frontend libraries have all these vulnerabilities in them. So you can't just leave code alone anymore. You have to constantly refresh it with the latest dependencies. So I can see how that might help. How do we then get started with Projen? What's the first hello world we need to think about? Yeah. So Projen will be part of your PDK.

Luciano: So you don't really have to think about it too much because PDK is, let's say, an additional layer of abstraction that uses Projen and CDK. So if we look at the quick start guide that you can find on the PDK documentation, the idea is that you can start by creating like a monorepo and then on that monorepo you can add additional things. So some of the things that you can do are, for instance, you can create an API backend and you can use either OpenAPI or Smitty.

It can also generate documentation in a variety of formats for your own APIs, for your own frontend. It can generate infrastructure as code. For instance, if you want to ship your API as Lambdas, which I think is the default, it will generate all the infrastructure as code using CDK for that API layer to be deployed in Lambda. It can also generate libraries. For instance, one thing I found interesting is that it's very easy if you create an API and you want in the future to create a React frontend.

It can also generate a library that contains React hooks that are built with all the auto completion and all the models that you have in your own API. So it makes it much easier for you to then call that frontend because you don't have to think about how do I do this specific API call. You just use the hooks and you have a much more expressive experience that way. It can also generate the boilerplate code for your Lambdas handler, React frontends for single page applications, deploying them on S3.

It can also use Cloudscape, which is the design system published by AWS. And finally, one of the cool things that I think are very interesting is that it can also generate diagrams. So for your own architecture, based on the CDK code that you have, as you build, it also spits out diagrams that you can visualize to make sense of the architecture that was generated. So to get started, what you need to do is you need to have Node.js installed.

And then at that point, you can run npm install aws-pdk. And that will just install the CLI tool. It might require some additional dependencies. Of course, you need to have the AWS CLI installed. It might need to use Git, depending on the modules you use. It might also need to use Java and Maven, again, depending on the kind of modules that you're going to be using with PDK. So you might need to install additional dependencies.

But generally speaking, having a JS and a WCLI should be more than enough to get started. Then you can scaffold a new PDK project. And this is a bit of a meta statement. You can use a generator to generate a generator by saying PDK new in the template. For instance, PDK new mono repo dash ts will generate the basic PDK project to manage a TypeScript-based mono repo. And once you've done that, you can already browse around.

So it will generate a bunch of files for you. And you can see, for instance, that it created a Projen RC file, which is the starting point where all the configuration leaves. And that you can modify to add additional bits and pieces and then make changes to the project structure. So at this point, if you run PDK, it will actually do all of that. It will generate all the files. But then you can also run PDK build, which is kind of a wrapper around all running build on all the packages, basically.

And if you do that, since it's going to create also CDK stuff, you can also go inside. I think it's going to be packages slash infra. And you can see all the CDK stuff that was generated and built. And inside that folder, you will also find the diagrams for the architecture that was generated. So this is one more way to generate diagrams. We have a dedicated episode on how to generate diagrams for AWS.

If you use this tool, it's an additional way. So if you're interested in this topic, we'll link to the previous show where we talk about creating diagrams for AWS. And then finally, there are additional commands. For instance, this is only local so far, but at some point you will want to deploy all of this code to AWS. So PDK as a wrapper gives you a high-level command called PDK deploy, which can deploy in dev mode using CDK or swap.

Or for production. And basically it will deploy everything that needs to be deployed. And finally, if you are just testing things out and you want to destroy everything, there is a PDK destroy command that will clean up all the deployed infrastructure. So I guess the next question will be where do you go from here? So this is just a quick run through on how do you get started and what you can do with it.

But if you look at the documentation, there are more examples. There is an interesting guided tutorial where you're going to be building an API in the front end. Then add login using Cognito to this front end. And finally, you can also add more advanced capability. It's kind of a shopping list application where it's a little bit more interactive and you get to see all the moving parts. And there is also a developer guide that covers specific modules. So you can see, for instance, what the static website module does in detail and all the other modules that are available by default. So we will have links to all these tutorials and the detailed documentation in the show notes if you want to take it from there. Okay, cool.

Eoin: And I've already fallen into the habit of mixing up Projen and PDK. But if we're looking at PDK, then I just tried, as you mentioned, I just tried NPM installing it. And I can see that when I want to do a PDK new, it gives me options of doing a React website, infrastructure projects, and then the Monorepo. You mentioned Monorepo TS, but there's also it looks like Monorepo Java and Python. So those are the languages that are supported. Is that right? Java, Python, and TypeScript? I think there is a difference between the code that you write, which is probably depending on what Projen supports, and then the code you actually generate, which probably there you can create your own modules and generate pretty much anything if you really want to make an effort.

Luciano: So, yeah, in a sense, I think we need to distinguish between the code that you can use to manage PDK and Projen and the code that you can generate. And I think, yeah, you probably have options on the first one, but on the other one, it's pretty much up to you. You will have good defaults if you stick with Python or TypeScript and standard ones. But if you want to go wild and use, I don't know, Elixir, you can probably still do that as long as you put the effort into generating all the modules for that. Okay, nice. Yeah, it looks like yeah, those are the Projen supported languages from what I can see anyway, those three.

Eoin: Okay, which I guess will cover bases for a lot of people, but there's no C sharp, it seems like which is is supported in CDK. So maybe that's something that will come down the tracks. I think that was a pretty good overview. What about your opinion then? What's the good and bad? What are the pros and cons as we look at the current state of PDK? Yeah, let's start with the good things. I think it's great to have a tool that tries to standardize the way that you bootstrap projects, but not just bootstrap, but also maintain them over time, which I think is where most of the other tools fail, because it's easy to generate a template and just copy paste it.

Luciano: There are millions of tools that you can use to do that and customize things a little bit. But then after that, you're pretty much on your own and then projects will drift over time and then it becomes harder to to to add that level of consistency. A tool like this can actually be a solution to that problem. Now, this is a vague statement intentionally because we haven't used it to the level that we can say that it actually fulfills that objective, but it seems very promising.

The other thing is that it's also focusing on giving you some degree of standardization when it comes to documentation through using open API, other tools to generate documentation, generates diagrams. And this is one of the big pain points that I see, especially going to larger organizations. Documentation is always very inconsistent. Sometimes it's lacking. It's not up to date. So a tool like this can also address that kind of problem and encourage people to be consistent, make it very easy actually to have that level of consistency. And the other thing is that if you like CDK, I think it's a very similar experience, so it will feel familiar enough. So I'm going to put that on the good side. That is not going to try to reinvent a new paradigm, but it's giving you something that if you have used CDK, it should be pretty familiar. Do you want to ask? Okay, so that's the good part of PDK. Are there any cons? What do you think? What would stop people from using it today at least?

Eoin: Yeah, this is a very opinionated take based on my quick experience with PDK. So I hope it's not going to sound too harsh because that would be unfair. But the first thing that kind of made me a little bit, I don't know, not enjoy the experience is that it feels very opinionated.

Luciano: There are lots of starters and all the starters are very opinionated in a way that is very AWS heavy, which is not necessarily a bad thing. But it feels like the tool wants to encourage people to use more of these AWS things that maybe are not necessarily the best in class for the specific topics of interest. For instance, you get to use CDK for infrastructure as code, which we know is like a tool that does the job, but it's AWS specific for the most part. So AWS is already pushing that kind of tool onto you if you use PDK. Similarly, Projen is probably a little bit more generic, but has a lot of ties with CDK and other AWS tools. So you will probably see Projen more in the AWS space than in other ecosystem. Then there is Meety, which is the internal tool from AWS to generate types and models and then create APIs based on that. You can also use OpenAPI, but Meety seems to be kind of the default. Then there is Cloud speed. And does that support GraphQL or is GraphQL outside of this completely? Because I know there's a whole, I knew everything that you said Smithy could do seems to be already supported in another GraphQL ecosystem and tooling like Apollo, etc.

Eoin: Might be. Honestly, I haven't looked in detail. I expect that there will be modules for that to make it easier. I don't know if there are already built in or if you can easily create your own.

Luciano: The other one is Cloudscape. So there is this new designs, relatively new design system that was published by AWS. And you might argue that is not the most famous design system, at least not yet. But of course it's a tool coming from AWS. So they had to pick the design system from AWS. And then there is Cognito for user management. So again, it's just that it's very opinionated in a too much AWS heavy way.

So you probably need to go the extra mile if you want to use other options out there and create all your own starters, starter modules and use, I don't know, maybe you don't like Cloudscape. You want to use Bootstrap or material design. You're probably going to be a little bit on your own to figure out how to migrate that original Cloudscape module into something else. And then there might be a little bit too many dependencies. So you need to have Node.js, CDK, probably Java, Maven, Maven, AWS CLI.

So you already need to buy in into the entire developer ecosystem of AWS plus additional new things if you want to use this tool. So maybe the starting point, if you're just starting from scratch, might feel like there is a little bit too much work involved into just getting the development environment ready. And I think this is the main one that I have now is that it can feel quite complex, especially at first.

And if you think about that is because there are many layers of abstraction. So when too many layers of abstraction is too many, I don't know. But it feels like we are getting to a limit here because you basically have a project to manage projects and you basically have to bootstrap this project to manage projects. And then every time you have to do a change, you will have this PDK making change into CDK, which will make changes, for instance, into generated cloud formation.

So there are so many levels of generating code. So if you have an issue, first of all, good luck trying to figure out where the issue is, then figuring out at which level you actually need to apply the fix and how to do that. And then I expect that there will be so many escape arches sometimes just because maybe something is not supported at one level and you need to figure out exactly how to do a workaround.

So, yeah, I guess having too many layers just adds complexity. And with that level of complexity, the development experience might not always be nice, especially when you have to deal with bugs and things to fix. So my final, I guess, judgment is that it feels like something that AWS wanted to have internally because they probably built lots of examples, lots of projects, and they have their own preference on how to do all of that.

And that's probably why all the main tools that are coming as default ones are AWS specific tools. And they just made it open source because other people might find it useful. But right now the state of the project makes me think that this started as an internal AWS thing, as a convenience thing, and now it's maybe getting into general adoption. So all my opinions might get better over time as maybe with more general adoption, there will be more variety in that ecosystem. But right now, we just feel something fresh off of AWS that only AWS can really get value off. Got it. That makes sense. I think there's probably also a benefit in that, the fact that it came, if it is internally used, it probably means that it's more likely to be supported and grow in the future.

Eoin: And that there are people inside AWS who are going to be passionate about it. And it's not like some sort of side project that they've launched as open source and they're looking for kind of a solution, looking for a problem. If they're already using it, you know, I've seen Smithy pop up as a dependency in the new SDKs. So they're obviously using this ecosystem that might actually bode well for the future. Who knows? What's the general community sense around PDK then and adoption? Have you seen much traction? I've seen a conversation on Twitter, mostly started by Vlad Ionescu with AWS Hero. So we will have the link in the show notes if you want to read all of that.

Luciano: He seems to be very skeptical and I'm not going to try to quote his opinion too much because I don't want to paraphrase it. But yeah, I guess the feeling is that this is not necessarily a tool that he wants to use, but the trade is interesting because other people chime in and they bring different opinions as well. So I think it's worth reading if you're looking for other perspective on the tool. Okay, fair enough. And if people don't feel like adopting PDK, is it the only show in town or would you recommend any alternatives if people have the same problems that we're trying to address here?

Yeah, I guess in terms of bootstrapping projects, the most famous ones that I know are Yeoman, which was, I'm going to say was because I'm seeing it less and less in the last few years. But it was the tool of choice for all things front end. So all things like JavaScript projects, front ends, APIs, and so on. While in the Python space, probably people will be a lot more familiar with something like Cookie Cutter, but the concepts are the same.

You can define a template, you can define some kind of configuration. And then when you install the template, you can combine that template with your own configuration and will generate a project with some degree of customization. And I think both projects after that step, like you are on your own, you need to figure out how to keep it going and how to keep that consistency. So in that sense, PDK might be a little bit better trying to address also what is the next step after you generated the project?

How do you guarantee that degree of consistency? So interesting different approaches there, but the tools are somewhat similar. Then we might also talk about specific generators. For instance, I don't know if you do React, there are many ways to generate React projects. The most famous is create React app, but there are lots of other alternatives. Or I don't know if you do Lambdas, there are many generators. For instance, if you use Rust, we spoke before about Cargo Lambda.

So that's another very domain specific tool that allows you to generate projects. And then there are other approaches that are probably the most common that I see all the time when people just create a repository on GitHub, they publish it as a template. And then in GitHub itself, you can easily just say use this template and it's going to clone that repository for you in your own space. And then from that point on, you are on your own and you have just a starting point.

And there are CLI tools like the Git, which basically does the same thing. It's just going to clone a repository for you and clean up all the Git artifacts, all the Git history. So you just have a clean slate starting from whatever it was published in a specific repository that you are starting from. And one more alternative that I believe is worth mentioning is Terraform. One of our colleagues, Connor Meyer, did a very good demo. We will have the link in the show notes showing that you can use Terraform to also bootstrap projects and have some degree of consistency.

Because there are modules, for instance, for Git to create repositories, integrate with GitHub or other repository. And then from there, you can add a bunch of additional things. You can bootstrap scripts that, for instance, will install dependencies and so on. So even a tool like Terraform can be used to do something like this. And it will be a little bit more general purpose because you can use it also to bootstrap other types of projects that don't have necessarily to be AWS specific. Okay, nice. Yeah, I could see that looking at the list of existing tools like Yeoman and Cookie Cutter, I've used them and they're great for bootstrapping projects, but not so great at keeping them up to date and retrofitting upgrades into an existing project.

Eoin: And I can see how PDK is trying to address that a little bit, but seems like there's a lot of complexity. Kind of makes me ask the question, do you want to use these tools at all? Or are you just better doing it by hand, trying to communicate within an organization and provide references and templates for people just to copy paste and then just use some other good hygiene or automation to keep things up to date? Do you need a project manager tool like PDK?

Luciano: I think the answer as every good technical question is, it depends. And I will answer with another question, which is what is the value here? What are we trying to achieve? I think the most important thing is trying to reach that level of global consistency. And by that, I mean that you probably want to get new projects started very quickly. You want to embed best practices in your templates. You want to reduce the amount of choices that developers have to face every time they are starting a new project.

And also you want to create projects that are similar enough that it should be easy for developers to cross collaborate, or maybe if they have to change team to make the transition as easy as possible. So basically you want to have a tool that keeps people productive, and it makes people focus on the things that matter, which is probably the business logic and not all this layer of configuration that you are repeating over and over and changing maybe slightly enough to create surprises.

But the other question is, when do you really need all of this? Because if you are a small scale company, or maybe a company that just started, you probably don't need all of this. You actually really want to go through the trouble of building projects from scratch to really understand what is the setup that works best for you? What are the technologies that you prefer to use for the specific domain you are in? And if you use this kind of boilerplate or starters, they will have their own opinions that maybe are not very suitable for your specific organization.

So you might be biased into using something that is not necessarily the best option for you. So that's kind of one of the risks that if you are a new organization, maybe is not the best option. So you probably want to spend a little bit of time figuring out yourself what is the best way for you to build software. But even for larger organizations, I think there is a little bit of a risk that you might end up stagnating in your technology choices.

You might end up just doing projects always in the same way. You might not embrace new technologies or new ways to build software. Or even worse, sometimes you need to build something very specific. And just because you have a boilerplate to build an API in a certain way, you just go with that without even thinking what are the trade offs. And sometimes you end up just using a hammer for everything, even though sometimes you don't necessarily need that hammer. You might need another tool.

So I think the risk with all these tools is just it's good when you need to do something very similar to avoid all the boilerplate and all the repetition. But we know that software projects are always different and in subtle ways sometimes. So having a little bit of freedom and picking the technologies and the approach that can be more suitable for the specific domain you're working on. I think it's something that you need to figure out how to guarantee anyway if you want to keep building high quality software. And the final thing is that one good use case where this might be very valuable is when you have compliance obligations. Because in that case, I think it's actually important to have some kind of template where you are sure that all the best practices are built in and you don't have to go through a compliance review over and over every time you do something new.

Eoin: That sounds like really good advice. And thanks for a very thorough overview of PDK this year. Is there anything we wanted to say to wrap up?

Luciano: I think in general, this is a very, I guess, personal opinion space. I think it's easy to have people with very different opinions. Some people actually enjoy the process of starting a new project and picking up libraries and maybe checking out what are the new things and whether they are worth using. I think there is actually a space there where you can be productive in ways that you generally cannot be productive when you're working on an existing project.

So some people might actually enjoy all of that work that might feel just as busy work, and there is value there for sure. So I'm really curious to hear people opinion and see, like, do you actually do all of that work or you'd rather just have a template that has made all of the decisions for you and just get over with and focus on the actual business logic that you're building. And if you've used this tool or other tools, like, do you like them? What do you like? What you don't like? Maybe there is space for something else that we haven't covered today that can solve the problems we are discussing about. So I think that brings us to the end of this episode. I hope that you enjoyed. As usual, if you enjoyed it, please give us a like, subscribe, share it with your friends and colleagues, and we hope to see you soon in the next episode.