AWS Bites Podcast

27. How do you organize AWS Accounts?

Published 2022-03-10 - Listen on your favourite podcast player

Eoin and Luciano try to answer the question of what AWS accounts do you need for your team and how to organize them. In this episode of AWS bites we discuss common ways to organize AWS accounts and environments from the perspective of a company running production workloads. We try to answer questions like “how many accounts and how many environments?”. We also discuss how you and your team can be more productive by effectively managing AWS accounts and environments. Finally we explore some common security and cost-related tradeoffs that are common when it comes to organizing AWS accounts.

Thanks to David Lynam for suggesting this awesome topic!

In this episode we mentioned the following resources:

Let's talk!

Do you agree with our opinions? Do you have interesting AWS questions you'd like us to chat about? Leave a comment on YouTube or connect with us on Twitter: @eoins, @loige.

Help us to make this transcription better! If you find an error, please submit a PR with your corrections.

Luciano: Hello, today we're going to answer the question, what accounts do you need for your team? And we will try to explore some of the common ways to organize AWS accounts and environments, of course with the idea of a company running production workloads, so how many accounts do they need and how many environments. We will discuss how to become more productive with AWS accounts and finally we'll discuss some of the common security and cost-related trade-offs. My name is Luciano and today I'm joined by Eoin and this is AWS Bites podcast. So first of all I'd like to start by saying a big thank you to David Lynam because he suggested this awesome topic on Twitter, so please keep sending us your suggestion, we are more than happy to try to address common questions that you might have. I would suggest we maybe start with a quick definition of what do we mean by an AWS account, what do we mean by an environment, just to make sure we are all in the same line.

Eoin: Okay, so an account, let's try to define that as the fundamental unit that you get started with on AWS. Most people will start off with one account, put in a credit card and grow from there, but you can add as many accounts as you need across your organization and some companies have hundreds of accounts. But an account is really a logical concept but it's a very strict security boundary as well because once you try to access resources in another account that's a cross-account access concern. So it's very much a walled garden for everything you're doing in AWS and it's very useful for isolating workloads. So I think one of the things that you need to think about when you're thinking about an account is what do you want to share with other people and what do you want to isolate, right? So that's a concern. Is there anything else we need to say about what an account is?

Luciano: Maybe I will only add that you can have this kind of tree structure for accounts so you can group them logically in a way that you will have like a master account and all the sub-accounts there, which is not always obvious. Organizational units as well are like these folders really for accounts. So yeah, that's important to know. Yeah, that's true.

Eoin: So what's an environment then? Because we talk about accounts and environments. Are they the same thing?

Luciano: So I would consider an environment, this is more of a loose concept I suppose, so I would consider an environment, this is more of a loose concept I suppose from the perspective of AWS. So it's more for the way that you decide to build and ship applications throughout your pipeline. So probably you're going to have a development environment where you are going to use that only for building new features and as you build maybe you might want to test a few things manually. You might have other environments for instance for QA or for all sorts of automated testing. Then finally you will have your production environment which is literally where the software you are running is going to be accessed by your users and that's the main source of I suppose delivering the product to your customers. So yeah, it tends to be a little bit more loose because you don't really have that concept as a structured way in AWS and you'll need to decide how to apply that concept using accounts.

Eoin: So I guess the one environment that everybody would probably have is the production environment and after that it's going to vary?

Luciano: Exactly, yeah. So what do we think is like the most common way at least to try to structure accounts and environments for? I suppose a big enough company doesn't have to be extremely big but maybe not just like one person company. Maybe you start to have a few products and a few employees in the team. So what can we recommend?

Eoin: It used to be a lot more common to mix lots of different environments in one account. These days, 2022, it's more common to have one account per environment but also per application or domain or a team. It would vary from company to company how you split that up based on what the relationship is between the number of products you have and the number of teams. Really you want to think about what is the security boundary you're trying to enforce? What are the resources you want to share and not share? And to avoid people getting on each other's toes, you know, stamping on each other's toes by sharing an account. You might find that you have to put in place constraints so that people don't delete other people's resources. If you're doing that, that's a sign that you need to split into multiple accounts, I think. But I think the AWS best practice is more or less one account per application per environment. Yeah.

Luciano: So if we want to do an example, that basically means, again, let's take our e-commerce example. If you have, I don't know, kind of product application and then you might have a fulfillment application, you might have already a few different accounts because let's just say you have a dev and a production account, you will have two different accounts for the first application and two other different accounts for the other application. So you can see how it's kind of a matrix of all the environments you have and all the applications you have. If you want to do very granular kind of control in terms of boundaries.

Eoin: It's probably worth mentioning as well that if those are like microservices and if you've got a team of five people and they're producing the fulfillment application as well as the product application and the billing application, if they're really just microservices and the deployment can all live quite easily together in one account, you don't need to have an account per microservice per environment. Because in that case, there's also a bit of overhead in managing multiple accounts because you need to, those are more things you have to monitor and take care of. So just because you have multiple separately deployable units in your application, it doesn't mean they always need to have a separate account. It's perfectly okay to start with one account if you've got just one team on that because with a small team especially, you're probably going to be able to make sure that you don't have any conflicts there.

Luciano: Do we agree that I suppose the rule of thumb is how you structure your organization should be somewhat reflected in how do you structure accounts, right? Yeah, I think so. This is Conway's law, isn't it?

Eoin: Exactly. Your architecture is going to follow your team structure anyway. So maybe your account structure also fits into that.

Luciano: And I suppose to summarize this concept, the benefits of doing that is that when you create accounts, you are trying to enforce autonomy on one side, but also security boundaries as we mentioned. But there is another interesting topic that is every account comes with quotas. So AWS will allow you to do a certain number of things depending on the services you use. And of course, the more teams are trying to use the same account, the more likely it's going to be that you hit those limits and that can become a little bit of a complication. It can slow you down. So also keeping more granular accounts will make it easier to use responsibly every account without having to ask for increased quotas or do other things like that. Absolutely. Yeah.

Eoin: When we talk about the application per account per environment as well, sorry, account per application for an environment, we should also mention it's pretty common as well to have shared tooling accounts like for CI CD where you might put your pipelines and other things that aren't specific to the runtime environment of your application. Because there you have a different set of access criteria and concerns to your runtime environment. And it's good to keep those two things separate.

Luciano: It's kind of like a control plane that then can act on behalf of the other accounts, right? Yeah, that's it. Yeah. And it works pretty well.

Eoin: And you get very familiar then with providing cross-account access, which is a good skill and a discipline to have when you've got multiple accounts.

Luciano: So I suppose the next question is, do we think that every developer in a team should have their own account? Is that too much or are there benefits in that?

Eoin: It's a good goal to consider, but it's not necessarily required for people to be productive. And it kind of depends on what your account requirements are. Some applications might need to have a certain level of setup when you create an account. Like you might need to go to AWS and get limit increases. Some of that might take support to get a lead time. So you don't necessarily need to make it one account per developer and make sure that if a new developer joins a team, you stand up a new account. You can certainly do that. And it provides a good level of autonomy. But if you've got a team that's collaborating with each other anyway, you might have pairing sessions. You don't necessarily need one per developer. What we see quite often is that you have a pool of accounts and you just allocate them to people working on different features in a given sprint, say in your team. And that tends to work pretty well.

Luciano: Yeah, I agree with that. I actually prefer that approach because encourages people to talk more between each other and try to figure out, okay, what are you working on? Maybe because you're working in this environment, let's finish this feature together. Instead with the single account per developer is a lot more, this is my account, let me do my thing. And you can keep doing your own thing in your own account, which I don't know, I guess encourages a non-collaborative environment at that point.

Eoin: I guess the main goal with having developer accounts is that people on the team have an environment where they can experiment freely and don't really have to worry about breaking things that matter to other people. I think there's a lot of people doing interesting things like in their continuous deployment pipelines then having automated processes. So when you create a pull request or when you create a branch, it'll automatically deploy a stack from that branch into an account. And then when you merge the PR and delete the branch, it will tear down that stack. So you can automate a lot of that stuff as well, and it can make it very productive for people so that if you've got lots of microservices they need to deploy, they don't have to go to the manual effort of doing that every time they open a new PR.

Luciano: Yeah, that's a good practice because I suppose that will also help to keep those environment as clean as possible so that when people transition across these accounts, they have some expectations on where they are starting from. So I suppose the next topic is because we are talking a lot about freedom and having different accounts where you can experiment freely. We should probably also discuss if there is something that you should restrict, right? And how, and I suppose there are two ways of seeing this. You could either be very defensive. You could say, okay, I'm going to allow you in this account only specific actions so you can provision only specific resources. And that's it. You cannot do anything else. Or the opposite mindset could be like trust but verify. You have more freedom, but you also have a way to detect what's going on and maybe realize if something dangerous is happening. So you can have monitoring or configuration tools that allows you to see, okay, somebody is provisioning a very big issue instance and that's going to affect cost or something like that. Do you have any preference like in one or the other?

Eoin: Yeah, I think the preference is to try and err on the side of putting in place just guardrails for people. And you have the concept of detective guardrails, which are freer for your teams, because it's basically you're just keeping an eye on things. And then if something does something that you don't particularly like, you work with them and you try to remediate the situation. Then the more strict approaches you have preventive guardrails where you're putting in like service control policies, enforcing permissions boundaries and saying you can't do that.

If you want to do something new, you have to open a ticket. And ultimately it will come down to what are the real levels of compliance that you need in your business and trying to be as trustful as possible of teams, because that's the most productive way to do it. It's just to let people move forward. And there's lots of things you can use like CloudTrail and AWS Config, which allow you to observe what's going on and then get alerts. So if you say, OK, we don't want people to create buckets that are not encrypted. So one of the things you could do there is put in place a config rule. And if you get alerted then when people create an unencrypted bucket. Right. So a much more strict approach is don't allow people to create buckets because only we know how to create buckets properly. But that's something that will guarantee your compliance at all times. But it is going to slow down your teams because creating buckets is something developers need to do to try things quite frequently. So understand the tradeoffs there. And I think you can approach this also from a cost perspective.

Luciano: For instance, you could have other rules that will prevent people from spinning up very expensive, easy to instances or maybe even at the regional level because we know there are certain regions that are more expensive than others. And most likely you are not going to need to use all the possible regions. So that's another rule that can help you to limit the surface without restricting too much the kind of activities that developers might want to do. Yeah, that's a really good one.

Eoin: People might say actually maybe it's worth calling it out. Doesn't that violate the principle of least privilege if you're being so permissive with accounts? Because on cloud security, the important thing is the principle of least privilege, not trying to be over permissive on permissions. I think there's maybe talking about two different things. Certainly in production and your test accounts, you need to be enforcing least privilege. In development accounts, you want people to be able to be productive. So you can encourage least privilege there by analyzing cloud trial, by retrospectively looking at things using IAM access analyzer and trying to continually refine the IAM policies that people use. Doesn't necessarily mean locking down everything by default. No, that's a very good clarification.

Luciano: For instance, I totally like the approach that you can experiment as much as you want in the development account. But as you do that, one principle that I really like to follow and encourage other team to do the same is when you are building an app, don't give it any IAM permission as a starting point. And then every time you get a failure, because of course you cannot do a specific action, I don't know, read or write from an strip bucket, then enable permissions gradually. And at that point, it's easier that you end up with a very locked down IAM policy for that particular service. And you can totally do that in a testing account. So yeah, it's a good point to try to separate how do you define as a developer IAM policies for your services compared to as a developer, what can you actually do in an AWS account? So what about non-developer accounts then?

Eoin: What else do you need? Because we focused on, we mentioned production accounts and developer accounts. What about testing? Yeah, that's a good point.

Luciano: I think we have at least, I've seen two different ways of working and I think they are equally good. One is probably a little bit more traditional. So you basically have, you start from a development account, you build a feature, eventually you want to ship this feature to production. There is an intermediate step, which is this feature is going to end up to another account, which is the QA account or acceptance test account, whatever you want to call it. But it's basically an account that is dedicated for testing. So before you actually ship it to production, you can observe and use that feature in that environment. You can run manual and automated tests to make sure it's actually doing what you want. And then from that environment and account, you can transition to the production account. Another approach could be you skip that QA step entirely and you just go from development to production. This is a little bit of a more modern approach. And of course, when you do that, you need to put some boundaries in places to make sure you are not breaking production.

And that's generally done through the usage of phisher flags and it follows the principle of testing in production. So basically what you will do is from development to production, you ship something that is technically disabled by default, and then individual users can enable that. And then they can start to play around with the phisher and see if it's actually doing what it does. And that's a good practice because it removes some of the concern we'll probably discuss a little bit more later around how is the data going to look like in my QA account versus the production account. Because you are already in the production account, you don't have that concern. So you are matching your production account straight away without trying to replicate a similar account in the QA environment. And I suppose I said if you were to use the QA environment, you would have the concern of, okay, I can use CloudFormation or any other infrastructure as code to try to have exactly the same infrastructure and configuration. But when it comes to data, it might not be possible to get all the production data to a QA environment because of volumes, because of compliance and security and privacy reasons. So probably you'll need a process that copies some of this data and as it's being copied, you might also need to anonymize the data. So that adds more complexity, of course.

Eoin: Yeah, trying to keep it as close as possible is important because if you have the environments matched one for one, you should match the volume of data and try and simulate the number of transactions and the transaction frequency as well. So I suppose the next topic is how do you set up? Where do you start?

Luciano: Because we know it's not easy to provision multiple accounts for different environments. Is there any tool or any best practice to get started? You could just start with a console and use that for organizations and accounts as a lot of people do.

Eoin: But I find it's really worthwhile putting some effort into using infrastructure as code for your organizations and accounts. And this is probably the most pragmatic approach that gives you a lot of stability and control and visibility into what changes you're making. So there's two options there. You could use Terraform because Terraform will allow you to create organization accounts. And there's another product which we use, an open source project called Org Formation. And Org Formation is essentially like a cloud formation additional layer that allows you to manage organizations and accounts and do things across lots of different accounts in your organization. So if you need to have every time you spin up an account, you need to put a certain set of resources in place like your config rules or your cloud trail. And it allows you to do that really easy and it doesn't take a lot of code to do that. The project also comes with a lot of great example templates for that. So I'd really recommend either of those. I think it's also really good to use single sign-on on AWS because if you're using IAM users, not only is it kind of almost a deprecated way of logging onto AWS, you end up with long-term credentials per account, which can be very difficult to manage. So SSO allows you to sign on and have assignments between your users in your identity provider, the accounts and the permissions. And so there's this triangle and you can configure all those things with infrastructure as code. There's some rough edges there because a lot of the AWS APIs don't allow you to control all of these things as you wish. And like there's, for example, there's still no API to delete an account. So you can create accounts very easily programmatically, but the last thing you need to do is delete them. So you can't do that very easily programmatically, but deleting them becomes a pain.

Luciano: Yeah. The other thing I really like from SSO is that you get, when you go to the web console and you do your logging, your SSO login, you end up in a page where it's very clear what are all the accounts that exists and all the roles that you as a user have in every account. So I think just that is extremely valuable because you have like literally a dashboard where you can see what you have in your application using the different AWS accounts.

Eoin: Yeah, I agree. It's a game changer when you're coming from IAM users. There's also this thing of account vending machines. And I think if you've got a really large organization and you're doing this frequently and you need a lot of automation, it's something you can look at. And there's a landing zone that AWS provide, which is a solution to kind of automate this. And you can use catalog and a lot of AWS resources to automate the process of creating an account and provisioning it with all these things. You could imagine it could be a massive project for your company. That's the only thing I'd say. So I wouldn't go all in on automating that approach unless you want to really dedicate a lot of engineering hours to it. There's nothing that will just work really well out of the box for you. So I think you'll get 90% of what you need with just like org formation or Terraform and a process to manage that. And then you can have pull requests against your organization for new accounts and it just works like everything else. So yeah, that's the approach I'd recommend for provisioning new accounts for developers, staging. And then it's something you can have templates for every new application. You have like a set of dev accounts and a staging account and a production account and your CI CD. Yeah, I like that approach.

Luciano: I think infrastructure as code applies very well also to this particular topic. Okay, so I think that covers everything we had for this episode. I am personally really, really curious to see what kind of way you found to organize your accounts and environments. We have seen many, many different configurations with our customers and our projects. And I think there are many, many different ways and they are equally valid. So if you do something that is different from what we suggested, please let us know because we are really curious to find out what are your trade-offs and why you decided to do something different. Also, this is a topic that is always evolving. Even AWS is constantly pushing new tools and new recommendations. So again, if you know any other way that it's equally valid for you, please let us know and let's have a chat. And with that, we'll see you at the next episode. Bye.