AWS Bites Podcast

Search...

137. Transit Gateway Explained

Published 2024-12-12 - Listen on your favourite podcast player

In this episode, David Lynam provides an overview of AWS Transit Gateway, which aims to simplify complex network connectivity between VPCs, VPNs, and on-premises networks. We discuss the limitations of using VPC peering and the benefits Transit Gateway provides through its hub-and-spoke model. The main components of Transit Gateway are explained, including attachments, route tables, associations, and route propagation. We go through some example use cases like sharing Transit Gateways across accounts, network isolation for compliance, routing traffic through security services, and bandwidth/scaling capabilities.

AWS Bites is sponsored by fourTheorem, an Advanced AWS partner that works collaboratively with you and sets you up for long-term success on AWS. Find out more at fourtheorem.com.

In this episode, we mentioned the following resources:

Let's talk!

Do you agree with our opinions? Do you have interesting AWS questions you'd like us to chat about? Leave a comment on YouTube or connect with us on Twitter: @eoins, @loige.

Help us to make this transcription better! If you find an error, please submit a PR with your corrections.

Eoin: Welcome to AWS Bites episode 137, where we're going to dive into one of AWS's more powerful services, Transit Gateway. Networking in the cloud can feel like a bit of a tangled web of connections with VPC peering, VPNs and direct connect, all weaving together into a complex mesh. Transit Gateway aims to simplify all of this, providing a centralized hub designed to streamline connectivity and make network management a whole lot easier.

In this episode, we're going to break down exactly what Transit Gateway is, how it works and why it's a game changer for organizations of all sizes. Whether you're managing a few VPCs or scaling to hundreds, Transit Gateway is built to handle the challenge. Plus, we're going to explore some real world use cases that give you a feel for how Transit Gateway can help you. My name is Eoin and today I'm joined for the first time by David Lynam. Let's get started. AWS Bites is brought to you by 4Theorem. Sometimes AWS is overwhelming and you might need someone to provide clear guidance in the fog of cloud offerings. That someone is 4Theorem, so check out 4Theorem at 4theorem.com. Now, very welcome, Dave. Thanks, Paul. Glad you could join us. Can you help set the scene? I know you're a bit of a closet Transit Gateway expert, so maybe we can just start with the topic of VPCs.

Dave: Yeah, sure. So a VPC or a virtual private cloud is an isolated network environment where you can define and manage an IP range or a CIDR block. And this CIDR block forms the base for creating subnets where you can assign the IP address to service components within a VPC. So typically in a VPC, you'll have a couple of different types of subnets. You'll have a public subnet where you host your public instances such as public EC2 instances, load balancers, and maybe a NAT gateway. You'll also have private subnets, which are instances that are not directly accessible from the Internet, but they'll be able to connect out through to the Internet using a NAT gateway. We'll also have isolated subnets, which basically have no inbound or outbound connectivity. So subnets are associated with route tables that define how interfaces in the network can route traffic to the Internet and other networks. By default, you can only route within the VPC CIDR block.

Eoin: So VPC is pretty much self-contained by default then. So if you're going to scale and if you've got any kind of complexity or you need systems to talk to each other, you're going to think about routing or routing outside of the VPC, like to other VPCs or even networks outside of AWS. So how is that typically done?

Dave: Yeah, so when you need to connect to VPCs, the default approach is traditionally to create a VPC peering relationship. To set up a peering connection, one VPC acts as the requester and the second VPC acts as the acceptor. The acceptor must accept the peering connection. You can then add route table entries in one VPC to route traffic to the other VPC and route the traffic backwards in the second VPC.

When you have limited requirements, this can be a very useful and cost-effective solution, but there are quite a few limitations. Firstly, you cannot use peering to route to the Internet or through NAT gateways or VPNs in another VPC. You cannot do transitive peering either, which is what we'll get on to in a few minutes. So, for example, if you're in VPC A and you want to send traffic to VPC C and VPC A and VPC B are peered and VPC B and VPC C are peered, you can send traffic from A all the way through B to C.

So it can become quite complex as well if you have multiple peerings. So it's quite easy if you have two VPCs, but if you have many more, it gets very cumbersome to manage all those VPC peerings. The reason these connections I mentioned, the transitive peerings, are not possible, there's kind of a small rule of thumb that I have to use. But a packet coming into a VPC, if the destination for that packet is outside of the VPC, the VPC will drop that packet.

So I think that's a good rule to kind of keep in mind when we're kind of talking about transitive gateways. So what we mean by a transitive network is that where traffic in one VPC is going beyond a second VPC to a third VPC. So going through to autonomous networks. Before transit gateway, it was possible to implement transitive connections by creating a gateway with an EC2 hosted VPN software product in a transitive VPC.

And peering each VPC with this transitive VPC. VPN acts as the destination and forwards the packet to the true destination in another VPC. As well as having to pay for the EC2 instance, the VPN software and the total cost of ownership of managing all this, there is also a fair amount of complexity and management overhead with the VPC peering. To support network resilience, you also would need to deploy the architecture across multiple availability zones and implement redundant connections. This is typically referred to as a hub and spoke architecture where the transitive VPC in the hub and the connected VPCs are spokes.

Eoin: I can understand why people were looking for a better solution because it is very, it seems like it's quite complex, a lot of moving parts, a lot of maintenance. And yeah, not something that you'd really warm to. Now, these days, when you have an AWS organization with multiple networks, there are lots of cases where you've got VPCs spread across all those accounts and networks. And you might need to connect them together. So it's typical to need routing through a centralized account, maybe, or to access on-premises networks. You also have to think about VPNs, like site-to-site VPNs or client VPNs. And then you have to think about just east-west routing, as they call it, between services or applications. So let's get into Transit Gateway then. How does Transit Gateway help?

Dave: Yeah, sure. So as we mentioned there about the Transit VPC, the Transit Gateway is basically a managed hub and spoke network. It takes the management of the hub and spoke architecture off your hands. So you really only need to worry about routing the traffic or directing the traffic where you want it to go. So it provides a centralized hub for connections between multiple VPCs. And it scales to thousands of VPCs by extending through other Transit Gateways. It supports connections to on-prem through Direct Connect and VPN. And it's highly available across multi-AZs in a region. You can peer Transit Gateways in multiple different regions together as well. And you can also do very, very fine-grained routing and isolating networks and stuff like that. And it also works across accounts. So it's very good for, like, if you have a large organization, you might have AWS organizations enabled. You might have many, many accounts. You might segregate your business workloads by account. And then you could join together where it makes sense using a Transit Gateway.

Eoin: That sounds pretty good. What are the main components then? What does even a simple setup look like if you're just getting started?

Dave: Yeah, sure. So the main component of the Transit Gateway is the attachment. If you want to attach VPCs to your Transit Gateway, you create an attachment for that particular VPC. You can also attach VPNs. You can also attach direct connects to attach to your on-prem networks. You can also peer attachments, which is connecting to another Transit Gateway to chain your Transit Gateway connections. When you create the attachment, you also pick what subnets you want the attachments to be joined in.

And once you have all your networks attached, you need to think about routing. Transit Gateway has its own route tables, which are separate to the VPC and subnet route tables. The Transit Gateway route table is a powerful thing. It lets you control how traffic can flow between attachments. Attachments and route tables are the first two building blocks. The two other important concepts to be aware of are associations and propagation.

So association, every attachment will have an association with the route table, but you can also create lots of route tables for fine-grain control and segmentation. Propagation. If you want a VPC to be routable through a Transit Gateway, you can propagate its CIDR block to the Transit Gateway route tables, which is essentially using BGP under the hood. And it allows the Transit Gateway to learn about routes that AttachVPC knows about.

With these four building blocks, you can create some very powerful scenarios. Let's imagine we have three AWS accounts, and each one of those accounts has a VPC. So let's say VPC A, VPC B, VPC C. What we could do is we can create a Transit Gateway. Let's say we had another account, which was even a networking account. We can actually share the Transit Gateway with those three accounts. So we can have a centralized management of the Transit Gateway.

And each one of those accounts can then attach their VPC to that Transit Gateway. And the routing of that Transit Gateway traffic can be managed inside the networking account. So that would allow traffic inside VPC A to send traffic to VPC C or VPC B, and likewise, and vice versa with the other two VPCs via the attachments. Yeah, one other point, I suppose. So I mentioned there about the Transit Gateway being shared.

So you would use AWS RAM or Resource Access Manager to share the VPC with the other particular accounts. So it's a really nice feature because you're able to centralize that ownership of the routing in a network-managed account. And then you don't have to worry about your business domains or your business accounts having to create Transit Gateways or attach Transit Gateways. They just use the Transit Gateway that's managed by the AWS organization. Moving on from that then, the VPCs could add in their private subnets, if they want to send traffic from the service components in the private subnets, they would route traffic to the target of the Transit Gateway itself. So any traffic that is not for their local VPC, they will send it to the Transit Gateway. The Transit Gateway will pick up that routing and it'll send it to the appropriate VPC, which will arrive then at its destination.

Eoin: I think that's a good point because you have the Transit Gateway routing tables, which are separate, and that you can use them for all sorts of advanced segmentation or just be very permissive if you want. But then you still need to have your subnet route tables as well. And you need to say, okay, well, which you might have a catch-all that says 0.0.0.0/0 goes to the Transit Gateway. Right? And then internet might go through it or it could be a specific CIDR blocker is even more specific.

Yeah. Another thing I was involved in recently was a project where we wanted to have a multi-account setup with VPCs for application traffic. So it was essentially web applications and other things, even databases, where end users wanted access, but it had to be secure and could not be on the public internet. So what we were actually asked to do was look into a VPN setup. And we've done quite a lot of AWS client VPN, which is generally pretty straightforward to set up.

You know, you can do certificate-based or integrated into identity provider. But this kind of VPN can terminate on your networking account, just like you described there. And you can associate it then with the Transit VPC. So this kind of VPC in the network account, and that's also attached to the Transit Gateway. We talk a lot about segmentation sometimes when we're talking about Transit Gateway.

I think we've mentioned it there a few times. So one thing we wanted here was that VPN users could be able to access these applications. But the applications, you don't want them to be able to route between each other. Right? They should be segmented from each other. They're isolated domains. They shouldn't talk to each other except through maybe an event bus or something. Right? So you want to avoid that kind of direct traffic.

So the way you can do that is by setting up two Transit Gateway route tables. One's associated with the Transit VPC attachment, and it lets you route through to those domain accounts, say application accounts. And another is associated with each of those domains VPC and different route table. So let's call that the applications route table. So then the domains can propagate their CIDRs to the Transit VPC route table.

And the Transit VPC propagates its CIDR range to the applications route table. And then they can both talk to each other. But the routing is segmented then so that client VPN connections can route to the applications. The applications can route back out to the VPN. But the domain accounts cannot route to each other. And like you said as well, you still need to make sure you add those subnet route tables from the Transit VPC to the destination CIDR through the Transit Gateway. So the VPN clients can reach the application. So you always need to think about the route tables for Transit Gateway and subnet route tables. And it might be more complicated. It might sound more complicated than it is, but it's just one of those things, right? You just have to try it a few times, make the mistakes. And then the concepts, I think, are pretty powerful and replicable. Then, you know, it's not as advanced as it might seem. Yeah. Any other use cases we should cover?

Dave: Yeah, there's a couple of interesting ones there I've come across. So similar to what you just mentioned there, there's kind of a network isolation for compliance reasons, such as PCI you've come across, where networks are in scope for holding like credit card details, for example. Using a Transit Gateway is a very good idea for, you know, restricting what networks can get into that data because you have to, under compliance, you have to, like, any connected networks are always audited.

So being able to show and being able to restrict what networks can actually get to the PCI data means you're reducing the scope of what's inside of an audit. So it's quite good for that. You can also turn on, like, Transit Gateway flow logs as well and be able to show that, monitor that data going in and out. The other kind of one I've come across is security services. So quite often security services are quite expensive and you kind of deploy them in maybe a security account.

And what you'd like to do is have, like, maybe all your traffic go through these security services. But obviously you don't want to deploy them in every single account. So what you would do is you would basically reduce your ingress and your egress to a single account. And what you'd do is route the traffic coming in and out of your AWS accounts through the security services and in a thing called a middle box. So it's basically intercepting the traffic, sending it off to the security service, pulling it back. So things like AWS firewall and those kind of things would quite often be used there. And you can use the Transit Gateway then to, like, direct all that traffic, make sure that all that traffic goes back through that security service. So there's lots of good patterns for these kind of things using the Transit Gateway.

Eoin: When we're talking about the benefits of this kind of stuff, we should also talk about pricing and limits. Anything interesting to talk about there? How does it stack up price-wise? Yeah. So, like, you're paying basically on two parameters, essentially.

Dave: You're paying based on the number of attachments you have or the hourly rate for those attachments and the data you're processing. Typically, it's about $0.02 per gigabyte per transfer data per month. And for the attachments then, you're paying in about, it's roughly about $0.05 per hour in many regions. So that's about $36.50 per month. But the reality is you're not going to be peering with one VPC. You're going to be peering with at least two. So at a very minimum, you're going to be talking about $73 a month to peer two VPCs. Traffic between peer VPCs in the same region is billed as if there were AZ to AZ data, while traffic between VPCs in different regions are billed as if the data had been sent out to the Internet. So it's a little bit more expensive.

Eoin: What about limits then? Is there any capacity issues you might need to think about? Yeah. One of the benefits of a transit gateway is it's very scalable.

Dave: So we can have five transit gateways per account, which is a soft limit. You get AWS to increase that. Each transit gateway can have up to 20 route tables. And we can have in those route tables a total of 10,000 routes. So that's quite a lot of routing already. And each transit gateway can have 5,000 attachments. That's quite a bit. It might be worth noting as well that you can't really get around this by using pending attachments, by not accepting them. So there is a limit of like 10 pending attachments. So you can't be smart and try to get 10,000 of them in a pending state and move on. So if we were to kind of compare the transit gateway with the VPN tunnel, we get a huge increase in throughput there across the network. So we're getting about 100 gigabits per VPC attachment per AZ. Whereas if we're looking at the VPN tunnel, we're only about 1.25 gigabits per second. So if you're looking for speed, the transit gateway is probably the way to go.

Eoin: I think that last one is important then. So the bandwidth is per VPC attachment per AZ. So as you scale out, you just get more bandwidth really. So that's pretty cool. Okay. Thanks a million, Dave. Just to wrap up, then we should probably point. I think there's an example of the AWS documentation for transit gateway, which I think is one of the best pieces of AWS documentation I've come across because it talks about how a fairly advanced topic like this works, but it's not very long. And most of it is just giving example scenarios like the ones we talked about today, but others as well. And I just think it's whoever was involved in that deserves some praise because it's really worth check it out. It'll be in the show notes. Yeah. If anyone out there has any other use cases for transit gateway that we missed, do let us know in the comments below. Thanks for joining us. Cheers, Dave. Thanks all. See you again in another one. And we'll catch you all in the next episode.