AWS Bites Podcast

43. When is it OK to cheat on AWS?

Published 2022-07-01 - Listen on your favourite podcast player

We do love AWS, but sometimes we have to admit that it’s not always a silver bullet. There are definitely use cases where it might be worth considering alternatives to AWS. In this episode we will discuss some of these use cases and try to highlight what are the advantages that other platforms or services can have over AWS in very specific circumstances. First of all we clarify why we like AWS and why (and when) it’s worth sticking with it. Then, we discuss what are some of the reasons why it might be worth considering alternatives to AWS. At this point we go into the specifics and talk about authentication services (Auth0), search services (ElasticSearch, Algolia), CDN Services (GitHub Pages, Netlify, Vercel, CloudFlare, Fastly, Akamai), Databases (MongoDB Atlas, Digital Ocean managed databases, IBM Compose, CloudFlare R4 and D1, Upstash, Confluent Kafka), Headless CMS services (ContentFul, Storyful, AirTable, Google Spreadsheet), Virtual Machine services (Digital Ocean, Linode).

In this episode, we mentioned the following resources:

Let's talk!

Do you agree with our opinions? Do you have interesting AWS questions you'd like us to chat about? Leave a comment on YouTube or connect with us on Twitter: @eoins, @loige.

Help us to make this transcription better! If you find an error, please submit a PR with your corrections.

Luciano: We do love AWS, but we have to admit that it's not always a silver bullet. There are definitely use cases where it might be worth considering alternatives. Today we will discuss some of these use cases and we will try to highlight what are the advantages that other platforms or services can have over AWS in very specific circumstances. My name is Luciano and today I'm joined by Eoin and this is AWS Bites podcast. So Eoin, why don't we start by asking ourselves why we would prefer to go all in on AWS for everything?

Eoin: That seems to be kind of the default, that you try and look for the AWS solution first, but you don't want to get distracted by that single focus and maybe be a little bit more open-minded. But the reason you do it is because generally the way the AWS services are delivered, they're well integrated with each other. You also get like unified billing and in general people have a level of trust in AWS. If you pick a vendor as important as your primary public cloud vendor, you want to go all in on them and you want to trust them and you believe that they're going to stick around and they're not going to retire services, they're not going to be acquired by some bigger company who then retires the services. So that's a good reason to go all in on AWS. But like I said, you probably shouldn't always, so why would you want to go for third-party services instead in some cases?

Luciano: Yeah, that's a good point. I would also add that AWS generally gives you very good levels of scalability. You have this peace of mind that if your project actually grows a lot, you can find a very good part of scaling up with AWS. But I mean, realistically, that's not always a realistic expectation, right? Sometimes you have very modest projects and it's okay to deal with them so you don't have to think about that level of scalability.

So other cases, I guess, is when you have very specific needs and there might be tools, the companies that are focusing on that one particular need and they're providing very, very good service. And I think we'll discuss some of these examples later on during this episode. So basically, in those cases, you might end up having a much better user experience, probably even price and feature set, than just going with some of the many AWS services.

In other cases, it's more about the billing, I guess. It's more about how easy it is for me to predict cost in a specific service rather than in AWS. Some offering, for instance, we'll talk about virtual machines. If you just need one virtual machine, the cost unit is generally easier if you go with something like Digital Ocean or Linod, rather than thinking in terms of AWS where you have so many different parameters that is a little bit harder to predict the cost in advance.

And in general, we'll talk about simplicity because, of course, we know we have been talking a lot about this, actually, in previous episodes, that AWS has a decent barrier to entry. You need to learn a bunch of concepts before you can start to deploy anything serious on AWS. So if you are just looking for, again, a simpler way to just deploy things in the cloud and launch products, other platforms might give you better access to the resources and more simple user experience in general. So I guess, yeah, let's start by picking some examples. I suppose my first one, because it's also a topic we have been discussing before, is what about authentication? Would you go with Cognito or are there alternatives that you would prefer in some cases? Yeah, I would definitely use either Cognito or likely something like Auth0.

Eoin: I think these are the two which compete very well with each other. Cognito, maybe we should do a deep dive on it in a future episode because there is a lot of features there. Unfortunately, it suffers from a couple of drawbacks, like poor naming of some of the concepts in there, which make it very confusing, difficult to get to grips with the documentation. There are some feature gaps as well. So from a developer experience point of view, it suffers, unfortunately.

Auth0, on the other hand, has a really nice developer experience and a really good onboarding. So if you want to integrate authentication, authorization, like sign up, log in, log in with social identity providers, Auth0 will allow you to do that very quickly. And the whole developer experience in general is going to be much smoother. The other thing about Auth0 is that it solves those kind of small company startup problems, like how do I add sign up and log into my application in an easy way?

But it also solves equally well the enterprise integration scenarios like SAML integration and OpenID Connect for enterprise as well. So there's a lot of feature supported there. So the only advantage is, and as with all of these things, you just have to compare the pricing models, I think, with the user-based pricing model with Auth0. I believe it's still quite expensive when you get to large numbers of users.

Cognito is pretty inexpensive for large numbers of users by comparison. I suppose the other thing you need to think about is just do you need deep integration with IAM? So if you're using Cognito, you get good integration to API Gateway authorizers and AppSync authorizers. You won't get that out of the box, and you just have to use a custom authorizer with Auth0 or any other third party. So that's the authentication, authorization scenario. What else? What else is a topic where people... I think Auth0 is probably the number one thing that people would reach for outside of AWS. What else is pretty popular for people who are... There is an option in AWS, but it's not people's number one choice.

Luciano: I guess we can talk about search because it's one of those use cases where eventually for any reasonable application, you'll need some search functionality, and that's where you start to ask yourself, okay, how do I build this in AWS, and then you realize it can be much more involved than you actually wanted it to be, and then you might want to start thinking about alternatives outside AWS. But if we just stick to AWS, probably the default is Amazon Open Search, so Amazon version of Elasticsearch.

And while that is good, it's still a little bit annoying that you still need to think in terms of servers. You need to provision a cluster and find the servers, and there is not really a serverless model, which is something that I guess is going to go into my own personal wish list for AWS, because of course most of the time you just want an API or something that allows you to index some data and then search over the indices.

And with Open Search, of course, you need to do a lot more before you can access to that level of API. And there is an Elastic Cloud as an alternative, so the cloud from Elasticsearch creators, but that one is also very similar, meaning that it's based on instances, and you need to provision those instances. So I guess maybe the user experience can be a little bit better. I actually don't know because I haven't tried it.

But again, you need to go outside AWS, create a whole account, and manage the billing for something that eventually is still based on instances, and it doesn't seem to give you a lot of advantages. So yeah, I guess if you want to zoom in a little bit more on different types of use cases, maybe we can also talk about other alternative products, because I suppose one use case is about search. So you really just want to index data and then search over it.

Other very common use case for Elasticsearch is about managing logs, distributed logs, and centralizing them somehow. And for those use cases, I've used Elasticsearch, and I think it works really, really well, especially when combined with Kibana. But again, it's a little bit of a messy setup, and it takes a little bit of time before you get it right. So in those cases, if you want to avoid all of that, you can use third-party services focused at 100% on the logs use cases.

And these are like Loggly, Logs, DotIO, Splunk, Datadog, Onacom, and so on. Instead, equivalent if you want to think about pure product search, or documentation search, document search in a more general way, Algolia is probably the go-to service where you just want to give me an account, give me an API, I don't want to think about servers. And it gives you a very nice developer experience, you can get a very good documentation.

There are also cases where you can get the service for free, like for instance if it's an open-source project, you can easily request Algolia. There is like a web form, you need to feel like, can you support us by giving us a certain level of access for free? And at least in my experience, it's been pretty easy to go through the process, and they seem to be very open to support open-source projects.

And the API is generally very nice, and you also get pre-made scripts or widgets that you can just copy in your application, and everything magically works, at least to a certain degree. The only thing is that if you use it at scale, because you really have successfully commerce with thousands of products, I think it might get expensive very quickly, based on what I've seen on the pricing. But that's always kind of the trade-off between go and do all the work of setting up infrastructure, and then maybe you get a little bit of economy of scale there, rather than just using a software as a service, which might be very cheap at the beginning, but then the scale curve grows very steeply, and you might end up paying a lot when you are reaching those levels of growth. Yeah, so I think that covers search and logs pretty well. What about something else very common? I'm thinking, hosting files like CDNs or websites.

Eoin: Yeah, this is actually something we talked about in one of the previous episodes, I think back near the beginning, we talked about how to deploy a static website on AWS, and we talked specifically about CDNs. I know we mentioned things like GitHub Pages and Vercel and Netlify as ways of getting a static website up and running very quickly on CDN. Netlify is actually the site that we use for hosting the AWS bytes website, awsbytes.com, funnily enough.

And then you also have, I guess, not just static websites, but sometimes you've got other CDN needs, like for video or documents or images, so there's lots of different use cases out there that can leverage a CMS. So CloudFront is pretty good. I think CloudFront is a great option, and I'd use it quite a lot, but it's not suitable for every scenario. We mentioned before that it might be slow to update. Configuration is complex enough.

So this is an area where there are quite a few alternatives. Cloudflare comes up quite often, I think, especially in the kind of small to medium business space. They've got also a lot of other products around it, like for protection and firewall, that kind of stuff. And then you have things like Fastly, which is known for being particularly fast. And then you've got the big older enterprise players like Akamai, which have been around forever, it seems.

So there's a lot of alternatives out there in the CDN space, and some of them are pretty fast to get up and running with, especially the more modern services. And yeah, I think it's one area where it really depends on your use case, but you've got good options out there. So if we think back then to data, let's maybe think about databases. You've got a lot of options in AWS, really, with RDS, right? You've got DynamoDB, which has grown in popularity in a phenomenal way. Then you have things like Aurora Serverless, which are kind of serverless, not quite there yet, a little bit debatable in whether you should use them or not. So what else is out there that people should be looking at? I guess looking outside AWS, the main first contender to DynamoDB is probably MongoDB.

Luciano: And there is MongoDB Atlas, which is kind of, I suppose, a good alternative to DynamoDB, if you like, document-based databases like Dynamo or NoSQL databases. And the good news is that recently there is a serverless option for MongoDB Atlas, so probably much more lightweight in terms of configuring the whole cluster and scaling it and so on. I haven't used it yet, but it seems to be an interesting alternative.

If you're looking, if you like to use MongoDB and if you're looking for the kind of model where it's like, just give me a database and scale it for me and I'm going to pay as I go. Another interesting one is that I noticed that recently enough, I think it was either this year or at some point last year, DigitalOcean launched kind of an alternative to RDS. It looks very similar, even though it covers Postgres and MySQL as kind of relational databases, but it also covers Redis and MongoDB as other classes of databases that in AWS they will go kind of somewhere else, in a different category of databases.

And this one is interesting because if you like DigitalOcean and if you use it, you probably know that they spend a lot of time making sure that the developer experience is good and it's very easy to see, to understand how to provision anything and get things up and running. You get lots of documentation, lots of tutorials, but the UI is generally very driven to kind of a workflow where it's like step one, step two, step three, and at the end you have everything up and running as you would expect. And I saw a small demo there showing how to provision a Postgres database and it seems like really simple to get started with. So if you are already using DigitalOcean for something else and you just want a database that is more managed than provisioning everything yourself, that's probably a good alternative to use. And I remember you mentioned to me some time ago, Eoin, about something I think from IBM called Compose, which is probably in a similar space. Do you want to talk about that? It actually was since I've used Compose, but when I did use it, I was really pleased with how easy it was to get up and running.

Eoin: I was using it for MongoDB at the time. So it's also not target for hosting MongoDB. They also host lots of other databases like Postgres, MySQL. I think they also have things like Redis and Kafka. I'm not sure about Kafka, but they've got quite a few different options. So it's one to check out. They were acquired by IBM, like you say, so they're an IBM service now. And yeah, I just thought it was really seamless to get a MongoDB instance up and running.

So it might be something else to look at. Maybe another one to mention finally on databases is Cloudflare. We might talk about Cloudflare seems to come up with new products very frequently these days. And one of the recent ones was they gave early access to their new relational database called D1. And I think the idea here is that if you're using a SQL, if you want SQL database and you're using Cloudflare workers, this is the solution that they are proposing. And it's an interesting one, right?

It's a bit like Cloudflare workers. It was a different approach to serverless functions. This is a different approach to serverless databases as well, because it's actually built on top of SQLite as a database engine. So yeah, it would be very interesting to watch that. I haven't used it yet. I think it's still in early access, so you can sign up to the wait list. I'll wait to see what other people say first and take my guidance from them.

So maybe slightly related, since we mentioned Kafka and Redis there a little bit, we covered this when we talked about event services. But you've also got things like Upstash for serverless Redis, which is really nice, and Confluent Cloud as well for Kafka and Upstash are doing Kafka as well. So those are definitely ones to check out if you're interested in Kafka and Redis. Since we were talking about databases, maybe it's a good idea to talk a little bit about the future direction of this.

And I think this is something we've alluded to once or twice. We've got databases out there, but you still end up building what seems like the same crud code for every time you build an application. Everything seems to seem a little bit like a CMS at its heart. And I think we've mentioned things like Contentful, Firebase for headless CMSs, or even backend as a service. I always feel like this is the way the industry is going and should be going, but it's maybe not getting there quite as fast as I expected.

I thought that maybe by 2022 more of us would be building systems on top of vendors that would provide just an API as a service, that we just give it our data, they give us an API back, and we don't have to worry too much about all the data access wiring and the query patterns and the optimization and the building indexes and normalization. But it still seems like we're still there, right? Especially if you adopt DynamoDB, it seems like people are actually getting deeper into the weeds now, and you have to gain a high level of proficiency with things like single table design to use it properly. But I think eventually we have to move away from that and start treating APIs as managed services, and maybe just build a definer schema and we get a GraphQL API and we don't know how it's stored anymore. What do you think? What are the alternatives out there? Is there anything that you can use today? I think you make a very good point.

Luciano: I agree that there was an explosion at some point a few years ago, I think about three or four years ago, of this kind of headless CMS. And there was a lot of buzz around the jump stack and building static websites off of data managed by somebody else, maybe in your team that is focused on content, and they will use this kind of CMS, static headless CMS UI, while then you have developers using a build process to take all this data and build new versions of the website.

That seems to be the most likely future for building marketing web pages or product pages. I agree with you that it doesn't feel like it has been adopted as much as we would have expected a few years ago. That might change because I think there is still a big push for these companies, and I think the market is starting to realize that there are many advantages in keeping your front end very simple and statically rendered, while you still want to retain all the capabilities of managing data and changing things very quickly.

So I am still convinced that this is a good solution going forward. But there is an interesting, I think, maybe it's an edge case, I'm not really sure, but I think there are many cases where you want to build just a small website for even sometimes even for just a limited amount of time. I'm thinking like, I don't know, event websites where, I don't know, for instance, like a conference, where you want to showcase that this event is happening and there will be, let's say, speakers and people attending.

So there is some data to manage that and most likely want to use some sort of database to manage all the data as you keep building the whole event. But at the same time, the website is going to be very simple so you can take advantage of statically rendering it and also do something like Netlify or Cell and so on. And yeah, because most likely you're not going to have a very complex dataset, I have seen cases and actually built something myself very successfully where you can use very simple data alternatives that are not really meant to be databases, like, I don't know, a Google spreadsheet.

And because they offer APIs, you can just ask everything there in one page. Most people would be comfortable enough with like inputting data there. And then you can easily build an API, even just at build time, like if you have a static website, at build time you can just fetch the data through the API and use that as a mini database. And if you don't like Google spreadsheet, because of course the APIs are not the most straightforward for this particular use case, I think Airtable does kind of a better job in giving you slightly better APIs.

And also what I really like by Airtable is that when you open the document, you can really watch it in real time when new data comes in. Like let's say that you also have a read-write pattern from the front end, maybe, I don't know, you're asking people, do you want to attend this event? They can click a button and say yes and give you, I don't know, an email address or other information. You can actually see in real time the data appearing in Airtable and there is like a nice highlight effect. So it actually looks really nice then to use that as a back end to watch things changing. And recently I had somebody mention that Notion is starting to put out APIs as well and they seem to be fairly good. And because more and more people are using Notion as their own personal database for everything, I think that might become another alternative for simple cases to just use the data there, use the data through an API to build websites and experiences based on top of that data. I like that. Yeah. I mean, I think Airtable was kind of designed almost as like an easy to use database with a spreadsheet kind of layer on top, wasn't it?

Eoin: So I think it's a perfectly viable solution and it's good to see that going. I think maybe a few years ago there were a lot of companies building on top of Parse and Parse, it didn't end well because it ended up getting acquired and retired. So a lot of people were left kind of hanging there. So maybe that's one of the reasons why this kind of thing didn't take off as it should. People felt a little bit wounded. But I hope, I definitely hope there's options there in the future.

Maybe if we take it back to basics, at the start you mentioned a couple of things about the idea of getting into AWS and all the things you need to understand if you just want to get an EC2 instance up and running. You can definitely do it quickly, but there's probably a lot you should be understanding, especially in terms of security and billing. It's probably worth mentioning, we've covered DigitalOcean already. They've got a user experience that helps you get up and running pretty quickly there. And there's other alternatives out there like Linode. And the idea with any of those specialists kind of providing simple instances, very well providers, is that you get more predictable cost generally. And it's very easy to see what you're getting, to understand it, and there's not a lot of complexity with it because it's just, they're essentially just providing those basic elements for you and they're not trying to do 200 other services along with it like AWS. Yeah, absolutely. I actually used Linode to host the website for my first startup. This was like seven years ago, I think now.

Luciano: And at the time I didn't really know much about, OK, how do you run your own servers? How do you host products on a server and put them online? I have to admit, I was quite scared about doing all of that. We were looking for like a sysadmin to help us. But because we were a startup with a small budget, we couldn't afford that. And eventually we were like, OK, let's just eat the bullet and try to do it ourselves and see where we end up.

And I was actually very pleased. And this was already like seven years ago. So I assume now the experience is like much, much better. How easy it was to really understand what were the building blocks? How do you start to create an instance? How do you provision software on that instance? Then you have a very nice dashboard that allows you to see what are the performances that any bottleneck should increase something.

Is disk read and writing doing OK? Is network doing OK? And I really enjoyed that experience. And I think at the end that the server was actually doing quite well, even in terms of performance. And it was relatively cheap because we did go with a kind of a medium instance and one instance was good enough for that particular startup. So it was very nice also for us started with a very small budget to have very predictable costs. I think we were paying something like fifty dollars a month. So that was giving us a lot of confidence that that solution would be good enough for us in that period of time. I think it's great to have that level of competition there. It's important that AWS doesn't completely run away with the market.

Eoin: And maybe on that topic, we do see some services aiming to come in and replace or give a viable alternative to specific AWS services and being API compatible. So one example of that is, again, Cloudflare, who introduced with a lot of fanfare, which is a replacement for S3. So it's an API compatible drop in replacement. And one of the standard features of that was its pricing. So the pricing, especially for data ingestion, sorry, for data extract, was vastly superior to Amazon's.

Amazon actually followed up with a significant change to the free tier for S3. But they've also got workers. So as an alternative, perhaps, although they're slightly different to Lambda functions, you've got Cloudflare workers and they've got a database storage there as well. So workers KV, it's called, key value store. So it might be kind of an alternative to DynamoDB global tables. I think it's great to see that competition, especially if they're pushing AWS on price or features or performance, because it keeps them on their toes and it keeps everybody innovating, I hope. So what do you think there? Would you be keen to adopt or to instead of S3, do you think, or other services that are just drop in replacements for AWS services? Yeah, that's a good question. I have to admit that my experience with Cloudflare is kind of a little bit conflicted because there are things that I really like, like how easy it is to set up the CDN.

Luciano: For instance, when you're building a static website hosted in GitHub pages, you can literally set up the integration with Cloudflare if you want that extra level of CDN and more configuration in terms of DNS and so on in literally minutes. And it works really well. And most of the time it's going to be free because for most use cases, you don't even need to go and pay for the other, for the kind of the custom enterprise plan.

But when it comes to these other new services, I found the last time I tried to use them was probably six months ago. So I hope now has improved. But I found that UI was a little bit confusing. It was a little bit complex to understand. Okay, what if I want to invite people to help me with this project? And then do I need to switch to from a personal plan to an enterprise plan? And then it seems that if you want to do that, you can. But there is no easy migration of your plan, or at least there wasn't. So I think they still need to get the whole developer experience in terms of transforming Cloudflare from a CDN company to a more like cloud offering company. I think there is work to do there, but definitely I love to see all the new products that they are pushing and to see that they are quite different from their territory. So they're trying to innovate in the market and also that they are competing on price. So I really love to see what is going to come out from Cloudflare in the near future. And I think it's going to become a more and more prominent competitor in the cloud space.

Eoin: If other people out there have seen third party services which they use instead of something we maybe you take for granted in AWS, we'd love to hear about it because we're definitely keen to explore other options, especially if it helps us achieve that kind of serverless vision where you're just offloading more and more effort to a cloud provider and just getting started in a very, very simple way. That was great to talk through those. We'd love to hear your ideas. Thanks for listening. Continue to listen to AWS Bites and follow us and subscribe to the YouTube channel and share the podcast with your friends and rate it and everything. We really appreciate having you here with us and we'll see you in the next episode. Goodbye.