Help us to make this transcription better! If you find an error, please submit a PR
with your corrections.
Eoin: Everyone loves the simplicity of S3 for storing and retrieving data. But when you start pushing the boundaries and want really large objects, high throughput, and faster access, it can start to become a bit of a minefield. AWS recently released Mountpoint for S3, a new client that promises to make fast access to S3 as simple as any file system. Today, we're going to take a look at Mountpoint for S3. And by the end, you should know where you might use it and when you should give this a hard pass. I'm Eoin. I'm joined by Luciano for another episode of AWS Bites. fourTheorem is the company that makes AWS Bites possible. If you're looking for a partner to accompany you on your cloud journey, check them out at fourtheorem.com. Luciano, why would you need something like Mountpoint for S3? What do you think? What are the use cases that it might solve?
Luciano: Yeah, that's a good question. So we were reading through the announcement, and there are some use cases that are detailed there. And the first one is big data application, specifically when big data application like data lakes, they don't directly support S3. So you can effectively use Mountpoint to mount S3 as a file system, like a FUSE file system, and then just give it to the application you're using. But this is a bit of an interesting use case because for the kind of big data application that we have been using, like Dremio, Snowflake, and others, like all of them have already S3 integration, so it wasn't really convincing. But it seems that there are other advantages. For instance, it's very optimized for performance. So if you're dealing with large objects, or if you need to have very high read throughput, or if you need to read without downloading an object entirely, you just need a subset of the data. In all those use cases, I think Mountpoint can give you ideal performance. So maybe this is already good enough to justify using Mountpoint. We were also trying to figure out some potentially additional use cases where Mountpoint can be useful. And we were thinking, okay, what if you have created a script, maybe you were doing something quick and dirty locally, and now you need to do it using data that is available in S3. You're probably going to be fast enough just using Mountpoint rather than changing all your code to actually use an SDK or the CLI. So that could be another use case. And a common use case is like you have a Unix pipeline, you read from S3, you do some kind of manipulation, you save to S3. And if you were doing that on a local file system, you can immediately support S3 that way. And similarly, we have seen people doing a lot of work analyzing CSV files or Parquet files using notebooks or logarithms, so all sorts of kind of analytics. And often enough, people are just working off of logarithm files, and then they need to use real data in S3. And they have all the code written for using generic file system operations. They don't want to change their code to use maybe Porto3 or maybe some other kind of direct integration with S3. So in that case, you have another valid use case for Mountpoint. And finally, this is always our... something we like to remark that if you need to explore what you have in an S3 bucket and you are not very familiar with AWS CLI, because maybe you haven't used AWS that much, you can just mount the S3 Mountpoint and then you can explore the files using familiar bash commands like ls, for instance. So that could be another use case, and it might be much more convenient than just browsing through the AWS web console, especially when you have lots and lots of files in the bucket. So should we talk a little bit more about how it is really implemented and some of the modulated characteristics of this implementation?
Eoin: Yeah, this is where it gets actually kind of interesting in looking at how they're implementing this new client. It's written in Rust, like a lot of the new performance critical things that they're doing at Amazon, they seem to be favoring Rust. And now it's only supported for Linux at the moment. But the idea of using Rust is to reduce latency, the binary size, good for serverless applications, thinking about cold starts, reducing resource consumption. And it provides then a file system operation that is intended to deliver optimal S3 performance. So the idea is that you get a simple interface, but you don't compromise on speed because they're providing this level of abstraction. And it uses the Linux FUSE subsystem. So that's the subsystem that you might've used before if you're a Linux user for providing user space file systems.
Now, one of the things, I was a little bit confused because there are alternatives that already do this kind of thing in FUSE. I was wondering what this provides us slightly different. And it seems from reading through the documentation and the code base, that the whole philosophy here is to intentionally not implement operations that would result in suboptimal performance. So to remove those foot guns where you might try and do a simple operation on the file system that might result in thousands of operations with S3 under the hood that might take days and might end up costing you a lot. So I think that is a little bit reassuring to see that in practice. I will have to see how it plays out. And it's also built on top of the native CRT.
So CRT is something you might come across very rarely, but the CRT is common runtime. It's a set of libraries that Amazon provide. And we can maybe talk a little bit about that further on. So when doesn't it work? Given this implementation and design, when does it not work? Well, we've already mentioned it doesn't work on anything that is in Linux because it uses FUSE. So it's not supported on OSX. When I was playing around with it, I had to use Docker on Mac. And it doesn't work in Fargate because it needs special permissions. And that's explicitly called out in the documentation. Fargate doesn't provide the special permissions needed for the FUSE device.
So if you wanted to use S3 with Fargate today, you're left with using the object paradigm, or you're just doing get object and put object yourself. Or you can use something like EFS with data sync to sync up data from S3. And then when it comes to the specific operations, you wouldn't use it when you need to do edits on an existing object. So you can't change the middle of an object. You can only do sequential writes when you're writing an object for the first time. You can't do symlinks because those aren't supported in S3. You can't do directory renames. And in general, you wouldn't use it for something like web serving either. I mean, you can do it, but performance is not going to be the best because you generally want caching there. So maybe before we go into the CRT and some of those things, Luciano, do you want to talk about some of the alternatives that are out there from Mountpoint or other kind of use cases in this realm?
Luciano: One that I've been using in the past is s3fs-fuse that I think you already mentioned before. It has been around for a long time. Seems pretty reliable, but again, they try to make it as possible. So sometimes you might find this kind of footcance where you try to do a simple operation and it results in something that's not very optimal in S3. So it might be a little bit dangerous. And while we were researching for this episode, we found out that there is an alternative called GOOFYS, which is written in Go. And in terms of design principle, it's somewhat similar to Mountpoint, meaning that they don't try to implement everything in a POSIX compliant way, but they try to keep it as performant as possible.
And in general, I would say the real alternative is don't try to do this stuff if you can. Try to stick with the object storage paradigm and use the CLI or the SDK and do the specific operations that the actual service is providing you. Don't try to simulate with different abstraction, the same things, because all these abstractions are a bit leaky and they don't always map one-to-one and you might end up in this kind of weird situation where either it doesn't work or it's too expensive or it's too slow. And yeah, so the alternative is try not to do that whenever you can. So speaking about performance, what can we say? Because that seems to be one of the main, on one side, one of the main concerns because it might be a little bit obscure, but on the other end, it's a bit of a promise that by using this kind of tool, you get the best performance that you can possibly get.
Eoin: Yeah, we mentioned that it's fairly simple just to read and write from S3 at the beginning, but when you start pushing the boundaries with large objects and high throughput, that's when it gets a little bit trickier. And S3 will give you performance tips in the documentation, like saying you should use byte range requests in parallel in order to get your object faster rather than reading from start to finish. There's lots of other tricks like using multi-part uploads to upload and even using multiple IP addresses. So if you're just using DNS with S3, you might get back one IP address that's used for your request. But if you're on a high bandwidth EC2 instance, you might want to maximize the number of flows you can do because there's a cap on the bandwidth you can use for an individual flow. So you might want to use multiple IP addresses. So this is how it starts to become a little bit of a minefield. And this was really well illustrated in the Cloudanaut podcast when Michael and Andreas Fittig went through this whole pain in order to try and download five terabyte objects, the maximum object size, really quickly.
And they've also built an S3 client. And it's designed for low overhead, high throughput, automatically uses byte range requests, parallelization, multi-part uploads. And I think ultimately the goal with this CRT is to provide a common code base that all of the SDKs can use so they don't have to implement all of this optimization in every language separately. Now, right now, CRT can integrate fairly easily with the Java SDK. And it's possible with the Python Boto3 one as well. But it seems to be very vague how to do it with other languages, even though they provide kind of bindings for all languages. One of the interesting claims here is that the team says that they prove algorithmic correctness using this fancy automated reasoning that they're really into at AWS. So there's a link to that in the show notes. Now, going back to mount point, mount point is built on top of CRT. So performance should be pretty optimal. But as of yet, we don't see any published benchmarks. I don't see any benchmarks, even from the S3FS FUSE teams showing what the difference is. So it would be really interesting. Setting up benchmarks and running them on S3, there's a lot of effort to put into it. So we haven't had a chance to do that yet. But if anyone out there feels like it, I'd be really interested to see what it would be like. Are you optimistic Luciano, or do you see any potential problems with mount point? Yeah, I think on one side, it's fair to say that it's a relatively new project.
Luciano: So it will improve over time for sure. And it will get better, I imagine. So although there are some potential problems that we have observed in this experimentation that we did in the last few days, and one interesting thing is that we were wondering, because this is an abstraction, how it's going to impact cost for me. Like what kind of S3 requests are actually happening behind the scenes, right? So initially, we didn't really found a way to see that.
Eventually, we figured out that there is a CLI flag that you can enable to get advanced logs, like you get more verbose logs. And these logs will give you a fair number of details about the S3 operations that are happening. For instance, if you do a put, or if you do a get, and you get details like how many parts are being used, for instance, when you do a put. And that could be very useful to understand exactly what kind of operations are happening, how fast they are, and can give you an indication of cost. The only annoying gotcha there is that you don't see the parts being used in S3. So if you just look at the logs, it's a little bit out of context. If you try to correlate the different operations with what you were trying to do, you need to stick together your, I guess, common line history with the logs that you see there to make sense of everything. But this is probably just something that's missing. It could be easily added by the team. Or maybe if somebody is willing to do a PR, that's probably an easy feature to try to add to the project, which after all is an open source project. And the other problem, and we have been saying that over and over during this episode, but I think it's worth reiterating that, is that we are using a POSIX model, which is not really POSIX. So lots of footcans there. It could be dangerous. It's probably wrong in the first place. So if you use it, use it with moderation and be aware of exactly the kind of trade-offs you are buying into. Because if you try to use it as a general file system, you are going to have problems for sure. What do you think is that? Should we say that the final verdict is to use it or not to use it?
Eoin: Generally not, I would say right now! Then again, if people have found it interesting and want to try it out for their own use cases, they'll probably already have a good feeling from what we've said so far. I think it's better to stick with the object paradigm when you're talking about an object store rather than trying to shoehorn it into a file system model. But look, you could use it for a period of time during a migration while you work on the changes in order to use an object storage paradigm. I think you gave a good example of that back in the episode where we talked about migrating like a CMS or for a legal firm to AWS and using something like S3FS views at the time. It's better, I think, to try and use more native S3 integrations. I'm curious to hear if there's cases where you really need something like this. But look, if you need to use it, you can use it as a last resource, understand the risks and put your logging and metrics in place. If you wanted to use it for web serving, ultimately, you're really better off using a CTN in front of S3. So I think in general, the jury is still out. If there are very compelling use cases that we haven't spotted, let us know because we're really curious. And if you've done any benchmarking, please share them with everybody because I think the whole area of S3 performance, when it gets into really optimizing it, it can take a lot of time. But if you've got any data on that, I'd love to see it because we can all benefit from it. So thanks very much for listening. Please like and subscribe and share with your friends and we'll see you in the next episode.