I want to load a 10TB parquet File from S3 and I'm trying to decide what EC2 instances to use.
Should I go for instances that in total have a larger memory size than 10TB? Or is it enough that they have in total enough SSD storage so that everything can be spilled to disk? thanks