Hi, thanks for reply. I finally got time and glanced through the design doc.
It seems that it has nothing to do with the paper I mentioned. The paper is
trying to solve the problem of I/O ops required for shuffle are growing
quadratically with number of tasks (shuffle files), therefore we nee
Hi everyone,
we are facing same problems as Facebook had, where shuffle service is a
bottleneck. For now we solved that with large task size (2g) to reduce
shuffle I/O.
I saw very nice presentation from Brian Cho on Optimizing shuffle I/O at
large scale[1]. It is a implementation of white