After digging in a bit more, it looks like maxrecordsperfile does not provide full parallelism as expected. Any thoughts on this would be really helpful.
On Sat, Nov 23, 2019 at 11:36 PM Rishi Shah <rishishah.s...@gmail.com> wrote: > Hi All, > > Version 2.2 introduced maxrecordsperfile option while writing data, could > someone help understand the performance impact of using maxrecordsperfile > (single pass at writing data with this option) vs repartitioning (2 stage > process where we write down data and then consolidate later)? > > -- > Regards, > > Rishi Shah >