Spark SQL did not support explicit partitioners even before tungsten: and often enough this did hurt performance. Even now Tungsten will not do the best job every time: so the question from the OP is still germane.
2017-06-25 19:18 GMT-07:00 Ryan <ryan.hd....@gmail.com>: > Why would you like to do so? I think there's no need for us to explicitly > ask for a forEachPartition in spark sql because tungsten is smart enough to > figure out whether a sql operation could be applied on each partition or > there has to be a shuffle. > > On Sun, Jun 25, 2017 at 11:32 PM, jeff saremi <jeffsar...@hotmail.com> > wrote: > >> You can do a map() using a select and functions/UDFs. But how do you >> process a partition using SQL? >> >> >> >