Hi Malc, Unless I am mistaken, all operations happen serially in local mode, so a group by will be always performed by a single reducer.
Either you can use MR mode to take advantage of parallel, or you can reduce the size of data to be grouped if possible. Hope this is helpful. Thanks, Cheolsoo On Fri, Jan 4, 2013 at 9:35 AM, Malcolm Tye <[email protected]>wrote: > Hi, > > Any ideas on how to make Pig run quicker when running it in > local mode ? > > > > I'm processing 3 files of about 13MB each with 3 group by statements in my > script which seem to suck up the time. There's no joins > > > > Increasing the heap size has made no difference and it doesn't use all that > anyway. > > > > I'm on default settings apart from that. > > > > > > Thanks > > > > Malc > >
