BTW - a partial solution here: https://github.com/apache/spark/pull/2470
This doesn't address the 0 size block problem yet, but makes my large job on hundreds of terabytes of data much more reliable. On Fri, Jul 4, 2014 at 2:28 AM, Mridul Muralidharan <mri...@gmail.com> wrote: > In our clusters, number of containers we can get is high but memory > per container is low : which is why avg_nodes_not_hosting data is > rarely zero for ML tasks :-) > > To update - to unblock our current implementation efforts, we went > with broadcast - since it is intutively easier and minimal change; and > compress the array as bytes in TaskResult. > This is then stored in disk backed maps - to remove memory pressure on > master and workers (else MapOutputTracker becomes a memory hog). > > But I agree, compressed bitmap to represent 'large' blocks (anything > larger that maxBytesInFlight actually) and probably existing to track > non zero should be fine (we should not really track zero output for > reducer - just waste of space). > > > Regards, > Mridul > > On Fri, Jul 4, 2014 at 3:43 AM, Reynold Xin <r...@databricks.com> wrote: > > Note that in my original proposal, I was suggesting we could track > whether > > block size = 0 using a compressed bitmap. That way we can still avoid > > requests for zero-sized blocks. > > > > > > > > On Thu, Jul 3, 2014 at 3:12 PM, Reynold Xin <r...@databricks.com> wrote: > > > >> Yes, that number is likely == 0 in any real workload ... > >> > >> > >> On Thu, Jul 3, 2014 at 8:01 AM, Mridul Muralidharan <mri...@gmail.com> > >> wrote: > >> > >>> On Thu, Jul 3, 2014 at 11:32 AM, Reynold Xin <r...@databricks.com> > wrote: > >>> > On Wed, Jul 2, 2014 at 3:44 AM, Mridul Muralidharan < > mri...@gmail.com> > >>> > wrote: > >>> > > >>> >> > >>> >> > > >>> >> > The other thing we do need is the location of blocks. This is > >>> actually > >>> >> just > >>> >> > O(n) because we just need to know where the map was run. > >>> >> > >>> >> For well partitioned data, wont this not involve a lot of unwanted > >>> >> requests to nodes which are not hosting data for a reducer (and lack > >>> >> of ability to throttle). > >>> >> > >>> > > >>> > Was that a question? (I'm guessing it is). What do you mean exactly? > >>> > >>> > >>> I was not sure if I understood the proposal correctly - hence the > >>> query : if I understood it right - the number of wasted requests goes > >>> up by num_reducers * avg_nodes_not_hosting data. > >>> > >>> Ofcourse, if avg_nodes_not_hosting data == 0, then we are fine ! > >>> > >>> Regards, > >>> Mridul > >>> > >> > >> >