Hi, I'm working in node-level aggregation for MapReduce. Please check the JIRA as follows: https://issues.apache.org/jira/browse/MAPREDUCE-4502 I'm waiting for the review by community.
And it also can be implemented in Tez as Bikas and Gopal mentioned. Thanks, On Wed, Oct 23, 2013 at 1:28 AM, Bikas Saha <bi...@hortonworks.com> wrote: > +1. A node level or rack level or any level intermediate combiner is > fairly straightforward to add in Tez. Please carry over your question to > the Apache Tez dev mailing list d...@tez.incubator.apache.org if you are > interested in following that path. > > Bikas > > -----Original Message----- > From: go...@hortonworks.com [mailto:go...@hortonworks.com] On Behalf Of > Gopal Vijayaraghavan > Sent: Tuesday, October 22, 2013 9:03 AM > To: common-dev@hadoop.apache.org > Subject: Re: Combiner Execution > > Hi, > > I'll answer your questions in reverse. > >> According to http://developer.yahoo.com/hadoop/tutorial/module4.html the > output is already combined over all Mappers in a node. But we can not find > how this is happening. Can someone point us to where this combiner is > executed? > > You'll find the Combiner runner somewhere buried inside MapTask.java, hunt > for the combinerRunner in there. > > The Combiner only combines the output of a single map-task (after > sorting). This kicks in only if there are spills in that 1 map-task > > minSpillsForCombine. > > It does not do any cross-task actions and the MR framework (as it is > today) doesn't leave enough room for scheduling a cross-task activity (i.e > MR is strictly bi-partite). > >> For a class project my group and I are looking to experiment with > combining the output from Mappers on the same node or in the same rack. We > found the idea at http://wiki.apache.org/hadoop/HadoopResearchProjects. > > Your general idea is sort of chalked out in Apache Tez (per-host/per-rack > multi-level combiner trees, which is designed to be more flexible with its > plumbing) - > https://issues.apache.org/jira/browse/TEZ-145 > > Cheers, > Gopal > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity > to which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified > that any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender > immediately and delete it from your system. Thank You. > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You. -- - Tsuyoshi