Re: MultithreadedMapper

kenyh Thu, 26 Jul 2012 23:14:38 -0700

For multithreaded mapper, it can get more chances to combine the mapper
output. Meanwhile, the locality of some global data will also be better. But
the implementation in Hadoop 1.0.2 uses heavy synchronization, which brings
much overhead. Are there any optimization about multithreaded mapper?



syscokid wrote:
> 
> Why multithread the mapper? Just create more mappers. That way you spread
> the data load as well as the mapping load potentially across multiple
> nodes.
> 
> 
> kenyh wrote:
>> 
>> I wonder if there are any optimization about the multithread mapper to
>> decrease the contention of input reading and output? 
>> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/MultithreadedMapper-tp34213805p34219011.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.

Re: MultithreadedMapper

Reply via email to