Almost all operations in hive can exploit map reduce for parallelism.
(isnt not really done on the thread level) essentially if you run a
hive job and there is multiple mappers or reducers it was parallelism.

On Fri, Jun 22, 2012 at 5:14 AM, Jayanth Muthya <jayanthmut...@gmail.com> wrote:
> Thanks or clarifying, I'll look into it too and see if I can find anything.
>
> -Jayanth
>
> On Thu, Jun 21, 2012 at 10:47 PM, Jerome Banks <jer...@klout.com> wrote:
>
>> set hive.exec.parallel=true;
>>
>> This will run Hive jobs in parallel, if they are able to do so.
>>
>> As for multi-threading in the actual job itself, I don't think so, but I'm
>> not sure. The query planner will merge steps together, in order to try to
>> minimize the number of MR jobs needed to run a query, but I think those are
>> chained together in a single thread, both on the mapper and reduce.
>>
>> When I was at Quantcast, we had some multi-threading in the mapper ands
>> reducers, to try to increase throughput, by utilizing the CPU when the job
>> would otherwise be blocked on IO.  This helps out, if your IO is very slow,
>> but if the IO no longer becomes a bottleneck, then you spend a lot of time
>> context-switching, and it no longer efficient.
>>
>> Interesting question, I'll look into it some more. Let me know if you find
>> out anything.
>>
>> -- jerome
>>
>> On Thu, Jun 21, 2012 at 1:16 AM, Jayanth Muthya <jayanthmut...@gmail.com
>> >wrote:
>>
>> > Hi,
>> > I was looking into some of the source code for hive. And had a few
>> > questions regarding parallelism in hive. Can a map task in
>> > hive exploit parallelism and run multiple threads? If it can do that,
>> does
>> > it do it by default? or does a user have to configure the settings?
>> > This question seems really basic, I just started looking into
>> hadoop/hive.
>> > Thanks in advance!
>> >
>> > -Jay
>> >
>>

Reply via email to