You expect a MapReduce job to be faster than a Scan on small data,
your expectation is wrong.

There's a minimal cost to every MR job, which is of a few seconds, and
you can't go around it.

What other people have been trying to tell you is that you don't have
enough data to benefit from the parallel execution advantages of
Hadoop and HBase.

J-D

On Wed, Jun 8, 2011 at 4:43 AM, Andre Reiter <[email protected]> wrote:
> cool, just one change
>
> scan.setCaching(1000);
>
> reduced the processing time of my MR job from 60sec to 10sec !
> nice :-)
>
> PS: now looking for other optimizations...
>
>
>
> Stack wrote:
>>
>> See http://hbase.apache.org/book/performance.html
>> St.Ack
>>
>
>

Reply via email to