Re: full table scan

Andre Reiter Sat, 11 Jun 2011 01:37:21 -0700

Jean-Daniel Cryans wrote:

You expect a MapReduce job to be faster than a Scan on small data,
your expectation is wrong.


never expected a MR job to be faster  for every context

There's a minimal cost to every MR job, which is of a few seconds, and
you can't go around it.


for sure there is an overhead for MR job, and a few seconds are OK, but not a 
whole minute...

so what time can be expected for processing a full scan of i.e. 1.000.000.000 
rows in an hbase cluster with i.e. 3 region servers?

i'm just wondering, if its worth to run the full scan only once a day, and to 
persist the results
i hoped to be able to process it on demand, but if it takes too much time, its 
not acceptable

andre

Re: full table scan

Reply via email to