Re: Hive or Pig - Which one gives best performance for reading HBase data

Jörn Franke Wed, 14 Sep 2016 12:06:05 -0700

They should be rather similar, you may gain some performance using Tez or Spark 
as an execution engine but in an export scenario do not expect much performance 
improvements.
In any scenario avoid to have only one reducer, but use several ones, e.g. by 
exporting to multiple output files instead of one. This avoids network load.
It might be worth to check Apache Pherf to export selected columns from Hbase.


> On 14 Sep 2016, at 19:19, Nagabhushanam Bheemisetty <nbheemise...@gmail.com> 
> wrote:
> 
> Hi,
> 
> I have a situation where I need to read data from huge HBase table and dump 
> it into other location as a flat file. I am not interested in all the columns 
> rather I need only lets 10 out of 100+ columns. So which technology Hive/Pig 
> gives better performance. I believe both of them will use serde' to extract 
> and build record.
> 
> Any thoughts?
> 
> Thanks

Re: Hive or Pig - Which one gives best performance for reading HBase data

Reply via email to