Re: Can a bucket be added to a partition?

2013-05-01 Thread Jie Li
I tried this interesting idea but also felt a little confusing. I guess you'll need to change the table schema so that it has both buckets and partitions. And to take advantage of the buckets inside the partitions, for example using the bucket map join, you'll need to specify one particular parti

Re: Huge join performance issue

2013-04-27 Thread Jie Li
In order for us to understand the performance and identify the bottlenecks, could you do two things: 1) run the EXPLAIN command and share with us the output 2) share with us the hadoop job histories generated by the query. They can be collected following http://www.cs.duke.edu/starfish/tutorial/jo

Hive Operator Counters

2013-02-05 Thread Jie Li
Hi all, Does anyone notice that the operator counters are not properly maintained? They are useful for understanding the query plan and execution, e.g how many rows each operator is processing and producing, and how much time each operator is spending. NUM_INPUT_ROWS NUM_OUTPUT_ROWS TIME_TAKEN T

Map-only aggregation

2013-01-04 Thread Jie Li
Hi all, Can Hive implement the aggregation as a Map-only job? As we know the data may be pre-partitioned via PARTITION-BY or CLUSTERED-BY, so we don't need the reduce phase to repartition the data. The Bucket Join seems to take advantage of the buckets for joins, so I wonder if there is some simi

Re: HiveHistoryViewer concurrency problem

2013-01-04 Thread Jie Li
Hi Qiang, Could you describe how HiveHistoryViewer is used? I'm also looking for a tool to understand the Hive log. Thanks, Jie On Sat, Jan 5, 2013 at 9:54 AM, Qiang Wang wrote: > Does Anybody have an idea about this? > > https://issues.apache.org/jira/browse/HIVE-3857 > > > 2013/1/4 Qiang Wang

Re: A tool to analyze and tune performance for Hive?

2012-12-13 Thread Jie Li
the query or jobs? It'll be nice to have some sample data available, so users can try a quick demo. Jie On Thu, Dec 13, 2012 at 9:12 PM, Zheng, Kai wrote: > You may have a try for HiTune & HiBench. Just google for them. > > -Original Message----- > From: Jie Li [mai

A tool to analyze and tune performance for Hive?

2012-12-13 Thread Jie Li
Hi everyone, May I know if there is any tool available to analyze and tune the performance for Hive queries? And what is the state of the art? I had some experience on tuning Pig, based on manually clicking JT web pages and collecting pieces of data from here and there, and guessing what might be

Re: Specify per-query configuration via a file

2012-07-08 Thread Jie Li
Perfect! Yeah there is a "--config" option to specify Hive configuration directory. Thanks! Jie On Sun, Jul 8, 2012 at 6:26 PM, Edward Capriolo wrote: > You can already do that hive allows you to specify different configs. Try > hive --help for more info. > > > On Sun

Re: Specify per-query configuration via a file

2012-07-08 Thread Jie Li
eld, please excuse typos. > > -Original Message- > From: Jie Li > Date: Sat, 7 Jul 2012 16:31:50 > To: > Reply-To: user@hive.apache.org > Subject: Specify per-query configuration via a file > > Hi all, > > Besides "-hiveconf x=y", does Hive support specifying per-query > configuration via a file? > > Thanks, > Jie > > >

Specify per-query configuration via a file

2012-07-07 Thread Jie Li
Hi all, Besides "-hiveconf x=y", does Hive support specifying per-query configuration via a file? Thanks, Jie