Question on MAPJOIN Vs JOIN performance

2015-04-15 Thread Harsha HN
Hi All, I went through below mentioned Facebook engineering page, https://www.facebook.com/notes/facebook-engineering/join -optimization-in-apache-hive/470667928919 I set following for auto conversion of joins, set hive.auto.convert.join=true; set hive.mapjoin.smalltable.filesize=10;(

Re: [ANNOUNCE] New Hive Committer - Mithun Radhakrishnan

2015-04-15 Thread @Sanjiv Singh
Congratulations !!! Regards Sanjiv Singh Mob : +091 9990-447-339 On Wed, Apr 15, 2015 at 11:47 PM, Sergey Shelukhin wrote: > Congrats! > > From: , Cheng A > Reply-To: "user@hive.apache.org" > Date: Tuesday, April 14, 2015 at 18:03 > To: "user@hive.apache.org" , "d...@hive.apache.org" < >

Re: Extremely Slow Data Loading with 40k+ Partitions

2015-04-15 Thread Daniel Haviv
How many reducers are you using? Daniel > On 16 באפר׳ 2015, at 00:55, Tianqi Tong wrote: > > Hi, > I'm loading data to a Parquet table with dynamic partitons. I have 40k+ > partitions, and I have skipped the partition stats computation step. > Somehow it's still exetremely slow loading data in

RE: [ANNOUNCE] New Hive Committer - Mithun Radhakrishnan

2015-04-15 Thread Xu, Cheng A
Please send mail to user-unsubscr...@hive.apache.org. From: Teja Kunapareddy [mailto:tejakunapare...@gmail.com] Sent: Thursday, April 16, 2015 8:37 AM To: user@hive.apache.org Subject: RE: [ANNOUNCE] New Hive Committer - Mithun Radhakrishnan unsubscribe

RE: [ANNOUNCE] New Hive Committer - Mithun Radhakrishnan

2015-04-15 Thread Teja Kunapareddy
unsubscribe From: Vaibhav Gumashta [mailto:vgumas...@hortonworks.com] Sent: Wednesday, April 15, 2015 5:08 PM To: user@hive.apache.org; venkatanathen kannan; d...@hive.apache.org; Chris Drome Cc: mit...@apache.org Subject: Re: [ANNOUNCE] New Hive Committer - Mithun Radhakrishnan Congrats M

Re: [ANNOUNCE] New Hive Committer - Mithun Radhakrishnan

2015-04-15 Thread Vaibhav Gumashta
Congrats Mithun. -Vaibhav From: venkatanathen kannan mailto:venkatanat...@yahoo.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>>, venkatanathen kannan mailto:venkatanat...@yahoo.com>> Date: Wednesday, April 15, 2015 at 11:40 AM To: "user@hive.ap

Extremely Slow Data Loading with 40k+ Partitions

2015-04-15 Thread Tianqi Tong
Hi, I'm loading data to a Parquet table with dynamic partitons. I have 40k+ partitions, and I have skipped the partition stats computation step. Somehow it's still exetremely slow loading data into partitions (800MB/h). Do you have any hints on the possible reason and solution? Thank you Tianqi T

Re: [ANNOUNCE] New Hive Committer - Mithun Radhakrishnan

2015-04-15 Thread venkatanathen kannan
Congrats Mithun ! On Wednesday, April 15, 2015 2:18 PM, Sergey Shelukhin wrote: Congrats! From: , Cheng A Reply-To: "user@hive.apache.org" Date: Tuesday, April 14, 2015 at 18:03 To: "user@hive.apache.org" , "d...@hive.apache.org" , Chris Drome Cc: "mit...@apache.org" Subjec

Re: [ANNOUNCE] New Hive Committer - Mithun Radhakrishnan

2015-04-15 Thread Sergey Shelukhin
Congrats! From: , Cheng A mailto:cheng.a...@intel.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>> Date: Tuesday, April 14, 2015 at 18:03 To: "user@hive.apache.org" mailto:user@hive.apache.org>>, "d...@hive.apache.org

Re: External Table with unclosed orc files.

2015-04-15 Thread Alan Gates
So it was in the map reduce job itself? Then the best information would be the logs from the MR job, so we can see what it's doing (or perhaps not doing). Alan. Grant Overby (groverby) wrote: > It wasn’t reliably reproducible for us. If we killed the compaction > job in yarn and manually triggere

RE: [ANNOUNCE] New Hive Committer - Mithun Radhakrishnan

2015-04-15 Thread Teja Kunapareddy
unsubscribe From: kulkarni.swar...@gmail.com [mailto:kulkarni.swar...@gmail.com] Sent: Wednesday, April 15, 2015 9:40 AM To: d...@hive.apache.org; Viraj Bhat Cc: user@hive.apache.org; mit...@apache.org Subject: Re: [ANNOUNCE] New Hive Committer - Mithun Radhakrishnan Congratulations!!

Re: [ANNOUNCE] New Hive Committer - Mithun Radhakrishnan

2015-04-15 Thread kulkarni.swar...@gmail.com
Congratulations!! On Wed, Apr 15, 2015 at 10:57 AM, Viraj Bhat wrote: > Mithun Congrats!! > Viraj > > From: Carl Steinbach > To: d...@hive.apache.org; user@hive.apache.org; mit...@apache.org > Sent: Tuesday, April 14, 2015 2:54 PM > Subject: [ANNOUNCE] New Hive Committer - Mithun Radha

Re: [ANNOUNCE] New Hive Committer - Mithun Radhakrishnan

2015-04-15 Thread Viraj Bhat
Mithun Congrats!! Viraj From: Carl Steinbach To: d...@hive.apache.org; user@hive.apache.org; mit...@apache.org Sent: Tuesday, April 14, 2015 2:54 PM Subject: [ANNOUNCE] New Hive Committer - Mithun Radhakrishnan The Apache Hive PMC has voted to make Mithun Radhakrishnan a committer o

RE: [Hive 0.13.1] - Explanation/confusion over "Fatal error occurred when node tried to create too many dynamic partitions" on small dataset with dynamic partitions

2015-04-15 Thread Mich Talebzadeh
Hi, I believe partitioning followed by hash cluster allows only up to 32 buckets within a single partition? HTH, Mich NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you

[Hive 0.13.1] - Explanation/confusion over "Fatal error occurred when node tried to create too many dynamic partitions" on small dataset with dynamic partitions

2015-04-15 Thread Daniel Harper
Hi there, We've been encountering the exception Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveFatalException: [Error 20004]: Fatal error occurred when node tried to create too many dynamic partitions. The maximum number of dynamic partitions is controlled by hive.e

Re: External Table with unclosed orc files.

2015-04-15 Thread Grant Overby (groverby)
It wasn’t reliably reproducible for us. If we killed the compaction job in yarn and manually triggered compaction for the same partition, it would succeed. We would see this about 1 time every 2 days / 200 partitions. There weren’t any errors logged that we noticed. The job was simply sitting th

Re: Dataset for hive

2015-04-15 Thread venkatanathen kannan
HI Gopal & Xiaohe,  Thanks for sharing. Thanks,VK   On Wednesday, April 15, 2015 9:23 AM, xiaohe lan wrote: I just have time to generate the data a few minutes ago. It can generate 100G data for me in tens of minutes on my 5 nodes cluster. Thanks all for helping me. Regards,Xiaohe O

Question on MAPJOIN V/s JOIN performance

2015-04-15 Thread Harsha HN
Hi All, I went through below mentioned Facebook engineering page, https://www.facebook.com/notes/facebook-engineering/join-optimization-in-apache-hive/470667928919 I set following for auto conversion of joins, set hive.auto.convert.join=true; set hive.mapjoin.smalltable.filesize=10

Re: Dataset for hive

2015-04-15 Thread xiaohe lan
I just have time to generate the data a few minutes ago. It can generate 100G data for me in tens of minutes on my 5 nodes cluster. Thanks all for helping me. Regards, Xiaohe On Fri, Apr 3, 2015 at 9:00 PM, Fabio C. wrote: > Thanks Gopal, but since it was a while ago and I didn't have to gener

Re: External Table with unclosed orc files.

2015-04-15 Thread Alan Gates
Grant Overby (groverby) wrote: > Thanks for the link to the hive streaming bolt. We rolled our own bolt > many moons ago to utilize hive streaming. We’ve tried it against 0.13 and > 0.14 . Acid tables have been a real pain for us. We don’t believe they are > production ready. At least in our use