Re: OOM/GC limit Error

2013-12-29 Thread Navis류승우
Could you post hive version and execution plan for the query? 2013/12/21 Martin, Nick > Hi all, > > > > I have two tables: > > > > tbl1: 81m rows > > tbl2: 4m rows > > > > tbl1 is partitioned on one column and tbl2 has none. > > > > > I’m attempting the following query: > > > > SELECT > > tbl1

ORC file tuning

2013-12-29 Thread Avrilia Floratou
Hi all, I'm using Hive 0.12 and running some experiments with the ORC file. The hdfs block size is 128MB and I was wondering what is the best stripe size to use. The default one (250MB) is larger than the block size. Is each stripe splittable or in this case each map task will have to access data

Re: Hive, datanucleus, jdbc, localmode.

2013-12-29 Thread Jay Vyas
Yes this blog + hive_test should be merged into a jira and then officially integrated into hive in some way I think. > On Dec 29, 2013, at 8:44 AM, Edward Capriolo wrote: > > This article describes exactly what hive test does. > > :) > > >> On Sat, Dec 28, 2013 at 9:18 PM, Jay Vyas wrote: >

Re: [DISCUSS] Proposed Changes to the Apache Hive Project Bylaws

2013-12-29 Thread Thejas Nair
On Sun, Dec 29, 2013 at 12:06 AM, Lefty Leverenz wrote: > Let's discuss annual rotation of the PMC chair a bit more. Although I > agree with the points made in favor, I wonder about frequent loss of > expertise and needing to establish new relationships. What's the ramp-up > time? The ramp up t

Re: [DISCUSS] Proposed Changes to the Apache Hive Project Bylaws

2013-12-29 Thread Edward Capriolo
"Also something similar to this was said in the thread, "my +1 issue was never committed for N days". Will new bylaws solve this problem? What is the root of the problem?" For the record, I will suggest the root of this problems is that there are too many people working in "silos", and as the proj

Re: [DISCUSS] Proposed Changes to the Apache Hive Project Bylaws

2013-12-29 Thread Edward Capriolo
I think the terms is good. We do not want to have a lame duck scenario. Strictly the chair is only responsible for this: http://www.apache.org/dev/pmc.html#chair In many of the other ASF projects the chair is a more active organizer of the committers. I have seen chairs suggest road maps, hang ou

Re: Dynamic columns in Hive Table - Best Design for the problem

2013-12-29 Thread Edward Capriolo
Basically when you have data like this, it is best to treat the all the columns as a single string and write a tool to break the entire row apart. You could use a UDF or a UDTF actually. Look at something like parseUrl... select myRow(row) as id string, events List A UDTF allows you to produ

Re: Dynamic columns in Hive Table - Best Design for the problem

2013-12-29 Thread Raj Hadoop
Matt, Thanks for the suggestion. Can you please provide more details on what type of UDAF should I develop ? I have never worked on a UDAF earlier. But would like to explore it. Any tips on how to proceed. Thanks, Raj On Saturday, December 28, 2013 2:47 PM, Matt Tucker wrote: It looks li

Re: Hive, datanucleus, jdbc, localmode.

2013-12-29 Thread Edward Capriolo
This article describes exactly what hive test does. :) On Sat, Dec 28, 2013 at 9:18 PM, Jay Vyas wrote: > Anyone try this yet : > > > http://hadoop-pig-hive-thejas.blogspot.com/2013/04/running-hive-in-local-mode.html > > ? > > > On Sat, Dec 28, 2013 at 8:09 PM, Jay Vyas wrote: > >> -Local mod

Re: [DISCUSS] Proposed Changes to the Apache Hive Project Bylaws

2013-12-29 Thread Lefty Leverenz
Let's discuss annual rotation of the PMC chair a bit more. Although I agree with the points made in favor, I wonder about frequent loss of expertise and needing to establish new relationships. What's the ramp-up time? Could a current chair be chosen for another consecutive term? Could two chair