RE: Question about how to add the debug info into the hive core jar

2013-03-20 Thread java8964 java8964
I am not sure the existing logging information is enough for me. The exception trace is as following: Caused by: java.lang.IndexOutOfBoundsException: Index: 8, Size: 8at java.util.ArrayList.rangeCheck(ArrayList.java:604)at java.util.ArrayList.get(ArrayList.java:382)at org.apache.hadoop.hive.ser

Re: Question about how to add the debug info into the hive core jar

2013-03-20 Thread Abdelrhman Shettia
Hi Yong, Have you tried running the H query in debug mode. Hive log level can be changed by passing the following conf while hive client is running. #hive -hiveconf hive.root.logger=ALL,console -e " DDL statement ;" #hive -hiveconf hive.root.logger=ALL,console -f ddl.sql ; Hope this helps

Re: Where is exception stacktrace and root cause for UDF

2013-03-20 Thread Harsh J
They should be viewable if you enable the DEBUG mode. hive -hiveconf hive.root.logger=DEBUG,console On Thu, Mar 21, 2013 at 1:31 AM, Marc Limotte wrote: > Hi. > > I'm trying to understand what happens to Exceptions that are thrown by > custom UDFs. I have a UDF that throws a RuntimeException wi

Re: Using TABLESAMPLE on inner queries

2013-03-20 Thread Ramki Palle
You may use percent based (block sampling) sampling for non-bucketed tables, though there are some restrictions. https://cwiki.apache.org/Hive/languagemanual-sampling.html Regards, Ramki. On Wed, Mar 20, 2013 at 12:27 PM, Mark Grover wrote: > Hey Dean, > I am not a power user of the sampling f

Re: Using TABLESAMPLE on inner queries

2013-03-20 Thread Mark Grover
Hey Dean, I am not a power user of the sampling feature but my understanding was that sampling in Hive only works on bucketed tables. I am happy to be corrected though. Mark On Wed, Mar 20, 2013 at 12:20 PM, Dean Wampler < dean.wamp...@thinkbiganalytics.com> wrote: > Mark, > > Aside from what mi

Re: Using TABLESAMPLE on inner queries

2013-03-20 Thread Dean Wampler
Mark, Aside from what might be wrong here, isn't it true that sampling with the bucket clause still works on non-bucketed tables; it's just inefficient because it still scans the whole table? Or am I an idiot? ;) dean On Wed, Mar 20, 2013 at 2:17 PM, Mark Grover wrote: > Hi Robert, > Sampling i

Re: Using TABLESAMPLE on inner queries

2013-03-20 Thread Mark Grover
Hi Robert, Sampling in Hive is based on buckets. Therefore, you table needs to be appropriately bucketed. I would recommend storing the results of your inner query in a bucketed table. See how to populate a bucketed table at https://cwiki.apache.org/Hive/languagemanual-ddl-bucketedtables.html The

Question regarding serde2.s3.S3LogDeserializer.java

2013-03-20 Thread Brandon Luth
Hello all, I recently came across the following piece of code in https://svn.apache.org/repos/asf/hive/trunk/contrib/src/java/org/apache/hadoop/hive/contrib/serde2/s3/S3LogDeserializer.java : *static* *{* StackTraceElement*[]* sTrace *=* *new* Exception*().*getStackTrace*();* sTrace*[

Using TABLESAMPLE on inner queries

2013-03-20 Thread Robert Li
Hi Everyone I'm trying to use the TABLESAMPLE function to sample data, however it's a little more complicated and I am having trouble getting it to run. I know that this works fine and it will give me about 25% of the whole dataset select distinct s from testtable TABLESAMPLE(BUCKET 1 OUT OF 4 O

Re: Does Hive support collation?

2013-03-20 Thread Alan Gates
No, Hive does not support collation at this time. Alan. On Mar 18, 2013, at 9:09 PM, Jon Klein wrote: > Hi, > > I'm using Hive for dealing with some international characters. > Does Hive have collation support so I can specify case sensitivity, ascent > sensitivity or width sensitity for stri