Re: Bucketing external tables

2013-04-16 Thread Bejoy KS
considered these, then can you please your CLI logs here so that we can help you better.   Regards, Bejoy KS From: Sadananda Hegde To: user@hive.apache.org Sent: Thursday, April 11, 2013 11:16 PM Subject: Re: Bucketing external tables I was able to load data

Re: Bucketing external tables

2013-04-11 Thread Sadananda Hegde
I was able to load data into bucketed tables. I verified that the number of files created in each of the partitioned folder matches the number of buckets specified in my CREATE statement. But I don't see any immprovements in the query speed. I tried with 90 buckets, 360 and 720 buckets. I have SET

Re: Bucketing external tables

2013-04-06 Thread Mark Grover
Glad to hear! On Fri, Apr 5, 2013 at 3:02 PM, Sadananda Hegde wrote: > Thanks, Mark. > > I found the problem. For some reason, Hive is not able to write Avro > output file when the schema has a complex field with NULL option. It read > without any problem; but cannot write with that structure.

Re: Bucketing external tables

2013-04-05 Thread Sadananda Hegde
Thanks, Mark. I found the problem. For some reason, Hive is not able to write Avro output file when the schema has a complex field with NULL option. It read without any problem; but cannot write with that structure. For example, Insert was failing on this array of structure field. { "name": "Pa

Re: Bucketing external tables

2013-04-03 Thread Mark Grover
Can you please check your Jobtracker logs? The is a generic error related to grabbing the Task Attempt Log URL, the real error is in JT logs. On Wed, Apr 3, 2013 at 7:17 PM, Sadananda Hegde wrote: > Hi Dean, > > I tried inserting a bucketed hive table from a non-bucketed table using > insert ove

Re: Bucketing external tables

2013-04-03 Thread Sadananda Hegde
Hi Dean, I tried inserting a bucketed hive table from a non-bucketed table using insert overwrite select from clause; but I get the following error. -- Exception in thread "Thread-225" java.lang.NullPointerExcepti

Re: Bucketing external tables

2013-03-30 Thread Dean Wampler
The table can be external. You should be able to use this data with other tools, because all bucketing does is ensure that all occurrences for records with a given key are written into the same block. This is why clustered/blocked data can be joined on those keys using map-side joins; Hive knows it

Re: Bucketing external tables

2013-03-30 Thread Sadananda Hegde
Thanks, Dean. Does that mean, this bucketing is exclusively Hive feature and not available to others like Java, Pig, etc? And also, my final tables have to be managed tables; not external tables, right? . Thank again for your time and help. Sadu On Fri, Mar 29, 2013 at 5:57 PM, Dean Wampler

Re: Bucketing external tables

2013-03-29 Thread Dean Wampler
I don't know of any way to avoid creating new tables and moving the data. In fact, that's the official way to do it, from a temp table to the final table, so Hive can ensure the bucketing is done correctly: https://cwiki.apache.org/Hive/languagemanual-ddl-bucketedtables.html In other words, you

Bucketing external tables

2013-03-29 Thread Sadananda Hegde
Hello, We run M/R jobs to parse and process large and highly complex xml files into AVRO files. Then we build external Hive tables on top the parsed Avro files. The hive tables are partitioned by day; but they are still huge partitions and joins do not perform that well. So I would like to try out