Re: change column type of orc table will throw exception in query time

2014-08-21 Thread wzc
hi all: I'veI created a jira for this problem: https://issues.apache.org/jira/browse/HIVE-7847 . Thanks. 2014-08-22 1:59 GMT+08:00 wzc : > hi all: > > I test the above example with hive trunk and it still fail. After some > debugging, finally I find the cause of the problem: > > Hive use Com

Ideal Bucket Size

2014-08-21 Thread Natarajan, Prabakaran 1. (NSN - IN/Bangalore)
Hi How can I determine a ideal bucket size? Info: 1) I have 2 billion rows in a hive table, it is in ORC format 2) I want to create bucket on a column X. 3) Column X has 100 million unique values. 4) Reason for bucketing - Want to make efficient distinct count on X - this

Re: Denormalizing JSON arrays

2014-08-21 Thread Subramanian, Sanjay (HQP)
Ok guys Figured it out Looks like the collective Hive Groups positive thought waves are helping me immensely – LOL After many experiments , this is the HQL that worked beautifully . I had forgotten that explode can take an array as a param as well ! HIVE QUERY == use sansub01 ;

Denormalizing JSON arrays

2014-08-21 Thread Subramanian, Sanjay (HQP)
Hey guys How do I denormalize the JSON arrays with a select statement ? Thanks Warm Regards sanjay DDL === USE sansub01 ; ADD JAR ./json-serde-1.3-SNAPSHOT-jar-with-dependencies.jar ; DROP TABLE IF EXISTS res_score ; CREATE EXTERNAL TABLE res_score ( uniqueResumeIden

Re: change column type of orc table will throw exception in query time

2014-08-21 Thread wzc
hi all: I test the above example with hive trunk and it still fail. After some debugging, finally I find the cause of the problem: Hive use CombineFileRecordReader and in one CombineFileSplit there are often more than one path. In this case, the schema for these two paths (dt='20140718' vs dt=

Re: New lines causing new rows

2014-08-21 Thread Charles Robertson
I have fixed this - I was using the wrong character to replace line breaks. I was replacing \r when it needed to be \n. Regards, Charles On 18 August 2014 08:44, Charles Robertson wrote: > Hi Andre, > > Table and view definitions: > > CREATE EXTERNAL TABLE tweets_raw ( >id BIGINT, >cre