hi all:
I'veI created a jira for this problem:
https://issues.apache.org/jira/browse/HIVE-7847 .
Thanks.
2014-08-22 1:59 GMT+08:00 wzc :
> hi all:
>
> I test the above example with hive trunk and it still fail. After some
> debugging, finally I find the cause of the problem:
>
> Hive use Com
Hi
How can I determine a ideal bucket size?
Info:
1) I have 2 billion rows in a hive table, it is in ORC format
2) I want to create bucket on a column X.
3) Column X has 100 million unique values.
4) Reason for bucketing - Want to make efficient distinct count on X -
this
Ok guys
Figured it out
Looks like the collective Hive Groups positive thought waves are helping me
immensely – LOL
After many experiments , this is the HQL that worked beautifully . I had
forgotten that explode can take an array as a param as well !
HIVE QUERY
==
use sansub01
;
Hey guys
How do I denormalize the JSON arrays with a select statement ?
Thanks
Warm Regards
sanjay
DDL
===
USE sansub01
;
ADD JAR
./json-serde-1.3-SNAPSHOT-jar-with-dependencies.jar
;
DROP TABLE IF EXISTS
res_score
;
CREATE EXTERNAL TABLE res_score (
uniqueResumeIden
hi all:
I test the above example with hive trunk and it still fail. After some
debugging, finally I find the cause of the problem:
Hive use CombineFileRecordReader and in one CombineFileSplit there are
often more than one path. In this case, the schema for these two paths
(dt='20140718' vs dt=
I have fixed this - I was using the wrong character to replace line breaks.
I was replacing \r when it needed to be \n.
Regards,
Charles
On 18 August 2014 08:44, Charles Robertson
wrote:
> Hi Andre,
>
> Table and view definitions:
>
> CREATE EXTERNAL TABLE tweets_raw (
>id BIGINT,
>cre