Re: Using an UDF in the WHERE (IN) clause

2014-03-10 Thread Navis류승우
(KW_IN expressions) -> ^(TOK_FUNCTION KW_IN $precedenceEqualExpression expressions) expressions : LPAREN expression (COMMA expression)* RPAREN -> expression* ; You should have arguments of IN wrapped by parentheses. But It seemed not possible to use array returning expression i

select count * throwing exception

2014-03-10 Thread Vaibhav Jain
Hi, I have a table created by the following query CREATE EXTERNAL TABLE IF NOT EXISTS partition_table (partkey STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.had

RE: Using an UDF in the WHERE (IN) clause

2014-03-10 Thread java8964
I don't know from syntax point of view, if Hive will allow to do "columnA IN UDF(columnB)". What I do know that even let's say above work, it won't do the partition pruning. The partition pruning in Hive is strict static, any dynamic values provided to partition column won't enable partition pru

Re: data get truncated.

2014-03-10 Thread Kim Chew
Thanks Szehon. My mine is stored as a SEQUENCEFILE, not TEXTFILE. Kim On Mon, Mar 10, 2014 at 1:25 PM, Szehon Ho wrote: > No there is no ignoring of key, you can declare a different key column if > you dont want it to be in your 'value'. Say if you want to create a table > with two fields sep

Re: data get truncated.

2014-03-10 Thread Szehon Ho
No there is no ignoring of key, you can declare a different key column if you dont want it to be in your 'value'. Say if you want to create a table with two fields separated by some separator (say '\t' in your case?), then you would do: CREATE TABLE TEST(key INT, value STRING) ROW FORMAT DELIMITE

Re: data get truncated.

2014-03-10 Thread Kim Chew
So I have generated my input file in SequenceFile format like this is typed IntWritable is typed Text For example, 167 1105|11748184969223627771|172.31.2.71|0|sta1|... And I create my table like this, CREATE TABLE if not exists KIM_TEST_SEQ ( value string) ROW FORMAT DELIMIT

cluster by in a subquery

2014-03-10 Thread Imran Akbar
Hi, I'm trying to use the "cluster by" statement in Hive to write a query like this: FROM (SELECT * FROM attribute_table CLUSTER BY id, name, value, amount) map_output INSERT OVERWRITE TABLE attributed_table SELECT TRANSFORM (map_output.id,...) USING 'python2.7 data_attribution.py' AS id, name,

Error SQLState:08S01 while connecting to remote Hive through Thrift API

2014-03-10 Thread Jegan
Hi All, I am trying to fetch some data from a remote hive server using thrift API. I am running the following hive script to fetch the data. add jar ${hiveconf:JAR_LOCATION}; drop table if exists monitor_event; create external table monitor_event (ec string, dt string, ip string, seq string) part

Using an UDF in the WHERE (IN) clause

2014-03-10 Thread Petter von Dolwitz (Hem)
Hi, I'm trying to get the following query to work. The parser don't like it. Anybody aware of a workaround? SELECT * FROM mytable WHERE partitionCol IN my_udf("2014-03-10"); partitionCol is my partition column of type INT and I want to achieve early pruning. I've tried returning an array of INTs