Re: Get 'TBLPROPERTIES' of a Hive table in Java

2014-07-22 Thread Tuong Tr.
Hi, You should be able to find what you need to do in the getSchema() method inside  hive/ql/src/java/org/apache/hadoop/hive/ql/io/avro/AvroGenericRecordReader.java. The table properties does not get copied into the jobconf in M/R job's passedin jobconf.  so there is some complex logic to trac

RE: Hive Statistics

2014-07-22 Thread Navdeep Agrawal
Thank you Nitin for reply. I am using mysql database ,and also I can see new row created for the partition ,but all values are zero . I think explicitly giving mysql data base wont make a difference . From: Nitin Pawar [mailto:nitinpawar...@gmail.com] Sent: Tuesday, July 22, 2014 11:05 PM To: us

Hive UDF gives duplicate result regardless of parameters, when nested in a subquery

2014-07-22 Thread 丁桂涛(桂花)
Recently I developed a Hive Generic UDF *getad*. It accepts a map type and a string type parameter and outputs a string value. But I found the UDF output really confusing in different conditions. Condition A: select getad(map_col, 'tp') as tp, getad(map_col, 'p') as p, getad(map_col, 'sp')

Re: Query difference between 2 tables

2014-07-22 Thread Brenden Cobb
Perfect, thanks very much. -BC From: Gunther Hagleitner mailto:ghagleit...@hortonworks.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>> Date: Tuesday, July 22, 2014 3:03 PM To: "user@hive.apache.org" mailto:user@hive

Re: select list for dynamic partition insert

2014-07-22 Thread Kristof Vanbecelaere
I see. The last column in u_data is unixtime while I wanted to partition on rating. I just assumed Hive would use the same-named column as the one mentioned in the partition spec. Thanks for clarifying this, I missed that bit in the documentation. On Tue, Jul 22, 2014 at 10:13 PM, Prasanth Jayach

Re: select list for dynamic partition insert

2014-07-22 Thread Prasanth Jayachandran
>From the error msg it looks like there are too many distinct values in >partition column. Try increasing the count >hive.exec.max.dynamic.partitions.pernode to a number >100. If you already know >the number of distinct values in partition column, try a value greater than or >equal to that numb

select list for dynamic partition insert

2014-07-22 Thread Kristof Vanbecelaere
While playing with the movielens data set to learn about dynamic partitions I ran from u_data insert overwrite table u_data_p partition (rating) select * This failed with [Error 20004]: Fatal error occurred when node tried to create too many dynamic partitions. The maximum number of dynamic pa

Re: Hive Stats

2014-07-22 Thread Bala Krishna Gangisetty
What is the value of "COLUMN_STATS_ACCURATE" field in "DESCRIBE FORMATTED " output under "Table Parameters" section. My suspect is, it's false even after executing your analyze query. If so, Hive launches the mapred jobs. Try using below analyze query: analyze table mytable partition(server_date=

Re: Query difference between 2 tables

2014-07-22 Thread Gunther Hagleitner
select x from t1 left outer join t2 on t1.x = t2.x where t2.x is null should work. Thanks, Gunther. On Tue, Jul 22, 2014 at 11:01 AM, Brenden Cobb wrote: > Hi- > > I'm stuck on Hive .10 right now and I'm trying to figure out how to > accomplish the equivalent of a not exists or minus statem

RE: hive 13: dynamic partition inserts

2014-07-22 Thread Gajendran, Vishnu
Hi Prasanth, Thanks a lot for your quick response. From: Gajendran, Vishnu Sent: Tuesday, July 22, 2014 11:47 AM To: user@hive.apache.org Cc: d...@hive.apache.org Subject: RE: hive 13: dynamic partition inserts Hi Prasanth, Thanks a lot for your quick response.

RE: hive 13: dynamic partition inserts

2014-07-22 Thread Gajendran, Vishnu
Hi Prasanth, Thanks a lot for your quick response. From: Prasanth Jayachandran [pjayachand...@hortonworks.com] Sent: Tuesday, July 22, 2014 11:28 AM To: user@hive.apache.org Cc: d...@hive.apache.org Subject: Re: hive 13: dynamic partition inserts Hi Vishnu Yes.

Re: hive 13: dynamic partition inserts

2014-07-22 Thread Prasanth Jayachandran
Hi Vishnu Yes. There is change in the way dynamic partitions are inserted in hive 13. The new dynamic partitioning is highly scalable and uses very less memory. Here is the related JIRA https://issues.apache.org/jira/browse/HIVE-6455. Setting "hive.optimize.sort.dynamic.partition" to false wi

Query difference between 2 tables

2014-07-22 Thread Brenden Cobb
Hi- I'm stuck on Hive .10 right now and I'm trying to figure out how to accomplish the equivalent of a not exists or minus statement: Select x from t1 where x not in ( select x from t2) I know this kind of subquery is available in .13 but would like to find a workaround. Appreciate any sugges

RE: hive 13: dynamic partition inserts

2014-07-22 Thread Gajendran, Vishnu
adding user@hive.apache.org for wider audience From: Gajendran, Vishnu Sent: Tuesday, July 22, 2014 10:42 AM To: d...@hive.apache.org Subject: hive 13: dynamic partition inserts Hello, I am seeing a difference between hive 11 and hive 13 when inserting to a table

Re: Hive Statistics

2014-07-22 Thread Nitin Pawar
by default hive stores the statistics in derby database. If you want a persistent look at column statistics, you may want to create mysql based database for column statistics. Your queries look fine On Tue, Jul 22, 2014 at 10:50 PM, Navdeep Agrawal < navdeep_agra...@symantec.com> wrote: > Hi ,

dynamic partition inserts in hive 13

2014-07-22 Thread Gajendran, Vishnu
Hello, I am seeing a difference between hive 11 and hive 13 when inserting to a table with dynamic partition. In Hive 11, when I set hive.merge.mapfiles=false before doing a dynamic partition insert, I see number of files (generated my each mapper) in the specified hdfs location as expected. B

Hive Statistics

2014-07-22 Thread Navdeep Agrawal
Hi , i am trying to compute statistics on ORC File but i am unable see any changes in PART_COL_STATS as well on using set hive.compute.query.using.stats=true; set hive.stats.reliable=true; set hive.stats.fetch.column.stats=true; set hive.stats.fetch.partition.stats=true; set hive.cbo.enable=tr

Hive Stats

2014-07-22 Thread Navdeep Agrawal
i am trying to compute statistics on ORC File but i am unable see any changes in PART_COL_STATS as well on using set hive.compute.query.using.stats=true; set hive.stats.reliable=true; set hive.stats.fetch.column.stats=true; set hive.stats.fetch.partition.stats=true; set hive.cbo.enable=true; to

Re: Drop Partition by ID

2014-07-22 Thread Devopam Mittra
Please try using escape character around the '%' if not already done so. regards Dev On Mon, Jul 21, 2014 at 7:32 PM, fab wol wrote: > Hi everyone, > > I have the following problem: I have a partitoned managed table (Partition > table is a string which represents a date, eg. log-date="2014-07-

Querying arrays of structs with regular expressions or like/rlike functions

2014-07-22 Thread Roberto Coluccio
Hello folks, I am performing some tests with Hive 0.12.0 (cdh5.0.3). I have a quite complex data model, in particular I modeled a filed in my table as an array of structs, like: people array< struct< name:string, surname:string,

Get 'TBLPROPERTIES' of a Hive table in Java

2014-07-22 Thread Something Something
I am writing a custom InputFormat class in which I need to get the 'TBLPROPETIES' of a Hive Table. This InputFormat will extend the HiveInputFormat. I thought the table properties will be available to me in the 'JobConf', but they are not there. What's the best way to get TBLPROPERTIES of a Hive