Re: Hive Query Performance Tuning

2019-12-03 Thread Matthew Dixon
Hi Rajbir, some thoughts to consider, I’m wondering what the row_number() functionality is doing. Because the window frame has no ORDER BY clause the result may not be deterministic, is this the expected behaviour? I ask because analytic functions can be expensive to compute so make sure you

Re: Hive query starts own session for LLAP

2017-09-27 Thread Gopal Vijayaraghavan
> Now we need an explanation of "map" -- can you supply it? The "map" mode runs all tasks with a TableScan operator inside LLAP instances and all other tasks in Tez YARN containers. This is the LLAP + Tez hybrid mode, which introduces some complexity in debugging a single query. The "only" mod

Re: Hive query starts own session for LLAP

2017-09-26 Thread Lefty Leverenz
Thanks for the explanations of "all" and "only" Sergey. I've added them to the wiki, with minor edits: hive.llap.execution.mode . Now we need an explanation of "map" -- ca

Re: Hive query starts own session for LLAP

2017-09-25 Thread Sergey Shelukhin
Hello. Hive would create a new Tez AM to coordinate the query (or use an existing one if HS2 session pool is used). However, the YARN app for Tez should only have a single container. Is this not the case? If it’s running additional containers, what is hive.llap.execution.mode set to? It should be s

Re: Hive query on ORC table is really slow compared to Presto

2017-06-22 Thread Gopal Vijayaraghavan
> 1711647 -1032220119 Ok, so this is the hashCode skew issue, probably the one we already know about. https://github.com/apache/hive/commit/fcc737f729e60bba5a241cf0f607d44f7eac7ca4 String hashcode distribution is much better in master after that. Hopefully that fixes the distinct speed issue h

Re: Hive query on ORC table is really slow compared to Presto

2017-06-21 Thread Mich Talebzadeh
With ORC tables have you tried set hive.vectorized.execution.enabled = true; set hive.vectorized.execution.reduce.enabled = true; SET hive.exec.parallel=true; -- set hive.optimize.ppd=true; HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcP

Re: Hive query on ORC table is really slow compared to Presto

2017-06-21 Thread Premal Shah
Gopal, Thanx for the debugging steps. Here's the output *hive> select count(1) as collisions, hash(ip) from table group by hash(ip) order by collisions desc limit 10;* 4 -1432955330 4 -317748560 4 -1460629578 4 1486313154 4 -320519155 4 1875999753 4 -141

Re: Hive query on ORC table is really slow compared to Presto

2017-06-14 Thread Gopal Vijayaraghavan
> SELECT COUNT(DISTINCT ip) FROM table - 71 seconds > SELECT COUNT(DISTINCT id) FROM table - 12,399 seconds Ok, I misunderstood your gist. > While ip is more unique that id, ip runs many times faster than id. > > How can I debug this ? Nearly the same way - just replace "ip" with "id" in my exp

Re: Hive query on ORC table is really slow compared to Presto

2017-06-14 Thread Premal Shah
Hi Gopal, Thanx for the reply. I just want to clarify a few things. 1. The count distinct ip query runs fast and so it's not a problem 2. I would not expect the ip column to use DICTIONARY encoding too 3. I am more concerned about the count distinct id or count distinct master_id column which if

Re: Hive query on ORC table is really slow compared to Presto

2017-06-12 Thread Gopal Vijayaraghavan
Hi, I think this is worth fixing because this seems to be triggered by the data quality itself - so let me dig in a bit into a couple more scenarios. > hive.optimize.distinct.rewrite is True by default FYI, we're tackling the count(1) + count(distinct col) case in the Optimizer now (which came

Re: Hive query on ORC table is really slow compared to Presto

2017-06-12 Thread Michael Segel
Silly question… What about using COUNT() and a GROUP BY() instead? I’m going from memory…. this may or may not work. Since you want the row_id only in order to de-dupe, right? On Jun 12, 2017, at 3:59 PM, Premal Shah mailto:premal.j.s...@gmail.com>> wrote: Thanx Gopal. Sorry, took me a few d

Re: Hive query on ORC table is really slow compared to Presto

2017-06-12 Thread Premal Shah
Thanx Gopal. Sorry, took me a few days to respond. Here are some findings. hive.optimize.distinct.rewrite is True by default I do see Reducer 2 + 3. However, this might be worth mentioning. The distinct query on an ORC table takes a ton of time. I created a table with the TEXTFILE format from th

Re: Hive query on ORC table is really slow compared to Presto

2017-04-04 Thread Gopal Vijayaraghavan
> SELECT COUNT(*), COUNT(DISTINCT id) FROM accounts; … > 0:01 [8.59M rows, 113MB] [11M rows/s, 146MB/s] I'm hoping this is not rewriting to the approx_distinct() in Presto. > I got similar performance with Hive + LLAP too. This is a logical plan issue, so I don't know if LLAP helps a lot. A cou

Re: hive query plain has not index description

2017-01-19 Thread min zou
it's fixed, as the params were not work. 2017-01-19 17:34 GMT+08:00 min zou : > hi, i have created a table hive_hbase_visitor2 in hive, and created an > index on the table,but when i execute the query plan about *select ** from > hive_hbase_visitor2 where name='knlf', the description of index wa

Re: hive query

2016-08-12 Thread Joanne Chan
The query is assuming Keyword/Hour is unique which I am not sure if that's an assumption per requirement. If not, you'd probably want to group by those two columns. select k.keyword , h.hour , sum(coalesce(t.totalcount,0)) from (select distinct keyword from t) as k join (select

RE: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and col 2 value

2016-04-26 Thread Markovitz, Dudu
om t group by c1; 1 [12,13,12,14,11,14] [12,13,14,11] [15,11,13,13,13,11] [15,11,13] 2 [15] [15] [11] [11] 3 [11,12,11] [11,12] [13,15,13] [13,15] From: Deepak Khandelwal [mailto:dkhandelwal@gmail.com] Sent: Tuesday, April 26, 2016 8:35 PM To: user@hive.apache.o

RE: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and col 2 value

2016-04-26 Thread Ryan Harris
all pairs. hope that helps From: Deepak Khandelwal [mailto:dkhandelwal@gmail.com] Sent: Tuesday, April 26, 2016 11:35 AM To: user@hive.apache.org Subject: Re: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and c

Re: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and col 2 value

2016-04-26 Thread Deepak Khandelwal
Thanks a lot Dudu. Could you also tell how can I use concat with group by clause in have. I have n rows with col1, col2, col3 and i want a result grouped by col1 and concat all values of col2 and col3. Id,key,value, value2 __ 1,fname,Dudu, m1 1,lname,Markowitz, m2 2,fname

Re: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and col 2 value

2016-04-23 Thread Mich Talebzadeh
http://talebzadehmich.wordpress.com On 23 April 2016 at 08:07, Markovitz, Dudu wrote: > Hi Mich, it seems the request was for unpivot. > > > > Dudu > > > > *From:* Mich Talebzadeh [mailto:mich.talebza...@gmail.com] > *Sent:* Saturday, April 23, 2016 10:04 AM > *To:* user > *Subj

RE: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and col 2 value

2016-04-23 Thread Markovitz, Dudu
Hi Mich, it seems the request was for unpivot. Dudu From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com] Sent: Saturday, April 23, 2016 10:04 AM To: user Subject: Re: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name

Re: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and col 2 value

2016-04-23 Thread Mich Talebzadeh
try this -- populate table user_parameters with user_id values (unique)from user_details INSERT user_parameters SELECT user_id, null, null FROM user_details -- Update remaining columnsd UPDATE user_parameters SET param_name = t1.user_name param_value = t1.user_address FROM

RE: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and col 2 value

2016-04-23 Thread Markovitz, Dudu
Another example (with first name and last name), same principal Dudu Given the following table: id, first_name,last_name __ 1,Dudu,Markovitz 2,Andrew,Sears select id,key,value from my_table lateral view explode (map('fname',first_name,'lname',last_name)) t; The result wil

Re: Hive query on Tez slower than on MR (fails in some cases) ..

2016-02-19 Thread Gopal Vijayaraghavan
Hi, > Here's the Tez DAG swimlane. Haven't gotten vertex.py to work.. will >send that too soon. Pretty clear that the map-side is fine - splitting sort buffers isn't bothering this at all. We want to over-partition Reducer 7 and possibly have all of them pick the total # of reducers dynamically

Re: Hive query on Tez slower than on MR (fails in some cases) ..

2016-02-18 Thread Gopal Vijayaraghavan
> On Tez, this is run as a single DAG of M-R+ ... Can't tell which vertex is the slow one in this. More tooling for isolating which vertex is taking up time (and which task) https://github.com/apache/tez/tree/master/tez-tools/swimlanes or alternatively run https://github.com/t3rmin4t0r/tez-s

Re: Hive Query Timeout in hive-jdbc

2016-02-02 Thread Loïc Chanel
Then indeed Tez and MR timeout won't be any help, sorry. I would be very interested in your solution though. Regards, Loïc Loïc CHANEL System & virtualization engineer TO - XaaS Ind - Worldline (Villeurbanne, France) 2016-02-02 11:27 GMT+01:00 Satya Harish Appana : > Queries I am running over H

Re: Hive Query Timeout in hive-jdbc

2016-02-02 Thread Satya Harish Appana
Queries I am running over Hive JDBC are ddl statements(none of the queries are select or insert. which will result in an execution engine(tez/mr) job to be launched.. all the queries are create external table .. and drop table .. and alter table add partitions). On Tue, Feb 2, 2016 at 3:54 PM, Lo

Re: Hive Query Timeout in hive-jdbc

2016-02-02 Thread Loïc Chanel
Actually, Hive doesn't support timeout, but Tez and MapReduce does. Therefore, you can set a timeout on these tools to kill failed queries. Hope this helps, Loïc Loïc CHANEL System & virtualization engineer TO - XaaS Ind - Worldline (Villeurbanne, France) 2016-02-02 11:10 GMT+01:00 董亚军 : > hive

Re: Hive Query Timeout in hive-jdbc

2016-02-02 Thread 董亚军
hive does not support timeout on the client side. and I think it is not recommended that if the client exit with timeout exception, the hiveserver side may also running the job. this will result in inconsistent state. On Tue, Feb 2, 2016 at 4:49 PM, Satya Harish Appana < satyaharish.app...@gmail.

Re: Hive query hangs in reduce steps

2016-01-09 Thread Suresh V
Hi Gopal - actually no., the table is not partitioned/bucketed. Everyday the whole table gets cleaned up and populated with last 120 days' data... What are the other properties I can try to improve the performance of reduce steps...? Suresh V http://www.justbirds.in On Sat, Jan 9, 2016 at 8:52

Re: Hive query hangs in reduce steps

2016-01-09 Thread Suresh V
Hi Mich We have to use TEZ as the engine since the data volume is high and with MR it takes several hours. With TEZ it used to take about an hour max. Thanks Suresh. On Sat, Jan 9, 2016 at 7:34 AM, Mich Talebzadeh wrote: > Hi Suresh, > > > > I have the same issue when I use Hive on Spark. > >

Re: Hive query hangs in reduce steps

2016-01-09 Thread Gopal Vijayaraghavan
Hi, > The job completes fine if we reduce the # of rows processed by reducing >the # of days data being processed. > > It just gets stuck after all maps are completed. We checked the logs and >it says the containers are released. Looks like you're inserting into a bucketed & partitioned table an

RE: Hive query hangs in reduce steps

2016-01-09 Thread Mich Talebzadeh
Hi Suresh, I have the same issue when I use Hive on Spark. What normally works is Hive on MR. Have you tried: set hive.execution.engine=mr; Sounds like it times out for one reason or other! From: Suresh V [mailto:verdi...@gmail.com] Sent: 09 January 2016 11:35 To: user@hive.apa

Re: Hive Query on Hbase snapshot error

2015-09-24 Thread Sandeep Nemuri
hbase org.apache.hadoop.hbase.snapshot.SnapshotInfo -snapshot test_snapshot -stats -schema On Thu, Sep 24, 2015 at 3:43 PM, Sandeep Nemuri wrote: > You can check snapshot state if it is healthy or not using below command. > > > On Thu, Sep 24, 2015 at 2:55 PM, 核弹头す <510688...@qq.com> wrote: > >>

Re: Hive Query on Hbase snapshot error

2015-09-24 Thread Sandeep Nemuri
You can check snapshot state if it is healthy or not using below command. On Thu, Sep 24, 2015 at 2:55 PM, 核弹头す <510688...@qq.com> wrote: > Hi all, > > > I am using hive to query on base snapshot. But I got the following error: > > FAILED: IllegalArgumentException > org.apache.hadoop.hbase.snap

Re: Hive Query failing !!!

2015-09-22 Thread Nitin Pawar
Ok Sorry my bad I had overlooked your query that you are doing joins via where clause. On Tue, Sep 22, 2015 at 12:20 PM, @Sanjiv Singh wrote: > Nitin, > > Following setting already there at HIVE. > set hive.exec.mode.local.auto=false; > > Surprisingly , when it did following setting , it starte

Re: Hive Query failing !!!

2015-09-21 Thread @Sanjiv Singh
Nitin, Following setting already there at HIVE. set hive.exec.mode.local.auto=false; Surprisingly , when it did following setting , it started working set hive.auto.convert.join=true; can you please help me understand , what had happened ? Regards Sanjiv Singh Mob : +091 9990-447-339 O

Re: Hive Query failing !!!

2015-09-21 Thread Nitin Pawar
Can you try setting these set hive.exec.mode.local.auto=false; On Tue, Sep 22, 2015 at 11:25 AM, @Sanjiv Singh wrote: > > > *Hi Folks,* > > > *I am running given hive query . it is giving error while executing. > please help me get out of it and understand possible reason for error.* > > *Hive

Re: Hive query over JDBC not honoring fetch size

2015-08-19 Thread Emil Berglind
Also, I tried setting the "hive.fetch.task.conversion" property in the JDBC URL, like so: jdbc:hive2:// 192.168.132.128:1/default?hive.fetch.task.conversion=none, but it is still creating mapreduce tasks for the query, so it effectively seems to be ignoring that property. On Wed, Aug 19, 2015

Re: Hive query over JDBC not honoring fetch size

2015-08-19 Thread Emil Berglind
When I run the "SELECT * FROM " query it is running it as a mapreduce job. I can see it in the Yarn Manager and also in the Tez UI. This is also when the fetch size is not honored and it tries to basically return all results at once. Is there a way to make this work? On Wed, Aug 19, 2015 at 10:53

Re: Hive query over JDBC not honoring fetch size

2015-08-19 Thread Prem Yadav
actually it should be something like getHandleIdentifier()=hfhkjhfjhkjfh-dsdsad-sdsd--dsada: fetchResults() On Wed, Aug 19, 2015 at 3:49 PM, Prem Yadav wrote: > Hi Emil, > for either of the queries, there will be no mapreduce job. the query > engine understands that in both case, it need not do

Re: Hive query over JDBC not honoring fetch size

2015-08-19 Thread Prem Yadav
Hi Emil, for either of the queries, there will be no mapreduce job. the query engine understands that in both case, it need not do any computation and just needs to fetch all the data from the files. The fetch size should be honored in both cases. Hope you are using hiveserver2. You can try connec

Re: Hive Query Error

2015-07-09 Thread Ajeet O
Hi Nitin , How to check this, you mean to check hive-site.xml. please let me know how to check this. From: Nitin Pawar To: "user@hive.apache.org" Date: 07/09/2015 07:35 PM Subject: Re: Hive Query Error can u check your config? host appears twice

Re: Hive Query Error

2015-07-09 Thread Nitin Pawar
can u check your config? host appears twice 01hw357381.tcsgegdc.com: 01hw357381.tcsgegdc.com it shd be hostname:port also once you correct this, you do a nslookup on the host to make sure its identified by the hive client On Thu, Jul 9, 2015 at 7:19 PM, Ajeet O wrote: > Hi All , I have installe

Re: Hive Query

2015-06-03 Thread João Alves
Hey all, Has anyone else also found the coalesce function to be prone to some weird behaviours? e.g.1: Giving null when it shouldn’t. e.g.2: I had to change a coalesce(v1,v2,v3) to coalesce(v1,v2,v3,null) (???) otherwise the query would crash! Regards, João Alves > On 03 Jun 2015, at 17:00

Re: Hive Query

2015-06-03 Thread gabriel balan
Hi If(ISNOTNULL(sum(columnname), sum(columnname),0) as sumVendor Or *coalesce( sum(columnname),0) as ...* As explained here , COALESCE(T v1, T v2, ...) Returns the first v that is not NULL, or NULL if all v's are NULL

Re: Hive Query o/p to HDFS as CSV file

2015-01-09 Thread Jason Dere
A workaround might be to create an external table with the correct format, insert overwrite into the external table, then drop the external table (which I think shouldn't delete the directory) On Jan 9, 2015, at 5:46 AM, vengatesh.babu wrote: > Hi, > > How to write Hive query output to HDFS

Re: Hive Query o/p to HDFS as CSV file

2015-01-09 Thread vengatesh.babu
Hi, How to write Hive query output to HDFS directory as CSV file(comma separated). Thanks Vengatesh Babu K M On Wed, 07 Jan 2015 11:46:39 +0530 vengatesh.babu wrote Hi, I want to write hive query output into HDFS file in CSV Format( comma separate

Re: hive query with in statement

2014-08-13 Thread Seeling Cheung
From: ilhami Kalkan To: user@hive.apache.org Sent: Wednesday, August 13, 2014 6:03 AM Subject: Re: hive query with in statement Hi Kevin, I'm using 0.12 version and IN statement works fine except this situation: select * from table1 where callhour in (1,2,3,4); --> success  (ca

Re: hive query with in statement

2014-08-13 Thread Tuong Tr.
ami Kalkan To: user@hive.apache.org Sent: Wednesday, August 13, 2014 1:14 AM Subject: Re: hive query with in statement Thanks Navis it works. On 13-08-2014 09:03, Navis류승우 wrote: Could you try "cast(calldate as string)"? > > >Thanks, >Navis > > > > > >2014

Re: hive query with in statement

2014-08-13 Thread ilhami Kalkan
Hi Kevin, I'm using 0.12 version and IN statement works fine except this situation: select * from table1 where callhour in (1,2,3,4); --> success (callhour type: int) select * from table1 where name in ('foo1','foo2','foo3','foo4'); --> success (name type: string) select * from table1 where c

Re: hive query with in statement

2014-08-13 Thread Kevin Weiler
This is a relatively old stack overflow post. I’m not sure what version you guys are using, but IN seems to work just fine for me. -- Kevin Weiler IT IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL 60606 | http://imc-chicago.com/ Phone: +1 312-204-7439 | Fax: +1 312-244-33

Re: hive query with in statement

2014-08-13 Thread ilhami Kalkan
Thanks Navis it works. On 13-08-2014 09:03, Navis류승우 wrote: Could you try "cast(calldate as string)"? Thanks, Navis 2014-08-12 20:22 GMT+09:00 ilhami Kalkan >: Hi all, I have a problem with IN statement in HiveQL. My table "cdr", column "call

Re: hive query with in statement

2014-08-12 Thread Navis류승우
Could you try "cast(calldate as string)"? Thanks, Navis 2014-08-12 20:22 GMT+09:00 ilhami Kalkan : > Hi all, > I have a problem with IN statement in HiveQL. My table "cdr", column > "calldate" which type is "date". First query is successfully return: > select * from cdr where calldate = '2014-0

Re: hive query with in statement

2014-08-12 Thread Sreenath
Hi, hive doesn't support IN clause. you might want to check out http://stackoverflow.com/questions/7677333/how-to-write-subquery-and-use-in-clause-in-hive On 12 August 2014 17:07, ilhami Kalkan wrote: > Hi all, > I have a problem with IN statement in HiveQL. My table "cdr", column > "calldate"

Re: Hive Query (Replace)

2014-07-01 Thread Lefty Leverenz
I'm no expert, but could you use the regexp_replace or translate function? >From the Hive wiki : regexp_replace(string INITIAL_STRING, string PATTERN, string REPLACEMENT) > > Returns the string

Re: hive query to select top 10 product of each subcategory and select most recent product info

2014-04-11 Thread Adrian Hains
I think you need to separate out the logic that does your group by aggregations from the logic of then retrieving all of the other columns for a single row from that set. Something like: select tbl.myKeyColumn1, tbl.myKeyColumn2, tbl.otherValueColumn1, tbl.ot

Re: hive query to select top 10 product of each subcategory and select most recent product info

2014-04-11 Thread Nitin Pawar
will it be a good idea to just get top 10 ranked products by whatever your ranking is based on and then join it with its metadata (self join or any other way) ? On Fri, Apr 11, 2014 at 1:52 PM, Mohit Durgapal wrote: > Hi Nitin, > > The ddl is as follows: > > CREATE EXTERNAL TABLE user_logs( > us

Re: hive query to select top 10 product of each subcategory and select most recent product info

2014-04-11 Thread Mohit Durgapal
Hi Nitin, The ddl is as follows: CREATE EXTERNAL TABLE user_logs( users_iduuidstring, siteid int, site_catid int, stext string, catgint, // CATEGORY scatg int, // SUBCATEGORY catgnamestring, scatgname string, brand string,// PRODUCT BRAND NAME prrange strin

Re: hive query to select top 10 product of each subcategory and select most recent product info

2014-04-11 Thread Nitin Pawar
may be you can share your table ddl, your query and what output r u looking for On Fri, Apr 11, 2014 at 12:26 PM, Mohit Durgapal wrote: > I have a hive table partitioned by dates. It contains ecomm data in the > format siteid,sitecatid,catid,subcatgid,pid,pname,pprice,pmrp,pdesc > > > > What

Re: HIVE QUERY HELP:: HOW TO IMPLEMENT THIS CASE

2014-03-04 Thread Stephen Sprague
ok. my conscience got the best of me. maybe for worse though. :) This to me is like giving you a rope and a stool and i don't think it'll end well. That said consider something like this: {code} select a.foo1, a.foo2, --column to be updated. you need to position it properly --if null th

Re: HIVE QUERY HELP:: HOW TO IMPLEMENT THIS CASE

2014-03-04 Thread Stephen Sprague
Let's just say this. Coercing hive into doing something its not meant to do is kinda a waste of time. Sure you can rewrite any update as a delete/insert but that's not the point of Hive. Seems like your going down a path here that's not optimal for your situation. You know, I could buy a Tesla a

RE: Hive query parser bug resulting in "FAILED: NullPointerException null"

2014-02-27 Thread java8964
Can you reproduce with an empty table? I can't reproduce it. Also, can you paste the stack trace? Yong From: krishnanj...@gmail.com Date: Thu, 27 Feb 2014 12:44:28 + Subject: Hive query parser bug resulting in "FAILED: NullPointerException null" To: user@hive.apache.org Hi all, we've experien

Re: hive query to calculate percentage

2014-02-26 Thread Manish
p by timestamp_dt ) b on (a.timestamp_dt = b.timestamp_dt) 2) If you are using hive 11 or above, using windows functions. Yong Date: Tue, 25 Feb 2014 18:27:34 -0600 Subject: Re: hive query to calculate percentage From: kkrishna...@gm

RE: hive query to calculate percentage

2014-02-25 Thread java8964
estamp_dt) b on (a.timestamp_dt = b.timestamp_dt) 2) If you are using hive 11 or above, using windows functions. Yong Date: Tue, 25 Feb 2014 18:27:34 -0600 Subject: Re: hive query to calculate percentage From: kkrishna...@gmail.com To: user@hive.apache.org Modfiy the query to :select totalcount / sum(t

Re: hive query to calculate percentage

2014-02-25 Thread Krishnan K
Modfiy the query to : select totalcount / sum(totalcount) from daily_count_per_kg_domain where timestamp_dt = '20140219' group by timestamp_dt; if you dont specify the where clause, you will get result for all partitions. On Tue, Feb 25, 2014 at 3:14 PM, Manish wrote: > I have a partitioned ta

Re: Hive Query :: Implementing case statement

2014-02-19 Thread yogesh dhari
Hello Stephen , Yes, actully I have used Left Outer Join instead of Join, there were left outer joins in RDBMS Query instead of join. Thanks again :) On Thu, Feb 20, 2014 at 10:45 AM, Stephen Sprague wrote: > Hi Yogesh, > > i overlooked one thing and for completeness we should make note of it

Re: Hive Query :: Implementing case statement

2014-02-19 Thread Stephen Sprague
Hi Yogesh, i overlooked one thing and for completeness we should make note of it here. change: -- non-intersected rows select a.* from TABLE_SQL a join NEW_BALS b on (a.key=b.key) where b.NEW_BALANCE is null to -- non-intersected rows select a.* from TABLE_SQL a *LEFT OUTER* join NEW_BAL

Re: Hive Query :: Implementing case statement

2014-02-19 Thread yogesh dhari
Thanks a lot Stephen Sprague :) :) It worked.. , just to remove the " ; " from here, bcoz it was throuig sub query systax error... create table NEW_BALS as select * from ( select b.prev as NEW_BALANCE, a.key from TABLE_SQL a join TABLE_SQL_2 b on (a.key=b.key) where a.code='1'; UNION ALL sel

Re: Hive Query :: Implementing case statement

2014-02-18 Thread Stephen Sprague
maybe consider something along these lines. nb. not tested. -- temp table holding new balances + key create table NEW_BALS as select * from ( select b.prev as NEW_BALANCE, a.key from TABLE_SQL a join TABLE_SQL_2 b on (a.key=b.key) where a.code='1'; UNION ALL select b.prev as N

Re: Hive Query :: Implementing case statement

2014-02-18 Thread Navis류승우
If key is unique, you might overwrite values by using hbase handler. 2014-02-18 22:05 GMT+09:00 yogesh dhari : > Yes, Hive does not provide update statement, I am just looking for the > work arround it, how to implement it > > > > > > On Tue, Feb 18, 2014 at 6:27 PM, Peter Marron < > peter.mar..

Re: Hive Query :: Implementing case statement

2014-02-18 Thread yogesh dhari
Yes, Hive does not provide update statement, I am just looking for the work arround it, how to implement it On Tue, Feb 18, 2014 at 6:27 PM, Peter Marron < peter.mar...@trilliumsoftware.com> wrote: > From https://cwiki.apache.org/confluence/display/Hive/Home > > > > "Hive is not designed for

RE: Hive Query :: Implementing case statement

2014-02-18 Thread Peter Marron
>From https://cwiki.apache.org/confluence/display/Hive/Home "Hive is not designed for OLTP workloads and does not offer real-time queries or row-level updates." As far as I am aware "UPDATE" isn't even in the Hive DML. Z Peter Marron Senior Developer Trillium Software, A Harte Hanks Company Th

Re: Hive Query Error

2014-02-05 Thread Stephen Sprague
file this one under RTFM. On Wed, Feb 5, 2014 at 9:11 AM, Nitin Pawar wrote: > its create table xyz stored as sequencefile as select blah from table > > > On Wed, Feb 5, 2014 at 10:37 PM, Raj Hadoop wrote: > >> *I am trying to create a Hive sequence file from another table by running >> the fo

Re: Hive Query Error

2014-02-05 Thread Nitin Pawar
its create table xyz stored as sequencefile as select blah from table On Wed, Feb 5, 2014 at 10:37 PM, Raj Hadoop wrote: > *I am trying to create a Hive sequence file from another table by running > the following -* > > *Your query has the following error(s):* > OK FAILED: ParseException line 5

Re: Hive query taking a lot of time just to launch map-reduce jobs

2013-11-26 Thread David Morel
On 26 Nov 2013, at 7:02, Sreenath wrote: Hey David, Thanks for the swift reply. Each id will have exactly one file. and regarding the volume on an average each file would be 100MB of compressed data with the maximum going upto around 200MB compressed data. And how will RC files be an advant

Re: Hive query taking a lot of time just to launch map-reduce jobs

2013-11-25 Thread Sreenath
Hey David, Thanks for the swift reply. Each id will have exactly one file. and regarding the volume on an average each file would be 100MB of compressed data with the maximum going upto around 200MB compressed data. And how will RC files be an advantage here? On Mon, Nov 25, 2013 at 5:50 PM, Dav

Re: Hive query taking a lot of time just to launch map-reduce jobs

2013-11-25 Thread David Morel
On 25 Nov 2013, at 11:50, Sreenath wrote: hi all, We are using hive for Ad-hoc querying and have a hive table which is partitioned on two fields (date,id).Now for each date there are around 1400 ids so on a single day around that many partitions are added.The actual data is residing in s3. n

Re: Hive Query Questions - is null in WHERE

2013-10-17 Thread Raj Hadoop
  Thanks. It worked for me now when i use it as an empty string. From: Krishnan K To: "user@hive.apache.org" ; Raj Hadoop Sent: Thursday, October 17, 2013 11:11 AM Subject: Re: Hive Query Questions - is null in WHERE For string columns, nu

Re: Hive Query Questions - is null in WHERE

2013-10-17 Thread Krishnan K
For string columns, null will be interpreted as an empty string and for others, it will be interpreted as null... On Wednesday, October 16, 2013, Raj Hadoop wrote: > All, > > When a query is executed like the below > > select field1 from table1 where field1 is null; > > I am getting the resul

Re: Hive Query via Hue, Only column headers in downloaded CSV or XSL results, sometimes

2013-09-30 Thread Prasad Mujumdar
9:38 AM > To: user@hive.apache.org > Subject: RE: Hive Query via Hue, Only column headers in downloaded CSV or > XSL results, sometimes > > Hmm.. No replies on this one? Is no one use Hue? :-) That would be > interesting to know .. if not Hue, how are others exposing Hive to &q

RE: Hive Query via Hue, Only column headers in downloaded CSV or XSL results, sometimes

2013-09-30 Thread Martin, Nick
et the returns fairly quickly and are able to export. -Original Message- From: Sunderlin, Mark [mailto:mark.sunder...@teamaol.com] Sent: Monday, September 30, 2013 9:38 AM To: user@hive.apache.org Subject: RE: Hive Query via Hue, Only column headers in downloaded CSV or XSL results, some

RE: Hive Query via Hue, Only column headers in downloaded CSV or XSL results, sometimes

2013-09-30 Thread Martin, Nick
Sent: Monday, September 30, 2013 9:38 AM To: user@hive.apache.org Subject: RE: Hive Query via Hue, Only column headers in downloaded CSV or XSL results, sometimes Hmm.. No replies on this one? Is no one use Hue? :-) That would be interesting to know .. if not Hue, how are others exposing Hive to

RE: Hive Query via Hue, Only column headers in downloaded CSV or XSL results, sometimes

2013-09-30 Thread Sunderlin, Mark
Hmm.. No replies on this one? Is no one use Hue? :-) That would be interesting to know .. if not Hue, how are others exposing Hive to "end users?" without given them a direct login to a node on the cluster? --- Mark E. Sunderlin Data Architect | AOL NETWORKS BDM P: 703-265-6935 | C: 540-3

Re: Hive Query - Issue

2013-09-03 Thread Sanjay Subramanian
Hi When you do a SELECT * , the partition columns are returned as last N columns (if u have N partitions) In this case the 63rd column in SELECT * is the partition column Instead of SELECT * Do a SELECT col1, col2, col3, ….. Not to show the candle to t

Re: Hive Query - Issue

2013-09-02 Thread manish dunani
Hello, I think you are working with dynamic partition. Then you do not need to mention it's value.you only need to put partition like this::: try this:: insert overwrite table table_baseline partition (sourcedate) select * from (select * from table_a where sourcedate='tablea_2013_08' union all s

Re: hive query error

2013-08-21 Thread 闫昆
thanks Bing I found it 2013/8/22 Bing Li > By default, hive.log should exist in /tmp/. > Also, it could be set in $HIVE_HOME/conf/hive-log4j.properties and > hive-exec-log4j.properties > - hive.log.dir > - hive.log.file > > > 2013/8/22 闫昆 > >> hi all >> when exec hive query throw exception as

Re: hive query error

2013-08-21 Thread Bing Li
By default, hive.log should exist in /tmp/. Also, it could be set in $HIVE_HOME/conf/hive-log4j.properties and hive-exec-log4j.properties - hive.log.dir - hive.log.file 2013/8/22 闫昆 > hi all > when exec hive query throw exception as follow > I donnot know where is error log I found $HIVE_HOME/

Re: Hive Query Issue

2013-08-07 Thread Sanjay Subramanian
.org<mailto:user@hive.apache.org>" mailto:user@hive.apache.org>> Date: Tuesday, August 6, 2013 4:10 AM To: "user@hive.apache.org<mailto:user@hive.apache.org>" mailto:user@hive.apache.org>> Subject: RE: Hive Query Issue Hi, Thanks for your reply. I have a small tes

Re: Hive Query Issue

2013-08-07 Thread Sunita Arvind
> -- > Date: Tue, 6 Aug 2013 16:24:38 +0530 > Subject: Re: Hive Query Issue > From: nitinpawar...@gmail.com > To: user@hive.apache.org > > > when you run select * from table .. it does not launch a mapreduce job, > > where are when you put

RE: Hive Query Issue

2013-08-06 Thread Manickam P
also. All my data nodes are up and running. I don't have any clue here. Thanks, Manickam P Date: Tue, 6 Aug 2013 16:24:38 +0530 Subject: Re: Hive Query Issue From: nitinpawar...@gmail.com To: user@hive.apache.org when you run select * from table .. it does not launch a mapreduce job, wher

Re: Hive Query Issue

2013-08-06 Thread Nitin Pawar
when you run select * from table .. it does not launch a mapreduce job, where are when you put some condition, it does need to process the data so it launches a mapreduce job now when you start this query, go to your jobtracker page and see how many jobs are running. Is it able to start your job?

Re: hive query is very slow,why?

2013-07-19 Thread Nitin Pawar
Huang, the number of records are huge and we do not know what your table definition is or what your cluster capacity is? there are multiple reasons that query is slow Can you share all the details on 1) Whats your table definition? 2) Whats the cluster capacity? 3) when you launched query did the

Re: hive query is very slow,why?

2013-07-18 Thread ch huang
the table records are more than 12000 On Fri, Jul 19, 2013 at 9:34 AM, Stephen Boesch wrote: > one mapper. how big is the table? > > > 2013/7/18 ch huang > >> i wait long time,no result ,why hive is so slow? >> >> hive> select cookie,url,ip,source,vsid,token,residence,edate from >> hb_cook

Re: hive query is very slow,why?

2013-07-18 Thread Stephen Boesch
one mapper. how big is the table? 2013/7/18 ch huang > i wait long time,no result ,why hive is so slow? > > hive> select cookie,url,ip,source,vsid,token,residence,edate from > hb_cookie_history where edate>='1371398400500' and edate<='1371400200500'; > Total MapReduce jobs = 1 > Launching Job

Re: Hive Query

2013-07-12 Thread Edward Capriolo
Hive DOES ssupport in but it is only row wise in. The query you are trying to do should work. On Fri, Jul 12, 2013 at 9:46 AM, Dean Wampler wrote: > Use a semi-join, which is more or less the same thing.. You might also see > if the having clause will help. > > dean > > On Fri, Jul 12, 2013 at

Re: Hive Query

2013-07-12 Thread Dean Wampler
Use a semi-join, which is more or less the same thing.. You might also see if the having clause will help. dean On Fri, Jul 12, 2013 at 6:13 AM, Manickam P wrote: > Hi, > > I need to run hive query like select * from employee where employee_id IN > (100,102). I came to know that hive does not s

Re: Hive Query

2013-07-12 Thread Nitin Pawar
> Date: Fri, 12 Jul 2013 17:00:38 +0530 > Subject: Re: Hive Query > From: nitinpawar...@gmail.com > To: user@hive.apache.org > > > Manickam, > > How does support the in clause > > what hive does not support is "subquery inside in clause" > >

RE: Hive Query

2013-07-12 Thread Manickam P
, Manickam P Date: Fri, 12 Jul 2013 17:00:38 +0530 Subject: Re: Hive Query From: nitinpawar...@gmail.com To: user@hive.apache.org Manickam, How does support the in clause what hive does not support is "subquery inside in clause" you can perfectly run the query you have written but currently

Re: Hive Query

2013-07-12 Thread Nitin Pawar
Manickam, How does support the in clause what hive does not support is "subquery inside in clause" you can perfectly run the query you have written but currently as per my knowledge hive does not let you do this "select * from table where coln in (select coln from table where coln=blah)" if you

Re: Hive Query having virtual column INPUT__FILE__NAME in where clause gives exception

2013-06-17 Thread Jitendra Kumar Singh
Thanks guys for reply. Following query also did not work hive> select count(*), filename from (select INPUT__FILE__NAME as filename from netflow) tmp where filename='vzb.1351794600.0' group by filename; FAILED: SemanticException java.lang.RuntimeException: cannot find field input__file__name from

  1   2   >