Re: [EXTERNAL] Re: Slow Hive query with a lot of 'get_materialized_views_for_rewriting'

2023-11-17 Thread Krisztian Kasa
Replied Message >> From Eugene Miretsky >> Date 11/16/2023 02:21 >> To >> Subject Slow Hive query with a lot of >> 'get_materialized_views_for_rewriting' >> Hey! >> >> We have a catalog with fairly a lot of databases and tables. &g

Re: [EXTERNAL] Re: Slow Hive query with a lot of 'get_materialized_views_for_rewriting'

2023-11-16 Thread Butao Zhang
L] Re: Slow Hive query with a lot of 'get_materialized_views_for_rewriting' | May I ask when hive4 can be released? Replied Message | From | Butao Zhang | | Date | 11/17/2023 12:24 | | To | user@hive.apache.org | | Cc | | | Subject | Re: [EXTERNAL] Re: Slow Hive quer

Re: [EXTERNAL] Re: Slow Hive query with a lot of 'get_materialized_views_for_rewriting'

2023-11-16 Thread lisoda
May I ask when hive4 can be released? Replied Message | From | Butao Zhang | | Date | 11/17/2023 12:24 | | To | user@hive.apache.org | | Cc | | | Subject | Re: [EXTERNAL] Re: Slow Hive query with a lot of 'get_materialized_views_for_rewriting' | Thanks for the info.

Re: [EXTERNAL] Re: Slow Hive query with a lot of 'get_materialized_views_for_rewriting'

2023-11-16 Thread Butao Zhang
Iceberg integration, enhanced materialized view, etc. Thanks, Butao Zhang Replied Message | From | Eugene Miretsky | | Date | 11/17/2023 09:06 | | To | | | Subject | Re: [EXTERNAL] Re: Slow Hive query with a lot of 'get_materialized_views_for_rewriting' | Hey! Hive versio

Re: [EXTERNAL] Re: Slow Hive query with a lot of 'get_materialized_views_for_rewriting'

2023-11-16 Thread Eugene Miretsky
is called by mistake. > > Thanks, > > Butao Zhang > Replied Message > From Eugene Miretsky > Date 11/16/2023 02:21 > To > Subject Slow Hive query with a lot of > 'get_materialized_views_for_rewriting' > Hey! > > We have a catalog with fairl

Re: Slow Hive query with a lot of 'get_materialized_views_for_rewriting'

2023-11-15 Thread Butao Zhang
etsky | | Date | 11/16/2023 02:21 | | To | | | Subject | Slow Hive query with a lot of 'get_materialized_views_for_rewriting' | Hey! We have a catalog with fairly a lot of databases and tables. Where we do a simple query (select * from table limit 5;) on an ideal cluster, it takes a

Slow Hive query with a lot of 'get_materialized_views_for_rewriting'

2023-11-15 Thread Eugene Miretsky
Hey! We have a catalog with fairly a lot of databases and tables. Where we do a simple query (select * from table limit 5;) on an ideal cluster, it takes around 20seconds, sometimes longer (usually first run takes 40s+) Looking at the hive-metastore logs during most of the query time the logs sh

MapReduce Job name of a Hive Query

2020-03-17 Thread Rajbir singh
Hello, Question regarding how to set hive mapreduce job name for hive query child jobs Our hive query creates 9 map-reduce jobs and 17 stages(when I ran EXPLAIN command, output showed 17 STAGES and STAGE DEPENDENCIES). Every child job has the same mapreduce.job.name value To distinguish these

Re: Is there any way to find Hive query to Datanucleus queries mapping

2020-02-11 Thread Chinna Rao Lalam
op/hive/ql/metadata/Hive.java#L5405 > > cheers, > Zoltan > > On 2/10/20 1:07 PM, Chinna Rao Lalam wrote: > > Hi All, > > > > Is there any way to find Hive query to Datanucleus queries mapping. > > > > "select * from table" this hive query will g

Re: Is there any way to find Hive query to Datanucleus queries mapping

2020-02-10 Thread Zoltan Haindrich
/blob/0d9deba3c15038df4c64ea9b8494d554eb8eea2f/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L5405 cheers, Zoltan On 2/10/20 1:07 PM, Chinna Rao Lalam wrote: Hi All, Is there any way to find Hive query to Datanucleus queries mapping. "select * from table" this hive query will generate multiple Datanucl

Is there any way to find Hive query to Datanucleus queries mapping

2020-02-10 Thread Chinna Rao Lalam
Hi All, Is there any way to find Hive query to Datanucleus queries mapping. "select * from table" this hive query will generate multiple Datanucleus queries and execute on configured DB. In our DB some of the queries are running slow, So we want to see hivequery->datanucleus que

Re: rename output error during hive query on AWSs3-external table

2020-02-04 Thread Sungwoo Park
Not a solution, but looking at the source code of S3AFileSystem.java (Hadoop 2.8.5), I think the Exception raised inside S3AFileSystem.rename() is swallowed and only a new HiveException is reported. So, in order to find out the root cause, I guess you might need to set Log level to DEBUG and see wh

RE: rename output error during hive query on AWSs3-external table

2020-02-04 Thread Aaron Grubb
Check this thread: https://forums.aws.amazon.com/thread.jspa?messageID=922594 From: Souvikk Roy Sent: Tuesday, February 4, 2020 3:06 AM To: user@hive.apache.org Subject: rename output error during hive query on AWSs3-external table Hello, We are using some external tables backed by aws S3

rename output error during hive query on AWSs3-external table

2020-02-04 Thread Souvikk Roy
Hello, We are using some external tables backed by aws S3. And we are intermittently getting this error, most likely at the last stage of the reduce, I see some similar posts in net but could not find any solution, Is there any way yo solve it: org.apache.hadoop.hive.ql.metadata.HiveException: Un

Re: Hive Query Performance Tuning

2019-12-03 Thread Matthew Dixon
sks or stages take longer can be a good way to understand what aspects of the query are most expensive to compute. Matt From: Rajbir singh Reply to: "user@hive.apache.org" Date: Tuesday, 3 December 2019 at 09:25 To: "user@hive.apache.org" Subject: Hive Query Performance Tu

Hive Query Performance Tuning

2019-12-02 Thread Rajbir singh
Hi All, I have a hive query which does the aggregation of amounts by reading from hive tables and loads the results to another hive table. I am trying to fine tune the attached query. Read online and came up with following. Any Ideas I would be really appreciate. Thank you 1. Indexing

Hive Query Optimization

2019-08-26 Thread Soupam Mandal
0 We have 7 tables and each table is partitioned by record_date.There is a query which involves inner join with all these tables and join is based on consumer_id. The join involves multiple partition join. Currently querying 1 week data takes very long time around 20-30 mins. We want to optimize t

To get Hive query from timeline server

2018-11-05 Thread Raghuraman Murugaiyan
Hi all, I am working on a Dashboard project, to list all the Hive jobs submitted in the Yarn and all the other related details like No. of Mappers, No. of reducers etc., Is there any chance to get the Hive query(complete query) from the Time line server ? Regards, Raghu M

How to Execute Hive Query in Java asynchronously

2017-10-10 Thread Anant Agarwal
Could not find any clear documentation on How to Execute Hive Query in Java asynchronously. as per https://issues.apache.org/jira/browse/HIVE-4617 this is supported in Hive. Have tried using the Hive Thrift protocol which gives an instance of TOperationHandle which I serialize to a file and

Re: Hive query starts own session for LLAP

2017-09-27 Thread Gopal Vijayaraghavan
> Now we need an explanation of "map" -- can you supply it? The "map" mode runs all tasks with a TableScan operator inside LLAP instances and all other tasks in Tez YARN containers. This is the LLAP + Tez hybrid mode, which introduces some complexity in debugging a single query. The "only" mod

Re: Hive query starts own session for LLAP

2017-09-26 Thread Lefty Leverenz
iners disabled - so the query would fail if it cannot run in LLAP). > > From: Rajesh Narayanan on behalf of > Rajesh Narayanan > Reply-To: "user@hive.apache.org" > Date: Friday, September 22, 2017 at 11:59 > To: "user@hive.apache.org" > Subject: H

Re: Hive query starts own session for LLAP

2017-09-25 Thread Sergey Shelukhin
e: Friday, September 22, 2017 at 11:59 To: "user@hive.apache.org" Subject: Hive query starts own session for LLAP HI All, When I execute the hive query , that starts its own session and creates new yarn jobs rather than using the llap enabled job Can you please provide some suggestion? Thanks Rajesh

Hive query starts own session for LLAP

2017-09-21 Thread Rajesh Narayanan
HI All, When I execute the hive query , that starts its own session and creates new yarn jobs rather than using the llap enabled job Can you please provide some suggestion? Thanks Rajesh

CustomRecordReader in Hive query

2017-09-12 Thread Pavan Kumar Prakash Savanur
I have written a CustomRecordReader which skips records randomly. I want to write a hive query which uses my CustomRecordReader. How do i do that?

Re: Hive query on ORC table is really slow compared to Presto

2017-06-22 Thread Gopal Vijayaraghavan
> 1711647 -1032220119 Ok, so this is the hashCode skew issue, probably the one we already know about. https://github.com/apache/hive/commit/fcc737f729e60bba5a241cf0f607d44f7eac7ca4 String hashcode distribution is much better in master after that. Hopefully that fixes the distinct speed issue h

Re: Hive query on ORC table is really slow compared to Presto

2017-06-21 Thread Mich Talebzadeh
With ORC tables have you tried set hive.vectorized.execution.enabled = true; set hive.vectorized.execution.reduce.enabled = true; SET hive.exec.parallel=true; -- set hive.optimize.ppd=true; HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcP

Re: Hive query on ORC table is really slow compared to Presto

2017-06-21 Thread Premal Shah
Gopal, Thanx for the debugging steps. Here's the output *hive> select count(1) as collisions, hash(ip) from table group by hash(ip) order by collisions desc limit 10;* 4 -1432955330 4 -317748560 4 -1460629578 4 1486313154 4 -320519155 4 1875999753 4 -141

Re: Hive query on ORC table is really slow compared to Presto

2017-06-14 Thread Gopal Vijayaraghavan
> SELECT COUNT(DISTINCT ip) FROM table - 71 seconds > SELECT COUNT(DISTINCT id) FROM table - 12,399 seconds Ok, I misunderstood your gist. > While ip is more unique that id, ip runs many times faster than id. > > How can I debug this ? Nearly the same way - just replace "ip" with "id" in my exp

Re: Hive query on ORC table is really slow compared to Presto

2017-06-14 Thread Premal Shah
Hi Gopal, Thanx for the reply. I just want to clarify a few things. 1. The count distinct ip query runs fast and so it's not a problem 2. I would not expect the ip column to use DICTIONARY encoding too 3. I am more concerned about the count distinct id or count distinct master_id column which if

Re: Hive query on ORC table is really slow compared to Presto

2017-06-12 Thread Gopal Vijayaraghavan
Hi, I think this is worth fixing because this seems to be triggered by the data quality itself - so let me dig in a bit into a couple more scenarios. > hive.optimize.distinct.rewrite is True by default FYI, we're tackling the count(1) + count(distinct col) case in the Optimizer now (which came

Re: Hive query on ORC table is really slow compared to Presto

2017-06-12 Thread Michael Segel
e; And see the output row-count of Map 1. > What can be done to get the hive query to run faster in hive? Try with (see if it generates a Reducer 2 + Reducer 3, which is what the speedup comes from). set hive.optimize.distinct.rewrite=true; or try a rewrite select id from accounts group by id

Re: Hive query on ORC table is really slow compared to Presto

2017-06-12 Thread Premal Shah
; Run with > > set hive.tez.exec.print.summary=true; > > And see the output row-count of Map 1. > > > What can be done to get the hive query to run faster in hive? > > Try with (see if it generates a Reducer 2 + Reducer 3, which is what the > speedup comes from). > &g

Re: Hive query on ORC table is really slow compared to Presto

2017-04-04 Thread Gopal Vijayaraghavan
LAP helps a lot. A count + a count(distinct) is planned as a full shuffle of 100% of rows. Run with set hive.tez.exec.print.summary=true; And see the output row-count of Map 1. > What can be done to get the hive query to run faster in hive? Try with (see if it generates a Reducer 2 + Reduce

Hive query on ORC table is really slow compared to Presto

2017-04-04 Thread Premal Shah
at can be done to get the hive query to run faster in hive? -- Regards, Premal Shah.

Re: hive query plain has not index description

2017-01-19 Thread min zou
it's fixed, as the params were not work. 2017-01-19 17:34 GMT+08:00 min zou : > hi, i have created a table hive_hbase_visitor2 in hive, and created an > index on the table,but when i execute the query plan about *select ** from > hive_hbase_visitor2 where name='knlf', the description of index wa

hive query plain has not index description

2017-01-19 Thread min zou
hi, i have created a table hive_hbase_visitor2 in hive, and created an index on the table,but when i execute the query plan about *select ** from hive_hbase_visitor2 where name='knlf', the description of index was not found, did the index not succeed? *create index hive_hbase_visitor2_index on t

Hive query fails with error "expecting dummy store operator but found: FS[26]"

2016-09-22 Thread Tale Firefly
Hello ! I send you this mail because I perform an hive query with Tez and it fails with a strange error : The error is like this : ### ERROR : Vertex failed, vertexName=Reducer 2, vertexId=vertex_1473870963805_157168_11_02, diagnostics=[Task failed, taskId=task_1473870963805_157168_11_02_35

Understanding hive query plan for Join operation

2016-09-17 Thread Nitin Kumar
Hi, I have the a query and its associated query and query for simulated data The number of rows in the table lte_data_tenmillion is 1000 The number of rows in the table subscriber data is 10 *For both tables none of

Re: hive query

2016-08-12 Thread Joanne Chan
t; 10 > > > > Samsung > > 10 > > > > Iphone > > 11 > > 500 > > Nokia > > 11 > > > > Samsung > > 11 > > 300 > > Iphone > > 12 > > 1000 > > Nokia > > 12 > > 200 > > Samsung >

hive query

2016-08-12 Thread raj hive
12 200 Iphone 16 1500 I want a query to get output for 24 hours like below. I need to show the zero count if i don't have the data. Can anyone help me the hive query. *Keyword* *hour* *TotalCount* iphone 0 0 samsung 0 0 nokia 0 0 iphone 1 0 samsung 1 0 noki

Re: Yarn Application ID for Hive query

2016-07-18 Thread Gopal Vijayaraghavan
> be nice to have access to a command or API call in HiveServer2 similar >to MySQL¹s ³SHOW PROCESSLIST² (and equivalent commands in most other >databases). There is one - if you have the HiveServer2 UI (in 2.0), that can be seen. It would take 10-15 line JSP script to export that as a JSON API

RE: Yarn Application ID for Hive query

2016-07-18 Thread Amit Bajpai
I am running hive on Tez. I am able to get the Yarn application ID for the hive query by submitting the query through Hive JDBC and using HiveStatement. Connection con = DriverManager.getConnection("jdbc:hive2://abc:1/default","xyz", ""); HiveS

RE: Yarn Application ID for Hive query

2016-07-18 Thread Gerber, Bryan W
mmands in most other databases). From: Amit Bajpai [mailto:amit.baj...@flextronics.com] Sent: Thursday, July 14, 2016 10:22 PM To: user@hive.apache.org Subject: Yarn Application ID for Hive query Hi, I am using the below python program to run a hive query. How can I get the Yarn application ID usin

Yarn Application ID for Hive query

2016-07-14 Thread Amit Bajpai
Hi, I am using the below python program to run a hive query. How can I get the Yarn application ID using the python program for the hive query execution. import pyhs2 with pyhs2.connect(host='abc.sac.com', port=1, authMechanism="PLAIN",

Hive Query Error: Cannot obtain block length

2016-06-28 Thread Arun Patel
I am trying to do log analytics on the logs created by Flume. Hive queries are failing with below error. "hadoop fs -cat" command works on all these open files. Is there a way to read these open files? My requirement is to read the data from open files too. I am using tez as execution engine.

Re: Optimize Hive Query

2016-06-27 Thread Eugene Koifman
1:11 PM To: Gopal Vijayaraghavan mailto:gop...@apache.org>> Cc: "user@hive.apache.org<mailto:user@hive.apache.org>" mailto:user@hive.apache.org>> Subject: Re: Optimize Hive Query Thanks Gopal for your inputs For now I have create NON ACID table and loaded data see

Re: Optimize Hive Query

2016-06-27 Thread Mich Talebzadeh
Hi, Curious to see if this issue been resolved (performance) after compaction? Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * http://t

Re: Optimize Hive Query

2016-06-26 Thread @Sanjiv Singh
Thanks Gopal for your inputs For now I have create NON ACID table and loaded data see below from logs proper group splits happening . 2016-06-25 12:52:00,160 [INFO] [InputInitializer {Map 1} #0] |tez.HiveSplitGenerator|: Number of grouped splits: 512 On compaction issue , Compaction enab

Re: Optimize Hive Query

2016-06-24 Thread @Sanjiv Singh
Thanks Gopal for your inputs. Let me run compaction explicitly on table then see how query works. Let Regards Sanjiv Singh Mob : +091 9990-447-339 On Fri, Jun 24, 2016 at 7:53 PM, Gopal Vijayaraghavan wrote: > > > Yes for this tables, ACID enabled. it has only 256 files for each > >buckets

Re: Optimize Hive Query

2016-06-24 Thread Gopal Vijayaraghavan
> Yes for this tables, ACID enabled. it has only 256 files for each >buckets. these are create only when data initially loaded in this table. Yes, the initial load goes in as an insert DELTA too - that requires another compaction to move into base files. The fact that they haven't been automati

Re: Optimize Hive Query

2016-06-24 Thread @Sanjiv Singh
Hi Vijay, Yes for this tables, ACID enabled. it has only 256 files for each buckets. these are create only when data initially loaded in this table. There is not transaction done after that. I see that all file for buckets are also in equal size. One thing that I am not able to understand that

Re: Optimize Hive Query

2016-06-24 Thread Gopal Vijayaraghavan
> Please help me on thislet me know you need other info. Are the ORC tables fully compacted? Looks like you're running a version of Hive-ACID, which does not perform well without compacting delta files. dfs -ls ; should tell you whether there are any delta_* files in the list. > |

Re: Optimize Hive Query

2016-06-24 Thread Mich Talebzadeh
Hi Sanjiv, Normally when it comes to this, I will try to find the section of the code which cause the largest lag SELECT > sb_gu_key, m_d_key, t_ev_st_dt, > LAG( t_ev_st_dt ) OVER ( PARTITION BY m_d_key , sb_gu_key ORDER BY > t_ev_st_dt ) AS LAG_START_DT, > a_z_key, > c_dt, > e_p_dt, > sq_nbr

Re: Optimize Hive Query

2016-06-24 Thread @Sanjiv Singh
Hi Vijay, Please help me on thislet me know you need other info. Regards Sanjiv Singh Mob : +091 9990-447-339 On Thu, Jun 23, 2016 at 12:41 PM, @Sanjiv Singh wrote: > Hi Gopal, > > I am using Tez as execution engine. > > DAG : > > +---

Re: Optimize Hive Query

2016-06-24 Thread @Sanjiv Singh
Hi Mich, I tried the same without any luck. I don't see any improvement. Regards Sanjiv Singh Mob : +091 9990-447-339 On Thu, Jun 23, 2016 at 5:38 PM, @Sanjiv Singh wrote: > Thanks Mich. for your inputs. > > Let me try that as well. Will post response. > > >

RE: Optimize Hive Query

2016-06-23 Thread Markovitz, Dudu
Thanks, I wanted to rule out skewedness over m_d_key,sb_gu_key Dudu From: @Sanjiv Singh [mailto:sanjiv.is...@gmail.com] Sent: Thursday, June 23, 2016 11:55 PM To: user@hive.apache.org; Markovitz, Dudu ; sanjiv singh (ME) Subject: Re: Optimize Hive Query Hi Dudu, find below query response

Re: Optimize Hive Query

2016-06-23 Thread @Sanjiv Singh
Thanks Mich. for your inputs. Let me try that as well. Will post response.

Re: Optimize Hive Query

2016-06-23 Thread @Sanjiv Singh
23, 2016 at 4:01 AM, Markovitz, Dudu wrote: > Could you also add the results of the following query? > > > > Thanks > > > > Dudu > > > > > > select m_d_key > >,sb_gu_key > >,count (*) as cnt > >

Re: Optimize Hive Query

2016-06-23 Thread Mich Talebzadeh
Funny enough it is pretty close to similar ORC transactional tables I have. Standard with 256 buckets with two columns as below number of distinct value in column m_d_key : 29 > number of distinct value in column sb_gu_key : 15434343 You have also vectorised data taking 1024 rows at once. Still

Re: Optimize Hive Query

2016-06-23 Thread @Sanjiv Singh
Hi Mich , Please find below output of command. desc formatted tuning_dd_key ; +---+---+---+--+ | col_name| data_type

RE: Optimized Hive query

2016-06-23 Thread Markovitz, Dudu
Any progress on this one? Dudu From: Aviral Agarwal [mailto:aviral12...@gmail.com] Sent: Wednesday, June 15, 2016 1:04 PM To: user@hive.apache.org Subject: Re: Optimized Hive query I ok to digging down to the AST Builder class. Can you guys point me to the right class ? Meanwhile "ex

Re: Optimize Hive Query

2016-06-23 Thread Jörn Franke
The query looks a little bit too complex from what it is supposed to do. Can you reformulate and restrict the data in a where clause (highest restriction first). Another hint would be to use the Orc format (with indexes and optionally bloom filters) with snappy compression as well as sorting the

Re: Optimize Hive Query

2016-06-23 Thread Mich Talebzadeh
Do you also have the output from desc formatted tuning_dd_key and send the output please? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * htt

Re: Optimize Hive Query

2016-06-23 Thread @Sanjiv Singh
Hi Gopal, I am using Tez as execution engine. DAG : ++--+ | Explain | +-+--+ | Pla

RE: Optimize Hive Query

2016-06-23 Thread Markovitz, Dudu
- From: Gopal Vijayaraghavan [mailto:go...@hortonworks.com] On Behalf Of Gopal Vijayaraghavan Sent: Thursday, June 23, 2016 9:45 AM To: user@hive.apache.org Subject: Re: Optimize Hive Query > Long running query : Are you running this on MapReduce or Tez? Please post the output

Re: Optimize Hive Query

2016-06-22 Thread Gopal Vijayaraghavan
> Long running query : Are you running this on MapReduce or Tez? Please post the output of explain - if you are seeing > 1 shuffle edge in your query while having only one window for OVER(), that might be the reason. OVER ( PARTITION BY m_d_key , sb_gu_key ORDER BY t_ev_st_dt) The multipl

Optimize Hive Query

2016-06-22 Thread @Sanjiv Singh
Hi All, I am running performance issue with below query. Its took 2-3 hours to complete in hive. Try tried to partition and bucketing changes on this tables, but without luck. Please help me in optimizing this query. what schema level changes can be done ? other parameters recommendations ? *

Re: Optimized Hive query

2016-06-15 Thread Aviral Agarwal
I ok to digging down to the AST Builder class. Can you guys point me to the right class ? Meanwhile "explain (rewrite | logical | extended) ", all are not able to flatten even a basic query of the form: select * from ( select * from ( select c from d) alias_1 ) alias_2 into select c from d Tha

Re: Optimized Hive query

2016-06-14 Thread Gopal Vijayaraghavan
> So I was hoping of using internal Hive CBO to somehow change the AST >generated for the query somehow. Hive does have an "explain rewrite" but that prints out the query before CBO runs. For CBO, you need to dig all the way down to the ASTBuilder class and work upwards from there. Perhaps add

Re: Optimized Hive query

2016-06-14 Thread Mich Talebzadeh
representation of the abstract > syntactic <https://en.wikipedia.org/wiki/Abstract_syntax> structure of source > code <https://en.wikipedia.org/wiki/Source_code> written in a programming > language <https://en.wikipedia.org/wiki/Programming_language>. > > > > &

RE: Optimized Hive query

2016-06-14 Thread Markovitz, Dudu
_syntax> structure of source code<https://en.wikipedia.org/wiki/Source_code> written in a programming language<https://en.wikipedia.org/wiki/Programming_language>. From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com] Sent: Tuesday, June 14, 2016 7:58 PM To: user Subject: Re: Op

Re: Optimized Hive query

2016-06-14 Thread Mich Talebzadeh
…”) > > In no point do we have a “flattened query” > > > > Dudu > > > > *From:* Aviral Agarwal [mailto:aviral12...@gmail.com] > *Sent:* Tuesday, June 14, 2016 10:37 AM > *To:* user@hive.apache.org > *Subject:* Re: Optimized Hive query > > > > Hi,

RE: Optimized Hive query

2016-06-14 Thread Markovitz, Dudu
Subject: Re: Optimized Hive query Hi, Thanks for the replies. I already knew that the optimizer already does that. My usecase is a bit different though. I want to display the flattened query back to the user. So I was hoping of using internal Hive CBO to somehow change the AST generated for the

Re: Optimized Hive query

2016-06-14 Thread Mich Talebzadeh
I presume the user is concerned with performance? The whole use case of a CBO is to take care of queries by finding the optimum access path. otherwise we would have a RBO as is in the old days of Hive. If you are in the more recent version of Hive CBO does the job. However, you may think of mov

Re: Optimized Hive query

2016-06-14 Thread Aviral Agarwal
Hi, Thanks for the replies. I already knew that the optimizer already does that. My usecase is a bit different though. I want to display the flattened query back to the user. So I was hoping of using internal Hive CBO to somehow change the AST generated for the query somehow. Thanks, Aviral On Tu

Re: Optimized Hive query

2016-06-14 Thread Gopal Vijayaraghavan
> You can see that you get identical execution plans for the nested query >and the flatten one. Wasn't that always though. Back when I started with Hive, before Stinger, it didn't have the identity project remover. To know if your version has this fix, try looking at hive> set hive.optimize.rem

RE: Optimized Hive query

2016-06-13 Thread Markovitz, Dudu
ListSink | | | +---+--+ From: Aviral Agarwal [mailto:aviral12...@gmail.com] Sent: Monday, June 13, 2016 7:55 PM To: user@hive.apache.org Subject: Optimized Hive query Hi,

Re: Optimized Hive query

2016-06-13 Thread Aviral Agarwal
Yes I want to flatten the query. Also the Insert code is correct. Thanks, Aviral Agarwal On Tue, Jun 14, 2016 at 3:46 AM, Mich Talebzadeh wrote: > you want to flatten the query I understand. > > create temporary table tmp as select c from d; > > INSERT INTO TABLE a > SELECT c from tmp where >

Re: Optimized Hive query

2016-06-13 Thread Mich Talebzadeh
you want to flatten the query I understand. create temporary table tmp as select c from d; INSERT INTO TABLE a SELECT c from tmp where condition Is the INSERT code correct? HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Optimized Hive query

2016-06-13 Thread Aviral Agarwal
Hi, I would like to know if there is a way to convert nested hive sub-queries into optimized queries. For example : INSERT INTO TABLE a.b SELECT * FROM ( SELECT c FROM d) into INSERT INTO TABLE a.b SELECT c FROM D This is a simple example but the solution should apply is there were deeper nesti

RE: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and col 2 value

2016-04-26 Thread Markovitz, Dudu
ndrew 2,lname, Sears From: Deepak Khandelwal [mailto:dkhandelwal@gmail.com] Sent: Saturday, April 23, 2016 9:04 AM To: user@hive.apache.org Subject: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and col 2 value Hi All, I am new

RE: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and col 2 value

2016-04-26 Thread Ryan Harris
all pairs. hope that helps From: Deepak Khandelwal [mailto:dkhandelwal@gmail.com] Sent: Tuesday, April 26, 2016 11:35 AM To: user@hive.apache.org Subject: Re: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and c

Re: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and col 2 value

2016-04-26 Thread Deepak Khandelwal
, Andrew > > 2,lname, Sears > > > > > > *From:* Deepak Khandelwal [mailto:dkhandelwal@gmail.com > ] > *Sent:* Saturday, April 23, 2016 9:04 AM > *To:* user@hive.apache.org > > *Subject:* Hive query to split one row into many rows such that Row 1 > will have

Re: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and col 2 value

2016-04-23 Thread Mich Talebzadeh
http://talebzadehmich.wordpress.com On 23 April 2016 at 08:07, Markovitz, Dudu wrote: > Hi Mich, it seems the request was for unpivot. > > > > Dudu > > > > *From:* Mich Talebzadeh [mailto:mich.talebza...@gmail.com] > *Sent:* Saturday, April 23, 2016 10:04 AM > *To:* user > *Subj

RE: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and col 2 value

2016-04-23 Thread Markovitz, Dudu
Hi Mich, it seems the request was for unpivot. Dudu From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com] Sent: Saturday, April 23, 2016 10:04 AM To: user Subject: Re: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name

Re: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and col 2 value

2016-04-23 Thread Mich Talebzadeh
le1(USER_DETAILS) > in the format shown above. I can do this using UNION ALL but I want to > avoid it as there are like 10 such columns that i need to split like above. > > Can someone suggest a efficient hive query so that i can achieve the > results shown in table 2 from data in

RE: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and col 2 value

2016-04-23 Thread Markovitz, Dudu
ame)) t; The result will look like: Id,key,value __ 1,fname,Dudu 1,lname,Markovitz 2,fname, Andrew 2,lname, Sears From: Deepak Khandelwal [mailto:dkhandelwal@gmail.com] Sent: Saturday, April 23, 2016 9:04 AM To: user@hive.apache.org Subject: Hive query to split one row into

Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and col 2 value

2016-04-22 Thread Deepak Khandelwal
suggest a efficient hive query so that i can achieve the results shown in table 2 from data in table 1 (Hive query to split one row of data into multiple rows like such that Row 1 will have column1 Name, column1 Value and Row 2 will have column 2 Name and column 2 value...). Thanks a lot Deepak

Re: Hive query on Tez slower than on MR (fails in some cases) ..

2016-02-19 Thread Gopal Vijayaraghavan
Hi, > Here's the Tez DAG swimlane. Haven't gotten vertex.py to work.. will >send that too soon. Pretty clear that the map-side is fine - splitting sort buffers isn't bothering this at all. We want to over-partition Reducer 7 and possibly have all of them pick the total # of reducers dynamically

Re: Hive query on Tez slower than on MR (fails in some cases) ..

2016-02-18 Thread Gopal Vijayaraghavan
> On Tez, this is run as a single DAG of M-R+ ... Can't tell which vertex is the slow one in this. More tooling for isolating which vertex is taking up time (and which task) https://github.com/apache/tez/tree/master/tez-tools/swimlanes or alternatively run https://github.com/t3rmin4t0r/tez-s

Re: Hive Query Timeout in hive-jdbc

2016-02-02 Thread Loïc Chanel
Then indeed Tez and MR timeout won't be any help, sorry. I would be very interested in your solution though. Regards, Loïc Loïc CHANEL System & virtualization engineer TO - XaaS Ind - Worldline (Villeurbanne, France) 2016-02-02 11:27 GMT+01:00 Satya Harish Appana : > Queries I am running over H

Re: Hive Query Timeout in hive-jdbc

2016-02-02 Thread Satya Harish Appana
Queries I am running over Hive JDBC are ddl statements(none of the queries are select or insert. which will result in an execution engine(tez/mr) job to be launched.. all the queries are create external table .. and drop table .. and alter table add partitions). On Tue, Feb 2, 2016 at 3:54 PM, Lo

Re: Hive Query Timeout in hive-jdbc

2016-02-02 Thread Loïc Chanel
Actually, Hive doesn't support timeout, but Tez and MapReduce does. Therefore, you can set a timeout on these tools to kill failed queries. Hope this helps, Loïc Loïc CHANEL System & virtualization engineer TO - XaaS Ind - Worldline (Villeurbanne, France) 2016-02-02 11:10 GMT+01:00 董亚军 : > hive

Re: Hive Query Timeout in hive-jdbc

2016-02-02 Thread 董亚军
hive does not support timeout on the client side. and I think it is not recommended that if the client exit with timeout exception, the hiveserver side may also running the job. this will result in inconsistent state. On Tue, Feb 2, 2016 at 4:49 PM, Satya Harish Appana < satyaharish.app...@gmail.

Hive Query Timeout in hive-jdbc

2016-02-02 Thread Satya Harish Appana
Hi Team, I am trying to connect to hiveServer via hive-jdbc. Can we configure client side timeout at each query executed inside each jdbc connection. (When I looked at HiveStatement.setQueryTimeout method it says operation unsupported). Is there any other way of timing out and cancelling the con

Re: Hive query hangs in reduce steps

2016-01-09 Thread Suresh V
Hi Gopal - actually no., the table is not partitioned/bucketed. Everyday the whole table gets cleaned up and populated with last 120 days' data... What are the other properties I can try to improve the performance of reduce steps...? Suresh V http://www.justbirds.in On Sat, Jan 9, 2016 at 8:52

Re: Hive query hangs in reduce steps

2016-01-09 Thread Suresh V
e Hive on Spark. > > > > What normally works is Hive on MR. Have you tried: > > > > set hive.execution.engine=mr; > > > > Sounds like it times out for one reason or other! > > > > *From:* Suresh V [mailto:verdi...@gmail.com] > *Sent:* 09 January 2016 1

Re: Hive query hangs in reduce steps

2016-01-09 Thread Gopal Vijayaraghavan
Hi, > The job completes fine if we reduce the # of rows processed by reducing >the # of days data being processed. > > It just gets stuck after all maps are completed. We checked the logs and >it says the containers are released. Looks like you're inserting into a bucketed & partitioned table an

RE: Hive query hangs in reduce steps

2016-01-09 Thread Mich Talebzadeh
@hive.apache.org Subject: Hive query hangs in reduce steps Dear all We have a Hive query that 'insert overwrites' from one main hive table to another table about 24million rows every day. This query was working fine so long, but lately it has started to hang at the reduce steps. It just

Hive query hangs in reduce steps

2016-01-09 Thread Suresh V
Dear all We have a Hive query that 'insert overwrites' from one main hive table to another table about 24million rows every day. This query was working fine so long, but lately it has started to hang at the reduce steps. It just gets stuck after all maps are completed. We checked the l

  1   2   3   4   5   6   >