Thank you, Ashutosh. That's very informative.
I appreciate that!
*Rossi*
2015-05-12 9:08 GMT+08:00 Ashutosh Chauhan :
> Hi Rossi,
>
> Historically, we used LoptOptimizeJoinRule of Calcite to do join
> reordering. This does a greedy search on join order search space to find a
> join order which
In MR query plan is
Map Join Operator
condition map:
Left Outer Join0 to 1
keys:
0 ordr_code (type: string), cart_prod_id (type: bigint)
1 parnt_ordr_code (type: string), comb_prod_id (type: bigint)
outputColumnNames: _col1, _col2, _col3, _col5, _col10, _col11, _col15, _col16,
But in tez
Hi,
You’re correct - that is not a valid rewrite.
Both tables have to be shuffled across due to the OR clause with no
reductions.
Cheers,
Gopal
On 5/11/15, 10:43 AM, "Eugene Koifman" wrote:
>This isn’t a valid rewrite.
>if a(x,y) has 1 row (1,2) and b(x,z) has 1 row (1,1) then the 1st query
>
Hi Rossi,
Historically, we used LoptOptimizeJoinRule of Calcite to do join
reordering. This does a greedy search on join order search space to find a
join order which is atleast as good as original join order of query.
Goodness being in term of estimated cost and not globally optimal because
of gr
Hi,
I have enabled storage based authorization in the hive metastore by adding
the following configs to hive-site:
>
> hive.security.authorization.enabled
> true
>
>
>
> hive.security.authorization.manager
>
>
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedA
I see only 1 reduce run forerver. Skew join?
r7raul1...@163.com
From: Eugene Koifman
Date: 2015-05-12 01:43
To: user
CC: r7raul1...@163.com
Subject: Re: hive sql on tez run forever
This isn’t a valid rewrite.
if a(x,y) has 1 row (1,2) and b(x,z) has 1 row (1,1) then the 1st query
will produce
This isn’t a valid rewrite.
if a(x,y) has 1 row (1,2) and b(x,z) has 1 row (1,1) then the 1st query
will produce 1 row
but the 2nd query with subselects will not.
On 5/11/15, 10:13 AM, "Gopal Vijayaraghavan" wrote:
>Hi,
>
>> I change the sql where condition to (where t.update_time >=
>>'2015-05-
The other option is to try UNION ALL or UNION depending on the nature of
the result set
SELECT rs.col1, rs,col2 ,
FROM
(
SELECT t.col1, t.col2, ..
FROM t WHERE t.update_time > '2015-05-04'
UNION ALL
SELECT t8.col1, t8.col2,..
FROM t8 WHERE length(t8.end_user_id) > 0
) r
Hi,
> I change the sql where condition to (where t.update_time >=
>'2015-05-04') , the sql can return result for a while. Because
>t.update_time
> >= '2015-05-04' can filter many row when table scan. But why change
>where condition to
> (where t.update_time >= '2015-05-04' or length(t8.end_user_i
my sql no group.
The sql cause the problem :
from dw.fct_traffic_navpage_path_detl t
left outer join dw.univ_parnt_tranx_comb_detl o
on t.ordr_code = o.parnt_ordr_code
and t.cart_prod_id = o.comb_prod_id
and o.ds = '{$label}'
select ordr_code,count(*) as a from dw.fct_traffic_navpage_path_d
May be your one reducer is overloaded due to groupby keys. If you are using
groupby then try below property and see if reducer data distributed.
set hive.groupby.skewindata=true;
Thanks
Jitendra
On Mon, May 11, 2015 at 12:35 PM, r7raul1...@163.com
wrote:
> Status: Running (Executing on YARN cl
Status: Running (Executing on YARN cluster with App id
application_1419300485749_1493279)
VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
12 matches
Mail list logo