I looked a bit better on the plans of query2 and query59 and there is
something weird with the semi joins that appear in the plan.

A possible workaround, till we fix the problem, would be to disable semi
joins for these two queries:
set hive.tez.dynamic.semijoin.reduction=false;

Best,
Stamatis

On Wed, Nov 18, 2020 at 3:29 PM Stamatis Zampetakis <zabe...@gmail.com>
wrote:

> Hi Sungwoo,
>
> As far as it concerns query14 the problem is logged in HIVE-24167 [1].
> There is also a PR [2] for reproducing the problem so it should be feasible
> to find the offending commit with git bisect.
>
> For queries 2 and 59, I am also able to reproduce the behavior that you
> mentioned (same reducer appearing twice in the same mapper) in the same PR
> [2], and I think we should raise a JIRA ticket as well.
>
> Best,
> Stamatis
>
> [1] https://issues.apache.org/jira/browse/HIVE-24167
> [2] https://github.com/apache/hive/pull/1347
>
> On Wed, Nov 18, 2020 at 1:33 AM Sungwoo Park <glap...@gmail.com> wrote:
>
>>  > 1. With hive.optimize.shared.work.dppunion=true, query 2 and 59 fail.
>>> Please see the attachment for stack traces.
>>>
>>> Even thru the exception seem to be a reoccurance of the previous issue -
>>> existing checks + HIVE-24360 should have restricted all incorrect cases.
>>> I built in some debug stuff while I made these patches - and it would
>>> help a lot to get a peek into those; but they need to be enabled by
>>> hand/etc...while I polish that a
>>> bit more - could you please share an EXPLAIN FORMATTED about one of the
>>> queries failing because of that patch?
>>>
>>
>> Please see the attachment for the result of EXPLAIN on query 12. (EXPLAIN
>> FORMATTED seems to have some problem.)  Hive tries to create two broadcast
>> edges from Reducer 8 to Map 6, thus raising an exception.
>>
>>  > 2. Query 14 fails in both cases, and it seems like another bug. Note
>>> that when hive.cbo.enable is set to true when running query 14.
>>>
>>> I think you will find some cbo exception in the hive logs - explaining
>>> why it resorts to the non-cbo path.
>>>
>>
>> Indeed it raises RuntimeException:
>>
>> 20/11/17 13:04:22 ERROR parse.CalcitePlanner: CBO failed, skipping CBO.
>> java.lang.RuntimeException: equivalence mapping violation
>>   at
>> org.apache.hadoop.hive.ql.plan.mapper.PlanMapper.link(PlanMapper.java:220)
>>
>>  Please see the attachment for the full stack trace.
>>
>>
>>>  > 3. For some queries, the number of rows is different between the two
>>> experiments. In most cases, it seems to be rounding errors, but the
>>> difference is rather large for
>>> some queries (e.g., query 29 and 58). Please see the attachment for the
>>> result.
>>>
>>> that's very odd - I've recently fixed a bug in swo which may have caused
>>> issues like this(HIVE-24365); I would recommend to compare the result with
>>> the whole thing off
>>> (hive.optimize.shared.work=false).
>>> If you could isolate and reproduce this in a qtest I could also dig into
>>> it.
>>
>>
>> For now, let me report the result of testing HIVE-24366. Please see the
>> attachment for the result.
>>
>> HIVE-24366 (e9f72e654750de208227d46a22e983413b080c6c, Thu Nov 12)
>>
>> TEZ-4238 (22fec6c0ecc7ebe6f6f28800935cc6f69794dad5, Thu Oct 8)
>> guava.version=19.0 in pom.xml
>> hadoop.version=3.1.0 in pom.xml
>>
>> TPC-DS 100GB ORC
>>
>> hive.execution.engine=tez
>> hive.execution.mode=container, Tez containers are not reused across
>> queries.
>> hive.cbo.enable=true
>> hive.query.reexecution.stats.persist.scope=metastore (default value)
>>
>> 1) hive.optimize.shared.work = false
>> 2) hive.optimize.shared.work = true, hive.optimize.shared.work.dppunion
>> = true
>> 3)  hive.optimize.shared.work = true, hive.optimize.shared.work.dppunion
>> = false
>>
>> For each case, the first column reports the execution time and the second
>> column reports the number of rows. If the number of rows is 1, it also
>> reports the sum of all values in the result.
>>
>> Cheers,
>>
>> --- Sungwoo
>>
>

Reply via email to