Hi Sungwoo,

As far as it concerns query14 the problem is logged in HIVE-24167 [1].
There is also a PR [2] for reproducing the problem so it should be feasible
to find the offending commit with git bisect.

For queries 2 and 59, I am also able to reproduce the behavior that you
mentioned (same reducer appearing twice in the same mapper) in the same PR
[2], and I think we should raise a JIRA ticket as well.

Best,
Stamatis

[1] https://issues.apache.org/jira/browse/HIVE-24167
[2] https://github.com/apache/hive/pull/1347

On Wed, Nov 18, 2020 at 1:33 AM Sungwoo Park <glap...@gmail.com> wrote:

>  > 1. With hive.optimize.shared.work.dppunion=true, query 2 and 59 fail.
>> Please see the attachment for stack traces.
>>
>> Even thru the exception seem to be a reoccurance of the previous issue -
>> existing checks + HIVE-24360 should have restricted all incorrect cases.
>> I built in some debug stuff while I made these patches - and it would
>> help a lot to get a peek into those; but they need to be enabled by
>> hand/etc...while I polish that a
>> bit more - could you please share an EXPLAIN FORMATTED about one of the
>> queries failing because of that patch?
>>
>
> Please see the attachment for the result of EXPLAIN on query 12. (EXPLAIN
> FORMATTED seems to have some problem.)  Hive tries to create two broadcast
> edges from Reducer 8 to Map 6, thus raising an exception.
>
>  > 2. Query 14 fails in both cases, and it seems like another bug. Note
>> that when hive.cbo.enable is set to true when running query 14.
>>
>> I think you will find some cbo exception in the hive logs - explaining
>> why it resorts to the non-cbo path.
>>
>
> Indeed it raises RuntimeException:
>
> 20/11/17 13:04:22 ERROR parse.CalcitePlanner: CBO failed, skipping CBO.
> java.lang.RuntimeException: equivalence mapping violation
>   at
> org.apache.hadoop.hive.ql.plan.mapper.PlanMapper.link(PlanMapper.java:220)
>
>  Please see the attachment for the full stack trace.
>
>
>>  > 3. For some queries, the number of rows is different between the two
>> experiments. In most cases, it seems to be rounding errors, but the
>> difference is rather large for
>> some queries (e.g., query 29 and 58). Please see the attachment for the
>> result.
>>
>> that's very odd - I've recently fixed a bug in swo which may have caused
>> issues like this(HIVE-24365); I would recommend to compare the result with
>> the whole thing off
>> (hive.optimize.shared.work=false).
>> If you could isolate and reproduce this in a qtest I could also dig into
>> it.
>
>
> For now, let me report the result of testing HIVE-24366. Please see the
> attachment for the result.
>
> HIVE-24366 (e9f72e654750de208227d46a22e983413b080c6c, Thu Nov 12)
>
> TEZ-4238 (22fec6c0ecc7ebe6f6f28800935cc6f69794dad5, Thu Oct 8)
> guava.version=19.0 in pom.xml
> hadoop.version=3.1.0 in pom.xml
>
> TPC-DS 100GB ORC
>
> hive.execution.engine=tez
> hive.execution.mode=container, Tez containers are not reused across
> queries.
> hive.cbo.enable=true
> hive.query.reexecution.stats.persist.scope=metastore (default value)
>
> 1) hive.optimize.shared.work = false
> 2) hive.optimize.shared.work = true, hive.optimize.shared.work.dppunion =
> true
> 3)  hive.optimize.shared.work = true, hive.optimize.shared.work.dppunion
> = false
>
> For each case, the first column reports the execution time and the second
> column reports the number of rows. If the number of rows is 1, it also
> reports the sum of all values in the result.
>
> Cheers,
>
> --- Sungwoo
>

Reply via email to