Hi everyone,

I would like to resume the discussion on the release of Hive 4 and the result of the TPC-DS benchmark.

Currently there are four unresolved JIRAs marked 'hive-4.0.0-must' which must be resolved before the release of Hive 4 ([1], [2], [3], [4]). The most urgent one is perhaps HIVE-26654 [1] which reports failing queries in the TPC-DS benchmark. (All these bugs were introduced after the release of Hive 3.1.2 which passes all the TPC-DS tests.)

Originally we reported 7 failing cases in HIVE-26654. Since then, 3 cases have been resolved, 2 cases have pull requests, and 2 cases don't have pull requests yet.

1. Query 17: Resolved in HIVE-26655 [6]
2. Query 16, 69, 94: Resolved in HIVE-26659 [8]
3. Query 64: Resolved in HIVE-26968 [10]

4. Query 2: Pull request available in HIVE-27006 [5]
5. Query 71: Pull request available in HIVE-26986 [9]

6. Query 14: Reported in HIVE-24167 [7]
7. Query 97: Reported in HIVE-27269 [11]

Seonggon and I (in MR3 team) have been working on these problems, and so far we have submitted 4 pull requests. Two of them have been merged, but the other two are not being reviewed (for query 2 and query 71). I'd apprecite it very much if Hive committers could review the remaining pull requests.

The remainging problems are query 14 and query 97.

For query 14, I suggest that we take a simple workaround by setting hive.optimize.cte.materialize.threshold to -1 by default because nobody seems to working on this JIRA. If necessary, we could try to fix it after the release of Hive 4.

For query 97 (which we think is the most challenging one among all the sub-JIRAs), we have a few choices:

1) Use a quick-fix solution by ignoring hive.mapjoin.hashtable.load.threads when FullOuterJoin is used
2) Fix HIVE-25583 [12] which introduces this bug
3) Fix it properly

I suggest that we take a quick-fix solution and revisit the problem after the release of Hive 4.

(We have also observed performance regression in Hive, but I guess another topic to discuss after fixing correctness issues.)

Please let us know what you think.

Thanks,

--- Sungwoo

[1] https://issues.apache.org/jira/browse/HIVE-26654
[2] https://issues.apache.org/jira/browse/HIVE-27226
[3] https://issues.apache.org/jira/browse/HIVE-26505
[4] https://issues.apache.org/jira/browse/HIVE-22636
[5] https://issues.apache.org/jira/browse/HIVE-27006
[6] https://issues.apache.org/jira/browse/HIVE-26655
[7] https://issues.apache.org/jira/browse/HIVE-24167
[8] https://issues.apache.org/jira/browse/HIVE-26659
[9] https://issues.apache.org/jira/browse/HIVE-26986
[10] https://issues.apache.org/jira/browse/HIVE-26968
[11] https://issues.apache.org/jira/browse/HIVE-27269
[12] https://issues.apache.org/jira/browse/HIVE-25583


Reply via email to