Hi everyone,
I would like to resume the discussion on the release of Hive 4 and
the result of the TPC-DS benchmark.
Currently there are four unresolved JIRAs marked 'hive-4.0.0-must' which must be
resolved before the release of Hive 4 ([1], [2], [3], [4]). The most urgent one
is perhaps HIVE-26654 [1] which reports failing queries in the TPC-DS benchmark.
(All these bugs were introduced after the release of Hive 3.1.2 which passes all
the TPC-DS tests.)
Originally we reported 7 failing cases in HIVE-26654. Since then, 3 cases have
been resolved, 2 cases have pull requests, and 2 cases don't have pull requests
yet.
1. Query 17: Resolved in HIVE-26655 [6]
2. Query 16, 69, 94: Resolved in HIVE-26659 [8]
3. Query 64: Resolved in HIVE-26968 [10]
4. Query 2: Pull request available in HIVE-27006 [5]
5. Query 71: Pull request available in HIVE-26986 [9]
6. Query 14: Reported in HIVE-24167 [7]
7. Query 97: Reported in HIVE-27269 [11]
Seonggon and I (in MR3 team) have been working on these problems, and so far we
have submitted 4 pull requests. Two of them have been merged, but the other two
are not being reviewed (for query 2 and query 71). I'd apprecite it very much if
Hive committers could review the remaining pull requests.
The remainging problems are query 14 and query 97.
For query 14, I suggest that we take a simple workaround by setting
hive.optimize.cte.materialize.threshold to -1 by default because nobody seems to
working on this JIRA. If necessary, we could try to fix it after the release of
Hive 4.
For query 97 (which we think is the most challenging one among all the
sub-JIRAs), we have a few choices:
1) Use a quick-fix solution by ignoring hive.mapjoin.hashtable.load.threads when
FullOuterJoin is used
2) Fix HIVE-25583 [12] which introduces this bug
3) Fix it properly
I suggest that we take a quick-fix solution and revisit the problem after the
release of Hive 4.
(We have also observed performance regression in Hive, but I guess another topic
to discuss after fixing correctness issues.)
Please let us know what you think.
Thanks,
--- Sungwoo
[1] https://issues.apache.org/jira/browse/HIVE-26654
[2] https://issues.apache.org/jira/browse/HIVE-27226
[3] https://issues.apache.org/jira/browse/HIVE-26505
[4] https://issues.apache.org/jira/browse/HIVE-22636
[5] https://issues.apache.org/jira/browse/HIVE-27006
[6] https://issues.apache.org/jira/browse/HIVE-26655
[7] https://issues.apache.org/jira/browse/HIVE-24167
[8] https://issues.apache.org/jira/browse/HIVE-26659
[9] https://issues.apache.org/jira/browse/HIVE-26986
[10] https://issues.apache.org/jira/browse/HIVE-26968
[11] https://issues.apache.org/jira/browse/HIVE-27269
[12] https://issues.apache.org/jira/browse/HIVE-25583