Hi Sungwoo,
There is https://issues.apache.org/jira/browse/HIVE-23975 causing a
regression in runtime. There is a ticket open to fix it (
https://issues.apache.org/jira/browse/HIVE-24139) which is still in
progress. You might want to revert 23975 before trying.

On Wed, Nov 4, 2020 at 2:55 PM Stamatis Zampetakis <zabe...@gmail.com>
wrote:

> Hi Sungwoo,
>
> Personally, I would be also interested to see the results of these
> experiments if they are available somewhere.
>
> I didn't understand if the queries are failing at runtime or compile time.
> Are the above errors the only ones that you're getting?
>
> If you can reproduce the problem with a smaller dataset then I think the
> best would be to create unit tests and JIRAS for each query separately.
>
> It may not be worth going through the commits to find those that caused the
> regression because it will be time-consuming and you may bump into
> something that is not trivial to revert.
>
> Best,
> Stamatis
>
>
> On Wed, Nov 4, 2020 at 7:24 PM Sungwoo Park <glap...@gmail.com> wrote:
>
> > Hello,
> >
> > I have tested a recent commit of the master branch using the TPC-DS
> > benchmark. I used Hive on Tez (not Hive-LLAP). The way I tested is:
> >
> > 1) create a database consisting of external tables from a 100GB TPC-DS
> text
> > dataset
> > 2) create a database consisting of ORC tables from the previous database
> > 3) compute column statistics
> > 4) run TPC-DS queries and check the results
> >
> > Previously we tested the commit 5f47808c02816edcd4c323dfa25194536f3f20fd
> > (HIVE-23114: Insert overwrite with dynamic partitioning is not working
> > correctly with direct insert, Fri Apr 10), and all queries ran okay.
> >
> > This time I used the following commits. I made a few changes to pom.xml
> of
> > both Hive and Tez, but these changes should not affect the result of
> > running queries.
> >
> > 1) Hive, master, 96aacdc50043fa442c2277b7629812e69241a507 (Tue Nov
> > 3), HIVE-24314: compactor.Cleaner should not set state mark cleaned if it
> > didn't remove any files
> > 2) Tez, 0.10.0, 22fec6c0ecc7ebe6f6f28800935cc6f69794dad5 (Thu Oct
> > 8), CHANGES.txt updated with TEZ-4238
> >
> > The result is that 14 queries (out of 99 queries) fail, and a query fails
> > during compilation for one of the following two reasons.
> >
> > 1)
> > org.apache.hive.service.cli.HiveSQLException: Error while compiling
> > statement: FAILED: Execution Error, return code 1 from
> > org.apache.hadoop.hive.ql.exec.tez.TezTask. Edge [Map 12 :
> > org.apache.hadoop.hive.ql.exec.tez.MapTezProcessor] -> [Map 7 :
> > org.apache.hadoop.hive.ql.exec.tez.MapTezProcessor] ({ BROADCAST :
> > org.apache.tez.runtime.library.input.UnorderedKVInput >> PERSISTED >>
> > org.apache.tez.runtime.library.output.UnorderedKVOutput >>
> NullEdgeManager
> > }) already defined!
> >   at
> >
> >
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:365)
> >   at
> >
> >
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:241)
> >   at
> >
> >
> org.apache.hive.service.cli.operation.SQLOperation.access$500(SQLOperation.java:88)
> >   at
> >
> >
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:325)
> >   at java.security.AccessController.doPrivileged(Native Method)
> >   at javax.security.auth.Subject.doAs(Subject.java:422)
> >   at
> >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
> >   at
> >
> >
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:343)
> >   at
> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> >   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >   at
> >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> >   at
> >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> >   at java.lang.Thread.run(Thread.java:745)
> > Caused by: java.lang.IllegalArgumentException: Edge [Map 12 :
> > org.apache.hadoop.hive.ql.exec.tez.MapTezProcessor] -> [Map 7 :
> > org.apache.hadoop.hive.ql.exec.tez.MapTezProcessor] ({ BROADCAST :
> > org.apache.tez.runtime.library.input.UnorderedKVInput >> PERSISTED >>
> > org.apache.tez.runtime.library.output.UnorderedKVOutput >>
> NullEdgeManager
> > }) already defined!
> >   at org.apache.tez.dag.api.DAG.addEdge(DAG.java:297)
> >   at org.apache.hadoop.hive.ql.exec.tez.TezTask.build(TezTask.java:519)
> >   at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:213)
> >   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105)
> >   at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:361)
> >   at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:334)
> >   at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:245)
> >   at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:108)
> >   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:326)
> >   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:149)
> >   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:144)
> >   at
> > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:164)
> >   at
> >
> >
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:228)
> >   ... 11 more
> >
> > 2)
> > Caused by: java.lang.NullPointerException
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:4491)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:4474)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:10940)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:10882)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11776)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11633)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11660)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11633)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11660)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11646)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlanForSubQueryPredicate(SemanticAnalyzer.java:3386)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3484)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:10830)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11776)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11633)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11636)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11636)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11660)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11633)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11660)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11646)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:12428)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:718)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12539)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:443)
> >   at
> >
> >
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301)
> >   at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223)
> >
> > There are 702 commits between the last commit of April 10 and the recent
> > commit, and I guess some of these commits introduce the bugs shown
> above. I
> > could perform a manual search to locate the commits, but it is not always
> > easy to build Hive using a particular commit (mostly because of version
> > conflicts with Tez, Hadoop, Guava, or others).
> >
> > If anybody can suggest specific commits to test, or guide me in
> > bug-hunting, please let me know. There is another bug that shows itself
> > when computing column statistics, but we can suppress it by using default
> > values in tez-site.xml.
> >
> > Thanks,
> >
> > --- Sungwoo
> >
>

Reply via email to