Ganesha Shreedhara created HIVE-27582: -----------------------------------------
Summary: Do not cache HBase table input format in FetchOperator Key: HIVE-27582 URL: https://issues.apache.org/jira/browse/HIVE-27582 Project: Hive Issue Type: Bug Reporter: Ganesha Shreedhara Assignee: Ganesha Shreedhara Caching of HBase table input format in FetchOperator causes Hive query to fail with following exception. ``` 2023-08-08T09:43:28,800 WARN [HiveServer2-Handler-Pool: Thread-47([])]: thrift.ThriftCLIService (ThriftCLIService.java:FetchResults(809)) - Error fetching results: org.apache.hive.service.cli.HiveSQLException: java.io.IOException: java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: Task org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@2ae0e353 rejected from java.util.concurrent.ThreadPoolExecutor@663dd540[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 2] at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:485) ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3] at org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:328) ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3] at org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:926) ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3] at sun.reflect.GeneratedMethodAccessor34.invoke(Unknown Source) ~[?:?] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_382] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_382] at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3] at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36) ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3] at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63) ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_382] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_382] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878) ~[hadoop-common-3.3.3-amzn-2.jar:?] at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3] at com.sun.proxy.$Proxy43.fetchResults(Unknown Source) ~[?:?] at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:568) ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3] at org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:800) [hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3] at org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1900) [hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3] at org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1880) [hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3] at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) [hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3] at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38) [hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3] at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) [hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3] at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:313) [hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_382] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_382] at java.lang.Thread.run(Thread.java:750) [?:1.8.0_382] Caused by: java.io.IOException: java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: Task org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@2ae0e353 rejected from java.util.concurrent.ThreadPoolExecutor@663dd540[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 2] at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:617) ~[hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3] at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:522) ~[hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3] at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) ~[hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3] at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2737) ~[hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229) ~[hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3] at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:480) ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3] ... 24 more ``` This is because HBase input format is stateful. When there are concurrent queries running, same HBase input format is reused by multiple threads. When one thread closes HBase connection, the other thread that wants to fetch data from HBase can not submit task to HTable's ThreadPoolExecutor because the HBase connection is closed and ThreadPoolExecutor is shutdown. Caching of HBase input format is disabled in HiveInputFormat as part of HIVE-8808. Same fix needs to be applied in FetchOperator. -- This message was sent by Atlassian Jira (v8.20.10#820010)