[ 
https://issues.apache.org/jira/browse/HIVE-27582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27582:
----------------------------------
    Labels: pull-request-available  (was: )

> Do not cache HBase table input format in FetchOperator
> ------------------------------------------------------
>
>                 Key: HIVE-27582
>                 URL: https://issues.apache.org/jira/browse/HIVE-27582
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Ganesha Shreedhara
>            Assignee: Ganesha Shreedhara
>            Priority: Major
>              Labels: pull-request-available
>
> Caching of HBase table input format in FetchOperator causes Hive query to 
> fail with following exception. 
> ```
> 2023-08-08T09:43:28,800 WARN  [HiveServer2-Handler-Pool: Thread-47([])]: 
> thrift.ThriftCLIService (ThriftCLIService.java:FetchResults(809)) - Error 
> fetching results:
> org.apache.hive.service.cli.HiveSQLException: java.io.IOException: 
> java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: 
> Task 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@2ae0e353
>  rejected from java.util.concurrent.ThreadPoolExecutor@663dd540[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 2]
>         at 
> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:485)
>  ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:328)
>  ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:926)
>  ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at sun.reflect.GeneratedMethodAccessor34.invoke(Unknown Source) ~[?:?]
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_382]
>         at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_382]
>         at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>  ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
>  ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
>  ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at java.security.AccessController.doPrivileged(Native Method) 
> ~[?:1.8.0_382]
>         at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_382]
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
>  ~[hadoop-common-3.3.3-amzn-2.jar:?]
>         at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>  ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at com.sun.proxy.$Proxy43.fetchResults(Unknown Source) ~[?:?]
>         at 
> org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:568) 
> ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:800)
>  [hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1900)
>  [hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1880)
>  [hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) 
> [hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38) 
> [hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
>  [hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:313)
>  [hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_382]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_382]
>         at java.lang.Thread.run(Thread.java:750) [?:1.8.0_382]
> Caused by: java.io.IOException: java.lang.RuntimeException: 
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@2ae0e353
>  rejected from java.util.concurrent.ThreadPoolExecutor@663dd540[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 2]
>         at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:617)
>  ~[hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:522) 
> ~[hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) 
> ~[hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2737) 
> ~[hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
>  ~[hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:480)
>  ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         ... 24 more
> ```
>  
> This is because HBase input format is stateful. When there are concurrent 
> queries running, same HBase input format is reused by multiple threads. When 
> one thread closes HBase connection, the other thread that wants to fetch data 
> from HBase can not submit task to HTable's ThreadPoolExecutor because the 
> HBase connection is closed and ThreadPoolExecutor is shutdown. 
>  
> Caching of HBase input format is disabled in HiveInputFormat as part of  
> HIVE-8808. Same fix needs to be applied in FetchOperator.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to