Quanlong Huang created IMPALA-13994:
---------------------------------------

             Summary: Thrift client hive_client shouldn't be used in multiple 
threads
                 Key: IMPALA-13994
                 URL: https://issues.apache.org/jira/browse/IMPALA-13994
             Project: IMPALA
          Issue Type: Bug
          Components: Test
            Reporter: Quanlong Huang


In ImpalaTestSuite, we create a ThriftHiveMetastore.Client as hive_client:
[https://github.com/apache/impala/blob/648209b17258cf610f4e73a3ed63de665216074f/tests/common/impala_test_suite.py#L255]

Different to other clients we create for Impala, this Thrift client is not 
thread-safe and shouldn't be used in parallel tests. See THRIFT-2283 and this 
email thread:
[https://lists.apache.org/thread/4rsjdtlpv8zrgknpf43vo5rg9q83b6wp]
{quote}The Thrift transport layer is not thread-safe. It is essentially a 
wrapper on a socket. You can't interleave writing things to a single socket 
from multiple threads without locking. You also don't know what order the 
responses will come back in.
{quote}
Here are some exceptions I hit when using it in two threads in 
https://gerrit.cloudera.org/c/22816/3:
{noformat}
Exception in thread Thread-4:
Traceback (most recent call last):
  File 
"/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/threading.py",
 line 801, in __bootstrap_inner
    self.run()
  File 
"/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/threading.py",
 line 754, in run
    self.__target(*self.__args, **self.__kwargs)
  File 
"/home/quanlong/workspace/Impala/tests/metadata/test_event_processing.py", line 
636, in drop_table_in_hive
    self.hive_client.drop_table(db, tbl_name, deleteData=True)
  File 
"/home/quanlong/workspace/Impala/shell/gen-py/impala_thrift_gen/hive_metastore/ThriftHiveMetastore.py",
 line 3913, in drop_table
    self.recv_drop_table()
  File 
"/home/quanlong/workspace/Impala/shell/gen-py/impala_thrift_gen/hive_metastore/ThriftHiveMetastore.py",
 line 3937, in recv_drop_table
    raise result.o1
NoSuchObjectException: NoSuchObjectException(message='null: null')
Exception in thread Thread-3:
Traceback (most recent call last):
  File 
"/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/threading.py",
 line 801, in __bootstrap_inner
    self.run()
  File 
"/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/threading.py",
 line 754, in run
    self.__target(*self.__args, **self.__kwargs)
  File 
"/home/quanlong/workspace/Impala/tests/metadata/test_event_processing.py", line 
636, in drop_table_in_hive
    self.hive_client.drop_table(db, tbl_name, deleteData=True)
  File 
"/home/quanlong/workspace/Impala/shell/gen-py/impala_thrift_gen/hive_metastore/ThriftHiveMetastore.py",
 line 3913, in drop_table
    self.recv_drop_table()
  File 
"/home/quanlong/workspace/Impala/shell/gen-py/impala_thrift_gen/hive_metastore/ThriftHiveMetastore.py",
 line 3927, in recv_drop_table
    (fname, mtype, rseqid) = iprot.readMessageBegin()
  File 
"/home/quanlong/workspace/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py",
 line 134, in readMessageBegin
    sz = self.readI32()
  File 
"/home/quanlong/workspace/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py",
 line 217, in readI32
    buff = self.trans.readAll(4)
  File 
"/home/quanlong/workspace/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/thrift/transport/TTransport.py",
 line 62, in readAll
    chunk = self.read(sz - have)
  File 
"/home/quanlong/workspace/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/thrift/transport/TTransport.py",
 line 164, in read
    self.__rbuf = BufferIO(self.__trans.read(max(sz, self.__rbuf_size)))
  File 
"/home/quanlong/workspace/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/thrift/transport/TSocket.py",
 line 164, in read
    raise TTransportException(message="unexpected exception", inner=e)
TTransportException: unexpected exception {noformat}
ERRORs in HMS side indicating the request data is abnormal
{noformat}
2025-04-25T13:49:50,021  INFO [TThreadPoolServer WorkerProcess-188] 
metastore.HiveMetaStore: 203: source:127.0.0.1 drop_table : tbl=null.null.null
2025-04-25T13:49:50,021  INFO [TThreadPoolServer WorkerProcess-188] 
HiveMetaStore.audit: ugi=quanlong   ip=127.0.0.1    cmd=source:127.0.0.1 
drop_table : tbl=null.null.null
2025-04-25T13:49:50,022  WARN [TThreadPoolServer WorkerProcess-188] 
metastore.ObjectStore: Falling back to ORM path due to direct SQL failure (this 
is not an error): null at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getDatabase(MetaStoreDirectSql.java:393)
 at 
org.apache.hadoop.hive.metastore.ObjectStore$1.getSqlResult(ObjectStore.java:896)
2025-04-25T13:49:50,022 ERROR [TThreadPoolServer WorkerProcess-188] 
metastore.ObjectStore:
java.lang.NullPointerException: null
        at 
org.apache.hadoop.hive.metastore.utils.StringUtils.normalizeIdentifier(StringUtils.java:94)
 ~[hive-standalone-metastore-3.1.3000.7.3.1.0-160.jar:3.1.3000.7.3.1.0-160]
        at 
org.apache.hadoop.hive.metastore.ObjectStore.getMDatabase(ObjectStore.java:853) 
~[hive-standalone-metastore-3.1.3000.7.3.1.0-160.jar:3.1.3000.7.3.1.0-160]
        at 
org.apache.hadoop.hive.metastore.ObjectStore.getJDODatabase(ObjectStore.java:911)
 ~[hive-standalone-metastore-3.1.3000.7.3.1.0-160.jar:3.1.3000.7.3.1.0-160]
        at 
org.apache.hadoop.hive.metastore.ObjectStore$1.getJdoResult(ObjectStore.java:901)
 ~[hive-standalone-metastore-3.1.3000.7.3.1.0-160.jar:3.1.3000.7.3.1.0-160]
        at 
org.apache.hadoop.hive.metastore.ObjectStore$1.getJdoResult(ObjectStore.java:893)
 ~[hive-standalone-metastore-3.1.3000.7.3.1.0-160.jar:3.1.3000.7.3.1.0-160]
        at 
org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:4302)
 ~[hive-standalone-metastore-3.1.3000.7.3.1.0-160.jar:3.1.3000.7.3.1.0-160]
        at 
org.apache.hadoop.hive.metastore.ObjectStore.getDatabaseInternal(ObjectStore.java:903)
 ~[hive-standalone-metastore-3.1.3000.7.3.1.0-160.jar:3.1.3000.7.3.1.0-160]
        at 
org.apache.hadoop.hive.metastore.ObjectStore.getDatabase(ObjectStore.java:875) 
~[hive-standalone-metastore-3.1.3000.7.3.1.0-160.jar:3.1.3000.7.3.1.0-160]
        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) ~[?:?]
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_432]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_432]
        at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) 
~[hive-standalone-metastore-3.1.3000.7.3.1.0-160.jar:3.1.3000.7.3.1.0-160]
        at com.sun.proxy.$Proxy33.getDatabase(Unknown Source) ~[?:?]
        at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:3253)
 
~[hive-standalone-metastore-3.1.3000.7.3.1.0-160.jar:3.1.3000.7.3.1.0-160]{noformat}
Another kind of ERROR log:
{noformat}
2025-04-25T13:49:50,054 ERROR [TThreadPoolServer WorkerProcess-188] 
server.TThreadPoolServer: Thrift Error occurred during processing of message.
org.apache.thrift.protocol.TProtocolException: Missing version in 
readMessageBegin, old client?
        at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:254)
 ~[libthrift-0.16.0.jar:0.16.0]
        at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:76)
 ~[hive-standalone-metastore-3.1.3000.7.3.1.0-160.jar:3.1.3000.7.3.1.0-160]
        at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250)
 ~[libthrift-0.16.0.jar:0.16.0]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_432]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_432]
        at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_432]{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to