John Sherman created HIVE-26633:
-----------------------------------

             Summary: Make thrift max message size configurable
                 Key: HIVE-26633
                 URL: https://issues.apache.org/jira/browse/HIVE-26633
             Project: Hive
          Issue Type: Bug
          Components: HiveServer2
    Affects Versions: 4.0.0-alpha-2
            Reporter: John Sherman
            Assignee: John Sherman


Since thrift >= 0.14, thrift now enforces max message sizes through a 
TConfiguration object as described here:
[https://github.com/apache/thrift/blob/master/doc/specs/thrift-tconfiguration.md]

By default MaxMessageSize gets set to 100MB.

As a result it is possible for HMS clients not to be able to retrieve certain 
metadata for tables with a large amount of partitions or other metadata.

For example on a cluster configured with kerberos between hs2 and hms, querying 
a large table (10k partitions, 200 columns with names of 200 characters) 
results in this backtrace:
{code:java}
org.apache.thrift.transport.TTransportException: MaxMessageSize reached
at 
org.apache.thrift.transport.TEndpointTransport.countConsumedMessageBytes(TEndpointTransport.java:96)
 
at 
org.apache.thrift.transport.TMemoryInputTransport.read(TMemoryInputTransport.java:97)
 
at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:390) 
at 
org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:39)
 
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:109) 
at 
org.apache.hadoop.hive.metastore.security.TFilterTransport.readAll(TFilterTransport.java:63)
 
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:464) 
at 
org.apache.thrift.protocol.TBinaryProtocol.readByte(TBinaryProtocol.java:329) 
at 
org.apache.thrift.protocol.TBinaryProtocol.readFieldBegin(TBinaryProtocol.java:273)
 
at 
org.apache.hadoop.hive.metastore.api.FieldSchema$FieldSchemaStandardScheme.read(FieldSchema.java:461)
 
at 
org.apache.hadoop.hive.metastore.api.FieldSchema$FieldSchemaStandardScheme.read(FieldSchema.java:454)
 
at org.apache.hadoop.hive.metastore.api.FieldSchema.read(FieldSchema.java:388) 
at 
org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.read(StorageDescriptor.java:1269)
 
at 
org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.read(StorageDescriptor.java:1248)
 
at 
org.apache.hadoop.hive.metastore.api.StorageDescriptor.read(StorageDescriptor.java:1110)
 
at 
org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.read(Partition.java:1270)
 
at 
org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.read(Partition.java:1205)
 
at org.apache.hadoop.hive.metastore.api.Partition.read(Partition.java:1062) 
at 
org.apache.hadoop.hive.metastore.api.PartitionsByExprResult$PartitionsByExprResultStandardScheme.read(PartitionsByExprResult.java:420)
 
at 
org.apache.hadoop.hive.metastore.api.PartitionsByExprResult$PartitionsByExprResultStandardScheme.read(PartitionsByExprResult.java:399)
 
at 
org.apache.hadoop.hive.metastore.api.PartitionsByExprResult.read(PartitionsByExprResult.java:335)
 
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_by_expr_result$get_partitions_by_expr_resultStandardScheme.read(ThriftHiveMetastore.java)
  {code}

Making this configurable (and defaulting to a higher value) would allow these 
tables to still be accessible.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to