[ https://issues.apache.org/jira/browse/IMPALA-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Quanlong Huang resolved IMPALA-11402. ------------------------------------- Target Version: Impala 5.0.0 Resolution: Fixed Resolving this. Thank [~kdeschle], [~daniel.becker] and [~rizaon] for the review! > getPartialCatalogObject fails with OOM with huge number of files > ---------------------------------------------------------------- > > Key: IMPALA-11402 > URL: https://issues.apache.org/jira/browse/IMPALA-11402 > Project: IMPALA > Issue Type: Bug > Components: Catalog > Reporter: Quanlong Huang > Assignee: Quanlong Huang > Priority: Critical > > The response size of getPartialCatalogObject depends on the number of > partitions in the request. Even with the optimization of IMPALA-7501, the > response size could still exceeds the 2GB byte array limit if requesting all > partitions of a huge table. E.g. > {noformat} > I0224 02:30:32.183627 28707 jni-util.cc:321] java.lang.OutOfMemoryError > at > java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123) > at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117) > at > java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) > at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153) > at > org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:197) > at > org.apache.thrift.protocol.TBinaryProtocol.writeBinary(TBinaryProtocol.java:236) > at > org.apache.impala.thrift.THdfsFileDesc$THdfsFileDescStandardScheme.write(THdfsFileDesc.java:450) > at > org.apache.impala.thrift.THdfsFileDesc$THdfsFileDescStandardScheme.write(THdfsFileDesc.java:405) > at org.apache.impala.thrift.THdfsFileDesc.write(THdfsFileDesc.java:346) > at > org.apache.impala.thrift.TPartialPartitionInfo$TPartialPartitionInfoStandardScheme.write(TPartialPartitionInfo.java:1647) > at > org.apache.impala.thrift.TPartialPartitionInfo$TPartialPartitionInfoStandardScheme.write(TPartialPartitionInfo.java:1433) > at > org.apache.impala.thrift.TPartialPartitionInfo.write(TPartialPartitionInfo.java:1265) > at > org.apache.impala.thrift.TPartialTableInfo$TPartialTableInfoStandardScheme.write(TPartialTableInfo.java:1402) > at > org.apache.impala.thrift.TPartialTableInfo$TPartialTableInfoStandardScheme.write(TPartialTableInfo.java:1215) > at > org.apache.impala.thrift.TPartialTableInfo.write(TPartialTableInfo.java:1061) > at > org.apache.impala.thrift.TGetPartialCatalogObjectResponse$TGetPartialCatalogObjectResponseStandardScheme.write(TGetPartialCatalogObjectResponse.java:1157) > at > org.apache.impala.thrift.TGetPartialCatalogObjectResponse$TGetPartialCatalogObjectResponseStandardScheme.write(TGetPartialCatalogObjectResponse.java:1010) > at > org.apache.impala.thrift.TGetPartialCatalogObjectResponse.write(TGetPartialCatalogObjectResponse.java:876) > at org.apache.thrift.TSerializer.serialize(TSerializer.java:84) > at > org.apache.impala.service.JniCatalogOp.lambda$execAndSerialize$1(JniCatalogOp.java:91) > at org.apache.impala.service.JniCatalogOp.execOp(JniCatalogOp.java:58) > at > org.apache.impala.service.JniCatalogOp.execAndSerialize(JniCatalogOp.java:89) > at > org.apache.impala.service.JniCatalogOp.execAndSerializeSilentStartAndFinish(JniCatalogOp.java:109) > at > org.apache.impala.service.JniCatalog.execAndSerializeSilentStartAndFinish(JniCatalog.java:259) > at > org.apache.impala.service.JniCatalog.getPartialCatalogObject(JniCatalog.java:436){noformat} > We should add flag to limit the number of partitions in a single > getPartialCatalogObject request. When more partitions are required, fetch > them in different batches. -- This message was sent by Atlassian Jira (v8.20.10#820010)