[ https://issues.apache.org/jira/browse/HIVE-6262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13885080#comment-13885080 ]
Gunther Hagleitner commented on HIVE-6262: ------------------------------------------ Committed to trunk. Thanks for the review Vikram! > Remove unnecessary copies of schema + table desc from serialized plan > --------------------------------------------------------------------- > > Key: HIVE-6262 > URL: https://issues.apache.org/jira/browse/HIVE-6262 > Project: Hive > Issue Type: Bug > Reporter: Gunther Hagleitner > Assignee: Gunther Hagleitner > Fix For: 0.13.0 > > Attachments: HIVE-6262.1.patch > > > Currently for a partitioned table the following are true: > - for each partitiondesc we send a copy of the corresponding tabledesc > - for each partitiondesc we send two copies of the schema (in different > formats). > Obviously we need to send different schemas if they are required by schema > evolution, but in our case we'll always end up with multiple copies. > The effect can be dramatic. The reductions by removing those on partitioned > tables easily be can be 8-10x in size. Plans themselves can be 10s to 100s of > mb (even with kryo). The size difference also plays out in every task on the > cluster we run. -- This message was sent by Atlassian JIRA (v6.1.5#6160)