[ https://issues.apache.org/jira/browse/HIVE-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16718295#comment-16718295 ]
Vineet Garg commented on HIVE-17935: ------------------------------------ [~asherman] Since now this optimization is turned on by default (HIVE-20703 & HIVE-20915) I don't believe we need this JIRA anymore. Is it ok to close it? > Turn on hive.optimize.sort.dynamic.partition by default > ------------------------------------------------------- > > Key: HIVE-17935 > URL: https://issues.apache.org/jira/browse/HIVE-17935 > Project: Hive > Issue Type: Bug > Reporter: Andrew Sherman > Priority: Major > Attachments: HIVE-17935.1.patch, HIVE-17935.2.patch, > HIVE-17935.3.patch, HIVE-17935.4.patch, HIVE-17935.5.patch, > HIVE-17935.6.patch, HIVE-17935.7.patch, HIVE-17935.8.patch > > > The config option hive.optimize.sort.dynamic.partition is an optimization for > Hive’s dynamic partitioning feature. It was originally implemented in > [HIVE-6455|https://issues.apache.org/jira/browse/HIVE-6455]. With this > optimization, the dynamic partition columns and bucketing columns (in case of > bucketed tables) are sorted before being fed to the reducers. Since the > partitioning and bucketing columns are sorted, each reducer can keep only one > record writer open at any time thereby reducing the memory pressure on the > reducers. There were some early problems with this optimization and it was > disabled by default in HiveConf in > [HIVE-8151|https://issues.apache.org/jira/browse/HIVE-8151]. Since then > setting hive.optimize.sort.dynamic.partition=true has been used to solve > problems where dynamic partitioning produces with (1) too many small files on > HDFS, which is bad for the cluster and can increase overhead for future Hive > queries over those partitions, and (2) OOM issues in the map tasks because it > trying to simultaneously write to 100 different files. > It now seems that the feature is probably mature enough that it can be > enabled by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005)