[ https://issues.apache.org/jira/browse/HIVE-11777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15015350#comment-15015350 ]
Lefty Leverenz commented on HIVE-11777: --------------------------------------- Doc note: This adds configuration parameter *hive.orc.splits.directory.batch.ms* to HiveConf.java, so it needs to be documented in the ORC section of Configuration Properties. * [Configuration Properties -- ORC File Format | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-ORCFileFormat] > implement an option to have single ETL strategy for multiple directories > ------------------------------------------------------------------------ > > Key: HIVE-11777 > URL: https://issues.apache.org/jira/browse/HIVE-11777 > Project: Hive > Issue Type: Bug > Reporter: Sergey Shelukhin > Assignee: Sergey Shelukhin > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-11777.01.patch, HIVE-11777.02.patch, > HIVE-11777.03.patch, HIVE-11777.04.patch, HIVE-11777.05.patch, > HIVE-11777.06.patch, HIVE-11777.patch > > > In case of metastore footer PPD we don't want to call PPD call with all > attendant SARG, MS and HBase overhead for each directory. If we wait for some > time (10ms? some fraction of inputs?) we can do one call without losing > overall perf. > For now make it time based. -- This message was sent by Atlassian JIRA (v6.3.4#6332)