[ 
https://issues.apache.org/jira/browse/HIVE-11777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15015350#comment-15015350
 ] 

Lefty Leverenz commented on HIVE-11777:
---------------------------------------

Doc note:  This adds configuration parameter 
*hive.orc.splits.directory.batch.ms* to HiveConf.java, so it needs to be 
documented in the ORC section of Configuration Properties.

* [Configuration Properties -- ORC File Format | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-ORCFileFormat]

> implement an option to have single ETL strategy for multiple directories
> ------------------------------------------------------------------------
>
>                 Key: HIVE-11777
>                 URL: https://issues.apache.org/jira/browse/HIVE-11777
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>              Labels: TODOC2.0
>             Fix For: 2.0.0
>
>         Attachments: HIVE-11777.01.patch, HIVE-11777.02.patch, 
> HIVE-11777.03.patch, HIVE-11777.04.patch, HIVE-11777.05.patch, 
> HIVE-11777.06.patch, HIVE-11777.patch
>
>
> In case of metastore footer PPD we don't want to call PPD call with all 
> attendant SARG, MS and HBase overhead for each directory. If we wait for some 
> time (10ms? some fraction of inputs?) we can do one call without losing 
> overall perf. 
> For now make it time based.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to