[ 
https://issues.apache.org/jira/browse/HIVE-15881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866824#comment-15866824
 ] 

Sergio Peña commented on HIVE-15881:
------------------------------------

Great, thanks for the suggestions. [~poeppt] Although I like your idea of the 
'maximum number allowable' using 0, I think we should continue using the 0 as 
using only one thread for the work. The rest of the configuration variables for 
threads use 0 to disable the use of threads. Let's keep it consistent.

I will submit a patch with the following:
- New variable name {{hive.exec.input.listing.max.threads}} for getInputSummary 
and getInputPaths
- Mark {{mapred.dfsclient.parallelism.max}} as deprecated, but continue using 
it.
- Default the value for {{hive.exec.input.listing.max.threads}} to 0 (no 
threads or just one thread). I think we should keep it disable because
  on HDFS there's no benefit of using threads, and we can multiple RPC 
connections with the namenode.

> Use new thread count variable name instead of mapred.dfsclient.parallelism.max
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-15881
>                 URL: https://issues.apache.org/jira/browse/HIVE-15881
>             Project: Hive
>          Issue Type: Task
>          Components: Query Planning
>            Reporter: Sergio Peña
>            Assignee: Sergio Peña
>            Priority: Minor
>
> The Utilities class has two methods, {{getInputSummary}} and 
> {{getInputPaths}}, that use the variable {{mapred.dfsclient.parallelism.max}} 
> to get the summary of a list of input locations in parallel. These methods 
> are Hive related, but the variable name does not look it is specific for Hive.
> Also, the above variable is not on HiveConf nor used anywhere else. I just 
> found a reference on the Hadoop MR1 code.
> I'd like to propose the deprecation of {{mapred.dfsclient.parallelism.max}}, 
> and use a different variable name, such as 
> {{hive.get.input.listing.num.threads}}, that reflects the intention of the 
> variable. The removal of the old variable might happen on Hive 3.x



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to