[ https://issues.apache.org/jira/browse/HIVE-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008221#comment-13008221 ]
Joydeep Sen Sarma commented on HIVE-2051: ----------------------------------------- my bad - i thought Carl == M IS :-). looking at .3 patch - i am concerned about this code: + result.get(); + } catch (InterruptedException e) { + throw new IOException(e); in a different block of code down from this one - we ignore InterruptedException. It seems safer to ignore them (I am just not sure if we there's any reason to get a valid thread interrupt in the calling thread and if so what the thread is supposed to do in that case). is it necessary for the executor to terminate if all the tasks given to it are already terminated? (trivial point - but might reduce code a bit). > getInputSummary() to call FileSystem.getContentSummary() in parallel > -------------------------------------------------------------------- > > Key: HIVE-2051 > URL: https://issues.apache.org/jira/browse/HIVE-2051 > Project: Hive > Issue Type: Improvement > Reporter: Siying Dong > Assignee: Siying Dong > Priority: Minor > Attachments: HIVE-2051.1.patch, HIVE-2051.2.patch, HIVE-2051.3.patch > > > getInputSummary() now call FileSystem.getContentSummary() one by one, which > can be extremely slow when the number of input paths are huge. By calling > those functions in parallel, we can cut latency in most cases. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira