[ https://issues.apache.org/jira/browse/HIVE-22964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17052036#comment-17052036 ]
Peter Vary commented on HIVE-22964: ----------------------------------- Hi [~aditya-shah], * I am not a big fan of renaming configuration variables. They can wreak havoc when upgrading a cluster * Sorry, I have missed the error handling part. My bad :(, but this highlights why it is good practice to use try catch around only the relevant part of the code where the exception can be thrown: {code:java} for (Future<MMPathInfo> pathFuture : pathFutures) { finalPaths.addAll(pathFuture.get().getFinalPaths()); pathsWithFileOriginals.addAll(pathFuture.get().getPathsWithFileOriginals()); } {code} * Why are we using ugi.doAs? I have checked the other file related pool implementations, and did not find any place where it was used. * Usually it is a nightmare to synchronize guava between projects, so I prefer to use it only when it is really useful. Lists.newArrayList() is deprecated based on the docs ([https://guava.dev/releases/19.0/api/docs/com/google/common/collect/Lists.html#newArrayList(])). Is there a specific purpose to use it here instead of the standard java new ArrayList()? * Maybe, if we were using lambdas for submitting the tasks we can get rid of the ProcessForWriteIdsForMmReadCallable / MMPathInfo objects. What do you think? * Also when we have output from the yetus run, please check the results of the checkstyle/findbugs for any newly introduced warnings. Thanks, Peter > MM table split computation is very slow > --------------------------------------- > > Key: HIVE-22964 > URL: https://issues.apache.org/jira/browse/HIVE-22964 > Project: Hive > Issue Type: Improvement > Reporter: Aditya Shah > Assignee: Aditya Shah > Priority: Major > Attachments: HIVE-22964.patch > > > Since for MM table we process the paths prior to inputFormat.getSplits() we > end up doing listing on the whole table at once. This could be optimized. -- This message was sent by Atlassian Jira (v8.3.4#803005)