[ 
https://issues.apache.org/jira/browse/HIVE-15879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15882640#comment-15882640
 ] 

Rajesh Balamohan commented on HIVE-15879:
-----------------------------------------

>>>
it returns only approximate number of threads and it cannot be guaranteed that 
it always returns the exact number of active threads. This still exposes the 
method implementation to the msck hang bug in rare corner cases.
>>>
My comment on ThreadPoolExecutor.getActiveCount was for the msck-hang bug. That 
bug should not surface with master branch. 

Proposed change should not impact the single partition use case (e.g single 
partition column having 10K partitions would not be impacted with this). 
However, if there are multiple partition columns, iterative approach would be 
better than the recursive approach. In recursive model, it gets blocked with in 
higher level and proceeds in single threaded approach. 

> Fix HiveMetaStoreChecker.checkPartitionDirs method
> --------------------------------------------------
>
>                 Key: HIVE-15879
>                 URL: https://issues.apache.org/jira/browse/HIVE-15879
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Vihang Karajgaonkar
>            Assignee: Vihang Karajgaonkar
>         Attachments: HIVE-15879.01.patch, HIVE-15879.02.patch
>
>
> HIVE-15803 fixes the msck hang issue in 
> HiveMetaStoreChecker.checkPartitionDirs method by adding a check to see if 
> the Threadpool has any spare threads. If not it uses single threaded listing 
> of the files.
> {noformat}
>     if (pool != null) {
>       synchronized (pool) {
>         // In case of recursive calls, it is possible to deadlock with TP. 
> Check TP usage here.
>         if (pool.getActiveCount() < pool.getMaximumPoolSize()) {
>           useThreadPool = true;
>         }
>         if (!useThreadPool) {
>           if (LOG.isDebugEnabled()) {
>             LOG.debug("Not using threadPool as active count:" + 
> pool.getActiveCount()
>                 + ", max:" + pool.getMaximumPoolSize());
>           }
>         }
>       }
>     }
> {noformat}
> Based on the java doc of getActiveCount() below 
> bq. Returns the approximate number of threads that are actively executing 
> tasks.
> it returns only approximate number of threads and it cannot be guaranteed 
> that it always returns the exact number of active threads. This still exposes 
> the method implementation to the msck hang bug in rare corner cases.
> We could either:
> 1. Use a atomic counter to track exactly how many threads are actively running
> 2. Relook at the method itself to make it much simpler. Like eg, look into 
> the possibility of changing the recursive implementation to an iterative 
> implementation where worker threads pick tasks from a queue until the queue 
> is empty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to