Re: Review Request 65716: HIVE-18696: The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

Sahil Takiar Fri, 23 Feb 2018 08:46:56 -0800


> On Feb. 21, 2018, 3:32 p.m., Sahil Takiar wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Lines 3149-3150 (patched)
> > <https://reviews.apache.org/r/65716/diff/1/?file=1962881#file1962881line3163>
> >
> >     curious what behavior you were seeing, wondering why cancelling or 
> > interrupting the `Future`s doesn't work
> 
> Marta Kuczora wrote:
>     The idea was to iterate over the Future tasks and call their get method 
> to wait until they finish. If an exception occurred in one task, iterated 
> over the tasks again and canceled them with mayInterruptIfRunning=false flag. 
> According to the Java doc, if this flag is false, the in-progress tasks are 
> allowed to complete. So the idea behind this was to avoid interrupting a task 
> when it already started creating the folder. So the situation when the folder 
> is created, but it is not yet put to the addedPartitions map doesn't happen.
>     Then wait for the already running tasks to complete and I assumed we are 
> good at that point because all tasks are either finished or didn't even 
> started.
>     
>     Imagine a flow something like this in code:
>     
>     boolean failureOccurred = false;
>     try {
>       for (Future<Partition> partFuture : partFutures) {
>         partFuture.get();
>       }
>     } catch (InterruptedException | ExecutionException e) {
>       failureOccurred = true;  
>     }
>     
>     for (Future<Partition> task : partFutures) {
>       task.cancel(false);
>     } 
>     
>     for (Future<Partition> partFuture : partFutures) {
>       if (!partFuture.isCanceled()) {
>         partFuture.get();
>       }
>     }
>     
>     
>     Then I created a test to see if it works as I expected. I tried to create 
> 40 partitions and had a counter which were visible to the tasks and threw an 
> exception when it reached 20. What I noticed is that almost every time I got 
> a ConcurrentModificationException on this line
>     
>        for (Map.Entry<PartValEqWrapperLite, Boolean> e : 
> addedPartitions.entrySet()) {
>     
>     So there must be some tasks still writing the addedPartitions map at that 
> point. By the way, changing the type of the map to ConcurrentMap proved this 
> right, as no exception occurred in this case, but there were leftover folders.
>     
>     So I started to debug it, mainly with logging when the call method of a 
> task is called, when a task get canceled and what was the result when the get 
> method was called. What I found that there were tasks which were started, 
> their call method was called, so they started to create the folder, but then 
> there was a successful cancel on them. For these tasks the get method simply 
> would throw a CancellationException as it sees the task is not running any 
> more (or the isCanceled method would return true). But actually these tasks 
> created the folder, but it could happen that they didn't finish until the 
> clean up.
>     
>     I checked the FutureTask code and the run method checks if the state of 
> the task is NEW and if it is, calls the Callable's call method. But doesn't 
> change the state at that point. My theory is that if a cancel is called on 
> the same task at this point, it will also see that the state is NEW, so it 
> will change it to CANCELLED. So I believe a task can go into a weird state 
> like this.
>     Calling the cancel with mayInterruptIfRunning=true also resulted the same.
>     So I didn't find a bullet proof solution with canceling the tasks, but it 
> can be that I missed something and there is a good way to solve this.
>     
>     If you have any idea, please share it with me, any idea is welcome. :)
> 
> Sahil Takiar wrote:
>     hmm this is odd. did you try checking the return value of the `cancel` 
> method, the javadocs say it returns `true` if the cancel was successful and 
> `false` otherwise; is it returning `true`?
>     
>     I can understand that `mayInterruptIfRunning=true` won't help, because it 
> just causes the thread to throw an `InterruptedException` but any code 
> running in that thread can just catch the exception and drop it
>     
>     could it be possible the second `partFuture.get()` threw an exception, in 
> which case the `finally` block will be trigerred before all threads complete?
>     
>     I think the solution you have proposed works fine, but the 
> `Future#cancel` methodology should work too and is slightly cleaner, but if 
> you can't get it to work no worries, don't spend too much time on it
> 
> Marta Kuczora wrote:
>     Yep, I checked it and for these "leftover" tasks, the cancel returned 
> true. There were tasks for which the cancel returned false, but those were 
> ok, because the get just waited for them.
>     
>     No, I didn't get exception during the second partFuture.get(). Also when 
> I logged what happens with the tasks, all tasks appeared in this second get 
> loop. For the ones which were canceled successfully (even though they are 
> running), just showed that they were canceled.
>     
>     I agree that it would be a cleaner solution and I spent quite some time 
> with trying to make it work, but I could make this use case work.


Ok sounds good. The current implementation looks fine then. Let's just stick to 
this then. Overall, LGTM pending the refactoring.


- Sahil


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65716/#review197829
-----------------------------------------------------------


On Feb. 20, 2018, 5:03 p.m., Marta Kuczora wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65716/
> -----------------------------------------------------------
> 
> (Updated Feb. 20, 2018, 5:03 p.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18696
>     https://issues.apache.org/jira/browse/HIVE-18696
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> The idea behind the patch is
> 
> 1) Separate the partition validation from starting the tasks which create the 
> partition folders. 
> Instead of doing the checks on the partitions and submit the tasks in one 
> loop, separated the validation into a different loop. So first iterate 
> through the partitions, validate the table/db names, and check for 
> duplicates. Then if all partitions were correct, in the second loop submit 
> the tasks to create the partition folders. This way if one of the partitions 
> is incorrect, the exception will be thrown in the first loop, before the 
> tasks are submitted. So we can be sure that no partition folder will be 
> created if the list contains an invalid partition.
> 
> 2) Handle the exceptions which occur during the execution of the tasks 
> differently.
> Previously if an exception occured in one task, the remaining tasks were 
> canceled, and the newly created partition folders were cleaned up in the 
> finally part. The problem was that it could happen that some tasks were still 
> not finished with the folder creation when cleaning up the others, so there 
> could have been leftover folders. After doing some testing it turned out that 
> this use case cannot be avoided completely when canceling the tasks.
> The idea of this patch is to set a flag if an exception is thrown in one of 
> the tasks. This flag is visible in the tasks and if its value is true, the 
> partition folders won't be created. Then iterate through the remaining tasks 
> and wait for them to finish. The tasks which are started before the flag got 
> set will then finish creating the partition folders. The tasks which are 
> started after the flag got set, won't create the partition folders, to avoid 
> unnecessary work. This way it is sure that all tasks are finished, when 
> entering the finally part where the partition folders are cleaned up.
> 
> 
> Diffs
> -----
> 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  47de215 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
>  f483ca8 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
>  919ba78 
> 
> 
> Diff: https://reviews.apache.org/r/65716/diff/1/
> 
> 
> Testing
> -------
> 
> Added some new tests cases to the TestAddPartitions and 
> TestAddPartitionsFromPartSpec tests.
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>

Re: Review Request 65716: HIVE-18696: The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

Reply via email to