> On Jan. 8, 2019, 10:04 a.m., Peter Vary wrote:
> > Thanks for the patch!
> > Two nits below.
> > Also a bit concerned about the size calculation - seems ok, but it would be 
> > good to have a few test case which validates the contentsummary 
> > calculations (when every path is cached/only few patch is cached/no path is 
> > cached), so we can be sure that further changes will not break the 
> > functionality.
> > 
> > What do you think?
> > 
> > Peter
> 
> David Mollitor wrote:
>     Thank you Peter for the review. This functionality is unit tested 
> already. Do you have suggestions for additional unit tests?
>     
>     
>     
> https://github.com/apache/hive/blob/ae008b79b5d52ed6a38875b73025a505725828eb/ql/src/test/org/apache/hadoop/hive/ql/exec/TestUtilities.java
> 
> Peter Vary wrote:
>     I missed the ones which are testing the full getInputSummary method, and 
> found only the ones (testGetInputSummaryPool, 
> testGetInputSummaryPoolAndFailure) using the getInputSummaryWithPool.
>     Seeing those tests I feel much better. Maybe it is an edge case, but it 
> might be good to add a test where we call the getInputSummary twice. First 
> with one set of path (p1, p2) and then again with another set of path (p1, 
> p2, p3, p4) so we can check that the merge of the cached results and the 
> newly fetched ones is working.

I created a new patch to include a unit test where a file path's summary status 
is stored in the cache.  I moved all of the unit tests for getInputSummary into 
their own file because it made it easier since its functionality (and therefore 
testing) are much different than all the other Utility methods.  This allowed 
me to implement JUnit setup() and teardown() methods.


- David


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69683/#review211758
-----------------------------------------------------------


On Feb. 6, 2019, 8:34 p.m., David Mollitor wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69683/
> -----------------------------------------------------------
> 
> (Updated Feb. 6, 2019, 8:34 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Improve performance of method getInputSummary by changing data structures and 
> allowing multiple threads to do calculations.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 8937b43 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestGetInputSummary.java 
> PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestUtilities.java 90eb45b 
> 
> 
> Diff: https://reviews.apache.org/r/69683/diff/3/
> 
> 
> Testing
> -------
> 
> Unit
> 
> 
> Thanks,
> 
> David Mollitor
> 
>

Reply via email to