[ https://issues.apache.org/jira/browse/HIVE-20254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Qinghui Xu resolved HIVE-20254. ------------------------------- Resolution: Duplicate Fix Version/s: 2.3.0 > CheckNonCombinablePathCallable is buggy > --------------------------------------- > > Key: HIVE-20254 > URL: https://issues.apache.org/jira/browse/HIVE-20254 > Project: Hive > Issue Type: Bug > Affects Versions: 1.1.0 > Reporter: Qinghui Xu > Priority: Major > Fix For: 2.3.0 > > > CombineHiveInputFormat provides the possibility for people to avoid combine > some part of their inputs (by implementing AvoidSplitCombination) > We spot a problem with that when our query tries to read a lot of partitions > (more than 100). In fact, when there are more than 100 input paths, the check > of combinability is run in parallel: > * dividing the input path array into several chunks (each chunk with no more > than 100 paths) > * submit each chunk to a CheckNonCombinablePathCallable > * each CheckNonCombinablePathCallable will return a set of index for the > paths to not be combined > The problem is that CheckNonCombinablePathCallable returns a set of relative > index (the index inside the chunk) instead of the absolute index, it means > that the returned indices are always smaller than 100, thus all the paths in > the array with position bigger than 100 are never taken into account for > avoiding combine input. -- This message was sent by Atlassian JIRA (v7.6.3#76005)