[ https://issues.apache.org/jira/browse/HIVE-14199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eugene Koifman updated HIVE-14199: ---------------------------------- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) test failures are not related, for example: https://builds.apache.org/view/H-L/view/Hive/job/PreCommit-HIVE-MASTER-Build/950/testReport/ committed to master Thanks [~saketj] for the contribution > Enable Bucket Pruning for ACID tables > ------------------------------------- > > Key: HIVE-14199 > URL: https://issues.apache.org/jira/browse/HIVE-14199 > Project: Hive > Issue Type: Improvement > Components: Transactions > Reporter: Saket Saurabh > Assignee: Saket Saurabh > Fix For: 2.2.0 > > Attachments: HIVE-14199.01.patch, HIVE-14199.02.patch, > HIVE-14199.03.patch > > > Currently, ACID tables do not benefit from the bucket pruning feature > introduced in HIVE-11525. The reason for this has been the fact that bucket > pruning happens at split generation level and for ACID, traditionally the > delta files were never split. The parallelism for ACID was then restricted to > the number of buckets. There would be as many splits as the number of buckets > and each worker processing one split would inevitably read all the delta > files for that bucket, even when the query may have originally required only > one of the buckets to be read. > However, HIVE-14035 now enables even the delta files to be also split. What > this means is that now we have enough information at the split generation > level to determine appropriate buckets to process for the delta files. This > can efficiently allow us to prune unnecessary buckets for delta files and > will lead to good performance gain for a large number of selective queries on > ACID tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)