[ https://issues.apache.org/jira/browse/HIVE-2146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029816#comment-13029816 ]
Hudson commented on HIVE-2146: ------------------------------ Integrated in Hive-trunk-h0.20 #712 (See [https://builds.apache.org/hudson/job/Hive-trunk-h0.20/712/]) > Block Sampling should adjust number of reducers accordingly to make it useful > ----------------------------------------------------------------------------- > > Key: HIVE-2146 > URL: https://issues.apache.org/jira/browse/HIVE-2146 > Project: Hive > Issue Type: Bug > Reporter: Siying Dong > Assignee: Siying Dong > Fix For: 0.8.0 > > Attachments: HIVE-2146.1.patch, HIVE-2146.2.patch > > > Now number of reducers of block sampling is not modified, so that queries > like: > select c from tab tablesample(1 percent) group by c; > can generate huge number of reducers although the input is sampled to be > small. > We need to shrink number of reducers to make block sampling more useful. > Since now number of reducers are determined before get splits, the way to do > it probably is not clean enough, but we can do a good guess. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira