[ https://issues.apache.org/jira/browse/HIVE-2146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Siying Dong updated HIVE-2146: ------------------------------ Status: Patch Available (was: Open) > Block Sampling should adjust number of reducers accordingly to make it useful > ----------------------------------------------------------------------------- > > Key: HIVE-2146 > URL: https://issues.apache.org/jira/browse/HIVE-2146 > Project: Hive > Issue Type: Bug > Reporter: Siying Dong > Assignee: Siying Dong > Attachments: HIVE-2146.1.patch, HIVE-2146.2.patch > > > Now number of reducers of block sampling is not modified, so that queries > like: > select c from tab tablesample(1 percent) group by c; > can generate huge number of reducers although the input is sampled to be > small. > We need to shrink number of reducers to make block sampling more useful. > Since now number of reducers are determined before get splits, the way to do > it probably is not clean enough, but we can do a good guess. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira