[ https://issues.apache.org/jira/browse/HIVE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ashutosh Chauhan updated HIVE-5916: ----------------------------------- Attachment: HIVE-5916.4.patch Patch is ready for review. [~navis] Major change in this patch is it changes the aggKey such that no aggregation is required on client side in StatsTask. This also shortens the key so that it won't run into limits. Now we just use dbName.TblName/p1=v1/ instead of long elaborate filesystem paths. This patch also lowers the number of unique counters required. Earlier they were num of Partitions * number of Tasks * number of Stats, now we just use num of Partitions * num of Tasks. This will conflict with HIVE-5936 in major fashion. Since, it has bunch of additional improvements, do you think it makes sense to get this one in first. > No need to aggregate statistics collected via counter mechanism > ---------------------------------------------------------------- > > Key: HIVE-5916 > URL: https://issues.apache.org/jira/browse/HIVE-5916 > Project: Hive > Issue Type: Bug > Components: Statistics > Affects Versions: 0.13.0 > Reporter: Ashutosh Chauhan > Assignee: Ashutosh Chauhan > Attachments: HIVE-5916.2.patch, HIVE-5916.3.patch, HIVE-5916.4.patch, > HIVE-5916.patch > > > This results in unnecessary computations and waste of cluster resources which > is not required since aggregation of counter is anyway done by JobTracker. -- This message was sent by Atlassian JIRA (v6.1#6144)