[ https://issues.apache.org/jira/browse/HIVE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809819#comment-13809819 ]
Sergey Shelukhin commented on HIVE-5657: ---------------------------------------- (responding to FB comment) yeah, it would be nice to have simpler fix to make merge easier, esp. since many of these changes (e.g. setting writable without duplicated code from BinaryComparable, and some other stuff) is already coming. What I don't understand still is what it does for distinct. Where can it decide to exclude? It seems the same forward-all behavior can be achieved by removing support for any distinct from optimizer. > TopN produces incorrect results with count(distinct) > ---------------------------------------------------- > > Key: HIVE-5657 > URL: https://issues.apache.org/jira/browse/HIVE-5657 > Project: Hive > Issue Type: Bug > Reporter: Sergey Shelukhin > Assignee: Navis > Priority: Critical > Attachments: D13797.1.patch, example.patch, HIVE-5657.1.patch.txt > > > Attached patch illustrates the problem. > limit_pushdown test has various other cases of aggregations and distincts, > incl. count-distinct, that work correctly (that said, src dataset is bad for > testing these things because every count, for example, produces one record > only), so something must be special about this. > I am not very familiar with distinct- code and these nuances; if someone > knows a quick fix feel free to take this, otherwise I will probably start > looking next week. -- This message was sent by Atlassian JIRA (v6.1#6144)