[ https://issues.apache.org/jira/browse/HIVE-25905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17483123#comment-17483123 ]
Alessandro Solimando commented on HIVE-25905: --------------------------------------------- I have only one doubt left before resolving the ticket as "Not a Bug". In [HiveParser.g#L866|https://github.com/apache/hive/blob/b92763e8e193076815f4ed6ba299c873d770aec1/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g#L866] we have the following code: {code:java} protected boolean nullsLast() { if(hiveConf == null){ return false; } return HiveConf.getBoolVar(hiveConf, HiveConf.ConfVars.HIVE_DEFAULT_NULLS_LAST); } {code} so when no hive config is available, we return "false", while I think we should return the property default, something like: {code:java} protected boolean nullsLast() { if(hiveConf == null){ return HiveConf.ConfVars.HIVE_DEFAULT_NULLS_LAST.defaultBoolVal; } return HiveConf.getBoolVar(hiveConf, HiveConf.ConfVars.HIVE_DEFAULT_NULLS_LAST); } {code} WDYT? > ORDER BY colName DESC does not honour 'hive.default.nulls.last' property in > absence of NULLS_LAST/FIRST > ------------------------------------------------------------------------------------------------------- > > Key: HIVE-25905 > URL: https://issues.apache.org/jira/browse/HIVE-25905 > Project: Hive > Issue Type: Bug > Components: Parser > Affects Versions: 4.0.0 > Reporter: Alessandro Solimando > Assignee: Alessandro Solimando > Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Consider the following table and data: > {noformat} > create table test1 > ( > a string, > b timestamp, > c timestamp > ); > INSERT INTO TABLE test1 VALUES > ('John Doe', '1990-05-10 00:00:00.0', '2022-01-10 00:00:00.0'), > ('John Doe', '1990-05-10 00:00:00.0', '2021-12-10 00:00:00.0'), > ('John Doe', '1990-05-10 00:00:00.0', '2021-11-10 00:00:00.0'), > ('John Doe', '1990-05-10 00:00:00.0', '2021-10-10 00:00:00.0'), > ('John Doe', '1990-05-10 00:00:00.0', '2021-09-10 00:00:00.0'), > ('John Doe', '1987-05-10 00:00:00.0', '2022-01-10 00:00:00.0'), > ('John Doe', '1987-05-10 00:00:00.0', '2021-12-10 00:00:00.0'), > ('John Doe', '1987-05-10 00:00:00.0', '2021-11-10 00:00:00.0'), > ('John Doe', '1987-05-10 00:00:00.0', '2021-10-10 00:00:00.0'), > ('John Doe', '1987-05-10 00:00:00.0', null);{noformat} > Consider also the following query: > {noformat} > SELECT a, b, c, row_number() OVER (PARTITION BY a, b ORDER BY b DESC, c DESC) > FROM test1; > {noformat} > The output is: > {noformat} > John Doe 10/05/1990 00:00 10/01/2022 00:00 1 > John Doe 10/05/1990 00:00 10/12/2021 00:00 2 > John Doe 10/05/1990 00:00 10/11/2021 00:00 3 > John Doe 10/05/1990 00:00 10/10/2021 00:00 4 > John Doe 10/05/1990 00:00 10/09/2021 00:00 5 > John Doe 10/05/1987 00:00 NULL 1 > John Doe 10/05/1987 00:00 10/01/2022 00:00 2 > John Doe 10/05/1987 00:00 10/12/2021 00:00 3 > John Doe 10/05/1987 00:00 10/11/2021 00:00 4 > John Doe 10/05/1987 00:00 10/10/2021 00:00 5{noformat} > While the expected output should be: > {noformat} > John Doe 10/05/1990 00:00 10/01/2022 00:00 1 > John Doe 10/05/1990 00:00 10/12/2021 00:00 2 > John Doe 10/05/1990 00:00 10/11/2021 00:00 3 > John Doe 10/05/1990 00:00 10/10/2021 00:00 4 > John Doe 10/05/1990 00:00 10/09/2021 00:00 5 > John Doe 10/05/1987 00:00 10/01/2022 00:00 1 > John Doe 10/05/1987 00:00 10/12/2021 00:00 2 > John Doe 10/05/1987 00:00 10/11/2021 00:00 3 > John Doe 10/05/1987 00:00 10/10/2021 00:00 4 > John Doe 10/05/1987 00:00 NULL 5{noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)