[ 
https://issues.apache.org/jira/browse/HIVE-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13182826#comment-13182826
 ] 

Phabricator commented on HIVE-2279:
-----------------------------------

zhenxiao has commented on the revision "HIVE-2279 [jira] Implement sort(array) 
UDF".

  sort() is a better name for sort_array(), while, seems currently the 
parser/semantic analyzer has some problem taking a reserved keyword as UDF 
function name.

  I tried the following changes in HIve.g:

  [~/Code/hive]git diff ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
  diff --git a/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 
b/ql/src/java/org/apache/hadoop/h
  index 888bf47..ec256de 100644
  --- a/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
  +++ b/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
  @@ -1816,7 +1816,7 @@ functionName
   @init { msgs.push("function name"); }
   @after { msgs.pop(); }
       : // Keyword IF is also a function name
  -    Identifier | KW_IF | KW_ARRAY | KW_MAP | KW_STRUCT | KW_UNIONTYPE
  +    Identifier | KW_IF | KW_ARRAY | KW_MAP | KW_STRUCT | KW_UNIONTYPE | 
KW_SORT
       ;

   castExpression
  @@ -2091,6 +2091,7 @@ sysFuncNames
       | KW_MAP
       | KW_STRUCT
       | KW_UNIONTYPE
  +    | KW_SORT
       | EQUAL
       | NOTEQUAL
       | LESSTHANOREQUALTO

  While, the testcase always fails during semantic analysis on argument length:

  -- Evaluate function against STRING valued keys
  EXPLAIN
  SELECT sort(array("b", "d", "c", "a")) FROM src LIMIT 1
  2012-01-09 11:31:55,134 INFO  parse.ParseDriver (ParseDriver.java:parse(426)) 
- Parsing command:

  -- Evaluate function against STRING valued keys
  EXPLAIN
  SELECT sort(array("b", "d", "c", "a")) FROM src LIMIT 1
  2012-01-09 11:31:55,146 INFO  parse.ParseDriver (ParseDriver.java:parse(443)) 
- Parse Completed
  2012-01-09 11:31:55,147 INFO  parse.SemanticAnalyzer 
(SemanticAnalyzer.java:analyzeInternal(7445)) - Starting Semantic Analysis
  2012-01-09 11:31:55,148 INFO  parse.SemanticAnalyzer 
(SemanticAnalyzer.java:analyzeInternal(7475)) - Completed phase 1 of Semantic 
Analysis
  2012-01-09 11:31:55,148 INFO  parse.SemanticAnalyzer 
(SemanticAnalyzer.java:getMetaData(942)) - Get metadata for source tables
  2012-01-09 11:31:55,149 INFO  metastore.HiveMetaStore 
(HiveMetaStore.java:logInfo(528)) - 0: get_table : db=default tbl=src
  2012-01-09 11:31:55,200 INFO  hive.log 
(MetaStoreUtils.java:getDDLFromFieldSchema(457)) - DDL: struct src { string 
key, string value}
  2012-01-09 11:31:55,200 DEBUG lazy.LazySimpleSerDe 
(LazySimpleSerDe.java:initialize(195)) - 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe initialized with: 
columnNames=[key, value] columnTypes=[string, string] separator=[[B@3bb20e65] 
nullstring=\N lastColumnTakesRest=false
  2012-01-09 11:31:55,200 INFO  parse.SemanticAnalyzer 
(SemanticAnalyzer.java:getMetaData(1021)) - Get metadata for subqueries
  2012-01-09 11:31:55,201 INFO  parse.SemanticAnalyzer 
(SemanticAnalyzer.java:getMetaData(1035)) - Get metadata for destination tables
  2012-01-09 11:31:55,201 INFO  parse.SemanticAnalyzer 
(SemanticAnalyzer.java:analyzeInternal(7478)) - Completed getting MetaData in 
Semantic Analysis
  2012-01-09 11:31:55,203 INFO  hive.log 
(MetaStoreUtils.java:getDDLFromFieldSchema(457)) - DDL: struct src { string 
key, string value}
  2012-01-09 11:31:55,203 DEBUG lazy.LazySimpleSerDe 
(LazySimpleSerDe.java:initialize(195)) - 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe initialized with: 
columnNames=[key, value] columnTypes=[string, string] separator=[[B@12e84396] 
nullstring=\N lastColumnTakesRest=false
  2012-01-09 11:31:55,222 DEBUG parse.SemanticAnalyzer 
(SemanticAnalyzer.java:genTablePlan(6598)) - Created Table Plan for src 
org.apache.hadoop.hive.ql.exec.TableScanOperator@5e9ea579
  2012-01-09 11:31:55,223 DEBUG parse.SemanticAnalyzer 
(SemanticAnalyzer.java:genSelectPlan(2117)) - tree: (TOK_SELECT (TOK_SELEXPR 
(TOK_FUNCTION sort (TOK_FUNCTION array "b" "d" "c" "a"))))
  2012-01-09 11:31:55,225 DEBUG parse.SemanticAnalyzer 
(SemanticAnalyzer.java:genSelectPlan(2222)) - genSelectPlan: input = 
src{(key,key: string)(value,value: 
string)(block__offset__inside__file,BLOCK__OFFSET__INSIDE__FILE: 
bigint)(input__file__name,INPUT__FILE__NAME: string)}
  2012-01-09 11:31:55,234 ERROR ql.Driver (SessionState.java:printError(380)) - 
FAILED: Error in semantic analysis: Line 5:7 Arguments length mismatch 'sort': 
The function SORT(array(obj1, obj2,...)) needs one argument.
  org.apache.hadoop.hive.ql.parse.SemanticException: Line 5:7 Arguments length 
mismatch 'sort': The function SORT(array(obj1, obj2,...)) needs one argument.
      at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:810)
      at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
      at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
      at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:125)
      at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
      at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:161)
      at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:7708)
      at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2301

  The same thing happens when I was doing format().

REVISION DETAIL
  https://reviews.facebook.net/D1125

                
> Implement sort(array) UDF
> -------------------------
>
>                 Key: HIVE-2279
>                 URL: https://issues.apache.org/jira/browse/HIVE-2279
>             Project: Hive
>          Issue Type: New Feature
>          Components: UDF
>            Reporter: Carl Steinbach
>            Assignee: Zhenxiao Luo
>         Attachments: HIVE-2279.D1059.1.patch, HIVE-2279.D1101.1.patch, 
> HIVE-2279.D1107.1.patch, HIVE-2279.D1125.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to