[ https://issues.apache.org/jira/browse/HIVE-4080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588515#comment-13588515 ]
Phabricator commented on HIVE-4080: ----------------------------------- ashutoshc has requested changes to the revision "HIVE-4080 [jira] Add Lead & Lag UDAFs". Some comments. INLINE COMMENTS ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:533 Use LEAD_FUNC_NAME and LAG_FUNC_NAME here to be consistent. ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:871 This null check should be done outside of nesting if() block. ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:875 I don't see a case where fInfo is != null but udafResolver could be. If so, than this null check is redundant and should be removed. ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:890 We are already know that we are dealing with lead/lag udaf, it must be of type GenericUDAFResolver2.. no ? ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:866 I am wondering if this function can be rewritten as following: WindowFunctionInfo finfo = windowFunctions.get(name.toLowerCase()); if (finfo == null) { return null;} f ( !name.toLowerCase().equals(LEAD_FUNC_NAME) && !name.toLowerCase().equals(LAG_FUNC_NAME) ) { return getGenericUDAFEvaluator(name, argumentOIs, isDistinct, isAllColumns); } // this must be lead/lag UDAF GenericUDAFResolver udafResolver = finfo.getfInfo().getGenericUDAFResolver(); GenericUDAFParameterInfo paramInfo = new SimpleGenericUDAFParameterInfo( argumentOIs.toArray(), isDistinct, isAllColumns); return ((GenericUDAFResolver2) udafResolver).getEvaluator(paramInfo); If not, than I have specific questions. See next few comments. ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:1467 It will be good to comments for this new boolean registerAsUDAF. Something like following There are certain UDAFs like lead/lag which we want as windowing functions, but don't want them to appear in mFunctions. Why? Because.... ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLag.java:29 This and GenericUDAFLead share lots of common code. It might be good to have an abstract class for these two, just the way you have it in GenericUDFLeadLag. ql/src/test/queries/clientpositive/leadlag_queries.q:20 I think currently we don't support over clause on expressions. Once we do, it will be good to add test like: select p_retailprice - lead (p_retail) over (partition by p_mfgr) from part; ql/src/test/queries/clientpositive/leadlag_queries.q:35 It will be good to add a test which has both lead and lag in same query. REVISION DETAIL https://reviews.facebook.net/D8961 BRANCH HIVE-4080 ARCANIST PROJECT hive To: JIRA, ashutoshc, hbutani > Add Lead & Lag UDAFs > -------------------- > > Key: HIVE-4080 > URL: https://issues.apache.org/jira/browse/HIVE-4080 > Project: Hive > Issue Type: Bug > Components: PTF-Windowing > Reporter: Harish Butani > Assignee: Harish Butani > Attachments: HIVE-4080.1.patch.txt, HIVE-4080.D8961.1.patch > > > Currently we support Lead/Lag as navigation UDFs usable with Windowing. > To be standard compliant we need to support Lead & Lag UDAFs. > Will continue to support Lead/Lag UDFs as arguments to UDAFs when Windowing > is in play. > Currently allow Lead/Lag expressions to appear in SelectLists even when they > are not arguments to UDAFs. Support for this feature will probably be > removed. Causes ambiguities when Query contains different partition clauses. > Will provide more details with associated Jira to remove this feature. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira