[ 
https://issues.apache.org/jira/browse/HIVE-15978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15904117#comment-15904117
 ] 

Zoltan Haindrich commented on HIVE-15978:
-----------------------------------------

[~pxiong] I see multiple ways this could be achieved...and I'm not sure which 
one to take :)

Most of these functions (more/or less) could be translated into existing UDAF 
function usage - it needs some tweaking; but it can be done; I don't really 
want to reimplement all those things again - I think it would be better to 
reuse them.

# if I create some 'cover' UDAF evaluators for each of these functions and do 
the evaluation of those inside the new evaluator - that could work; but it will 
be quite a few very similar classes
# tho other alternative is to add some slightly extended versions of some 
existing UDAFs (like:count and variance) - and rewrite somehow the 
{{regr_sxx(y,x)}} invocations to {{extended_COUNT(x, y) * extended_VAR_POP( y 
)}}

I guess from here that the 1. alternative may give slightly better runtimes - 
but not significantly; but in the 2. case the "original" evalutators would do 
the real work

about why do I need to change a bit the existing UDAFs: all these regr_* 
functions are required to only do any work when neither of {{x}} and {{y}} is 
null ({{regr_sxx(x,y)}})

> Support regr_* functions
> ------------------------
>
>                 Key: HIVE-15978
>                 URL: https://issues.apache.org/jira/browse/HIVE-15978
>             Project: Hive
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Carter Shanklin
>            Assignee: Zoltan Haindrich
>
> Support the standard regr_* functions, regr_slope, regr_intercept, regr_r2, 
> regr_sxx, regr_syy, regr_sxy, regr_avgx, regr_avgy, regr_count. SQL reference 
> section 10.9



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to