[ 
https://issues.apache.org/jira/browse/HIVE-23598?focusedWorklogId=447935&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447935
 ]

ASF GitHub Bot logged work on HIVE-23598:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 18/Jun/20 17:12
            Start Date: 18/Jun/20 17:12
    Worklog Time Spent: 10m 
      Work Description: jcamachor commented on a change in pull request #1126:
URL: https://github.com/apache/hive/pull/1126#discussion_r442379240



##########
File path: 
ql/src/test/results/clientpositive/llap/sketches_rewrite_cume_dist_partition_by.q.out
##########
@@ -26,76 +26,42 @@ POSTHOOK: Input: _dummy_database@_dummy_table
 POSTHOOK: Output: default@sketch_input
 POSTHOOK: Lineage: sketch_input.category SCRIPT []
 POSTHOOK: Lineage: sketch_input.id SCRIPT []
-PREHOOK: query: select id,category,cume_dist() over (partition by category 
order by id) from sketch_input order by category,id
-PREHOOK: type: QUERY
-PREHOOK: Input: default@sketch_input
-#### A masked pattern was here ####
-POSTHOOK: query: select id,category,cume_dist() over (partition by category 
order by id) from sketch_input order by category,id
-POSTHOOK: type: QUERY
-POSTHOOK: Input: default@sketch_input
-#### A masked pattern was here ####
-1      a       0.18181818181818182
-1      a       0.18181818181818182
-2      a       0.2727272727272727
-3      a       0.36363636363636365
-4      a       0.45454545454545453
-5      a       0.5454545454545454
-6      a       0.6363636363636364
-7      a       0.7272727272727273
-8      a       0.8181818181818182
-9      a       0.9090909090909091
-10     a       1.0
-6      b       0.18181818181818182
-6      b       0.18181818181818182
-7      b       0.2727272727272727
-8      b       0.36363636363636365
-9      b       0.45454545454545453
-10     b       0.5454545454545454
-11     b       0.6363636363636364
-12     b       0.7272727272727273
-13     b       0.8181818181818182
-14     b       0.9090909090909091
-15     b       1.0
-1      NULL    0.25
-2      NULL    0.5
-10     NULL    0.75
-13     NULL    1.0
-PREHOOK: query: select id,category,cume_dist() over (partition by category 
order by id),1.0-ds_kll_cdf(ds, CAST(-id AS FLOAT))[0]
+PREHOOK: query: select id,category,cume_dist() over (partition by category 
order by id),ds_kll_cdf(ds, CAST(id AS FLOAT))[0]
 from sketch_input
-join ( select category as c,ds_kll_sketch(cast(-id as float)) as ds from 
sketch_input group by category) q on (q.c=category)
+join ( select category as c,ds_kll_sketch(cast(id as float)) as ds from 
sketch_input group by category) q on (q.c=category)
 order by category,id
 PREHOOK: type: QUERY
 PREHOOK: Input: default@sketch_input
 #### A masked pattern was here ####
-POSTHOOK: query: select id,category,cume_dist() over (partition by category 
order by id),1.0-ds_kll_cdf(ds, CAST(-id AS FLOAT))[0]
+POSTHOOK: query: select id,category,cume_dist() over (partition by category 
order by id),ds_kll_cdf(ds, CAST(id AS FLOAT))[0]
 from sketch_input
-join ( select category as c,ds_kll_sketch(cast(-id as float)) as ds from 
sketch_input group by category) q on (q.c=category)
+join ( select category as c,ds_kll_sketch(cast(id as float)) as ds from 
sketch_input group by category) q on (q.c=category)
 order by category,id
 POSTHOOK: type: QUERY
 POSTHOOK: Input: default@sketch_input
 #### A masked pattern was here ####
-1      a       0.18181818181818182     0.18181818181818177
-1      a       0.18181818181818182     0.18181818181818177
-2      a       0.2727272727272727      0.2727272727272727
-3      a       0.36363636363636365     0.36363636363636365
-4      a       0.45454545454545453     0.4545454545454546
-5      a       0.5454545454545454      0.5454545454545454
-6      a       0.6363636363636364      0.6363636363636364
-7      a       0.7272727272727273      0.7272727272727273
-8      a       0.8181818181818182      0.8181818181818181
-9      a       0.9090909090909091      0.9090909090909091
-10     a       1.0     1.0
-6      b       0.18181818181818182     0.18181818181818177
-6      b       0.18181818181818182     0.18181818181818177
-7      b       0.2727272727272727      0.2727272727272727
-8      b       0.36363636363636365     0.36363636363636365
-9      b       0.45454545454545453     0.4545454545454546
-10     b       0.5454545454545454      0.5454545454545454
-11     b       0.6363636363636364      0.6363636363636364
-12     b       0.7272727272727273      0.7272727272727273
-13     b       0.8181818181818182      0.8181818181818181
-14     b       0.9090909090909091      0.9090909090909091
-15     b       1.0     1.0
+1      a       0.18181818181818182     0.0

Review comment:
       {quote}
   As a matter of fact; I think after submitting a patch to alter the udf a bit 
it could get back to the more accurate results again.
   {quote}
   Yes, I think we should do that. Thanks




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 447935)
    Time Spent: 40m  (was: 0.5h)

> Add option to rewrite NTILE and RANK to sketch functions
> --------------------------------------------------------
>
>                 Key: HIVE-23598
>                 URL: https://issues.apache.org/jira/browse/HIVE-23598
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Zoltan Haindrich
>            Assignee: Zoltan Haindrich
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to