[ 
https://issues.apache.org/jira/browse/KUDU-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230392#comment-17230392
 ] 

ASF subversion and git services commented on KUDU-1644:
-------------------------------------------------------

Commit 6a7cadc7eddeaaa374971d5ba16fec8422e33db9 in kudu's branch 
refs/heads/master from ningw
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=6a7cadc ]

KUDU-1644 hash-partition based in-list predicate optimization

Hash prune for single hash-key based inList query. Reduce the values to 
predicate
by hash-partition match.
This patch reduces the IN List predicated values to be pushed to tablet
without change the content to be returned.

Table has P partitions, N records. Inlist predicate has V values.

Before:
To each tablet, time complexity to complete hash-key based in-list query is:
LOG(V) * N

After:
Complexity becomes:
LOG(V/P) * N

E.g.
Hash partition of table 'profile':
hash(id) by id partitions 3, simply use mod as hash function.
select * from profile where id in (1,2,3,4,5,6,7,8,9,10)

Before:
Tablet 1: id in (1,2,3,4,5,6,7,8,9,10)
Tablet 2: id in (1,2,3,4,5,6,7,8,9,10)
Tablet 3: id in (1,2,3,4,5,6,7,8,9,10)

After:
Tablet 1: id in (0,3,6,9)
Tablet 2: id in (1,4,7,10)
Tablet 3: id in (2,5,8)

Change-Id: I202001535669a72de7fbb9e766dbc27db48e0aa2
Reviewed-on: http://gerrit.cloudera.org:8080/16674
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong <aw...@cloudera.com>


> Simplify IN-list predicate values based on tablet partition key or rowset PK 
> bounds
> -----------------------------------------------------------------------------------
>
>                 Key: KUDU-1644
>                 URL: https://issues.apache.org/jira/browse/KUDU-1644
>             Project: Kudu
>          Issue Type: Sub-task
>          Components: perf, tablet
>            Reporter: Dan Burkert
>            Priority: Major
>         Attachments: image-2019-12-05-14-52-05-846.png, 
> image-2019-12-05-14-52-18-487.png, image-2019-12-05-14-53-51-175.png, 
> image-2019-12-05-14-53-57-741.png, image-2019-12-05-14-54-03-485.png
>
>
> When new scans are optimized by the tablet, the tablet's partition key bounds 
> aren't taken into account in order to remove predicates from the scan.  One 
> of the most important such optimizations is that IN-list predicates could 
> remove values based on the tablet's constraints.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to