[ 
https://issues.apache.org/jira/browse/HUDI-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Y Ethan Guo updated HUDI-9164:
------------------------------
    Summary: Improve file group sharing strategy of secondary index in MDT  
(was: Partition strategy of MDT secondary index does not match the query 
pattern it serves)

> Improve file group sharing strategy of secondary index in MDT
> -------------------------------------------------------------
>
>                 Key: HUDI-9164
>                 URL: https://issues.apache.org/jira/browse/HUDI-9164
>             Project: Apache Hudi
>          Issue Type: Improvement
>            Reporter: Davis Zhang
>            Priority: Blocker
>             Fix For: 1.1.0
>
>
> h3. MDT sec idx file layout does not favor lookup/join efficiently
> Regarding the MDT join with an incoming pruning set RDD[Internal Row], the 
> existing secondary index data layout does not favor batch prefix look up.
>  
> h4. MDT index layout
> MDT secondary index are using key value pair, where the key uses scheme
> <data column value><separator><record key value>
> and value is the file group id.
>  
> So you can see all records comes with the prefix of the column value.
>  
> It adopts {*}hash based partitioning{*}, which means it takes Full key <data 
> col value><record key value>, hash it and decide which file group the 
> partition belongs to.
>  
> h3. The query pattern it serves
>  
> In a nutshell, the data layout is hash partitioning while the query pattern 
> is prefix lookup, this 2 does not match at all.
>  
> 2 types of query pattern against the index: * point look up given only a 
> secondary index column value, meaning only {{<data column value>}} is given 
> and we need to look up all file group ids associated
>  * Join with a large amount of column value: this is how secondary index join 
> would work. When joining tableGeneratingPruningSet and 
> tableWithIdxTobePruned, the tableGeneratingPruningSet generates a RDD of 
> values for data column C1, we use this RDD joining with MDT C1 secondary 
> index to figure out file group ids of interest. Here we are looking at join 
> between this RDD and MDT at a large scale.
>  
> Because we only knew the {{<data column value>}} from the input, which is 
> only the prefix of the secondary index key, so we don't know which bucket the 
> potential MDT records belongs to. As a result, even for point look up we need 
> to load the full MDT and the complexity is O(n).
>  
> This is not scalable .
>  
> Needs a improvements on the partition scheme to handle prefix based search at 
> a large scale.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to