[ 
https://issues.apache.org/jira/browse/HUDI-7833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17943273#comment-17943273
 ] 

Lin Liu commented on HUDI-7833:
-------------------------------

After using nested key, the MOR table does not generate log files, but keep 
generating base file, from `TestFileGroupReaderBase.
testReadFileGroupInMergeOnReadTable` .



{code:java}
➜  junit7444571774815248205 ls -al 2016/03/15/
total 5328
drwxr-xr-x@ 16 linliu  staff     512 Apr 10 08:17 .
drwxr-xr-x@  3 linliu  staff      96 Apr 10 08:17 ..
-rw-r--r--@  1 linliu  staff      12 Apr 10 08:17 
..hoodie_partition_metadata.crc
-rw-r--r--@  1 linliu  staff    3500 Apr 10 08:17 
.6a8e662a-5d6f-47f5-bcab-34a4207331cb-0_0-169-446_20250410081743865.parquet.crc
-rw-r--r--@  1 linliu  staff    3496 Apr 10 08:17 
.71202b5d-d928-4ce8-8e8e-6dbcf4736c0c-0_0-139-356_20250410081741630.parquet.crc
-rw-r--r--@  1 linliu  staff    3500 Apr 10 08:17 
.b7c1e5db-cf45-4f77-8dd4-a90055256e7c-0_0-109-269_20250410081739740.parquet.crc
-rw-r--r--@  1 linliu  staff    3500 Apr 10 08:17 
.d12f47a5-5682-4926-8729-d7f5e1eb9fc3-0_0-49-104_20250410081735639.parquet.crc
-rw-r--r--@  1 linliu  staff    3496 Apr 10 08:17 
.d4abf013-047e-4a5d-883a-55c286dff4d5-0_0-79-185_20250410081737693.parquet.crc
-rw-r--r--@  1 linliu  staff    3500 Apr 10 08:17 
.d8f07823-5c95-4c1a-84c2-012fdb9f9be1-0_0-22-26_20250410081729468.parquet.crc
-rw-r--r--@  1 linliu  staff      96 Apr 10 08:17 .hoodie_partition_metadata
-rw-r--r--@  1 linliu  staff  446638 Apr 10 08:17 
6a8e662a-5d6f-47f5-bcab-34a4207331cb-0_0-169-446_20250410081743865.parquet
-rw-r--r--@  1 linliu  staff  446148 Apr 10 08:17 
71202b5d-d928-4ce8-8e8e-6dbcf4736c0c-0_0-139-356_20250410081741630.parquet
-rw-r--r--@  1 linliu  staff  446606 Apr 10 08:17 
b7c1e5db-cf45-4f77-8dd4-a90055256e7c-0_0-109-269_20250410081739740.parquet
-rw-r--r--@  1 linliu  staff  446826 Apr 10 08:17 
d12f47a5-5682-4926-8729-d7f5e1eb9fc3-0_0-49-104_20250410081735639.parquet
-rw-r--r--@  1 linliu  staff  446147 Apr 10 08:17 
d4abf013-047e-4a5d-883a-55c286dff4d5-0_0-79-185_20250410081737693.parquet
-rw-r--r--@  1 linliu  staff  446611 Apr 10 08:17 
d8f07823-5c95-4c1a-84c2-012fdb9f9be1-0_0-22-26_20250410081729468.parquet {code}

> Validate that fg reader works with nested column as record key
> --------------------------------------------------------------
>
>                 Key: HUDI-7833
>                 URL: https://issues.apache.org/jira/browse/HUDI-7833
>             Project: Apache Hudi
>          Issue Type: Sub-task
>            Reporter: Jonathan Vexler
>            Assignee: Lin Liu
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 1.0.2
>
>   Original Estimate: 2h
>          Time Spent: 1h
>  Remaining Estimate: 1h
>
> Ensure that fg reader works if the record key is a nested column
>  
> Progress:
> Created a PR to reproduce the problem: 
> [https://github.com/apache/hudi/pull/12253]
> From the PR, we turn on and off the fg reader, and run write operations 
> (insert, update and delete), and read. We want to test if the update and 
> delete could succeed on the map typed key column.
> From the test result, we can see that with or without fg reader enabled, the 
> test failed for map typed key. We can conclude that nested keys are not 
> supported in Hudi so far.
> I did some investigation on the root cause for this specific test:
> in `BuiltinKeyGenerator.combineRecordKeyInternal`, the `UnsafeMapData` 
> object's hash is returned, without caring the content of the map object.
> to fix it, we need to create a until function that deserialize these map 
> objects, and generate the hash based on their content.
> We should also do this for all other nested data types.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to