[ https://issues.apache.org/jira/browse/HUDI-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
rex xiong updated HUDI-3818: ---------------------------- Description: when use bytes column as primary key, hudi will generate fixed hoodie key, then upserts will only insert one row. {code:java} scala> sql("desc extended binary_test1").show() +--------------------+--------------------+-------+ | col_name| data_type|comment| +--------------------+--------------------+-------+ | _hoodie_commit_time| string| null| |_hoodie_commit_seqno| string| null| | _hoodie_record_key| string| null| |_hoodie_partition...| string| null| | _hoodie_file_name| string| null| | id| binary| null| | name| string| null| | dt| string| null| | | | | |# Detailed Table ...| | | | Database| default| | | Table| binary_test1| | | Owner| root| | | Created Time|Sat Apr 02 13:28:...| | | Last Access| UNKNOWN| | | Created By| Spark 3.2.0| | | Type| MANAGED| | | Provider| hudi| | | Table Properties|[last_commit_time...| | | Statistics| 435194 bytes| | +--------------------+--------------------+-------+ scala> sql("select * from binary_test1").show() +-------------------+--------------------+--------------------+----------------------+--------------------+--------------------+---------+--------+ |_hoodie_commit_time|_hoodie_commit_seqno| _hoodie_record_key|_hoodie_partition_path| _hoodie_file_name| id| name| dt| +-------------------+--------------------+--------------------+----------------------+--------------------+--------------------+---------+--------+ | 20220402132927590|20220402132927590...|id:java.nio.HeapB...| |1a06106e-5e7a-4e6...|[03 45 6A 00 00 0...|Mary Jane|20220401| +-------------------+--------------------+--------------------+----------------------+--------------------+--------------------+---------+--------+{code} was: {code:java} scala> sql("desc extended binary_test1").show(false) +----------------------------+--------------------------------------------------------------------------------------+-------+ |col_name |data_type |comment| +----------------------------+--------------------------------------------------------------------------------------+-------+ |_hoodie_commit_time |string |null | |_hoodie_commit_seqno |string |null | |_hoodie_record_key |string |null | |_hoodie_partition_path |string |null | |_hoodie_file_name |string |null | |id |binary |null | |name |string |null | |dt |string |null | | | | | |# Detailed Table Information| | | |Database |default | | |Table |binary_test1 | | |Owner |root | | |Created Time |Sat Apr 02 13:28:29 CST 2022 | | |Last Access |UNKNOWN | | |Created By |Spark 3.2.0 | | |Type |MANAGED | | |Provider |hudi | | |Table Properties |[last_commit_time_sync=20220402132927590, preCombineField=id, primaryKey=id, type=cow]| | |Statistics |435194 bytes | | +----------------------------+--------------------------------------------------------------------------------------+-------+ only showing top 20 rows scala> sql("select * from binary_test1").show(false) +-------------------+---------------------+-----------------------------------------------+----------------------+--------------------------------------------------------------------------+-------------------------------------------------+---------+--------+ |_hoodie_commit_time|_hoodie_commit_seqno |_hoodie_record_key |_hoodie_partition_path|_hoodie_file_name |id |name |dt | +-------------------+---------------------+-----------------------------------------------+----------------------+--------------------------------------------------------------------------+-------------------------------------------------+---------+--------+ |20220402132927590 |20220402132927590_0_1|id:java.nio.HeapByteBuffer[pos=0 lim=16 cap=16]| |1a06106e-5e7a-4e68-9ebb-a0dceab70d87-0_0-12-1005_20220402132927590.parquet|[03 45 6A 00 00 00 00 00 00 00 00 00 00 00 00 00]|Mary Jane|20220401| +-------------------+---------------------+-----------------------------------------------+----------------------+--------------------------------------------------------------------------+-------------------------------------------------+---------+--------+ {code} > hudi doesn't support bytes column as primary key > ------------------------------------------------ > > Key: HUDI-3818 > URL: https://issues.apache.org/jira/browse/HUDI-3818 > Project: Apache Hudi > Issue Type: Bug > Reporter: rex xiong > Assignee: rex xiong > Priority: Minor > > when use bytes column as primary key, hudi will generate fixed hoodie key, > then upserts will only insert one row. > {code:java} > scala> sql("desc extended binary_test1").show() > +--------------------+--------------------+-------+ > | col_name| data_type|comment| > +--------------------+--------------------+-------+ > | _hoodie_commit_time| string| null| > |_hoodie_commit_seqno| string| null| > | _hoodie_record_key| string| null| > |_hoodie_partition...| string| null| > | _hoodie_file_name| string| null| > | id| binary| null| > | name| string| null| > | dt| string| null| > | | | | > |# Detailed Table ...| | | > | Database| default| | > | Table| binary_test1| | > | Owner| root| | > | Created Time|Sat Apr 02 13:28:...| | > | Last Access| UNKNOWN| | > | Created By| Spark 3.2.0| | > | Type| MANAGED| | > | Provider| hudi| | > | Table Properties|[last_commit_time...| | > | Statistics| 435194 bytes| | > +--------------------+--------------------+-------+ > scala> sql("select * from binary_test1").show() > +-------------------+--------------------+--------------------+----------------------+--------------------+--------------------+---------+--------+ > |_hoodie_commit_time|_hoodie_commit_seqno| > _hoodie_record_key|_hoodie_partition_path| _hoodie_file_name| > id| name| dt| > +-------------------+--------------------+--------------------+----------------------+--------------------+--------------------+---------+--------+ > | 20220402132927590|20220402132927590...|id:java.nio.HeapB...| > |1a06106e-5e7a-4e6...|[03 45 6A 00 00 0...|Mary Jane|20220401| > +-------------------+--------------------+--------------------+----------------------+--------------------+--------------------+---------+--------+{code} -- This message was sent by Atlassian Jira (v8.20.1#820001)