TarunMootala commented on issue #4914:
URL: https://github.com/apache/hudi/issues/4914#issuecomment-1058226404


   Tested with OverwriteNonDefaultsWithLatestAvroPayload but it's not working 
as expected. When a field is missing in the middle, it has overwritten values 
based on position instead of field names. Same case with 
OverwriteWithLatestAvroPayload also. 
   
   ```
   # Missing the 2nd field 'name' 
   inputDF = spark.createDataFrame(
       [
           ("111", "2022-01-01", "2022-01-01T12:15:00.512679Z", "2022-01-01", 
"2022-01-01"),
           ("112", "2015-01-01", "2015-01-01T12:15:00.512679Z", "2015-01-01", 
"2015-01-01"),
           ("113", "2015-01-01", "2015-01-01T13:51:42.248818Z", "2015-01-01", 
"2015-01-01"),
           ("114", "2015-01-01", "2015-01-01T13:51:42.248818Z", "2015-01-01", 
"2015-01-01"),
           ("115", "2015-01-01", "2015-01-01T13:51:42.248818Z", "2015-01-01", 
"2015-01-01"),
           ("116", "2015-01-01", "2015-01-01T13:51:42.248818Z", "2015-01-01", 
"2015-01-01"),
           ("117", "2015-01-01", "2015-01-01T13:51:42.248818Z", "2015-01-01", 
"2015-01-01")
       ],
       ["id", "creation_date", "last_update_time", "creation_date1", 
"creation_date2"]
   )
   
   hudiOptions = {
   'hoodie.table.name': table_name,
   'hoodie.datasource.write.recordkey.field': 'id',
   'hoodie.datasource.write.precombine.field': 'last_update_time',
   'hoodie.datasource.write.reconcile.schema': 'true',
   'hoodie.datasource.hive_sync.enable': 'true',
   'hoodie.datasource.hive_sync.database':'streaming_dev',
   'hoodie.datasource.hive_sync.partition_extractor_class': 
'org.apache.hudi.hive.NonPartitionedExtractor',
   'hoodie.datasource.write.payload.class': 
'org.apache.hudi.common.model.OverwriteNonDefaultsWithLatestAvroPayload'
   }
   
   print(table_name, table_path)
   
   inputDF.write\
   .format('hudi')\
   .option('hoodie.datasource.write.operation', 'upsert')\
   .options(**hudiOptions)\
   .mode('append')\
   .save(table_path)
   ```
   
   Output:
   ```
   
+-------------------+--------------------+------------------+----------------------+--------------------+---+----------+--------------------+--------------------+--------------+--------------+
   
|_hoodie_commit_time|_hoodie_commit_seqno|_hoodie_record_key|_hoodie_partition_path|
   _hoodie_file_name| id|      name|       creation_date|    
last_update_time|creation_date1|creation_date2|
   
+-------------------+--------------------+------------------+----------------------+--------------------+---+----------+--------------------+--------------------+--------------+--------------+
   |     20220301154618|  20220301154618_0_1|               100|                
      |a23a3468-f505-4e1...|100|       AAA|          
2015-01-01|2015-01-01T13:51:...|          null|          null|
   |     20220301154618|  20220301154618_0_2|               101|                
      |a23a3468-f505-4e1...|101|       BBB|          
2015-01-01|2015-01-01T12:14:...|          null|          null|
   |     20220301154618|  20220301154618_0_3|               102|                
      |a23a3468-f505-4e1...|102|       CCC|          
2015-01-01|2015-01-01T13:51:...|          null|          null|
   |     20220301154618|  20220301154618_0_4|               103|                
      |a23a3468-f505-4e1...|103|       DDD|          
2015-01-01|2015-01-01T13:51:...|          null|          null|
   |     20220301154618|  20220301154618_0_5|               104|                
      |a23a3468-f505-4e1...|104|       EEE|          
2015-01-01|2015-01-01T12:15:...|          null|          null|
   |     20220301154618|  20220301154618_0_6|               105|                
      |a23a3468-f505-4e1...|105|       FFF|          
2015-01-01|2015-01-01T13:51:...|          null|          null|
   |     20220301154640|  20220301154640_0_3|               106|                
      |a23a3468-f505-4e1...|106|       AAA|          
2015-01-01|2015-01-01T13:51:...|    2015-01-01|    2015-01-01|
   |     20220301154640|  20220301154640_0_4|               107|                
      |a23a3468-f505-4e1...|107|       BBB|          
2015-01-01|2015-01-01T12:14:...|    2015-01-01|    2015-01-01|
   |     20220301154640|  20220301154640_0_5|               108|                
      |a23a3468-f505-4e1...|108|       CCC|          
2015-01-01|2015-01-01T13:51:...|    2015-01-01|    2015-01-01|
   |     20220301154640|  20220301154640_0_6|               109|                
      |a23a3468-f505-4e1...|109|       DDD|          
2015-01-01|2015-01-01T13:51:...|    2015-01-01|    2015-01-01|
   |     20220301154640|  20220301154640_0_1|               110|                
      |a23a3468-f505-4e1...|110|       EEE|          
2015-01-01|2015-01-01T12:15:...|    2015-01-01|    2015-01-01|
   |     20220303161634|  20220303161634_0_7|               111|                
      |a23a3468-f505-4e1...|111|2022-01-01|2022-01-01T12:15:...|          
2022-01-01|    2022-01-01|          null|
   |     20220303161634|  20220303161634_0_8|               112|                
      |a23a3468-f505-4e1...|112|2015-01-01|2015-01-01T12:15:...|          
2015-01-01|    2015-01-01|          null|
   |     20220303161634|  20220303161634_0_9|               113|                
      |a23a3468-f505-4e1...|113|2015-01-01|2015-01-01T13:51:...|          
2015-01-01|    2015-01-01|          null|
   |     20220303161634| 20220303161634_0_10|               114|                
      |a23a3468-f505-4e1...|114|2015-01-01|2015-01-01T13:51:...|          
2015-01-01|    2015-01-01|          null|
   |     20220303161634| 20220303161634_0_11|               115|                
      |a23a3468-f505-4e1...|115|2015-01-01|2015-01-01T13:51:...|          
2015-01-01|    2015-01-01|          null|
   |     20220303161634| 20220303161634_0_12|               116|                
      |a23a3468-f505-4e1...|116|2015-01-01|2015-01-01T13:51:...|          
2015-01-01|    2015-01-01|          null|
   |     20220303161634| 20220303161634_0_13|               117|                
      |a23a3468-f505-4e1...|117|2015-01-01|2015-01-01T13:51:...|          
2015-01-01|    2015-01-01|          null|
   
+-------------------+--------------------+------------------+----------------------+--------------------+---+----------+--------------------+--------------------+--------------+--------------+
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to