robbik commented on issue #12931: URL: https://github.com/apache/hudi/issues/12931#issuecomment-2707008191
Hi, I did initial data insert using Spark and Hudi from one database to S3 using EMR. The S3 path was empty before doing the insert. After the insert was successful, I did delete + upsert using Spark and Hudi from the same table (the table had some removed, updated, and new rows) to the same S3 path. Then the upsert was failed with error above but the delete was successful. I retried several times but the result (error) was consistent even if I skipped the delete part. I debugged the Spark job and saw that the lookup record was fine for other record index files but error for that particular record index file. I debugged deeper into Hudi and found that in that record index file, at least one index block key pointed to different key in data block. Not sure if this is because of delete operation or this is expected. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org