robbik commented on issue #12931: URL: https://github.com/apache/hudi/issues/12931#issuecomment-2710086485
Hi @rangareddy I am afraid it depends on the data itself. I tried the same code to process different dataset (from initial / empty S3 bucket, not using the existing record index files) and couldn't reproduce the issue. But if you use existing record index files with same incremental dataset, it is reproducible. The record index files are quite big (~ 1GB) and the incremental dataset is around 100 thousand rows. I can share the record index files but I can't share the incremental dataset since it is our production data. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org