xushiyan commented on issue #3975:
URL: https://github.com/apache/hudi/issues/3975#issuecomment-968090220


   @dmenin What i described is worst-case scenario which each delete op route 
to different files. Deletes on the same file will be consolidated into 1 
re-writing. I highlight the worst case to show this can be slow compare to your 
inserts which have no re-writing at all. So this is expected in COW table. If 
you have perf concern on this, try convert to MOR where updates/deletes will be 
appended in log files. And if you configure async compaction to run, then there 
is no write amplification on ingestion. Also i think you may consider 
partitioning on immutable fields to avoid records jumping over partitions. Or 
near-immutable, as occasional partition updates are totally fine to cope with.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to