parisni commented on issue #6373:
URL: https://github.com/apache/hudi/issues/6373#issuecomment-1218625262

   Yeah KEEP_LATEST_COMMITS. Since cleaning never find files to delete it 
always fallback into getPartitionPathsForFullCleaning().
   But that method looks for path on disk, however it then looks for filegroup 
to delete in metadata table .
   
   Also I guess there is a problem to use incremental cleaning together with 
KEEP_LATEST_COMMITS which lead to never clean some partitions after a first 
clean but I will open a separate issue for this one. Incremental cleaning shall 
be use together withKEEP_LATEST_FILE_VERSIONS only
   
   On August 17, 2022 10:45:32 PM UTC, Sivabalan Narayanan ***@***.***> wrote:
   >may I know what cleaning policy you are using? I see that for 
KEEP_LATEST_FILE_VERSIONS, we call getPartitionPathsForFullCleaning() within 
which we use file system based listing and not metadata table based listing. 
   >
   >and if you are using KEEP_LATEST_COMMITS, within incremental clean mode 
enabled, if there is no prior clean ever, we trigger 
getPartitionPathsForFullCleaning() (within which we use file system based 
listing and not metadata table based listing). 
   >
   >If not for these, we should be hitting only metadata based listing. Can you 
confirm which one among the above is your case. 
   >
   >
   >-- 
   >Reply to this email directly or view it on GitHub:
   >https://github.com/apache/hudi/issues/6373#issuecomment-1218565541
   >You are receiving this because you authored the thread.
   >
   >Message ID: ***@***.***>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to