[ 
https://issues.apache.org/jira/browse/HUDI-349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-349:
----------------------------
    Sprint: Cont' improve -  2022/02/07

> Make cleaner retention based on time period to account for higher deviations 
> in ingestion runs
> ----------------------------------------------------------------------------------------------
>
>                 Key: HUDI-349
>                 URL: https://issues.apache.org/jira/browse/HUDI-349
>             Project: Apache Hudi
>          Issue Type: Task
>          Components: cleaning, writer-core
>            Reporter: Balaji Varadarajan
>            Assignee: sivabalan narayanan
>            Priority: Major
>              Labels: core-flow-ds, new-to-hudi, pull-request-available, 
> sev:high
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Cleaner by commits is based on number of commits to be retained.  Ingestion 
> time could vary across runs due to various factors. For providing a bound on 
> the maximum running time for a query and for providing consistent retention 
> period, it is better to use a retention config based on time (e:g 12h) 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to