Balaji Varadarajan created HUDI-308:
---------------------------------------
Summary: Avoid Renames for tracking state transitions of all
actions on dataset
Key: HUDI-308
URL: https://issues.apache.org/jira/browse/HUDI-308
Project: Apache Hudi (incubating)
Issue Type: Improvement
Components: Common Core
Reporter: Balaji Varadarajan
Fix For: 0.5.1
Currently, We employ renames when transitioning states (REQUESTED, INFLIGHT,
COMPLETED) of all actions in Hudi.
The idea is to always create new files pertaining to each state of an action
(commit, compaction, clean, ....) that is being performed and have the Timeline
management resolve conflicts when loading them from .hoodie to folder. The
Archiving logic will cleanup transient state files and archive terminal state
files.
THis handling will be done consistently for all kinds of actions on datasets.
As part of this project, we will cleanup un-necessary fields in metada, version
them and standardize on avro/json.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)