Prashant Wason created HUDI-1717:
------------------------------------
Summary: Metadata Table reader does not show correct view of the
metadata
Key: HUDI-1717
URL: https://issues.apache.org/jira/browse/HUDI-1717
Project: Apache Hudi
Issue Type: Bug
Reporter: Prashant Wason
Dataset timeline: C1 C2 C3 Compaction.inflight C4 C5
Metadata timeline: DC1 DC2 DC3. (DC=deltaCommit)
Assume the dataset timeline has some completed commits (C1, C2 ... C5) and an
async compaction operation in progress. Also assume that the metadata table is
synced only till C3.
The MetadataTableWriter will not sync any more instants to the Metadata Table
since an incomplete instant is present next (Compaction.inflight).
The same sync logic is also used by the MetadataReader to perform the in-memory
merge of timeline. Hence, the reader will also not consider C4 and C5 thereby
providing an incorrect and older view of the FileSlices and FileGroups.
Any future ingestion into this table MAY insert data into older versions of the
FileSlices which will end up being a data loss when queried.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)