Jingsong Lee created FLINK-28244:
------------------------------------

             Summary: Introduce changelog file for DataFile
                 Key: FLINK-28244
                 URL: https://issues.apache.org/jira/browse/FLINK-28244
             Project: Flink
          Issue Type: New Feature
          Components: Table Store
            Reporter: Jingsong Lee
             Fix For: table-store-0.2.0


When using TableStore to support stream consumption, there are two requirements.
 * Downstream gets all changelogs
 * The order of stream consumption is the order of input

For append only table, it is easy to meet both.

But for the primary key table, Its files are all sorted and de-duplicated by 
pk, making it impossible to meet the above expectations.

We can output another ChangelogFile when the DataFile flush, and the stream 
reads it directly.

We can modify DataFileMeta:
{code:java}
class DataFileMeta {
    String fileName;
    .....
    // store the suffix for extra files, extra files including changelog_file, 
primary_key_index_file, secondary_index_file, and etc...
    List<String> extraFiles;
}
{code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to