[
https://issues.apache.org/jira/browse/HUDI-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lin Liu updated HUDI-7102:
--------------------------
Description:
Issue:
# Based on the provided TIMESTAMP_AS_OF, a list of file slices are returned.
However, these file slices that are returned are based on their base file
timestamp. That means, these slices may contain log files whose timestamps are
higher than the provided timestamp.
# Such that, when we try to merge the logs in the reverse order, we may see
these unqualified log files first, which triggers the "break" operation, and no
merging will be done.
Solution:
# The first solution is to filter the log files as well as the base files for
the file slices.
# The second solution is to skip these unqualified log files, and keep merging.
Risk:
* 1. Not sure if new bugs would be introduced by changing the current behavior.
was:
The issue is:
# Based on the provided TIMESTAMP_AS_OF, a list of file slices are returned.
However, these file slices that are returned are based on their base file
timestamp. That means, these slices may contain log files whose timestamps are
higher than the provided timestamp.
# Such that, when we try to merge the logs in the reverse order, we may see
these unqualified log files first, which triggers the "break" operation, and no
merging will be done.
Solution:
# The first solution is to filter the log files as well as the base files for
the file slices. But not sure if any other logic will be affected.
# The second solution is to skip these unqualified log files, and keep
merging. Not sure if any existing processing logic are based on this "break"
logic.
> A bug for the time travel queries for MOR tables
> ------------------------------------------------
>
> Key: HUDI-7102
> URL: https://issues.apache.org/jira/browse/HUDI-7102
> Project: Apache Hudi
> Issue Type: Task
> Reporter: Lin Liu
> Assignee: Lin Liu
> Priority: Major
> Fix For: 1.0.0
>
>
> Issue:
> # Based on the provided TIMESTAMP_AS_OF, a list of file slices are returned.
> However, these file slices that are returned are based on their base file
> timestamp. That means, these slices may contain log files whose timestamps
> are higher than the provided timestamp.
> # Such that, when we try to merge the logs in the reverse order, we may see
> these unqualified log files first, which triggers the "break" operation, and
> no merging will be done.
>
> Solution:
> # The first solution is to filter the log files as well as the base files
> for the file slices.
> # The second solution is to skip these unqualified log files, and keep
> merging.
>
> Risk:
> * 1. Not sure if new bugs would be introduced by changing the current
> behavior.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)