[ https://issues.apache.org/jira/browse/BEAM-14267?focusedWorklogId=753696&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-753696 ]
ASF GitHub Bot logged work on BEAM-14267: ----------------------------------------- Author: ASF GitHub Bot Created on: 06/Apr/22 21:17 Start Date: 06/Apr/22 21:17 Worklog Time Spent: 10m Work Description: asf-ci commented on PR #17305: URL: https://github.com/apache/beam/pull/17305#issuecomment-1090817476 Can one of the admins verify this patch? Issue Time Tracking ------------------- Worklog Id: (was: 753696) Time Spent: 0.5h (was: 20m) > Update watchForNewFiles to allow reading already read files with a new > timestamp > -------------------------------------------------------------------------------- > > Key: BEAM-14267 > URL: https://issues.apache.org/jira/browse/BEAM-14267 > Project: Beam > Issue Type: New Feature > Components: io-java-files > Reporter: Yi Hu > Assignee: Yi Hu > Priority: P2 > Time Spent: 0.5h > Remaining Estimate: 0h > > In TextIO and AvroIO, we have a configuration option called watchForNewFiles, > and in FileIO.MatchConfiguration, we have an option called watchInterval. > Right now, these match any files according to the filtering criteria, and > then periodically check for new files. A file is determined to be new if it > has a different filename than a file that has already been read. > We want to add an option to choose to consider a file new if it has a > different timestamp from an existing file, even if the file itself has the > same name. > See the following design doc for more detail: > [https://docs.google.com/document/d/1xnacyLGNh6rbPGgTAh5D1gZVR8rHUBsMMRV3YkvlL08/edit?usp=sharing&resourcekey=0-be0uF-DdmwAz6Vg4Li9FNw] > -- This message was sent by Atlassian Jira (v8.20.1#820001)