xiearthur commented on issue #12661:
URL: https://github.com/apache/hudi/issues/12661#issuecomment-2599509180

   Yes, we did execute the job in streaming mode by setting:
   
   ```java
   options.put(FlinkOptions.READ_AS_STREAMING.key(), "true");
   options.put(FlinkOptions.READ_START_COMMIT.key(), "20240116000000"); // 
specific timestamp
   options.put(FlinkOptions.READ_STREAMING_CHECK_INTERVAL.key(), "5");
   ```
   
   We've found that:
   1. With `earliest`: can continuously read both historical and new data
   2. With specific timestamp: only reads data up to Flink job start time, 
missing new data written after that
   
   We tried different configurations but still couldn't make it work with a 
specific timestamp. The job can only read new data when using 
`READ_START_COMMIT=earliest`. Is there any other configuration we should set to 
make it work with a specific timestamp?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to