I have a Dataflow pipeline that reads data from JDBC and Pub/Sub. My ideal pipeline backfills its state and output from historical data via the JDBC input, and then continues processing new elements arriving via pub/sub. Conceptually, this seems easy to do with a filter on each source before/after some specific cutoff instant.
However, when I add pub/sub into the pipeline, it runs in streaming mode, and the pipeline does not produce the expected results -- all of the results that would be produced based on looping timers seem to be missing. I thought this might be related to the post-inputs Flatten, but I've taken pub/sub out of the equation, and run the same exact JDBC-based pipeline in batch vs streaming mode, and the JDBC-only pipeline in streaming mode produces the same partial results. What could be happening? Regards, Raman