Ok, that might explain it. I have a topic with very few messages so it
might have empty partitions actually. Thanks!

I’m really curious about the TextIO watermark as well though. I cant find
the implementation either so if anyone know where to look I’d love to get a
pointer.

// Vilhelm

On Tue, 20 Feb 2018 at 19:31, Raghu Angadi <[email protected]> wrote:

> Both the sources have to provide watermark.
>
> - KafkaIO : It can have its watermark stuck if you don't have any records
> in one of the partitions. Thats is a bug. The management of timestamps and
> watermarks in KafkaIO are being updated in
> https://github.com/apache/beam/pull/4680.
> - TextIO.watchForNewFiles() - I am not sure how the watermark is handled
> by TextIO. Didn't notice any mentions of in implementation.
>
> On Tue, Feb 20, 2018 at 10:13 AM, Vilhelm von Ehrenheim <
> [email protected]> wrote:
>
>> Hi all!
>> I have a somewhat complicated stateful DoFn that i would like to add an
>> event time timer on. My goal with the timer is to not output anything until
>> sufficient amount of state has been built up in a Global window.
>>
>> In doing this I realize that the watermark doesn’t seem to progress at
>> all (regardless of the timer) and in Google Dataflow the displayed
>> watermark is just “-“ when clicking on the ParDo(DoFn) node.
>>
>> The DoFn is reading flattened input from a TextIO.watchForNewFiles and
>> KafkaIO. The Flatten element had a watermark set.
>>
>> I have written tests for my DoFn that all pass using TestStream but since
>> I there explicitly set the watermark progression all is fine.
>>
>> What can I do to look into why there is no watermark progression for a
>> specific  PTransform?
>>
>> Regards,
>> Vilhelm von Ehrenheim
>>
>
>

Reply via email to