In case of reading from input files, at the EOF event, readers will send
Watermark(Long.MAX_VALUE) on all of the output edges and those watermarks will
be propagated accordingly. So your ACK operator will get
Watermark(Long.MAX_VALUE) only when it gets it from ALL of it’s input edges.
When read
Hi,
If an operator has multiple inputs, it’s watermark will be the minimum of all
of the inputs. Thus your hypothetical “ACK Operator” will get
Watermark(Long.MAX_VALUE) only when of the preceding operators report
Watermark(Long.MAX_VALUE).
Yes, instead of simply adding sink, you would have t
Hi Piotrek,
Thank you for your detailed answer.
Yes, I want to generate the ack when all the records of the file are
written to DB.
So to understand what you are saying , we will receive a single EOF
watermark value at the ack operator when all the downstream operator
process all the records of
Hi,
As you figured out, some dummy EOF record is one solution, however you might
try to achieve it also by wrapping an existing CSV function. Your wrapper could
emit this dummy EOF record. Another (probably better) idea is to use
Watermark(Long.MAX_VALUE) for the EOF marker. Stream source and/o
Hi Guys,
Following is how my pipeline looks (DataStream API) :
[1] Read the data from the csv file
[2] KeyBy it by some id
[3] Do the enrichment and write it to DB
[1] reads the data in sequence as it has single parallelism and then I have
default parallelism for the other operators.
I want to