Hello,

I have a Streaming event processing job that looks like this.

*Source - ProcessFn(3 in total) - Sink*

I am seeing a delay of 50ms to 250ms between each operators (maybe
buffering or serde delays) leading to a slow end- to-end processing. What
could be the reason for such high latency?

Some more details:
- Source operator is getting continuous events at a rate of 200 to 300
events per minute through controlled tests.
- Using DataStream<POJO> between the operators. It has simple types and the
input payload got from source in byte[] format as fields. Right now the
size of the payload is in few kb's.
- Same events are also processed by another Flink job that looks
*source-processFn(mongoWriter)-sink.
*Here the end-to-end processing is less than 5ms. Similar Stream<pojo> is
being carried.
- The original(problematic) pipeline, has extraction, validation,
transformation processFn. But each of these steps get completed within
couple of ms. I am calculating the processing time inside these process
functions by *endTime - startTime* logic in the java code. So the Average
time of the event inside the operators is just 1ms.
- There is no back pressure shown in the flink ui on these operators.
- Input events are continously flowing from the source at  a very high rate
without any delays. So waiting on the buffer can be ruled out.

Regards,
Kartik

Reply via email to