Hi Flink community,

I’m encountering an issue with PyFlink where a FlatMap operator invokes an 
external service (using a PyTorch model to generate embedding vectors). The 
operator processes data very slowly, leading to an extremely long initial 
checkpoint start delay, which eventually causes checkpoint failures. The 
external service has strict concurrency limits and cannot handle increased 
parallel requests,increasing the parallelism of the operator did not improve 
performance due to this bottleneck. Besides, when I use flink1.20.0, the 
operator processing speed seems to be faster than that of flink2.0.0. Does 
anyone have any clue? Thank you for your insights!

Reply via email to