Hi Sigalit, First of all, did you ensure different source operator consumes different consumer id for the kafka source? Did each flink job share the same data or consumed the data independently?
Moreover, was your job behaves back pressured? It might need to break the chained operator to see whether the sink back-pressured the source to impact the throughput of source. Last but not least, did your source already have 100% CPU usage, which means your source operator has already reached to its highest throughput. Best Yun Tang ________________________________ From: Sigalit Eliazov <e.siga...@gmail.com> Sent: Thursday, April 7, 2022 19:12 To: user <user@flink.apache.org> Subject: flink pipeline handles very small amount of messages in a second (only 500) hi all I would appreciate some help to understand the pipeline behaviour... We deployed a standalone flink cluster. The pipelines are deployed via the jm rest api. We have 5 task managers with 1 slot each. In total i am deploying 5 pipelines which mainly read from kafka, a simple object conversion and either write back to kafka or GCP pub/sub or save in the DB. These jobs run "forever" and basically each task manager runs a specific job (this is how flink handled it). We have a test that sends to kafka 10k messages per second. but according to the metrics exposed by flink i see that the relevant job handles only 500 messages per second. I would expect all the 10K to be handled. I guess the setup is not correct. The messages are in avro format Currently we are not using checkpoints at all. Any suggestions are welcome. Thanks alot Sigalit