Hi all, I’ve been working with StateFun for a bit for my university project. I am now trying to increase the number of StateFun workers and the parallelism, however this barely seems to increase the throughput of my system.
I have 5000 function instances in my system during my tests. Once I increase the workers from 1 to 3 I notice a significant increase in throughput, however from 3 to 5 (or even to 7) I notice no increase. I run all workers with 4 CPUs and made sure that Kafka and my deployed colocated functions are not causing any bottlenecks. I also have many partitions for the ingress topics. I attached my flink-conf.yaml below. Is this expected behaviour for StateFun or am I missing some configuration which can improve my performance. Also if this is expected for StateFun, what could be causing this? Best regards, Martijn jobmanager.rpc.address: statefun-master taskmanager.numberOfTaskSlots: 1 blob.server.port: 6124 jobmanager.rpc.port: 6123 taskmanager.rpc.port: 6122 classloader.parent-first-patterns.additional: org.apache.flink.statefun;org.apache.kafka;com.google.protobuf state.checkpoints.dir: file:///checkpoint-dir state.backend: rocksdb state.backend.rocksdb.timer-service.factory: ROCKSDB state.backend.incremental: true execution.checkpointing.interval: 10sec execution.checkpointing.mode: EXACTLY_ONCE restart-strategy: fixed-delay restart-strategy.fixed-delay.attempts: 2147483647 restart-strategy.fixed-delay.delay: 1sec jobmanager.memory.process.size: 1g taskmanager.memory.process.size: 1g parallelism.default: 5