StateFun scalability

Martijn de Heus Thu, 04 Feb 2021 12:51:41 -0800

Hi all,

I’ve been working with StateFun for a bit for my university project. I am now 
trying to increase the number of StateFun workers and the parallelism, however 
this barely seems to increase the throughput of my system.


I have 5000 function instances in my system during my tests. Once I increase 
the workers from 1 to 3 I notice a significant increase in throughput, however 
from 3 to 5 (or even to 7) I notice no increase. I run all workers with 4 CPUs 
and made sure that Kafka and my deployed colocated functions are not causing 
any bottlenecks. I also have many partitions for the ingress topics.

I attached my flink-conf.yaml below. Is this expected behaviour for StateFun or 
am I missing some configuration which can improve my performance. Also if this 
is expected for StateFun, what could be causing this?

Best regards,

Martijn


jobmanager.rpc.address: statefun-master
taskmanager.numberOfTaskSlots: 1
blob.server.port: 6124
jobmanager.rpc.port: 6123
taskmanager.rpc.port: 6122
classloader.parent-first-patterns.additional: 
org.apache.flink.statefun;org.apache.kafka;com.google.protobuf
state.checkpoints.dir: file:///checkpoint-dir
state.backend: rocksdb
state.backend.rocksdb.timer-service.factory: ROCKSDB
state.backend.incremental: true
execution.checkpointing.interval: 10sec
execution.checkpointing.mode: EXACTLY_ONCE
restart-strategy: fixed-delay
restart-strategy.fixed-delay.attempts: 2147483647
restart-strategy.fixed-delay.delay: 1sec
jobmanager.memory.process.size: 1g
taskmanager.memory.process.size: 1g
parallelism.default: 5

StateFun scalability

Reply via email to