2020-07-27 09:15:54 UTC - Jatin Bansal: Hi all, I am bursting the controller with 1 million request but it is giving server down error after some request. Can any one help out what should be ideal controller count to handle such load? https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595841354152200?thread_ts=1595841354.152200&cid=C3TPCAQG1 ---- 2020-07-27 09:18:29 UTC - Dominic Kim: Are you sending 1 million requests per second? https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595841509152300?thread_ts=1595841354.152200&cid=C3TPCAQG1 ---- 2020-07-27 12:43:00 UTC - Jatin Bansal: nope a burst of 1000 request https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595853780152500?thread_ts=1595841354.152200&cid=C3TPCAQG1 ---- 2020-07-27 12:43:13 UTC - Jatin Bansal: ```The server is currently unavailable (because it is overloaded or down for maintenance```
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595853793152700?thread_ts=1595841354.152200&cid=C3TPCAQG1 ---- 2020-07-27 12:43:28 UTC - Jatin Bansal: getting this after some time https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595853808152900?thread_ts=1595841354.152200&cid=C3TPCAQG1 ---- 2020-07-27 13:16:31 UTC - Dominic Kim: Generally to support high RPS, you need more invokers and Kafka nodes. https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595855791153100?thread_ts=1595841354.152200&cid=C3TPCAQG1 ---- 2020-07-27 13:17:01 UTC - Dominic Kim: I observed around 14000 RPS with 10 invoker VMs. https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595855821153500?thread_ts=1595841354.152200&cid=C3TPCAQG1 ---- 2020-07-27 13:17:26 UTC - Dominic Kim: Each VM has 10GB memory for runtime containers. https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595855846153900?thread_ts=1595841354.152200&cid=C3TPCAQG1 ---- 2020-07-27 13:18:07 UTC - Dominic Kim: Depdends on the TPS you want to achieve, the bottleneck points can vary. https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595855887154100?thread_ts=1595841354.152200&cid=C3TPCAQG1 ---- 2020-07-27 13:18:52 UTC - Dominic Kim: For example, for relatively less TPS such as 10K~20K I think 3~5 Kafka nodes would be enough while you need much more invoker nodes. https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595855932154300?thread_ts=1595841354.152200&cid=C3TPCAQG1 ---- 2020-07-27 13:19:15 UTC - Dominic Kim: But for higher TPS, Kafka will become a bottleneck and you need to add more Kafka nodes as well. https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595855955154500?thread_ts=1595841354.152200&cid=C3TPCAQG1 ---- 2020-07-27 13:19:46 UTC - Dominic Kim: I could secure such TPS with relatively less number of controllers. https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595855986154700?thread_ts=1595841354.152200&cid=C3TPCAQG1 ---- 2020-07-27 13:20:00 UTC - Dominic Kim: I used around 3 controllers. https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595856000154900?thread_ts=1595841354.152200&cid=C3TPCAQG1 ---- 2020-07-27 13:21:07 UTC - Dominic Kim: Since the result can severely vary depending on your server spec, the number of servers, etc, I recommend testing with different number of servers and sets of components. https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595856067155100?thread_ts=1595841354.152200&cid=C3TPCAQG1 ---- 2020-07-27 15:03:25 UTC - Jatin Bansal: Can anyone please explain the flow of how controller check for Health invoker ? What are the basis for that? https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595862205155500?thread_ts=1595862205.155500&cid=C3TPCAQG1 ---- 2020-07-27 15:04:32 UTC - Jatin Bansal: Can you please explain how you increased kafka nodes? https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595862272155600?thread_ts=1595841354.152200&cid=C3TPCAQG1 ---- 2020-07-27 15:46:46 UTC - Rodric Rabbah: each invoker sends a periodic ping - this informs the controller that it exists <https://github.com/apache/openwhisk/blob/ef33823a1d22179133999f7cd628202cd0498a5a/core/controller/src/main/scala/org/apache/openwhisk/core/loadBalancer/InvokerSupervision.scala#L89-L99> An invoker is represented by a state machine. When a new invoker registers it is tested for health and recorded as usable/healthy and otherwise not. See <https://github.com/apache/openwhisk/blob/ef33823a1d22179133999f7cd628202cd0498a5a/core/controller/src/main/scala/org/apache/openwhisk/core/loadBalancer/InvokerSupervision.scala#L52-L66> https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595864806155800?thread_ts=1595862205.155500&cid=C3TPCAQG1 ---- 2020-07-27 16:00:22 UTC - Rodric Rabbah: This allows a system to scale the compute capacity, by adding invokers for example. Also if an invoker goes offline, requests are routed to another in the system. https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595865622156100?thread_ts=1595862205.155500&cid=C3TPCAQG1 ---- 2020-07-27 17:13:19 UTC - Jatin Bansal: Can you also explain working of invokeHealthTestAction and after how much time does pod created by this get destroyed as my invoker instance are near 30 but healthtestAction pod are reaching to 350 https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595869999156700?thread_ts=1595862205.155500&cid=C3TPCAQG1 ---- 2020-07-27 17:14:21 UTC - Jatin Bansal: yes I am deploying on kube https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595870061157000?thread_ts=1594975386.119300&cid=C3TPCAQG1 ---- 2020-07-27 17:15:37 UTC - Jatin Bansal: Is there a way for implement auto scaling for kafka as well? https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595870137157200?thread_ts=1594975386.119300&cid=C3TPCAQG1 ---- 2020-07-27 17:21:33 UTC - Jatin Bansal: Also if invoker is pinging controller periodically why is invokerHealthTestAction getting created https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1595870493157400?thread_ts=1595862205.155500&cid=C3TPCAQG1 ----