2020-10-02 13:28:49 UTC - Emil Chitas: Hi all, I am struggling to use the `docker` container factory on EKS. After deployment, the invoker daemon set creates the pods on all the labeled nodes and the invokers are registered by the controller: https://openwhisk-team.slack.com/archives/C4J3R7JFL/p1601645329004000?thread_ts=1601645329.004000&cid=C4J3R7JFL ---- 2020-10-02 13:33:22 UTC - Emil Chitas: There's no entry in the invoker logs as to an action being invoked. https://openwhisk-team.slack.com/archives/C4J3R7JFL/p1601645602004500?thread_ts=1601645329.004000&cid=C4J3R7JFL ---- 2020-10-02 13:40:11 UTC - Dave Grove: The docker container factory will only work if Kubernetes is really using docker (as opposed to containerd) as the underlying container runtime. I don’t know what EKS does, but many cloud provided Kubernetes services do not use docker. https://openwhisk-team.slack.com/archives/C4J3R7JFL/p1601646011004700?thread_ts=1601645329.004000&cid=C4J3R7JFL ---- 2020-10-02 13:47:55 UTC - Emil Chitas: I would expect that, but what bugs me is that the invoker never receives any message to start a container https://openwhisk-team.slack.com/archives/C4J3R7JFL/p1601646475004900?thread_ts=1601645329.004000&cid=C4J3R7JFL ---- 2020-10-02 13:49:25 UTC - Dave Grove: i would assume that the invoker is not healthy. because it was not able to use docker to create and run the health action. Look in the controller logs to see how many healthy invokers it thinks it has https://openwhisk-team.slack.com/archives/C4J3R7JFL/p1601646565005100?thread_ts=1601645329.004000&cid=C4J3R7JFL ---- 2020-10-02 13:50:10 UTC - Dave Grove: if my guess is right that EKS doesn’t use docker, there will be 0 healthy invokers https://openwhisk-team.slack.com/archives/C4J3R7JFL/p1601646610005300?thread_ts=1601645329.004000&cid=C4J3R7JFL ---- 2020-10-02 13:50:48 UTC - Emil Chitas: ``` [2020-10-02T13:06:58.833Z] [INFO] [#tid_sid_invokerHealth] [] [InvokerActor] invoker3 is up [2020-10-02T13:06:58.833Z] [INFO] [#tid_sid_invokerHealth] [] [InvokerPool] invoker status changed to 0 -> Healthy, 1 -> Healthy, 2 -> Healthy, 3 -> Healthy```
https://openwhisk-team.slack.com/archives/C4J3R7JFL/p1601646648005500?thread_ts=1601645329.004000&cid=C4J3R7JFL ---- 2020-10-02 13:51:31 UTC - Emil Chitas: that's the entry for one of the invokers in the controller log. Wouldn't that mean the invoekr is registered as healthy? https://openwhisk-team.slack.com/archives/C4J3R7JFL/p1601646691005700?thread_ts=1601645329.004000&cid=C4J3R7JFL ---- 2020-10-02 13:53:55 UTC - Emil Chitas: but you are right nevertheless, the nodes use most likely containerd. https://openwhisk-team.slack.com/archives/C4J3R7JFL/p1601646835005900?thread_ts=1601645329.004000&cid=C4J3R7JFL ---- 2020-10-02 13:54:26 UTC - Dave Grove: it does look like the controller believes it has healthy invokers, which is suspicious. https://openwhisk-team.slack.com/archives/C4J3R7JFL/p1601646866006100?thread_ts=1601645329.004000&cid=C4J3R7JFL ---- 2020-10-02 13:55:07 UTC - Emil Chitas: exactly, that's my big surprise. I checked the nodes and they are definitely running containerd https://openwhisk-team.slack.com/archives/C4J3R7JFL/p1601646907006300?thread_ts=1601645329.004000&cid=C4J3R7JFL ---- 2020-10-02 13:56:54 UTC - Dave Grove: That sounds like a bug in the health protocol then. Most likely on the invoker side in the docker-specific code, but could be anywhere I guess. https://openwhisk-team.slack.com/archives/C4J3R7JFL/p1601647014006500?thread_ts=1601645329.004000&cid=C4J3R7JFL ---- 2020-10-02 14:02:10 UTC - Emil Chitas: yes, most likely: ```[2020-10-02T13:06:53.322Z] [INFO] [#tid_sid_invokerHealth] [] [InvokerReactive] [marker:invoker_activation_start:23693] [2020-10-02T13:06:53.329Z] [WARN] [#tid_sid_invokerHealth] [] [InvokerReactive] revision was not provided for whisk.system/invokerHealthTestAction0 [2020-10-02T13:06:53.483Z] [INFO] [#tid_sid_invokerHealth] [] [CouchDbRestStore] [GET] 'test_whisks' finding document: 'id: whisk.system/invokerHealthTestAction0' [marker:database_getDocument_start:23854] [2020-10-02T13:06:53.497Z] [INFO] [#tid_sid_invokerHealth] [] [CouchDbRestStore] [marker:database_getDocument_finish:23868:14] [2020-10-02T13:06:53.599Z] [INFO] [#tid_sid_invokerHealth] [] [ContainerPool] containerStart containerState: cold container: None activations: 1 of max 1 action: invokerHealthTestAction0 namespace: whisk.system activationId: 99847b6c63c04ba5847b6c63c05ba5d2 [marker:invoker_containerStart.cold_counter:23970] [2020-10-02T13:06:53.656Z] [INFO] [#tid_sid_invokerHealth] [] [DockerClientWithFileAccess] running /usr/bin/docker run -d --cpu-shares 11 --memory 128m --memory-swap 128m --network bridge -e __OW_API_HOST=<http://openwhisk-nginx.ow.svc.cluster.local> -e __OW_ALLOW_CONCURRENT=false .................. --ulimit nofile=1024:1024 --pids-limit 1024 --log-driver json-file openwhisk/action-nodejs-v10:nightly (timeout: 1 minute) [marker:invoker_docker.run_start:24027] [2020-10-02T13:06:54.400Z] [INFO] [#tid_sid_invokerHealth] [] [DockerClientWithFileAccess] [marker:invoker_docker.run_finish:24770:743] [2020-10-02T13:06:54.415Z] [INFO] [#tid_sid_invokerHealth] [] [DockerClientWithFileAccess] running /usr/bin/docker inspect --format {{.NetworkSettings.Networks.bridge.IPAddress}} f108336bd3f03509bc36e94a958e18e8b523cf70e2b5636327f013b7b2d97d42 (timeout: 1 minute) [marker:invoker_docker.inspect_start:24786] [2020-10-02T13:06:54.536Z] [INFO] [#tid_sid_invokerHealth] [] [DockerClientWithFileAccess] [marker:invoker_docker.inspect_finish:24907:121] [2020-10-02T13:06:54.538Z] [INFO] [#tid_sid_invokerHealth] [] [DockerClientWithFileAccess] running /usr/bin/docker rm -f f108336bd3f03509bc36e94a958e18e8b523cf70e2b5636327f013b7b2d97d42 (timeout: 1 minute) [marker:invoker_docker.rm_start:24907] [2020-10-02T13:06:54.601Z] [INFO] [#tid_sid_invokerHealth] [] [CouchDbRestStore] [PUT] 'test_activations' saving document: 'id: whisk.system/99847b6c63c04ba5847b6c63c05ba5d2, rev: null' [marker:database_saveDocument_start:24971] [2020-10-02T13:06:54.604Z] [INFO] [#tid_sid_dbBatcher] [] [CouchDbRestStore] 'test_activations' saving 1 documents [marker:database_saveDocumentBulk_start:4907] [2020-10-02T13:06:54.640Z] [INFO] [#tid_sid_invokerHealth] [] [MessagingActiveAck] posted combined of activation 99847b6c63c04ba5847b6c63c05ba5d2 [2020-10-02T13:06:54.653Z] [INFO] [#tid_sid_dbBatcher] [] [CouchDbRestStore] [marker:database_saveDocumentBulk_finish:4957:48] [2020-10-02T13:06:54.663Z] [INFO] [#tid_sid_invokerHealth] [] [CouchDbRestStore] [marker:database_saveDocument_finish:25034:62] [2020-10-02T13:06:54.861Z] [INFO] [#tid_sid_invokerHealth] [] [DockerClientWithFileAccess] [marker:invoker_docker.rm_finish:25231:323]``` https://openwhisk-team.slack.com/archives/C4J3R7JFL/p1601647330006700?thread_ts=1601645329.004000&cid=C4J3R7JFL ---- 2020-10-02 14:03:56 UTC - Emil Chitas: thank you for the clarifications https://openwhisk-team.slack.com/archives/C4J3R7JFL/p1601647436007100?thread_ts=1601645329.004000&cid=C4J3R7JFL ---- 2020-10-02 14:05:50 UTC - Dave Grove: ok if I grab that log section from the invoker to put in an issue? We really should have detected that the health action didn’t run correctly https://openwhisk-team.slack.com/archives/C4J3R7JFL/p1601647550007300?thread_ts=1601645329.004000&cid=C4J3R7JFL ---- 2020-10-02 14:06:41 UTC - Emil Chitas: if you want, I can provide you the full logs from that try https://openwhisk-team.slack.com/archives/C4J3R7JFL/p1601647601007500?thread_ts=1601645329.004000&cid=C4J3R7JFL ---- 2020-10-02 14:07:41 UTC - Dave Grove: cool. thanks. if you don’t mind opening an issue and posting the log in there, it would help me remember to take a look at this. +1 : Emil Chitas https://openwhisk-team.slack.com/archives/C4J3R7JFL/p1601647661007700?thread_ts=1601645329.004000&cid=C4J3R7JFL ---- 2020-10-02 14:07:54 UTC - Emil Chitas: will do, thanks https://openwhisk-team.slack.com/archives/C4J3R7JFL/p1601647674008000?thread_ts=1601645329.004000&cid=C4J3R7JFL ---- 2020-10-02 18:30:01 UTC - Lixiang Ao: makes sense. thank you! https://openwhisk-team.slack.com/archives/C4J3R7JFL/p1601663401008400?thread_ts=1601494447.001000&cid=C4J3R7JFL ----