2019-10-15 21:07:36 UTC - Quentin ADAM: @Quentin ADAM has joined the channel ---- 2019-10-15 22:40:02 UTC - Addison Higham: hrm, getting an interesting issues with k8s functions worker when using a "source" function, the command for the pod is: ``` /pulsar/bin/pulsar-admin --admin-url <https://10.11.12.154:8443> functions download --tenant restructure-test --namespace cg-logs2 --name cg-logs-iad --destination-file /pulsar/ && SHARD_ID=${POD_NAME##*-} && echo shardId=${SHARD_ID} && exec java -cp /pulsar/instances/java-instance.jar:/pulsar/instances/deps/* -Dpulsar.functions.extra.dependencies.dir=/pulsar/instances/deps -Dpulsar.functions.instance.classpath=/pulsar/conf:::/pulsar/lib/*:/pulsar/offloaders/* -Dlog4j.configurationFile=kubernetes_instance_log4j2.xml -Dpulsar.function.log.dir=logs/functions/restructure-test/cg-logs2/cg-logs-iad -Dpulsar.function.log.file=cg-logs-iad-$SHARD_ID -Xmx1073741824 org.apache.pulsar.functions.instance.JavaInstanceMain --jar /pulsar/ --instance_id $SHARD_ID ... ``` ---- 2019-10-15 22:42:21 UTC - Addison Higham: notice that the `download` just goes to `/pulsar/` and then the `--jar /pulsar/` command is empty. The error I think get from the function running is basically that `/pulasr` is a directory and it immediately crashes ---- 2019-10-15 22:44:09 UTC - Addison Higham: tracing the code it looks that path comes from the `pulsarRootDir` + `FunnctionMetaData.packageLocation.originalFileName` so I assume with a source the `originalFileName` isn't working because it is a builtin source? ---- 2019-10-15 22:44:21 UTC - Addison Higham: @Jerry Peng ^^ if you have any insight ---- 2019-10-15 22:51:14 UTC - Addison Higham: happy to help get a fix out... just want to make sure it isn't something dumb I am doing ---- 2019-10-16 00:14:15 UTC - Jerry Peng: @Addison Higham built-in sources and sinks should be handled correctly ---- 2019-10-16 00:14:31 UTC - Jerry Peng: for the kubernetes runtime ---- 2019-10-16 00:21:02 UTC - Jerry Peng: @Addison Higham what is the filename of the built-in source you try to start? ---- 2019-10-16 00:21:42 UTC - Jerry Peng: also do get ./bin/pulsar-admin sources get --name NAME --namespace NS --tenant TENANT ---- 2019-10-16 00:21:48 UTC - Jerry Peng: to get the metadata for the source ---- 2019-10-16 00:23:00 UTC - Jerry Peng: there should be a field called “originalFileName” ---- 2019-10-16 00:38:36 UTC - Addison Higham: Afk for a bit, will look in a minute, it is a built-in source, ---- 2019-10-16 01:17:48 UTC - Addison Higham: ``` $ pulsar-admin sources get --tenant restructure-test --namespace cg-logs2 --name cg-logs-iad { "tenant": "restructure-test", "namespace": "cg-logs2", "name": "cg-logs-iad", "className": "org.apache.pulsar.io.kinesis.KinesisSource", "topicName": "<persistent://restructure-test/cg-logs2/iad>", "configs": { ... }, "parallelism": 3, "processingGuarantees": "ATLEAST_ONCE", "resources": { "cpu": 1.0, "ram": 1073741824, "disk": 10737418240 }, "archive": "<builtin://kinesis>" } ``` ---- 2019-10-16 01:21:06 UTC - Addison Higham: although that doesn't look to be the top level `FunctionMetaData` proto, so not sure if it will be included in that? ---- 2019-10-16 01:28:09 UTC - Addison Higham: trying to pull the raw record off the `metadata` topic ---- 2019-10-16 01:34:27 UTC - Jerry Peng: btw the originalFileName should be something like “pulsar-io-datagenerator-0.0.1.nar”. The name of the NAR file for the source/sink in your connectors directory ---- 2019-10-16 01:46:28 UTC - Addison Higham: I recreated the source and perhaps that fixed it. That actually makes sense now looking at the code, I had created it prior to flipping over to k8s function worker ---- 2019-10-16 01:47:18 UTC - Addison Higham: okay yeah, it now has a valid value set for the `originalFileName`. Still not working yet, getting an unauthorized error, but I can figure out what is going on there ---- 2019-10-16 05:32:20 UTC - Addison Higham: okay, got auth working, but there is a doc bug in the rbac definitions in the docs, it doesn't include the secrets permissions you need ---- 2019-10-16 05:35:48 UTC - Addison Higham: but now the problem: my CA cert I mount in via a volume, when the functions worker goes to run, it obviously doesn't have that volume mounted and so it can't get the CA cert.
I have this code, <https://github.com/instructure/pulsar/commit/bc2e8d4a348ffa4a4d04be8d9f350c767e92be09>, in progress, that is specifically to handle stuff like this... but I am wondering if fetching the CA and sending it to the client would make sense.. you could do it via a secrets provider if that supported files instead of just env vars ---- 2019-10-16 05:36:37 UTC - Addison Higham: I think for now, I may just use that new interface to add an init-container that will fetch the CA, trying to avoid baking the CAs into images ---- 2019-10-16 05:57:13 UTC - Jerry Peng: @Addison Higham I think the best approach is to add another implementation for: <https://github.com/apache/pulsar/blob/master/pulsar-functions/runtime/src/main/java/org/apache/pulsar/functions/auth/KubernetesFunctionAuthProvider.java> That supports authentication with CA instead of tokens. ---- 2019-10-16 05:57:46 UTC - Jerry Peng: Currently, the FunctionAuthProvider is hard coded but all the interfaces are there to make it pluggable ---- 2019-10-16 06:23:40 UTC - Addison Higham: I am using tokens, but just have tls enabled and am using a custom ca ---- 2019-10-16 06:24:41 UTC - Addison Higham: So I need both tokens and just to get the CA cert so that it can validate the broker cert ---- 2019-10-16 06:26:22 UTC - Addison Higham: Maybe the token function auth provider should also just see if you have a trust cert path set and copy that into the function auth data :thinking_face: ----