GitHub user raunakagrawal47 edited a discussion: Not able to run pulsar function locally
I have tried everything to package and run my python function locally. I tried to run the pulsar standalone on my laptop (v. 2.9.1) as well as tried to run pulsar in docker (latest image). Here is the repo: https://github.com/raunakagrawal47/pulsar-test I am trying to run a basic python function with some dependencies attached to it. Steps I followed to run pulsar locally: Generate whl file for all required dependencies: `pip3 download --only-binary :all: -r requirements.txt -d deps` Zip the contents of folder: `zip -r format-phone-number.zip . -x test/**\* -x venv/**\* -x .idea/**\*` Run the pulsar function: `bin/pulsar-admin functions localrun --tenant public --namespace default --py format-phone-number.zip --classname TestEtl.TestEtl --inputs persistent://public/default/in --output persistent://public/default/out` I tried to run pulsar function locally on laptop by running pulsar standalone as well as tried to run by copying the zip file to a pulsar docker image using: ``` docker cp format-phone-number.zip bbcba9a3f42b:/pulsar docker exec -it bbcba9a3f42b /bin/bash ``` When running, I am getting below error: ``` 2022-11-20T14:58:51,194+0000 [main] INFO org.apache.pulsar.functions.runtime.process.ProcessRuntimeFactory - Java instance jar location is not defined, using the location defined in system environment : /pulsar/instances/java-instance.jar 2022-11-20T14:58:51,197+0000 [main] INFO org.apache.pulsar.functions.runtime.process.ProcessRuntimeFactory - Python instance file location is not defined using the location defined in system environment : /pulsar/instances/python-instance/python_instance_main.py 2022-11-20T14:58:51,197+0000 [main] INFO org.apache.pulsar.functions.runtime.process.ProcessRuntimeFactory - No extra dependencies location is defined in either function worker config or system environment 2022-11-20T14:58:51,265+0000 [main] INFO org.apache.pulsar.functions.runtime.RuntimeSpawner - public/default/TestEtl-0 RuntimeSpawner starting function 2022-11-20T14:58:51,266+0000 [main] INFO org.apache.pulsar.functions.runtime.process.ProcessRuntime - Creating function log directory /pulsar/logs/functions/public/default/TestEtl 2022-11-20T14:58:51,267+0000 [main] INFO org.apache.pulsar.functions.runtime.process.ProcessRuntime - Created or found function log directory /pulsar/logs/functions/public/default/TestEtl 2022-11-20T14:58:51,268+0000 [main] INFO org.apache.pulsar.functions.runtime.process.ProcessRuntime - ProcessBuilder starting the process with args python /pulsar/instances/python-instance/python_instance_main.py --py format-phone-number.zip --logging_directory /pulsar/logs/functions --logging_file TestEtl --logging_config_file /pulsar/conf/functions-logging/logging_config.ini --instance_id 0 --function_id 44d3a24d-a564-45ed-a817-9b55d3212351 --function_version 98cd871d-14ee-40d6-8272-fdc411b834d3 --function_details '{"tenant":"public","namespace":"default","name":"TestEtl","className":"TestEtl.TestEtl","runtime":"PYTHON","autoAck":true,"parallelism":1,"source":{"inputSpecs":{"persistent://public/default/in":{}},"cleanupSubscription":true},"sink":{"topic":"persistent://public/default/out","forwardSourceMessageProperty":true},"resources":{"cpu":1.0,"ram":"1073741824","disk":"10737418240"},"componentType":"FUNCTION"}' --pulsar_serviceurl pulsar://localhost:6650 --use_tls false --tls_al low_insecure false --hostname_verification_enabled false --max_buffered_tuples 1024 --port 33167 --metrics_port 42809 --expected_healthcheck_interval 30 --secrets_provider secretsprovider.ClearTextSecretsProvider --cluster_name local 2022-11-20T14:58:51,280+0000 [main] INFO org.apache.pulsar.functions.runtime.process.ProcessRuntime - Started process successfully 2022-11-20 14:58:51.526 INFO [140611343988544] Client:88 | Subscribing on Topic :persistent://public/default/in 2022-11-20 14:58:51.526 INFO [140611343988544] ClientConnection:189 | [<none> -> pulsar://localhost:6650] Create ClientConnection, timeout=10000 2022-11-20 14:58:51.526 INFO [140611343988544] ConnectionPool:96 | Created connection for pulsar://localhost:6650 2022-11-20 14:58:51.526 INFO [140611281073920] ExecutorService:41 | Run io_service in a single thread 2022-11-20 14:58:51.527 INFO [140611281073920] ClientConnection:375 | [127.0.0.1:59164 -> 127.0.0.1:6650] Connected to broker 2022-11-20 14:58:51.535 INFO [140611281073920] HandlerBase:64 | [persistent://public/default/in, public/default/TestEtl, 0] Getting connection from pool 2022-11-20 14:58:51.535 INFO [140611055253248] ExecutorService:41 | Run io_service in a single thread 2022-11-20 14:58:51.540 INFO [140611281073920] ConsumerImpl:224 | [persistent://public/default/in, public/default/TestEtl, 0] Created consumer on broker [127.0.0.1:59164 -> 127.0.0.1:6650] 2022-11-20T14:59:21,662+0000 [function-timer-thread-1-1] ERROR org.apache.pulsar.functions.runtime.process.ProcessRuntime - Health check failed for TestEtl-0 java.util.concurrent.ExecutionException: io.grpc.StatusRuntimeException: UNAVAILABLE: io exception at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) ~[?:?] at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) ~[?:?] at org.apache.pulsar.functions.runtime.process.ProcessRuntime.lambda$start$1(ProcessRuntime.java:184) ~[org.apache.pulsar-pulsar-functions-runtime-2.10.1.jar:2.10.1] at org.apache.pulsar.common.util.Runnables$CatchingAndLoggingRunnable.run(Runnables.java:54) [org.apache.pulsar-pulsar-common-2.10.1.jar:2.10.1] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) [?:?] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.77.Final.jar:4.1.77.Final] at java.lang.Thread.run(Thread.java:829) [?:?] Caused by: io.grpc.StatusRuntimeException: UNAVAILABLE: io exception at io.grpc.Status.asRuntimeException(Status.java:535) ~[io.grpc-grpc-api-1.45.1.jar:1.45.1] at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:533) ~[io.grpc-grpc-stub-1.45.1.jar:1.45.1] at io.grpc.internal.DelayedClientCall$DelayedListener$3.run(DelayedClientCall.java:463) ~[io.grpc-grpc-core-1.45.1.jar:1.45.1] at io.grpc.internal.DelayedClientCall$DelayedListener.delayOrExecute(DelayedClientCall.java:427) ~[io.grpc-grpc-core-1.45.1.jar:1.45.1] at io.grpc.internal.DelayedClientCall$DelayedListener.onClose(DelayedClientCall.java:460) ~[io.grpc-grpc-core-1.45.1.jar:1.45.1] at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:562) ~[io.grpc-grpc-core-1.45.1.jar:1.45.1] at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:70) ~[io.grpc-grpc-core-1.45.1.jar:1.45.1] at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:743) ~[io.grpc-grpc-core-1.45.1.jar:1.45.1] at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:722) ~[io.grpc-grpc-core-1.45.1.jar:1.45.1] at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) ~[io.grpc-grpc-core-1.45.1.jar:1.45.1] at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133) ~[io.grpc-grpc-core-1.45.1.jar:1.45.1] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?] ... 1 more Caused by: io.grpc.netty.shaded.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: /127.0.0.1:33167 Caused by: java.net.ConnectException: finishConnect(..) failed: Connection refused at io.grpc.netty.shaded.io.netty.channel.unix.Errors.newConnectException0(Errors.java:155) ~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1] at io.grpc.netty.shaded.io.netty.channel.unix.Errors.handleConnectErrno(Errors.java:128) ~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1] at io.grpc.netty.shaded.io.netty.channel.unix.Socket.finishConnect(Socket.java:320) ~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1] at io.grpc.netty.shaded.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:710) ~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1] at io.grpc.netty.shaded.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:687) ~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1] at io.grpc.netty.shaded.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:567) ~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1] at io.grpc.netty.shaded.io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:470) ~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1] at io.grpc.netty.shaded.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) ~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1] at io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) ~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1] at io.grpc.netty.shaded.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1] at io.grpc.netty.shaded.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1] ... 1 more 2022-11-20T14:59:21,676+0000 [function-timer-thread-1-1] ERROR org.apache.pulsar.functions.runtime.process.ProcessRuntime - Extracted Process death exception java.lang.RuntimeException: at org.apache.pulsar.functions.runtime.process.ProcessRuntime.tryExtractingDeathException(ProcessRuntime.java:400) ~[org.apache.pulsar-pulsar-functions-runtime-2.10.1.jar:2.10.1] at org.apache.pulsar.functions.runtime.process.ProcessRuntime.isAlive(ProcessRuntime.java:387) ~[org.apache.pulsar-pulsar-functions-runtime-2.10.1.jar:2.10.1] at org.apache.pulsar.functions.runtime.RuntimeSpawner.lambda$start$0(RuntimeSpawner.java:88) ~[org.apache.pulsar-pulsar-functions-runtime-2.10.1.jar:2.10.1] at org.apache.pulsar.common.util.Runnables$CatchingAndLoggingRunnable.run(Runnables.java:54) [org.apache.pulsar-pulsar-common-2.10.1.jar:2.10.1] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) [?:?] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.77.Final.jar:4.1.77.Final] at java.lang.Thread.run(Thread.java:829) [?:?] 2022-11-20T14:59:21,677+0000 [function-timer-thread-1-1] ERROR org.apache.pulsar.functions.runtime.RuntimeSpawner - public/default/TestEtl Function Container is dead with following exception. Restarting. java.lang.RuntimeException: at org.apache.pulsar.functions.runtime.process.ProcessRuntime.tryExtractingDeathException(ProcessRuntime.java:400) ~[org.apache.pulsar-pulsar-functions-runtime-2.10.1.jar:2.10.1] at org.apache.pulsar.functions.runtime.process.ProcessRuntime.isAlive(ProcessRuntime.java:387) ~[org.apache.pulsar-pulsar-functions-runtime-2.10.1.jar:2.10.1] at org.apache.pulsar.functions.runtime.RuntimeSpawner.lambda$start$0(RuntimeSpawner.java:88) ~[org.apache.pulsar-pulsar-functions-runtime-2.10.1.jar:2.10.1] at org.apache.pulsar.common.util.Runnables$CatchingAndLoggingRunnable.run(Runnables.java:54) [org.apache.pulsar-pulsar-common-2.10.1.jar:2.10.1] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) [?:?] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.77.Final.jar:4.1.77.Final] at java.lang.Thread.run(Thread.java:829) [?:?] 2022-11-20T14:59:21,684+0000 [function-timer-thread-1-1] INFO org.apache.pulsar.functions.runtime.process.ProcessRuntime - Creating function log directory /pulsar/logs/functions/public/default/TestEtl 2022-11-20T14:59:21,684+0000 [function-timer-thread-1-1] INFO org.apache.pulsar.functions.runtime.process.ProcessRuntime - Created or found function log directory /pulsar/logs/functions/public/default/TestEtl 2022-11-20T14:59:21,684+0000 [function-timer-thread-1-1] INFO org.apache.pulsar.functions.runtime.process.ProcessRuntime - ProcessBuilder starting the process with args python /pulsar/instances/python-instance/python_instance_main.py --py format-phone-number.zip --logging_directory /pulsar/logs/functions --logging_file TestEtl --logging_config_file /pulsar/conf/functions-logging/logging_config.ini --instance_id 0 --function_id 44d3a24d-a564-45ed-a817-9b55d3212351 --function_version 98cd871d-14ee-40d6-8272-fdc411b834d3 --function_details '{"tenant":"public","namespace":"default","name":"TestEtl","className":"TestEtl.TestEtl","runtime":"PYTHON","autoAck":true,"parallelism":1,"source":{"inputSpecs":{"persistent://public/default/in":{}},"cleanupSubscription":true},"sink":{"topic":"persistent://public/default/out","forwardSourceMessageProperty":true},"resources":{"cpu":1.0,"ram":"1073741824","disk":"10737418240"},"componentType":"FUNCTION"}' --pulsar_serviceurl pulsar://localhost:6650 --u se_tls false --tls_allow_insecure false --hostname_verification_enabled false --max_buffered_tuples 1024 --port 33167 --metrics_port 42809 --expected_healthcheck_interval 30 --secrets_provider secretsprovider.ClearTextSecretsProvider --cluster_name local 2022-11-20T14:59:21,685+0000 [Timer-0] INFO org.apache.pulsar.functions.LocalRunner - { "failureException": "UNAVAILABLE: io exception", "instanceId": "0" } ``` I tried looking at logs inside: pulsar/logs/functions/public/default/TestEtl: Here is the output: ``` [2022-11-20 15:25:25 +0000] [ERROR] log.py: Could not import User Function Module TestEtl.TestEtl [2022-11-20 15:25:55 +0000] [INFO] python_instance_main.py: Starting Python instance with Namespace(client_auth_params=None, client_auth_plugin=None, cluster_name='local', dependency_repository=None, expected_healthcheck_interval=30, extra_dependency_repository=None, function_details='{"tenant":"public","namespace":"default","name":"TestEtl","className":"TestEtl.TestEtl","runtime":"PYTHON","autoAck":true,"parallelism":1,"source":{"inputSpecs":{"persistent://public/default/in":{}},"cleanupSubscription":true},"sink":{"topic":"persistent://public/default/out","forwardSourceMessageProperty":true},"resources":{"cpu":1.0,"ram":"1073741824","disk":"10737418240"},"componentType":"FUNCTION"}', function_id='73da1f7a-97bc-4274-942a-377dd0e4727e', function_version='575e5fd1-4dce-4f8c-81d7-62e02c68ec99', hostname_verification_enabled='false', install_usercode_dependencies=None, instance_id='0', logging_config_file='/pulsar/conf/functions-logging/logging_config.ini', logging_directory='/pulsar/log s/functions', logging_file='TestEtl', max_buffered_tuples='1024', metrics_port=35879, port=34267, pulsar_serviceurl='pulsar://localhost:6650', py='format-phone-number.zip', secrets_provider='secretsprovider.ClearTextSecretsProvider', secrets_provider_config=None, state_storage_serviceurl=None, tls_allow_insecure_connection='false', tls_trust_cert_path=None, use_tls='false') [2022-11-20 15:25:55 +0000] [INFO] util.py: Failed to import class TestEtl.TestEtl from path [2022-11-20 15:25:55 +0000] [INFO] util.py: No module named 'TestEtl' Traceback (most recent call last): File "/pulsar/instances/python-instance/util.py", line 41, in import_class return import_class_from_path(from_path, full_class_name) File "/pulsar/instances/python-instance/util.py", line 64, in import_class_from_path mod = importlib.import_module(classname_path) File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 1014, in _gcd_import File "<frozen importlib._bootstrap>", line 991, in _find_and_load File "<frozen importlib._bootstrap>", line 973, in _find_and_load_unlocked ModuleNotFoundError: No module named 'TestEtl' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/pulsar/instances/python-instance/util.py", line 46, in import_class return import_class_from_path(api_dir, full_class_name) File "/pulsar/instances/python-instance/util.py", line 64, in import_class_from_path mod = importlib.import_module(classname_path) File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 1014, in _gcd_import File "<frozen importlib._bootstrap>", line 991, in _find_and_load File "<frozen importlib._bootstrap>", line 973, in _find_and_load_unlocked ModuleNotFoundError: No module named 'TestEtl' [2022-11-20 15:25:55 +0000] [CRITICAL] python_instance.py: Could not import User Function Module TestEtl.TestEtl [2022-11-20 15:25:55 +0000] [ERROR] log.py: Traceback (most recent call last): [2022-11-20 15:25:55 +0000] [ERROR] log.py: File "/pulsar/instances/python-instance/python_instance_main.py", line 218, in <module> [2022-11-20 15:25:55 +0000] [ERROR] log.py: main() [2022-11-20 15:25:55 +0000] [ERROR] log.py: File "/pulsar/instances/python-instance/python_instance_main.py", line 199, in main [2022-11-20 15:25:55 +0000] [ERROR] log.py: pyinstance.run() [2022-11-20 15:25:55 +0000] [ERROR] log.py: File "/pulsar/instances/python-instance/python_instance.py", line 199, in run [2022-11-20 15:25:55 +0000] [ERROR] log.py: raise NameError("Could not import User Function Module %s" % self.instance_config.function_details.className) [2022-11-20 15:25:55 +0000] [ERROR] log.py: NameError [2022-11-20 15:25:55 +0000] [ERROR] log.py: : [2022-11-20 15:25:55 +0000] [ERROR] log.py: Could not import User Function Module TestEtl.TestEtl ``` Please help me out with this. GitHub link: https://github.com/apache/pulsar/discussions/18552 ---- This is an automatically sent email for dev@pulsar.apache.org. To unsubscribe, please send an email to: dev-unsubscr...@pulsar.apache.org