Great to hear! Regards, Dian
On Tue, Apr 26, 2022 at 4:11 AM John Tipper <john_tip...@hotmail.com> wrote: > Hi Dian, > > I've tried this and it works nicely, on both MacOS and Windows, thank you > very much indeed for your help. > > Kind regards, > > John > ------------------------------ > *From:* Dian Fu <dian0511...@gmail.com> > *Sent:* 25 April 2022 02:42 > *To:* John Tipper <john_tip...@hotmail.com> > *Cc:* user@flink.apache.org <user@flink.apache.org> > *Subject:* Re: Unit testing PyFlink SQL project > > Hi John, > > I'm also using MacOS. This is the steps I'm following which I have run > successfully: > 1) python3 -m venv .venv > 2) source .venv/bin/activate > 3) pip install apache-flink==1.14.4 > 4) python -c "import pyflink;import > os;print(os.path.dirname(os.path.abspath(pyflink.__file__))+'/log')" > It will print something like this: > "/Users/dianfu/code/src/github/pyflink-faq/testing/.venv/lib/python3.8/site-packages/pyflink/log" > 5) check the structure of the installed package: > ``` > (.venv) (base) dianfu@B-7174MD6R-1908 testing % ls -lh > /Users/dianfu/code/src/github/pyflink-faq/testing/.venv/lib/python3.8/site-packages/pyflink/ > total 136 > -rw-r--r-- 1 dianfu staff 1.3K Apr 25 09:26 README.txt > -rw-r--r-- 1 dianfu staff 1.9K Apr 25 09:26 __init__.py > drwxr-xr-x 11 dianfu staff 352B Apr 25 09:26 __pycache__ > drwxr-xr-x 25 dianfu staff 800B Apr 25 09:26 bin > drwxr-xr-x 21 dianfu staff 672B Apr 25 09:26 common > drwxr-xr-x 13 dianfu staff 416B Apr 25 09:26 conf > drwxr-xr-x 20 dianfu staff 640B Apr 25 09:26 datastream > drwxr-xr-x 4 dianfu staff 128B Apr 25 09:26 examples > -rw-r--r-- 1 dianfu staff 3.2K Apr 25 09:26 find_flink_home.py > drwxr-xr-x 25 dianfu staff 800B Apr 25 09:26 fn_execution > -rw-r--r-- 1 dianfu staff 9.1K Apr 25 09:26 gen_protos.py > -rw-r--r-- 1 dianfu staff 7.6K Apr 25 09:26 java_gateway.py > drwxr-xr-x 11 dianfu staff 352B Apr 25 09:26 lib > drwxr-xr-x 28 dianfu staff 896B Apr 25 09:26 licenses > drwxr-xr-x 4 dianfu staff 128B Apr 25 09:26 log > drwxr-xr-x 5 dianfu staff 160B Apr 25 09:26 metrics > drwxr-xr-x 4 dianfu staff 128B Apr 25 09:26 opt > drwxr-xr-x 11 dianfu staff 352B Apr 25 09:26 plugins > -rw-r--r-- 1 dianfu staff 1.3K Apr 25 09:26 pyflink_callback_server.py > -rw-r--r-- 1 dianfu staff 12K Apr 25 09:26 pyflink_gateway_server.py > -rw-r--r-- 1 dianfu staff 5.3K Apr 25 09:26 serializers.py > -rw-r--r-- 1 dianfu staff 7.9K Apr 25 09:26 shell.py > drwxr-xr-x 31 dianfu staff 992B Apr 25 09:26 table > drwxr-xr-x 6 dianfu staff 192B Apr 25 09:26 util > -rw-r--r-- 1 dianfu staff 1.1K Apr 25 09:26 version.py > ``` > 6) Execute command `python3 -m unittest > test_table_api.TableTests.test_scalar_function` > The output is as following and you could see that it executes successfully: > ``` > (.venv) (base) dianfu@B-7174MD6R-1908 testing % python3 -m unittest > test_table_api.TableTests.test_scalar_function > Using %s as FLINK_HOME... > /Users/dianfu/code/src/github/pyflink-faq/testing/.venv/lib/python3.8/site-packages/pyflink > Skipped download > /Users/dianfu/code/src/github/pyflink-faq/testing/flink-python_2.11-1.14.4-tests.jar > since it already exists. > /Users/dianfu/miniconda3/lib/python3.8/subprocess.py:946: ResourceWarning: > subprocess 71018 is still running > _warn("subprocess %s is still running" % self.pid, > ResourceWarning: Enable tracemalloc to get the object allocation traceback > Downloading jar org.apache.flink:flink-table-planner_2.11:1.14.4:jar:tests > /Users/dianfu/code/src/github/pyflink-faq/testing/.venv/lib/python3.8/site-packages/pyflink/table/table_environment.py:538: > DeprecationWarning: Deprecated in 1.10. Use create_table instead. > warnings.warn("Deprecated in 1.10. Use create_table instead.", > DeprecationWarning) > . > ---------------------------------------------------------------------- > Ran 1 test in 32.746s > > OK > ``` > > I have also tried your commands and run into the same error. I believe the > difference comes from `python setup.py install` vs `pip install > apache-flink==1.14.4`. When installing with command `python setup.py > install`, the structure of the installed package is a little different > from `pip install apache-flink==1.14.4`. I will dig into this and share the > results when I have some findings. > > Before that, could you try to create a new clean virtual environment and > see if the steps I'm following work for you? > > Regards, > Dian > > On Mon, Apr 25, 2022 at 6:04 AM John Tipper <john_tip...@hotmail.com> > wrote: > > And now when I add further dependencies to the classpath to remove all > ClassNotFound exceptions, I get a different error which I don't understand > (*TypeError: > Could not found the Java class 'EnvironmentSettings.inStreamingMode'.*), > see the logs below: > > $ python test_table_api.py TableTests.test_scalar_function > > Using %s as FLINK_HOME... > /Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink > > Skipped download > /Users/john/PycharmProjects/pyflink-faq/testing/flink-python_2.11-1.14.4-tests.jar > since it already exists. > > Skipped download > /Users/john/PycharmProjects/pyflink-faq/testing/flink-python_2.11-1.14.4.jar > since it already exists. > > Skipped download > /Users/john/PycharmProjects/pyflink-faq/testing/flink-shaded-guava-30.1.1-jre-15.0.jar > since it already exists. > > > /Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-shaded-guava-30.1.1-jre-15.0.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-java-1.14.4.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-table-planner_2.11-1.14.4.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-table-common-1.14.4.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/jsr305-1.3.9.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-python_2.11-1.14.4.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-python_2.11-1.14.4-tests.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-clients_2.11-1.14.4.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-shaded-force-shading-14.0.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-table-runtime_2.11-1.14.4.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/commons-compress-1.21.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/slf4j-api-1.7.15.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-streaming-java_2.11-1.14.4.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-core-1.14.4.jar > > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". > > SLF4J: Defaulting to no-operation (NOP) logger implementation > > SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further > details. > > /usr/local/Cellar/python@3.8/3.8.13/Frameworks/Python.framework/Versions/3.8/lib/python3.8/subprocess.py:946: > ResourceWarning: subprocess 45753 is still running > > _warn("subprocess %s is still running" % self.pid, > > ResourceWarning: Enable tracemalloc to get the object allocation traceback > > E > > ====================================================================== > > ERROR: test_scalar_function (__main__.TableTests) > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > File "/Users/john/PycharmProjects/pyflink-faq/testing/test_utils.py", > line 135, in setUp > > self.t_env = > TableEnvironment.create(EnvironmentSettings.in_streaming_mode()) > > File > "/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/table/environment_settings.py", > line 267, in in_streaming_mode > > get_gateway().jvm.EnvironmentSettings.inStreamingMode()) > > File > "/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/util/exceptions.py", > line 185, in wrapped_call > > raise TypeError( > > *TypeError: Could not found the Java class > 'EnvironmentSettings.inStreamingMode'.* The Java dependencies could be > specified via command line argument '--jarfile' or the config option > 'pipeline.jars' > > > ---------------------------------------------------------------------- > > Ran 1 test in 0.401s > > > FAILED (errors=1) > > sys:1: ResourceWarning: unclosed file <_io.BufferedWriter name=4> > > > ------------------------------ > *From:* John Tipper <john_tip...@hotmail.com> > *Sent:* 24 April 2022 20:48 > *To:* Dian Fu <dian0511...@gmail.com> > *Cc:* user@flink.apache.org <user@flink.apache.org> > *Subject:* Re: Unit testing PyFlink SQL project > > I get a bit further when I add all of the transitive dependencies to the > classpath, where I download these by calling mvn twice: > > mkdir -p out > > mvn org.apache.maven.plugins:maven-dependency-plugin:2.10:copy > -Dartifact=org.apache.flink:flink-python_2.11:1.14.4:pom > -DoutputDirectory=$(pwd)/out > > mvn > org.apache.maven.plugins:maven-dependency-plugin:2.10:copy-dependencies > -Dartifact=org.apache.flink:flink-python_2.11:1.14.4 > -DoutputDirectory=$(pwd)/out -f out/flink-python_2.11-1.14.4.pom > > I then copy modify your `construct_test_classpath()` to add all of these > resulting jars to the classpath. I now get a further exception: > > > $ python test_table_api.py TableTests.test_scalar_function > > Using %s as FLINK_HOME... > /Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink > > Skipped download > /Users/john/PycharmProjects/pyflink-faq/testing/flink-python_2.11-1.14.4-tests.jar > since it already exists. > > Skipped download > /Users/john/PycharmProjects/pyflink-faq/testing/flink-python_2.11-1.14.4.jar > since it already exists. > > > /Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-runtime-1.14.4-tests.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-java-1.14.4.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-table-planner_2.11-1.14.4.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-table-common-1.14.4.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/mockito-core-2.21.0.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/jsr305-1.3.9.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/log4j-api-2.17.1.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-python_2.11-1.14.4.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-table-planner_2.11-1.14.4-tests.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/snappy-java-1.1.8.3.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-test-utils-junit-1.14.4.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/log4j-1.2-api-2.17.1.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/powermock-api-mockito2-2.0.4.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-clients_2.11-1.14.4.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/log4j-core-2.17.1.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/log4j-slf4j-impl-2.17.1.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-core-1.14.4-tests.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-shaded-force-shading-14.0.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/powermock-module-junit4-2.0.4.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-table-runtime_2.11-1.14.4.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-streaming-java_2.11-1.14.4-tests.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/hamcrest-all-1.3.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/commons-compress-1.21.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/junit-jupiter-5.7.2.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/junit-vintage-engine-5.7.2.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/slf4j-api-1.7.15.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-streaming-java_2.11-1.14.4.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-core-1.14.4.jar:/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/opt/flink-table-runtime_2.11-1.14.4-tests.jar > > /usr/local/Cellar/python@3.8/3.8.13/Frameworks/Python.framework/Versions/3.8/lib/python3.8/subprocess.py:946: > ResourceWarning: subprocess 44523 is still running > > _warn("subprocess %s is still running" % self.pid, > > ResourceWarning: Enable tracemalloc to get the object allocation traceback > > Exception in thread "Thread-3" java.lang.NoClassDefFoundError: > org/apache/flink/shaded/guava30/com/google/common/cache/CacheLoader > > at java.base/java.lang.Class.forName0(Native Method) > > at java.base/java.lang.Class.forName(Class.java:398) > > at > org.apache.flink.api.python.shaded.py4j.reflection.CurrentThreadClassLoadingStrategy.classForName(CurrentThreadClassLoadingStrategy.java:40) > > at > org.apache.flink.api.python.shaded.py4j.reflection.ReflectionUtil.classForName(ReflectionUtil.java:51) > > at > org.apache.flink.api.python.shaded.py4j.Protocol.getPythonProxy(Protocol.java:423) > > at > org.apache.flink.api.python.shaded.py4j.Protocol.getObject(Protocol.java:311) > > at > org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.getArguments(AbstractCommand.java:82) > > at > org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:77) > > at > org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238) > > at java.base/java.lang.Thread.run(Thread.java:834) > > Caused by: java.lang.ClassNotFoundException: > org.apache.flink.shaded.guava30.com.google.common.cache.CacheLoader > > at > java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581) > > at > java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) > > at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522) > > ... 10 more > > ERROR:root:Exception while sending command. > > Traceback (most recent call last): > > File > "/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/py4j/java_gateway.py", > line 1188, in send_command > > raise Py4JNetworkError("Answer from Java side is empty") > > py4j.protocol.Py4JNetworkError: Answer from Java side is empty > > > During handling of the above exception, another exception occurred: > > > Traceback (most recent call last): > > File > "/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/py4j/java_gateway.py", > line 1014, in send_command > > response = connection.send_command(command) > > File > "/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/py4j/java_gateway.py", > line 1192, in send_command > > raise Py4JNetworkError( > > py4j.protocol.Py4JNetworkError: Error while receiving > > E > > ====================================================================== > > ERROR: test_scalar_function (__main__.TableTests) > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > File "/Users/john/PycharmProjects/pyflink-faq/testing/test_utils.py", > line 133, in setUp > > self.t_env = > TableEnvironment.create(EnvironmentSettings.in_streaming_mode()) > > File > "/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/table/environment_settings.py", > line 267, in in_streaming_mode > > get_gateway().jvm.EnvironmentSettings.inStreamingMode()) > > File > "/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/java_gateway.py", > line 75, in get_gateway > > _gateway.entry_point.put("PythonFunctionFactory", > PythonFunctionFactory()) > > File > "/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/py4j/java_gateway.py", > line 1285, in __call__ > > return_value = get_return_value( > > File > "/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/util/exceptions.py", > line 146, in deco > > return f(*a, **kw) > > File > "/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/py4j/protocol.py", > line 334, in get_return_value > > raise Py4JError( > > py4j.protocol.Py4JError: An error occurred while calling t.put > > > ---------------------------------------------------------------------- > > Ran 1 test in 1.089s > > > FAILED (errors=1) > > sys:1: ResourceWarning: unclosed file <_io.BufferedWriter name=4> > > (.venv) Johns-MacBook-Pro:testing john$ Unable to get the Python watchdog > object, now exit. > > > ------------------------------ > *From:* John Tipper <john_tip...@hotmail.com> > *Sent:* 24 April 2022 20:30 > *To:* Dian Fu <dian0511...@gmail.com> > *Cc:* user@flink.apache.org <user@flink.apache.org> > *Subject:* Re: Unit testing PyFlink SQL project > > Hi Dian, > > Thank you very much, that's very helpful. I'm seeing a couple of errors > when I try to run the example though (Python 3.8 on Mac OS). > > > 1. I create a fresh Python virtual env: `python -m venv .venv` > 2. `source .venv/bin/activate` > 3. When I tried to configure the project by running `python setup.py > install` I got errors about Cython not being installed even though it was. > I then just had to do a `pip install apache-flink==1.14.4` to install the > requirements and be able to move forward. Not sure what the issue here is. > 4. > > $ python test_table_api.py TableTests.test_scalar_function > > Using %s as FLINK_HOME... > > /Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink > > Skipped download > > /Users/john/PycharmProjects/pyflink-faq/testing/flink-python_2.11-1.14.4-tests.jar > since it already exists. > > The flink-python jar is not found in the opt folder of the FLINK_HOME: > > /Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink > > > > /Users/john/PycharmProjects/pyflink-faq/testing/flink-python_2.11-1.14.4-tests.jar > > Error: Could not find or load main class > org.apache.flink.client.python.PythonGatewayServer > > Caused by: java.lang.ClassNotFoundException: > org.apache.flink.client.python.PythonGatewayServer > > > E/usr/local/Cellar/python@3.8/3.8.13/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/case.py:704: > ResourceWarning: unclosed file <_io.BufferedWriter name=4> > > outcome.errors.clear() > > ResourceWarning: Enable tracemalloc to get the object allocation > traceback > > > ====================================================================== > > ERROR: test_scalar_function (__main__.TableTests) > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > File > "/Users/john/PycharmProjects/pyflink-faq/testing/test_utils.py", line 123, > in setUp > > self.t_env = > TableEnvironment.create(EnvironmentSettings.in_streaming_mode()) > > File > > "/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/table/environment_settings.py", > line 267, in in_streaming_mode > > get_gateway().jvm.EnvironmentSettings.inStreamingMode()) > > File > > "/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/java_gateway.py", > line 62, in get_gateway > > _gateway = launch_gateway() > > File > > "/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/java_gateway.py", > line 112, in launch_gateway > > raise Exception("Java gateway process exited before sending its > port number") > > Exception: Java gateway process exited before sending its port number > > > ---------------------------------------------------------------------- > > Ran 1 test in 0.333s > > > FAILED (errors=1) > > > 5. I then added `("org.apache.flink", "flink-python_2.11", "1.14.4", > None)` to the testing_jars list so that the regular Flink jar would be > downloaded, created an `opt` directory to the FLINK_HOME directory and > copied into it the regular Flink jar. > 6. > 7. > > $ python test_table_api.py TableTests.test_scalar_function > > Using %s as FLINK_HOME... > > /Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink > > Skipped download > > /Users/john/PycharmProjects/pyflink-faq/testing/flink-python_2.11-1.14.4-tests.jar > since it already exists. > > Skipped download > > /Users/john/PycharmProjects/pyflink-faq/testing/flink-python_2.11-1.14.4.jar > since it already exists. > > > > /Users/john/PycharmProjects/pyflink-faq/testing/flink-python_2.11-1.14.4.jar:/Users/john/PycharmProjects/pyflink-faq/testing/flink-python_2.11-1.14.4-tests.jar > > Exception in thread "main" java.lang.NoClassDefFoundError: > org/slf4j/LoggerFactory > > at > > org.apache.flink.client.python.PythonEnvUtils.<clinit>(PythonEnvUtils.java:77) > > at > > org.apache.flink.client.python.PythonGatewayServer.main(PythonGatewayServer.java:46) > > Caused by: java.lang.ClassNotFoundException: org.slf4j.LoggerFactory > > at > > java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581) > > at > > java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) > > at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522) > > ... 2 more > > > E/usr/local/Cellar/python@3.8/3.8.13/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/case.py:704: > ResourceWarning: unclosed file <_io.BufferedWriter name=4> > > outcome.errors.clear() > > ResourceWarning: Enable tracemalloc to get the object allocation > traceback > > > ====================================================================== > > ERROR: test_scalar_function (__main__.TableTests) > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > File > "/Users/john/PycharmProjects/pyflink-faq/testing/test_utils.py", line 131, > in setUp > > self.t_env = > TableEnvironment.create(EnvironmentSettings.in_streaming_mode()) > > File > > "/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/table/environment_settings.py", > line 267, in in_streaming_mode > > get_gateway().jvm.EnvironmentSettings.inStreamingMode()) > > File > > "/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/java_gateway.py", > line 62, in get_gateway > > _gateway = launch_gateway() > > File > > "/Users/john/PycharmProjects/pyflink-faq/.venv/lib/python3.8/site-packages/apache_flink-1.14.4-py3.8-macosx-11-x86_64.egg/pyflink/java_gateway.py", > line 112, in launch_gateway > > raise Exception("Java gateway process exited before sending its > port number") > > Exception: Java gateway process exited before sending its port number > > > ---------------------------------------------------------------------- > > Ran 1 test in 0.344s > > > FAILED (errors=1) > 8. > Now it looks like the code needs all of the transitive dependencies on > the classpath? > > > Have you managed to get your example tests to run in a completely clean > virtual environment? It looks like if it's working on your computer that > your computer perhaps has Java and Python dependencies already downloaded > into particular locations. > > Many thanks, > > John > > ------------------------------ > *From:* Dian Fu <dian0511...@gmail.com> > *Sent:* 24 April 2022 06:21 > *To:* John Tipper <john_tip...@hotmail.com> > *Cc:* user@flink.apache.org <user@flink.apache.org> > *Subject:* Re: Unit testing PyFlink SQL project > > Hi John, > > I have written an example on how to write unit tests of Flink > functionalities with PyFlink in [1]. Hope it is helpful for you. Feel free > to let me know if there are any problems. > > Regards, > Dian > > [1] https://github.com/dianfu/pyflink-faq/tree/main/testing > > On Sun, Apr 24, 2022 at 9:25 AM Dian Fu <dian0511...@gmail.com> wrote: > > Hi John, > > >> I don't know how to fix this. I've tried adding `flink-table-planner` > and `flink-table-planner-blink` dependencies with `<type>test-jar</type>` > to my dummy pom.xml, but it still fails. > What's the failure after doing this? The flink-table-planner*-tests.jar > should be available in maven repository[1]. > > >> This is starting to feel like a real pain to do something that should > be trivial: basic TDD of a PyFlink project. Is there a real-world example > of a Python project that shows how to set up a testing environment for unit > testing SQL with PyFlink? > I'm not aware of such a project, however I agree that this may be a very > important aspect which should be improved. I will look into this. > > Regards, > Dian > > [1] > https://repo1.maven.org/maven2/org/apache/flink/flink-table-planner_2.11/1.13.6/ > > > On Sun, Apr 24, 2022 at 4:44 AM John Tipper <john_tip...@hotmail.com> > wrote: > > Hi all, > > Is there an example of a self-contained repository showing how to perform > SQL unit testing of PyFlink (specifically 1.13.x if possible)? I have > cross-posted the question to Stack Overflow here: > https://stackoverflow.com/questions/71983434/is-there-an-example-of-pyflink-sql-unit-testing-in-a-self-contained-repo > > > There is a related SO question ( > https://stackoverflow.com/questions/69937520/pyflink-sql-local-test), > where it is suggested to use some of the tests from PyFlink itself. The > issue I'm running into is that the PyFlink repo assumes that a bunch of > things are on the Java classpath and that some Python utility classes are > available (they're not distributed via PyPi apache-flink). > > I have done the following: > > > 1. Copied `test_case_utils.py` and `source_sink_utils.py` from PyFlink > ( > > https://github.com/apache/flink/tree/f8172cdbbc27344896d961be4b0b9cdbf000b5cd/flink-python/pyflink/testing) > into my project. > 2. Copy an example unit test ( > > https://github.com/apache/flink/blob/f8172cdbbc27344896d961be4b0b9cdbf000b5cd/flink-python/pyflink/table/tests/test_sql.py#L39) > as suggested by the related SO question. > 3. > > When I try to run the test, I get an error because the test case cannot > determine what version of Avro jars to download (`download_apache_avro()` > fails, because pyflink_gateway_server.py tries to evaluate the value of > `avro.version` by running `mvn help:evaluate -Dexpression=avro.version`) > > I then added a dummy `pom.xml` defining a Maven property of `avro.version` > (with a value of `1.10.0`) and my unit test case is loaded. > > I now get a new error and my test is skipped: > > 'flink-table-planner*-tests.jar' is not available. Will skip the > related tests. > > I don't know how to fix this. I've tried adding `flink-table-planner` and > `flink-table-planner-blink` dependencies with `<type>test-jar</type>` to my > dummy pom.xml, but it still fails. > > This is starting to feel like a real pain to do something that should be > trivial: basic TDD of a PyFlink project. Is there a real-world example of > a Python project that shows how to set up a testing environment for unit > testing SQL with PyFlink? > > Many thanks, > > John > >