Hello dear friends,
I hope everyone is doing fine and staying safe. This query is for SPARK 3.0.1. The following works: > pyspark --py-files s3://gourav-bucket/spark_nlp_display-1.7-py3.7.egg > >>> import sparknlp_display > >>> But when I start python, and then create a spark session then it gives an error even if I do not add the configuration spark.yarn.dist.pyFiles: >>>spark = SparkSession.builder.master("yarn") \ > .config("spark.submit.pyFiles", > "s3://gourav-bucket/spark_nlp_display-1.7-py3.7.egg") \ > .config("spark.yarn.dist.pyFiles", > "s3://gourav-bucket/spark_nlp_display-1.7-py3.7.egg") \ > getOrCreate() > >>> import sparknlp_display > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "<frozen importlib._bootstrap>", line 983, in _find_and_load > File "<frozen importlib._bootstrap>", line 967, in > _find_and_load_unlocked > File "<frozen importlib._bootstrap>", line 668, in _load_unlocked > File "<frozen importlib._bootstrap>", line 638, in > _load_backward_compatible > File > "/mnt/tmp/spark-692956ec-cc89-4b11-a1de-ced1a164ef1b/userFiles-ed25968c-cc16-4900-b910-75d933db3afb/spark_nlp_display-1.7-py3.7.egg/sparknlp_display/__init__.py", > line 16, in <module> > __version__ = get_version() > File > "/mnt/tmp/spark-692956ec-cc89-4b11-a1de-ced1a164ef1b/userFiles-ed25968c-cc16-4900-b910-75d933db3afb/spark_nlp_display-1.7-py3.7.egg/sparknlp_display/__init__.py", > line 12, in get_version > with open(os.path.join(here, "VERSION"), "r") as fh: > NotADirectoryError: [Errno 20] Not a directory: > '/mnt/tmp/spark-692956ec-cc89-4b11-a1de-ced1a164ef1b/userFiles-ed25968c-cc16-4900-b910-75d933db3afb/spark_nlp_display-1.7-py3.7.egg/sparknlp_display/VERSION' When I do ls I can see that the following is present: /mnt/tmp/spark-692956ec-cc89-4b11-a1de-ced1a164ef1b/userFiles-ed25968c-cc16-4900-b910-75d933db3afb/spark_nlp_display-1.7-py3.7.egg When I unzip the egg file I do see the following files there under sparknlp_display folder: > VERSION > __init__.py > __pycache__ > assertion.py > dep_updates.py > dependency_parser.py > entity_resolution.py > fonts > label_colors > ner.py > re_updates.py > relation_extraction.py > retemp.py > style.css > style_utils.py I will be grateful if someone could kindly let me know what am I doing wrong here. Regards, Gourav Sengupta