Re: Apache Spark 3.2.0 | Pyspark | Pycharm Setup

Mich Talebzadeh Wed, 17 Nov 2021 03:42:05 -0800

yep the latest pyspark is 3.2. you can easily install it from available
packages


[image: image.png]
my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Wed, 17 Nov 2021 at 11:07, Khalid Mammadov <khalidmammad...@gmail.com>
wrote:

> Hi Anil
>
> You dont need to download and Install Spark.
> It's enough to add pyspark to PyCharm as a package for your environment
> and start developing and testing locally. The thing is PySpark includes
> local Spark that is installed as part of pip install.
>
> When it comes to your particular issue. I believe you are using Windows
> and there is well known issue with these "hadoop libraries". You will need
> to install few executables to solve this. Check the web for that particular
> error.
>
> Regards
> Khalid
>
>
> On Wed, 17 Nov 2021, 08:35 Gourav Sengupta, <gourav.sengu...@gmail.com>
> wrote:
>
>> Hi Anil,
>>
>> I generally create an anaconda environment, and then install pyspark in
>> it, and then configure the interpreter to point to that particular
>> environment. Never faced an issue with my approach.
>>
>>
>> Regards,
>> Gourav Sengupta
>>
>> On Wed, Nov 17, 2021 at 7:39 AM Anil Kulkarni <anil...@gmail.com> wrote:
>>
>>> Hi Spark community,
>>>
>>> I am having a hard time setting up my Pycharm to work with pyspark. Can
>>> any of you point me to documentation available?
>>>
>>> Things I have tried till now :
>>>
>>>    1. Download and Install Apache spark
>>>    2. Add pyspark package in pycharm.
>>>    3. Add SPARK_HOME. PYTHONPATH, HADOOP_HOME env variables to Run
>>>    config
>>>
>>> Error I am getting :
>>>
>>> Using Spark's default log4j profile:
>>> org/apache/spark/log4j-defaults.properties
>>> Setting default log level to "WARN".
>>> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use
>>> setLogLevel(newLevel).
>>> 21/11/16 23:26:28 WARN NativeCodeLoader: *Unable to load native-hadoop
>>> library for your platform... using builtin-java classes where applicable*
>>> Traceback (most recent call last):
>>>
>>>
>>> --
>>> Cheers,
>>> Anil Kulkarni
>>> https://anilkulkarni.com/
>>>
>>>

Re: Apache Spark 3.2.0 | Pyspark | Pycharm Setup

Reply via email to