Hi Anil

You dont need to download and Install Spark.
It's enough to add pyspark to PyCharm as a package for your environment and
start developing and testing locally. The thing is PySpark includes local
Spark that is installed as part of pip install.

When it comes to your particular issue. I believe you are using Windows and
there is well known issue with these "hadoop libraries". You will need to
install few executables to solve this. Check the web for that particular
error.

Regards
Khalid


On Wed, 17 Nov 2021, 08:35 Gourav Sengupta, <gourav.sengu...@gmail.com>
wrote:

> Hi Anil,
>
> I generally create an anaconda environment, and then install pyspark in
> it, and then configure the interpreter to point to that particular
> environment. Never faced an issue with my approach.
>
>
> Regards,
> Gourav Sengupta
>
> On Wed, Nov 17, 2021 at 7:39 AM Anil Kulkarni <anil...@gmail.com> wrote:
>
>> Hi Spark community,
>>
>> I am having a hard time setting up my Pycharm to work with pyspark. Can
>> any of you point me to documentation available?
>>
>> Things I have tried till now :
>>
>>    1. Download and Install Apache spark
>>    2. Add pyspark package in pycharm.
>>    3. Add SPARK_HOME. PYTHONPATH, HADOOP_HOME env variables to Run config
>>
>> Error I am getting :
>>
>> Using Spark's default log4j profile:
>> org/apache/spark/log4j-defaults.properties
>> Setting default log level to "WARN".
>> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use
>> setLogLevel(newLevel).
>> 21/11/16 23:26:28 WARN NativeCodeLoader: *Unable to load native-hadoop
>> library for your platform... using builtin-java classes where applicable*
>> Traceback (most recent call last):
>>
>>
>> --
>> Cheers,
>> Anil Kulkarni
>> https://anilkulkarni.com/
>>
>>

Reply via email to