Personally I like Jupyter notebooks for my interactive work and then once
I’ve done my exploration I switch back to emacs with either scala-metals or
Python mode.

I think the main takeaway is: do what feels best for you, there is no one
true way to develop in Spark.

On Fri, Oct 1, 2021 at 1:28 AM Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> Thanks guys for your comments.
>
> I agree with you Florian that opening a terminal say in VSC allows you to
> run a shell script (an sh file) to submit your spark code, however, this
> really makes sense if your IDE is running on a Linux host submitting a job
> to a Kubernetes cluster or YARN cluster.
>
> For Python, I will go with PyCharm which is specific to the Python world.
> With Spark, I have used IntelliJ with Spark plug in on MAC for development
> work. Then created a JAR file, gzipped the whole project and scped to an
> IBM sandbox, untarred it and ran it with a pre-prepared shell with
> environment plugin for dev, test, staging etc.
>
> IDE is also useful for looking at csv, tsv type files or creating json
> from one form to another. For json validation,especially if the file is too
> large, you may have restriction loading the file to web json validator
> because of the risk of proprietary data being exposed. There is a tool
> called jq <https://stedolan.github.io/jq/> (a lightweight and flexible
> command-line JSON processor), that comes pretty handy to validate json.
> Download and install it on OS and run it as
>
> zcat <json_file>.tgz | jq
>
> That will validate the whole tarred and gzipped json file. Otherwise most
> of these IDE tools come with add-on plugins, for various needs. My
> preference would be to use the best available IDE for the job. VSC I would
> consider as a general purpose tool. If all fails, one can always use OS
> stuff like vi, vim, sed, awk etc 🤔
>
>
> Cheers
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Fri, 1 Oct 2021 at 06:55, Florian CASTELAIN <
> florian.castel...@redlab.io> wrote:
>
>> Hello.
>>
>> Any "evolved" code editor allows you to create tasks (or builds, or
>> whatever they are called in the IDE you chose). If you do not find anything
>> that packages by default all you need, you could just create your own tasks.
>>
>>
>> *For yarn, one needs to open a terminal and submit from there. *
>>
>> You can create task(s) that launch your yarn commands.
>>
>>
>> *With VSC, you get stuff for working with json files but I am not sure
>> with a plugin for Python *
>>
>> In your json task configuration, you can launch whatever you want:
>> python, shell. I bet you could launch your favorite video game (just make a
>> task called "let's have a break" 😉)
>>
>> Just to say, if you want everything exactly the way you want, I do not
>> think you will find an IDE that does it. You will have to customize it.
>> (correct me if wrong, of course).
>>
>> Have a good day.
>>
>> *[image: signature_299490615]* <https://www.neuroo.ai/>
>>
>>
>>
>> [image: Banner] <http://www.redlab.io/>
>>
>>
>>
>> *Florian CASTELAIN *
>> *Ingénieur Logiciel*
>>
>> 72 Rue de la République, 76140 Le Petit-Quevilly
>> <https://www.google.com/maps/search/72+Rue+de+la+R%C3%A9publique,+76140+Le+Petit-Quevilly?entry=gmail&source=g>
>> m: +33 616 530 226
>> e: florian.castel...@redlab.io w: www.redlab.io
>>
>> ------------------------------
>> *De :* Jeff Zhang <zjf...@gmail.com>
>> *Envoyé :* jeudi 30 septembre 2021 13:57
>> *À :* Mich Talebzadeh <mich.talebza...@gmail.com>
>> *Cc :* user @spark <user@spark.apache.org>
>> *Objet :* Re: Choice of IDE for Spark
>>
>> IIRC, you want an IDE for pyspark on yarn ?
>>
>> Mich Talebzadeh <mich.talebza...@gmail.com> 于2021年9月30日周四 下午7:00写道:
>>
>> Hi,
>>
>> This may look like a redundant question but it comes about because of the
>> advent of Cloud workstation usage like Amazon workspaces and others.
>>
>> With IntelliJ you are OK with Spark & Scala. With PyCharm you are fine
>> with PySpark and the virtual environment. Mind you as far as I know PyCharm
>> only executes spark-submit in local mode. For yarn, one needs to open a
>> terminal and submit from there.
>>
>> However, in Amazon workstation, you get Visual Studio Code
>> <https://code.visualstudio.com/> (VSC, an MS product) and openoffice
>> installed. With VSC, you get stuff for working with json files but I am not
>> sure with a plugin for Python etc, will it be as good as PyCharm? Has
>> anyone used VSC in anger for Spark and if so what is the experience?
>>
>> Thanks
>>
>>
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
> --
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau

Reply via email to