Personally I like Jupyter notebooks for my interactive work and then once I’ve done my exploration I switch back to emacs with either scala-metals or Python mode.
I think the main takeaway is: do what feels best for you, there is no one true way to develop in Spark. On Fri, Oct 1, 2021 at 1:28 AM Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Thanks guys for your comments. > > I agree with you Florian that opening a terminal say in VSC allows you to > run a shell script (an sh file) to submit your spark code, however, this > really makes sense if your IDE is running on a Linux host submitting a job > to a Kubernetes cluster or YARN cluster. > > For Python, I will go with PyCharm which is specific to the Python world. > With Spark, I have used IntelliJ with Spark plug in on MAC for development > work. Then created a JAR file, gzipped the whole project and scped to an > IBM sandbox, untarred it and ran it with a pre-prepared shell with > environment plugin for dev, test, staging etc. > > IDE is also useful for looking at csv, tsv type files or creating json > from one form to another. For json validation,especially if the file is too > large, you may have restriction loading the file to web json validator > because of the risk of proprietary data being exposed. There is a tool > called jq <https://stedolan.github.io/jq/> (a lightweight and flexible > command-line JSON processor), that comes pretty handy to validate json. > Download and install it on OS and run it as > > zcat <json_file>.tgz | jq > > That will validate the whole tarred and gzipped json file. Otherwise most > of these IDE tools come with add-on plugins, for various needs. My > preference would be to use the best available IDE for the job. VSC I would > consider as a general purpose tool. If all fails, one can always use OS > stuff like vi, vim, sed, awk etc 🤔 > > > Cheers > > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Fri, 1 Oct 2021 at 06:55, Florian CASTELAIN < > florian.castel...@redlab.io> wrote: > >> Hello. >> >> Any "evolved" code editor allows you to create tasks (or builds, or >> whatever they are called in the IDE you chose). If you do not find anything >> that packages by default all you need, you could just create your own tasks. >> >> >> *For yarn, one needs to open a terminal and submit from there. * >> >> You can create task(s) that launch your yarn commands. >> >> >> *With VSC, you get stuff for working with json files but I am not sure >> with a plugin for Python * >> >> In your json task configuration, you can launch whatever you want: >> python, shell. I bet you could launch your favorite video game (just make a >> task called "let's have a break" 😉) >> >> Just to say, if you want everything exactly the way you want, I do not >> think you will find an IDE that does it. You will have to customize it. >> (correct me if wrong, of course). >> >> Have a good day. >> >> *[image: signature_299490615]* <https://www.neuroo.ai/> >> >> >> >> [image: Banner] <http://www.redlab.io/> >> >> >> >> *Florian CASTELAIN * >> *Ingénieur Logiciel* >> >> 72 Rue de la République, 76140 Le Petit-Quevilly >> <https://www.google.com/maps/search/72+Rue+de+la+R%C3%A9publique,+76140+Le+Petit-Quevilly?entry=gmail&source=g> >> m: +33 616 530 226 >> e: florian.castel...@redlab.io w: www.redlab.io >> >> ------------------------------ >> *De :* Jeff Zhang <zjf...@gmail.com> >> *Envoyé :* jeudi 30 septembre 2021 13:57 >> *À :* Mich Talebzadeh <mich.talebza...@gmail.com> >> *Cc :* user @spark <user@spark.apache.org> >> *Objet :* Re: Choice of IDE for Spark >> >> IIRC, you want an IDE for pyspark on yarn ? >> >> Mich Talebzadeh <mich.talebza...@gmail.com> 于2021年9月30日周四 下午7:00写道: >> >> Hi, >> >> This may look like a redundant question but it comes about because of the >> advent of Cloud workstation usage like Amazon workspaces and others. >> >> With IntelliJ you are OK with Spark & Scala. With PyCharm you are fine >> with PySpark and the virtual environment. Mind you as far as I know PyCharm >> only executes spark-submit in local mode. For yarn, one needs to open a >> terminal and submit from there. >> >> However, in Amazon workstation, you get Visual Studio Code >> <https://code.visualstudio.com/> (VSC, an MS product) and openoffice >> installed. With VSC, you get stuff for working with json files but I am not >> sure with a plugin for Python etc, will it be as good as PyCharm? Has >> anyone used VSC in anger for Spark and if so what is the experience? >> >> Thanks >> >> >> >> view my Linkedin profile >> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >> >> >> >> *Disclaimer:* Use it at your own risk. Any and all responsibility for >> any loss, damage or destruction of data or any other property which may >> arise from relying on this email's technical content is explicitly >> disclaimed. The author will in no case be liable for any monetary damages >> arising from such loss, damage or destruction. >> >> >> >> >> >> -- >> Best Regards >> >> Jeff Zhang >> > -- Twitter: https://twitter.com/holdenkarau Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> YouTube Live Streams: https://www.youtube.com/user/holdenkarau