foreach definitely works :) This is not a streaming question. The error says that the JVM worker died for some reason. You'd have to look at its logs to see why.
On Fri, May 7, 2021 at 11:03 AM Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Hi, > > I am not convinced foreach works even in 3.1.1 > Try doing the same with foreachBatch > > foreachBatch(sendToSink). \ > trigger(processingTime='2 seconds'). \ > > and see it works > > HTH > > > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Fri, 7 May 2021 at 16:07, rajat kumar <kumar.rajat20...@gmail.com> > wrote: > >> Hi Team, >> >> I am using Spark 2.4.4 with Python >> >> While using below line: >> >> dataframe.foreach(lambda record : process_logs(record)) >> >> >> My use case is , process logs will download the file from cloud storage >> using Python code and then it will save the processed data. >> >> I am getting the following error >> >> File "/opt/spark/python/lib/pyspark.zip/pyspark/java_gateway.py", line >> 46, in launch_gateway >> return _launch_gateway(conf) >> File "/opt/spark/python/lib/pyspark.zip/pyspark/java_gateway.py", line >> 108, in _launch_gateway >> raise Exception("Java gateway process exited before sending its port >> number") >> Exception: Java gateway process exited before sending its port number >> >> Can anyone pls suggest what can be done? >> >> Thanks >> Rajat >> >