Hey all, I have created a Spark Job that runs successfully but if I do not use sc.stop() at the end, the job hangs. It shows some "cleaned accumulator 0" messages but never finishes.
I intent to use these jobs in production via spark-submit and schedule it in cron. Is that the best practice use sc.stop() or is there something else I am missing. One interesting point is, if I run the job for 100 lines, the job finishes completely (without using sc.stop(), but when running with actual data (more than millions) that happens. i've waited for more than 24 hours but it never releases the prompt and in the UI it appears as RUNNING. Appreciate any help Thanks