In PySpark, is there a way to get the status of a job which is currently 
running? My use case is that I have a long running job that users may not know 
whether or not the job is still running. It would be nice to have an idea of 
whether or not the job is progressing even if it isn't very granular.

I've looked into the Application detailed UI which has per-stage information 
(but unfortunately is not in json format), but even at that point I don't 
necessarily know which stages correspond to a job I started.

So I guess my main questions are:

  1.  How do I get the job id of a job started in python?
  2.  If possible, how do I get the stages which correspond to that job?
  3.  Is there any way to get information about currently running stages 
without parsing the Stage UI HTML page?
  4.  Has anyone approached this problem in a different way?

Reply via email to