It's accessed through the `statusTracker` field on SparkContext.

*Scala*:

https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.SparkStatusTracker

*Java*:

https://spark.apache.org/docs/latest/api/java/org/apache/spark/api/java/JavaSparkStatusTracker.html

Don't create new instances of this yourself; instead, use sc.statusTracker
to obtain the current instance.

This API is missing a bunch of stuff that's available in the web UI, but it
was designed so that we can add new methods without breaking binary
compatibility. Although it would technically be a new feature, I'd hope
that we can backport some additions to 1.2.1 since it's just adding a
facade / stable interface in front of JobProgressListener and thus has
little to no risk to introduce new bugs elsewhere in Spark.



On Mon, Dec 29, 2014 at 3:08 AM, Aniket Bhatnagar <
aniket.bhatna...@gmail.com> wrote:

> Hi Josh
>
> Is there documentation available for status API? I would like to use it.
>
> Thanks,
> Aniket
>
>
> On Sun Dec 28 2014 at 02:37:32 Josh Rosen <rosenvi...@gmail.com> wrote:
>
>> The console progress bars are implemented on top of a new stable "status
>> API" that was added in Spark 1.2.  It's possible to query job progress
>> using this interface (in older versions of Spark, you could implement a
>> custom SparkListener and maintain the counts of completed / running /
>> failed tasks / stages yourself).
>>
>> There are actually several subtleties involved in implementing
>> "job-level" progress bars which behave in an intuitive way; there's a
>> pretty extensive discussion of the challenges at
>> https://github.com/apache/spark/pull/3009.  Also, check out the pull
>> request for the console progress bars for an interesting design discussion
>> around how they handle parallel stages:
>> https://github.com/apache/spark/pull/3029.
>>
>> I'm not sure about the plumbing that would be necessary to display live
>> progress updates in the IPython notebook UI, though.  The general pattern
>> would probably involve a mapping to relate notebook cells to Spark jobs
>> (you can do this with job groups, I think), plus some periodic timer that
>> polls the driver for the status of the current job in order to update the
>> progress bar.
>>
>> For Spark 1.3, I'm working on designing a REST interface to accesses this
>> type of job / stage / task progress information, as well as expanding the
>> types of information exposed through the stable status API interface.
>>
>> - Josh
>>
>> On Thu, Dec 25, 2014 at 10:01 AM, Eric Friedman <
>> eric.d.fried...@gmail.com> wrote:
>>
>>> Spark 1.2.0 is SO much more usable than previous releases -- many thanks
>>> to the team for this release.
>>>
>>> A question about progress of actions.  I can see how things are
>>> progressing using the Spark UI.  I can also see the nice ASCII art
>>> animation on the spark driver console.
>>>
>>> Has anyone come up with a way to accomplish something similar in an
>>> iPython notebook using pyspark?
>>>
>>> Thanks
>>> Eric
>>>
>>
>>

Reply via email to