https://issues.apache.org/jira/browse/SPARK-1081?jql=project%20%3D%20SPARK%20AND%20text%20~%20Annotate


On Thu, Apr 3, 2014 at 9:24 AM, Philip Ogren <philip.og...@oracle.com>wrote:

>  I can appreciate the reluctance to expose something like the
> JobProgressListener as a public interface.  It's exactly the sort of thing
> that you want to deprecate as soon as something better comes along and can
> be a real pain when trying to maintain the level of backwards
> compatibility  that we all expect from commercial grade software.  Instead
> of simply marking it private and therefore unavailable to Spark developers,
> it might be worth incorporating something like a @Beta 
> annotation<http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/annotations/Beta.html>which
>  you could sprinkle liberally throughout Spark that communicates "hey
> use this if you want to cause its here now" and "don't come crying if we
> rip it out or change it later."  This might be better than simply marking
> so many useful functions/classes as private.  I bet such an annotation
> could generate a compile warning/error for those who don't want to risk
> using them.
>
>
> On 04/02/2014 06:40 PM, Patrick Wendell wrote:
>
> Hey Phillip,
>
>  Right now there is no mechanism for this. You have to go in through the
> low level listener interface.
>
>  We could consider exposing the JobProgressListener directly - I think
> it's been factored nicely so it's fairly decoupled from the UI. The concern
> is this is a semi-internal piece of functionality and something we might,
> e.g. want to change the API of over time.
>
>  - Patrick
>
>
> On Wed, Apr 2, 2014 at 3:39 PM, Philip Ogren <philip.og...@oracle.com>wrote:
>
>> What I'd like is a way to capture the information provided on the stages
>> page (i.e. cluster:4040/stages via IndexPage).  Looking through the Spark
>> code, it doesn't seem like it is possible to directly query for specific
>> facts such as how many tasks have succeeded or how many total tasks there
>> are for a given active stage.  Instead, it looks like all the data for the
>> page is generated at once using information from the JobProgressListener.
>> It doesn't seem like I have any way to programmatically access this
>> information myself.  I can't even instantiate my own JobProgressListener
>> because it is spark package private.  I could implement my SparkListener
>> and gather up the information myself.  It feels a bit awkward since classes
>> like Task and TaskInfo are also spark package private.  It does seem
>> possible to gather up what I need but it seems like this sort of
>> information should just be available without by implementing a custom
>> SparkListener (or worse screen scraping the html generated by StageTable!)
>>
>> I was hoping that I would find the answer in MetricsServlet which is
>> turned on by default.  It seems that when I visit
>> http://cluster:4040/metrics/json/ I should be able to get everything I
>> want but I don't see the basic stage/task progress information I would
>> expect.  Are there special metrics properties that I should set to get this
>> info?  I think this would be the best solution - just give it the right URL
>> and parse the resulting JSON - but I can't seem to figure out how to do
>> this or if it is possible.
>>
>> Any advice is appreciated.
>>
>> Thanks,
>> Philip
>>
>>
>>
>> On 04/01/2014 09:43 AM, Philip Ogren wrote:
>>
>>> Hi DB,
>>>
>>> Just wondering if you ever got an answer to your question about
>>> monitoring progress - either offline or through your own investigation.
>>>  Any findings would be appreciated.
>>>
>>> Thanks,
>>> Philip
>>>
>>> On 01/30/2014 10:32 PM, DB Tsai wrote:
>>>
>>>> Hi guys,
>>>>
>>>> When we're running a very long job, we would like to show users the
>>>> current progress of map and reduce job. After looking at the api document,
>>>> I don't find anything for this. However, in Spark UI, I could see the
>>>> progress of the task. Is there anything I miss?
>>>>
>>>> Thanks.
>>>>
>>>> Sincerely,
>>>>
>>>> DB Tsai
>>>> Machine Learning Engineer
>>>> Alpine Data Labs
>>>> --------------------------------------
>>>> Web: http://alpinenow.com/
>>>>
>>>
>>>
>>
>
>

Reply via email to