The more I look into this, the more I think that the Samza ApplicationManager 
is making things difficult by putting its REST interface on the RPC port rather 
than somewhere under the tracking URL.  From reading the Hadoop ResourceManager 
webapp code, it does not look like they expect to expose the RPC port via the 
REST interface.  It seems that the prevailing thinking is that REST interfaces 
should be supported via the tracking URL, and the RPC port available for 
‘other’ protocols.  Seems rather arbitrary, tho.

Richard

> On Sep 19, 2015, at 4:36 PM, Richard Lee <rd...@tivo.com> wrote:
> 
> Hmm.. what if you aren’t using Java?  I don’t see the RPC port in the REST 
> ResourceManager application endpoint… just the proxied tracking URL.
> 
> It appears that other ApplicationManagers (such as MapReduce) put the REST 
> endpoint on the same port as the proxied Tracking UI (under a /ws/v1/… path). 
>  That way, the trackingUrl returned from the ResourceManager can be used to 
> find the REST endpoints of the ApplicationManager without having to resort to 
> proprietary Java based protocols.
> 
> Short of a change to the YARN resource manager to expose both ports via REST, 
> this seems like a better approach for Samza.
> 
> Richard
> 
>> On Sep 19, 2015, at 4:23 PM, Gian Merlino <gianmerl...@gmail.com> wrote:
>> 
>> Hey Richard,
>> 
>> The ApplicationReport returned by YarnClient.getApplications() or
>> getApplicationReport(appId) includes the AM host and rpc port.
>> 
>> https://hadoop.apache.org/docs/r2.7.0/api/org/apache/hadoop/yarn/client/api/YarnClient.html#getApplications()
>> https://hadoop.apache.org/docs/r2.7.0/api/org/apache/hadoop/yarn/client/api/YarnClient.html#getApplicationReport(org.apache.hadoop.yarn.api.records.ApplicationId)
>> 
>> On Sat, Sep 19, 2015 at 4:09 PM, Richard Lee <rd...@tivo.com> wrote:
>> 
>>> I suppose it would be possible to add a custom ‘job.version’ field to the
>>> samza job properties file, and then query for it via the REST /config
>>> endpoint on the ApplicationMananger, but I’m unclear how I find the RPC
>>> port for the ApplicationManager from the ResourceManager.  The
>>> ResourceManager seems to only list the UI tracker endpoint proxy in its
>>> application REST endpoint.
>>> 
>>> I suppose I could scrape the port out of the HTML, but it seems like there
>>> should be a better way.
>>> 
>>> Richard
>>> 
>>>> On Sep 19, 2015, at 3:25 PM, Richard Lee <rd...@tivo.com> wrote:
>>>> 
>>>> Hi there-
>>>> 
>>>> How do people track which version of a samza job is running in yarn?
>>> The job name and job id can’t be used, as they are used to create the
>>> checkpoint topic, etc.  I’m looking for a way of determining if the current
>>> job running in yarn is the latest version, and if not, kill it and launch a
>>> newer version, picking up where the previous version left off.
>>>> 
>>>> There seems to be no ‘job version’ field anywhere obvious in either
>>> samza or yarn.
>>>> 
>>>> Is there another approach I should use?
>>>> 
>>>> Richard
>>>> 
>>> 
>>> 
> 

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to