Re: job versioning

2015-09-22 Thread Abdollahian Noghabi, Shadi
Using “job.version”+1, Samza would automatically assign job version. However, as Navina pointed out, this is probably not a good way to go since you might want to skip a version. Thus, I think you are right in this regard. Sorry if I made a confusion. > On Sep 22, 2015, at 4:18 AM, Richard Lee

Re: job versioning

2015-09-22 Thread Richard Lee
I am not following what this ‘job.version’ + 1 business is about. I have build id for my code, likely a build timestamp or similar. I want to be able to determine if the currently running samza job has that version of the code. If not, I want to kill it, and launch a new job, with the same jo

Re: job versioning

2015-09-20 Thread Navina Ramesh
Hi Richard, We assume that job versioning is controlled by how you manage the config and deployment process. Job version is treated semantically different from a job id. At LinkedIn, for example, we generate 2 tars - one for job and one for config. When we deploy a Samza job, we specify the versio

Re: job versioning

2015-09-20 Thread Jocke Eriksson
Maybe you could store the version number in kafka by publishing it to a topic with compaction mode. You could then consume the messages and do a version comparison. But for me such a simple task should be made easier. 2015-09-20 0:25 GMT+02:00 Richard Lee : > Hi there- > > How do people track whi

Re: job versioning

2015-09-19 Thread Richard Lee
The more I look into this, the more I think that the Samza ApplicationManager is making things difficult by putting its REST interface on the RPC port rather than somewhere under the tracking URL. From reading the Hadoop ResourceManager webapp code, it does not look like they expect to expose t

Re: job versioning

2015-09-19 Thread Richard Lee
> On Sep 19, 2015, at 4:44 PM, Abdollahian Noghabi, Shadi > wrote: > > As far as I know there is no notion of job version, and you should not run > two instances of a job with the same (job name , job id) pair since it will > mess up the checkpoint and etc. The job id is used to run the same

Re: job versioning

2015-09-19 Thread Abdollahian Noghabi, Shadi
As far as I know there is no notion of job version, and you should not run two instances of a job with the same (job name , job id) pair since it will mess up the checkpoint and etc. The job id is used to run the same job with different instances at the same time. However I think it might be u

Re: job versioning

2015-09-19 Thread Richard Lee
Hmm.. what if you aren’t using Java? I don’t see the RPC port in the REST ResourceManager application endpoint… just the proxied tracking URL. It appears that other ApplicationManagers (such as MapReduce) put the REST endpoint on the same port as the proxied Tracking UI (under a /ws/v1/… path).

Re: job versioning

2015-09-19 Thread Gian Merlino
Hey Richard, The ApplicationReport returned by YarnClient.getApplications() or getApplicationReport(appId) includes the AM host and rpc port. https://hadoop.apache.org/docs/r2.7.0/api/org/apache/hadoop/yarn/client/api/YarnClient.html#getApplications() https://hadoop.apache.org/docs/r2.7.0/api/org

Re: job versioning

2015-09-19 Thread Richard Lee
I suppose it would be possible to add a custom ‘job.version’ field to the samza job properties file, and then query for it via the REST /config endpoint on the ApplicationMananger, but I’m unclear how I find the RPC port for the ApplicationManager from the ResourceManager. The ResourceManager s