Ah ok we need to pass --host. The command line help sais jobmanager.sh
<host>?!?! If I recall. I have to go check tomorrow...

On Tue., Jun. 18, 2019, 10:05 p.m. PoolakkalMukkath, Shakir, <
shakir_poolakkalmukk...@comcast.com> wrote:

> Hi Nick,
>
>
>
> It works that way by explicitly setting the –host. I got mislead by the
> *“only”* word in doc and did not try. Thanks for the help
>
>
>
> Thanks,
>
> Shakir
>
> *From: *"Martin, Nick" <nick.mar...@ngc.com>
> *Date: *Tuesday, June 18, 2019 at 6:31 PM
> *To: *"PoolakkalMukkath, Shakir" <shakir_poolakkalmukk...@comcast.com>,
> Till Rohrmann <trohrm...@apache.org>, John Smith <java.dev....@gmail.com>
> *Cc: *user <user@flink.apache.org>
> *Subject: *RE: [EXTERNAL] Re: How to restart/recover on reboot?
>
>
>
> Jobmanager.sh takes an optional argument for the hostname to bind to, and
> start-cluster uses it. If you leave it blank it, the script will use
> whatever is in flink-conf.yaml (localhost is the default value that ships
> with flink).
>
>
>
> The dockerized version of flink runs pretty much the way you’re trying to
> operate (i.e. each node starts itself), so the entrypoint script out of
> that is probably a good source of information about how to set it up.
>
>
>
> *From:* PoolakkalMukkath, Shakir [mailto:
> shakir_poolakkalmukk...@comcast.com]
> *Sent:* Tuesday, June 18, 2019 2:15 PM
> *To:* Till Rohrmann <trohrm...@apache.org>; John Smith <
> java.dev....@gmail.com>
> *Cc:* user <user@flink.apache.org>
> *Subject:* EXT :Re: [EXTERNAL] Re: How to restart/recover on reboot?
>
>
>
> Hi Tim,John,
>
>
>
> I do agree with the issue John mentioned and have the same problem.
>
>
>
> We can only *start* a standalone HA cluster with ./start-cluster.sh
> script. And then when there are failures, we can *restart* those
> components individually by calling jobmanager.sh/ jobmanager.sh.  This
> works great
>
> But , Like John mentioned, If we want to start the cluster initially
> itself by running the jobmanager.sh on each JobManager nodes, it is not
> working. It binds to local and not forming the HA cluster.
>
>
>
> Thanks,
>
> Shakir
>
>
>
> *From: *Till Rohrmann <trohrm...@apache.org>
> *Date: *Tuesday, June 18, 2019 at 4:23 PM
> *To: *John Smith <java.dev....@gmail.com>
> *Cc: *user <user@flink.apache.org>
> *Subject: *[EXTERNAL] Re: How to restart/recover on reboot?
>
>
>
> I guess it should work if you installed a systemd service which simply
> calls `jobmanager.sh start` or `taskmanager.sh start`.
>
>
>
> Cheers,
>
> Till
>
>
>
> On Tue, Jun 18, 2019 at 4:29 PM John Smith <java.dev....@gmail.com> wrote:
>
> Yes, that is understood. But I don't see why we cannot call jobmanager.sh
> and taskmanager.sh to build the cluster and have them run as systemd units.
>
> I looked at start-cluster.sh and all it does is SSH and call jobmanager.sh
> which then cascades to taskmanager.sh I just have to pin point what's
> missing to have systemd service working. In fact calling jobmanager.sh as
> systemd service actually sees the shared masters, slaves and
> flink-conf.yaml. But it binds to local host.
>
>
>
> Maybe one way to do it would be to bootstrap the cluster with
> ./start-cluster.sh and then install systemd services for jobmanager.sh and
> tsakmanager.sh
>
>
>
> Like I said I don't want to have some process in place to remind admins
> they need to manually start a node every time they patch or a host goes
> down for what ever reason.
>
>
>
> On Tue, 18 Jun 2019 at 04:31, Till Rohrmann <trohrm...@apache.org> wrote:
>
> When a single machine fails you should rather call `taskmanager.sh
> start`/`jobmanager.sh start` to start a single process. `start-cluster.sh`
> will start multiple processes on different machines.
>
>
>
> Cheers,
>
> Till
>
>
>
> On Mon, Jun 17, 2019 at 4:30 PM John Smith <java.dev....@gmail.com> wrote:
>
> Well some reasons, machine reboots/maintenance etc... Host/VM crashes and
> restarts. And same goes for the job manager. I don't want/need to have to
> document/remember some start process for sys admins/devops.
>
> So far I have looked at ./start-cluster.sh and all it seems to do is SSH
> into all the specified nodes and starts the processes using the jobmanager
> and taskmanager scripts. I don't see anything special in any of the sh
> scripts.
> I configured passwordless ssh through terraform and all that works great
> only when trying to do the manual start through systemd. I may have
> something missing...
>
>
>
> On Mon, 17 Jun 2019 at 09:41, Till Rohrmann <trohrm...@apache.org> wrote:
>
> Hi John,
>
>
>
> I have not much experience wrt setting Flink up via systemd services. Why
> do you want to do it like that?
>
>
>
> 1. In standalone mode, Flink won't automatically restart TaskManagers.
> This only works on Yarn and Mesos atm.
>
> 2. In case of a lost TaskManager, you should run `taskmanager.sh start`.
> This script simply starts a new TaskManager process.
>
> 3. I guess you could use systemd to bring up a Flink TaskManager process
> on start up.
>
>
>
> Cheers,
>
> Till
>
>
>
> On Fri, Jun 14, 2019 at 5:56 PM John Smith <java.dev....@gmail.com> wrote:
>
> I looked into the start-cluster.sh and I don't see anything special. So
> technically it should be as easy as installing Systemd services to run
> jobamanger.sh and taskmanager.sh respectively?
>
>
>
> On Wed, 12 Jun 2019 at 13:02, John Smith <java.dev....@gmail.com> wrote:
>
> The installation instructions do not indicate how to create systemd
> services.
>
>
>
> 1- When task nodes fail, will the job leader detect this and ssh and
> restart the task node? From my testing it doesn't seem like it.
>
> 2- How do we recover a lost node? Do we simply go back to the master node
> and run start-cluster.sh and the script is smart enough to figure out what
> is missing?
>
> 3- Or do we need to create systemd services and if so on which command do
> we start the service on?
>
>
> ------------------------------
>
> Notice: This e-mail is intended solely for use of the individual or entity
> to which it is addressed and may contain information that is proprietary,
> privileged and/or exempt from disclosure under applicable law. If the
> reader is not the intended recipient or agent responsible for delivering
> the message to the intended recipient, you are hereby notified that any
> dissemination, distribution or copying of this communication is strictly
> prohibited. This communication may also contain data subject to U.S. export
> laws. If so, data subject to the International Traffic in Arms Regulation
> cannot be disseminated, distributed, transferred, or copied, whether
> incorporated or in its original form, to foreign nationals residing in the
> U.S. or abroad, absent the express prior approval of the U.S. Department of
> State. Data subject to the Export Administration Act may not be
> disseminated, distributed, transferred or copied contrary to U. S.
> Department of Commerce regulations. If you have received this communication
> in error, please notify the sender by reply e-mail and destroy the e-mail
> message and any physical copies made of the communication.
>  Thank you.
> *********************
>

Reply via email to