Related source code: https://github.com/apache/flink/blob/master/flink-dist/src/main/flink-bin/bin/start-cluster.sh#L40
On Wed, Mar 7, 2018 at 2:11 AM, Yesheng Ma <kimi.y...@gmail.com> wrote: > Hi Nico, > > Thanks for your reply. My major concern is actually the `-l` argument. > The command I executed is: `nohup /bin/bash -x -l > "/state/partition1/ysma/flink-1.4.1/bin/jobmanager.sh" start cluster > dell-01.epcc 8091`, with and without the `-l` argument (the script in > Flink's bin directory uses the `-l` argument). > > 1) with the `-l` argument: the log is quite messy, but there are some > clue, the last executed command starts a zsh shell: > ``` > + . /home/ysma/.bashrc > ++ case $- in > ++ return > + PATH=/home/ysma/bin:/home/ysma/.local/bin:/state/ > partition1/ysma/redis-4.0.8/../bin:/home/ysma/env/jdk1.8.0_ > 151/bin:/home/ysma/env/maven/bin:/home/ysma/bin:/home/ysma/ > .local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/ > bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin > + '[' -f /bin/zsh ']' > + exec /bin/zsh -l > ``` > I guess the bash -l arguments detects the user's login shell and then logs > in a zsh shell (which I'm currently using) and never back. > > 2) without the `-l` argument, everything just goes fine. > > Therefore I suspect there might be something wrong with the `-l` argument, > or something wrong with my bash config? Any ideas? Thanks! > > > On Wed, Mar 7, 2018 at 12:20 AM, Nico Kruber <n...@data-artisans.com> > wrote: > >> Hi Yesheng, >> `nohup /bin/bash -l bin/jobmanager.sh start cluster ...` looks a bit >> strange since it should (imho) be an absolute path towards flink. >> >> What you could do to diagnose further, is to try to run the ssh command >> manually, i.e. figure out what is being executed by calling >> bash -x ./bin/start-cluster.sh >> and then run the ssh command without "-n" and not in background "&". >> Then you should also see the JobManager stdout to diagnose further. >> >> If that does not help yet, please log into the master manually and >> execute the "nohup /bin/bash..." command there to see what is going on. >> >> Depending on where the failure was, there may even be logs on the master >> machine. >> >> >> Nico >> >> On 04/03/18 15:52, Yesheng Ma wrote: >> > Hi all, >> > >> > When I execute bin/start-cluster.sh on the master machine, actually >> > the command `nohup /bin/bash -l bin/jobmanager.sh start cluster ...` is >> > exexuted, which does not open the job manager properly. >> > >> > I think there might be something wrong with the `-l` argument, since >> > when I use the `bin/jobmanager.sh start` command, everything is fine. >> > Kindly point out if I've done any configuration wrong. Thanks! >> > >> > Best, >> > Yesheng >> > >> > >> >> >