Hi Greg, I guess I restarted the cluster too fast. Combined with a high cpu inside the cluster. I tested it again few minutes ago and there was no issue! With „$ jps“ I checked if there any Java process -> there wasn’t
But if the master don’t know slave5, how can slave5 reconnect to the JobManager? That mean the JobManager will „adopt a child“. Marc > Am 11.08.2017 um 20:27 schrieb Greg Hogan <c...@greghogan.com>: > > Hi Marc, > > By chance did you edit the slaves file before shutting down the cluster? If > so, then the removed worker would not be stopped and would reconnect to the > restarted JobManager. > > Greg > > >> On Aug 11, 2017, at 11:25 AM, Kaepke, Marc <marc.kae...@haw-hamburg.de> >> wrote: >> >> Hi, >> >> I have a cluster of 4 dedicated machines (no VMs). My previous config was: 1 >> master and 3 slaves. Each machine provides a task- or jobmanager. >> >> Now I want to reduce my cluster and have 1 master and 3 slaves, but one >> machine provides a jobmanager and one task manager in parallel. I changed >> all conf/slaves files. While I start my cluster everything seems well for 2 >> seconds -> one JM and 3 TM with each 8 cores/slots. Two seconds later I see >> 4 taskmanger and one JM. I also can run a job with 32 slots (4 TM * 8 slots) >> without any errors. >> >> Why does my cluster has 4 task manager?! All slaves files are cleaned and >> contains 3 inputs >> >> >> Thanks! >> >> Marc