Questions about standalone cluster configuration: 1. Is it considered bad practice to have standby JobManagers co-located on the same machines as TaskManagers? 2. Is it considered bad practice to have zookeeper installed on the same machines as the JobManager leader and standby machines? (the docs say "In production setups, it is recommended to manage your own ZooKeeper installation.", but I'm assuming it's still okay to co-locate ZK on with JobManager?) 3. In another thread, I read that the rule of thumb for taskmanager.numberOfTaskSlots = number of cores. Doesn't this ignore cases where threads have a high proportion of idle time (i.e. waiting on an I/O call)? If the total number of task slot limits my degree of parallelism, but most parallel copies of a subtask are idle at any given time, it seems that I would want to have # of task slots equal to some multiple of the number of cores.
Thanks, Edward