Hi Claude, I agree that you should be able to restart individual pods with a changed memory configuration. Can you share the full Jobmanager log of the failed restart attempt?
I don't think that the log statement you've posted explains a start failure. Regards, Robert On Tue, Nov 3, 2020 at 2:33 AM Claude M <claudemur...@gmail.com> wrote: > > Hello, > > I have Flink 1.10.2 installed in a Kubernetes cluster. > Anytime I make a change to the flink.conf, the Flink jobmanager pod fails > to restart. > For example, I modified the following memory setting in the flink.conf: > jobmanager.memory.flink.size. > After I deploy the change, the pod fails to restart and the following is > seen in the log: > > WARN > org.apache.flink.runtime.webmonitor.retriever.impl.RpcGatewayRetriever - > Error while retrieving the leader gateway. Retrying to connect to > akka.tcp://flink@flink-jobmanager:50010/user/dispatcher. > > The pod can be restored by doing one of the following but these are not > acceptable solutions: > > - Revert the changes made to the flink.conf to the previous settings > - Remove the Flink Kubernetes deployment before doing a deployment > - Delete the flink cluster folder in Zookeeper > > I don't understand why making any changes in the flink.conf causes this > problem. > Any help is appreciated. > > > Thank You >