Hi All, We are currently on a very old version of flink 1.4.0 and it has worked pretty well. But lately we have been facing checkpoint timeout issues. We would like to minimize any changes to the current pipelines and go ahead with the migration. With that said our first pick was to migrate to 1.5.6 and later migrate to a newer version.
Do you guys think a more recent version like 1.6 or 1.7 might work? We did try 1.8 but it requires some changes in the pipelines. When we tried 1.5.6 with docker compose we were unable to get the task manager attached to jobmanager. Are there some specific configurations required for newer versions? Logs: 8-28 07:36:30.834 [main] INFO org.apache.flink.runtime.util.LeaderRetrievalUtils - TaskManager will try to connect for 10000 milliseconds before falling back to heuristics 2020-08-28 07:36:30.853 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Retrieved new target address jobmanager/172.21.0.8:6123. 2020-08-28 07:36:31.279 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Trying to connect to address jobmanager/172.21.0.8:6123 2020-08-28 07:36:31.280 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address 'e6f9104cdc61/172.21.0.9': Connection refused (Connection refused) 2020-08-28 07:36:31.281 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/172.21.0.9': Connection refused (Connection refused) 2020-08-28 07:36:31.281 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/172.21.0.9': Connection refused (Connection refused) 2020-08-28 07:36:31.282 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/127.0.0.1': Invalid argument (connect failed) 2020-08-28 07:36:31.283 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/172.21.0.9': Connection refused (Connection refused) 2020-08-28 07:36:31.284 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/127.0.0.1': Invalid argument (connect failed) 2020-08-28 07:36:31.684 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Trying to connect to address jobmanager/172.21.0.8:6123 2020-08-28 07:36:31.686 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address 'e6f9104cdc61/172.21.0.9': Connection refused (Connection refused) 2020-08-28 07:36:31.687 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/172.21.0.9': Connection refused (Connection refused) 2020-08-28 07:36:31.688 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/172.21.0.9': Connection refused (Connection refused) 2020-08-28 07:36:31.688 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/127.0.0.1': Invalid argument (connect failed) 2020-08-28 07:36:31.689 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/172.21.0.9': Connection refused (Connection refused) 2020-08-28 07:36:31.690 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/127.0.0.1': Invalid argument (connect failed) 2020-08-28 07:36:32.490 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Trying to connect to address jobmanager/172.21.0.8:6123 2020-08-28 07:36:32.491 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address 'e6f9104cdc61/172.21.0.9': Connection refused (Connection refused) 2020-08-28 07:36:32.493 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/172.21.0.9': Connection refused (Connection refused) 2020-08-28 07:36:32.494 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/172.21.0.9': Connection refused (Connection refused) 2020-08-28 07:36:32.495 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/127.0.0.1': Invalid argument (connect failed) 2020-08-28 07:36:32.496 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/172.21.0.9': Connection refused (Connection refused) 2020-08-28 07:36:32.497 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/127.0.0.1': Invalid argument (connect failed) 2020-08-28 07:36:34.099 [main] INFO org.apache.flink.runtime.net.ConnectionUtils - Trying to connect to address jobmanager/172.21.0.8:6123 2020-08-28 07:36:34.100 [main] INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - TaskManager will use hostname/address 'e6f9104cdc61' (172.21.0.9) for communication. Flink Conf jobmanager.rpc.address: jobmanager rest.address: jobmanager Thanks