Hi All,

We are currently on a very old version of flink 1.4.0 and it has worked
pretty well. But lately we have been facing checkpoint timeout issues. We
would like to minimize any changes to the current pipelines and go ahead
with the migration. With that said our first pick was to migrate to 1.5.6
and later migrate to a newer version.

Do you guys think a more recent version like 1.6 or 1.7 might work? We did
try 1.8 but it requires some changes in the pipelines.

When we tried 1.5.6 with docker compose we were unable to get the task
manager attached to jobmanager. Are there some specific configurations
required for newer versions?

Logs:

8-28 07:36:30.834 [main] INFO
org.apache.flink.runtime.util.LeaderRetrievalUtils  - TaskManager will try
to connect for 10000 milliseconds before falling back to heuristics

2020-08-28 07:36:30.853 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Retrieved new target
address jobmanager/172.21.0.8:6123.

2020-08-28 07:36:31.279 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Trying to connect to
address jobmanager/172.21.0.8:6123

2020-08-28 07:36:31.280 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from
address 'e6f9104cdc61/172.21.0.9': Connection refused (Connection refused)

2020-08-28 07:36:31.281 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from
address '/172.21.0.9': Connection refused (Connection refused)

2020-08-28 07:36:31.281 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from
address '/172.21.0.9': Connection refused (Connection refused)

2020-08-28 07:36:31.282 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from
address '/127.0.0.1': Invalid argument (connect failed)

2020-08-28 07:36:31.283 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from
address '/172.21.0.9': Connection refused (Connection refused)

2020-08-28 07:36:31.284 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from
address '/127.0.0.1': Invalid argument (connect failed)

2020-08-28 07:36:31.684 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Trying to connect to
address jobmanager/172.21.0.8:6123

2020-08-28 07:36:31.686 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from
address 'e6f9104cdc61/172.21.0.9': Connection refused (Connection refused)

2020-08-28 07:36:31.687 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from
address '/172.21.0.9': Connection refused (Connection refused)

2020-08-28 07:36:31.688 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from
address '/172.21.0.9': Connection refused (Connection refused)

2020-08-28 07:36:31.688 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from
address '/127.0.0.1': Invalid argument (connect failed)

2020-08-28 07:36:31.689 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from
address '/172.21.0.9': Connection refused (Connection refused)

2020-08-28 07:36:31.690 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from
address '/127.0.0.1': Invalid argument (connect failed)

2020-08-28 07:36:32.490 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Trying to connect to
address jobmanager/172.21.0.8:6123

2020-08-28 07:36:32.491 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from
address 'e6f9104cdc61/172.21.0.9': Connection refused (Connection refused)

2020-08-28 07:36:32.493 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from
address '/172.21.0.9': Connection refused (Connection refused)

2020-08-28 07:36:32.494 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from
address '/172.21.0.9': Connection refused (Connection refused)

2020-08-28 07:36:32.495 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from
address '/127.0.0.1': Invalid argument (connect failed)

2020-08-28 07:36:32.496 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from
address '/172.21.0.9': Connection refused (Connection refused)

2020-08-28 07:36:32.497 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from
address '/127.0.0.1': Invalid argument (connect failed)

2020-08-28 07:36:34.099 [main] INFO
org.apache.flink.runtime.net.ConnectionUtils  - Trying to connect to
address jobmanager/172.21.0.8:6123

2020-08-28 07:36:34.100 [main] INFO
org.apache.flink.runtime.taskexecutor.TaskManagerRunner  - TaskManager will
use hostname/address 'e6f9104cdc61' (172.21.0.9) for communication.


Flink Conf

jobmanager.rpc.address: jobmanager

rest.address: jobmanager


Thanks

Reply via email to