Could you also check the jobmanager logs whether the flink akka is also bound to and listening at the hostname "prod-bigd-dn11"? Otherwise, all the package from taskmanager will be discarded.
Best, Yang Vitaliy Semochkin <vitaliy...@gmail.com> 于2020年3月27日周五 下午3:35写道: > Hello Zhu, > > The host can be resolved and there are no filewalls in the cluster, so > all ports are opened. > > Regards, > Vitaliy > > On Fri, Mar 27, 2020 at 8:32 AM Zhu Zhu <reed...@gmail.com> wrote: > >> Hi Vitaliy, >> >> >> *Cannot serve slot request, no ResourceManager connected* >> This is not a problem, just that the JM needs RM to be connected to send >> slot requests. >> >> >> *Could not resolve ResourceManager address >> akka.tcp://flink@prod-bigd-dn11:43757/user/resourcemanager* >> This should be the root cause. Would you check whether the hostname >> *prod-bigd-dn11* is resolvable? And whether the port 43757 of that >> machine is permitted to be accessed? >> >> Thanks, >> Zhu Zhu >> >> Vitaliy Semochkin <vitaliy...@gmail.com> 于2020年3月27日周五 上午1:54写道: >> >>> Hi, >>> >>> I'm facing an issue similar to >>> https://issues.apache.org/jira/browse/FLINK-14074 >>> Job starts and then yarn logs report "*Could not resolve >>> ResourceManager address akka.tcp://flink*" >>> >>> A fragment from yarn logs looks like this: >>> >>> LazyFromSourcesSchedulingStrategy] >>> 16:54:21,279 INFO >>> org.apache.flink.runtime.executiongraph.ExecutionGraph - Job Flink >>> Java Job at Thu Mar 26 16:54:09 CET 2020 (9817283f911d83a6d278cc39d17d6b11) >>> switched from state CREATED to RUNNING. >>> 16:54:21,287 INFO >>> org.apache.flink.runtime.executiongraph.ExecutionGraph - CHAIN >>> DataSource (MailEvent; EMC; 2019-12-01 - 2020-01-01; null - 1578182400000) >>> -> FlatMap (SplitDuplicate) -> FlatMap (Create MailEvent) -> Filter >>> (EventDateTimeRangeFilter) -> Filter (TrackingStatusesFilter) -> FlatMap >>> (Get mail item by EMC event) -> Map (Map IntraregionalVolumeItem data set >>> from EMC events) (1/3) (5482b0e6ae1d64d9b0918ec15599211f) switched from >>> CREATED to SCHEDULED. >>> 16:54:21,287 INFO >>> org.apache.flink.runtime.executiongraph.ExecutionGraph - CHAIN >>> DataSource (MailEvent; EMC; 2019-12-01 - 2020-01-01; null - 1578182400000) >>> -> FlatMap (SplitDuplicate) -> FlatMap (Create MailEvent) -> Filter >>> (EventDateTimeRangeFilter) -> Filter (TrackingStatusesFilter) -> FlatMap >>> (Get mail item by EMC event) -> Map (Map IntraregionalVolumeItem data set >>> from EMC events) (2/3) (5c993710423eea47ae66f833b2999530) switched from >>> CREATED to SCHEDULED. >>> 16:54:21,287 INFO >>> org.apache.flink.runtime.executiongraph.ExecutionGraph - CHAIN >>> DataSource (MailEvent; EMC; 2019-12-01 - 2020-01-01; null - 1578182400000) >>> -> FlatMap (SplitDuplicate) -> FlatMap (Create MailEvent) -> Filter >>> (EventDateTimeRangeFilter) -> Filter (TrackingStatusesFilter) -> FlatMap >>> (Get mail item by EMC event) -> Map (Map IntraregionalVolumeItem data set >>> from EMC events) (3/3) (23cfa30fba857b2c75ba76a21c7d4972) switched from >>> CREATED to SCHEDULED. >>> 16:54:21,287 INFO >>> org.apache.flink.runtime.executiongraph.ExecutionGraph - CHAIN >>> DataSource (MailEvent; EMD; 2019-12-01 - 2020-01-01; null - 1578182400000) >>> -> FlatMap (SplitDuplicate) -> FlatMap (Create MailEvent) -> Filter >>> (EventDateTimeRangeFilter) -> Filter (TrackingStatusesFilter) -> FlatMap >>> (Get mail item by EMD event) -> Map (Map IntraregionalVolumeItem data set >>> from EMD events) (1/3) (7cc8a395b87e82000184724eb1698ace) switched from >>> CREATED to SCHEDULED. >>> 16:54:21,288 INFO >>> org.apache.flink.runtime.executiongraph.ExecutionGraph - CHAIN >>> DataSource (MailEvent; EMD; 2019-12-01 - 2020-01-01; null - 1578182400000) >>> -> FlatMap (SplitDuplicate) -> FlatMap (Create MailEvent) -> Filter >>> (EventDateTimeRangeFilter) -> Filter (TrackingStatusesFilter) -> FlatMap >>> (Get mail item by EMD event) -> Map (Map IntraregionalVolumeItem data set >>> from EMD events) (2/3) (5edfe3d1f509856d17fa0da078cb3f7e) switched from >>> CREATED to SCHEDULED. >>> 16:54:21,288 INFO >>> org.apache.flink.runtime.executiongraph.ExecutionGraph - CHAIN >>> DataSource (MailEvent; EMD; 2019-12-01 - 2020-01-01; null - 1578182400000) >>> -> FlatMap (SplitDuplicate) -> FlatMap (Create MailEvent) -> Filter >>> (EventDateTimeRangeFilter) -> Filter (TrackingStatusesFilter) -> FlatMap >>> (Get mail item by EMD event) -> Map (Map IntraregionalVolumeItem data set >>> from EMD events) (3/3) (dd3397f889a3fad1acf4c59f59a93d92) switched from >>> CREATED to SCHEDULED. >>> 16:54:21,297 INFO >>> org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl - Cannot >>> serve slot request, no ResourceManager connected. Adding as pending request >>> [SlotRequestId{b4c6e7357e4620bf2e997c46d7723eb1}] >>> 16:54:21,301 INFO >>> org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl - Cannot >>> serve slot request, no ResourceManager connected. Adding as pending request >>> [SlotRequestId{841bbb79b01b5e0d9ae749a03f65c303}] >>> 16:54:21,301 INFO >>> org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl - Cannot >>> serve slot request, no ResourceManager connected. Adding as pending request >>> [SlotRequestId{496120465d541ea9fd2ffcec89e2ac3b}] >>> 16:54:21,304 INFO org.apache.flink.runtime.jobmaster.JobMaster >>> - Connecting to ResourceManager akka.tcp:// >>> fl...@prod-bigd-dn11.net:43757/user/resourcemanager(00000000000000000000000000000000) >>> 16:54:21,307 INFO org.apache.flink.runtime.jobmaster.JobMaster >>> - Could not resolve ResourceManager address >>> akka.tcp://flink@prod-bigd-dn11:43757/user/resourcemanager, retrying in >>> 10000 ms: Could not connect to rpc endpoint under address akka.tcp:// >>> fl...@prod-bigd-dn11.net:43757/user/resourcemanager.. >>> 16:54:31,322 INFO org.apache.flink.runtime.jobmaster.JobMaster >>> - Could not resolve ResourceManager address >>> akka.tcp://flink@prod-bigd-dn11:43757/user/resourcemanager, retrying in >>> 10000 ms: Could not connect to rpc endpoint under address >>> akka.tcp://flink@prod-bigd-dn11:43757/user/resourcemanager.. >>> >>> What can cause following problems? >>> *Cannot serve slot request, no ResourceManager connected* >>> *Could not resolve ResourceManager address >>> akka.tcp://flink@prod-bigd-dn11:43757* >>> >>> Regards, >>> Vitaliy >>> >>> >>>