Hi Vitaliy, >> *Cannot serve slot request, no ResourceManager connected* This is not a problem, just that the JM needs RM to be connected to send slot requests.
>> *Could not resolve ResourceManager address akka.tcp://flink@prod-bigd-dn11:43757/user/resourcemanager* This should be the root cause. Would you check whether the hostname *prod-bigd-dn11* is resolvable? And whether the port 43757 of that machine is permitted to be accessed? Thanks, Zhu Zhu Vitaliy Semochkin <vitaliy...@gmail.com> 于2020年3月27日周五 上午1:54写道: > Hi, > > I'm facing an issue similar to > https://issues.apache.org/jira/browse/FLINK-14074 > Job starts and then yarn logs report "*Could not resolve ResourceManager > address akka.tcp://flink*" > > A fragment from yarn logs looks like this: > > LazyFromSourcesSchedulingStrategy] > 16:54:21,279 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph > - Job Flink Java Job at Thu Mar 26 16:54:09 CET 2020 > (9817283f911d83a6d278cc39d17d6b11) switched from state CREATED to RUNNING. > 16:54:21,287 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph > - CHAIN DataSource (MailEvent; EMC; 2019-12-01 - 2020-01-01; null - > 1578182400000) -> FlatMap (SplitDuplicate) -> FlatMap (Create MailEvent) -> > Filter (EventDateTimeRangeFilter) -> Filter (TrackingStatusesFilter) -> > FlatMap (Get mail item by EMC event) -> Map (Map IntraregionalVolumeItem > data set from EMC events) (1/3) (5482b0e6ae1d64d9b0918ec15599211f) switched > from CREATED to SCHEDULED. > 16:54:21,287 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph > - CHAIN DataSource (MailEvent; EMC; 2019-12-01 - 2020-01-01; null - > 1578182400000) -> FlatMap (SplitDuplicate) -> FlatMap (Create MailEvent) -> > Filter (EventDateTimeRangeFilter) -> Filter (TrackingStatusesFilter) -> > FlatMap (Get mail item by EMC event) -> Map (Map IntraregionalVolumeItem > data set from EMC events) (2/3) (5c993710423eea47ae66f833b2999530) switched > from CREATED to SCHEDULED. > 16:54:21,287 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph > - CHAIN DataSource (MailEvent; EMC; 2019-12-01 - 2020-01-01; null - > 1578182400000) -> FlatMap (SplitDuplicate) -> FlatMap (Create MailEvent) -> > Filter (EventDateTimeRangeFilter) -> Filter (TrackingStatusesFilter) -> > FlatMap (Get mail item by EMC event) -> Map (Map IntraregionalVolumeItem > data set from EMC events) (3/3) (23cfa30fba857b2c75ba76a21c7d4972) switched > from CREATED to SCHEDULED. > 16:54:21,287 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph > - CHAIN DataSource (MailEvent; EMD; 2019-12-01 - 2020-01-01; null - > 1578182400000) -> FlatMap (SplitDuplicate) -> FlatMap (Create MailEvent) -> > Filter (EventDateTimeRangeFilter) -> Filter (TrackingStatusesFilter) -> > FlatMap (Get mail item by EMD event) -> Map (Map IntraregionalVolumeItem > data set from EMD events) (1/3) (7cc8a395b87e82000184724eb1698ace) switched > from CREATED to SCHEDULED. > 16:54:21,288 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph > - CHAIN DataSource (MailEvent; EMD; 2019-12-01 - 2020-01-01; null - > 1578182400000) -> FlatMap (SplitDuplicate) -> FlatMap (Create MailEvent) -> > Filter (EventDateTimeRangeFilter) -> Filter (TrackingStatusesFilter) -> > FlatMap (Get mail item by EMD event) -> Map (Map IntraregionalVolumeItem > data set from EMD events) (2/3) (5edfe3d1f509856d17fa0da078cb3f7e) switched > from CREATED to SCHEDULED. > 16:54:21,288 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph > - CHAIN DataSource (MailEvent; EMD; 2019-12-01 - 2020-01-01; null - > 1578182400000) -> FlatMap (SplitDuplicate) -> FlatMap (Create MailEvent) -> > Filter (EventDateTimeRangeFilter) -> Filter (TrackingStatusesFilter) -> > FlatMap (Get mail item by EMD event) -> Map (Map IntraregionalVolumeItem > data set from EMD events) (3/3) (dd3397f889a3fad1acf4c59f59a93d92) switched > from CREATED to SCHEDULED. > 16:54:21,297 INFO > org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl - Cannot > serve slot request, no ResourceManager connected. Adding as pending request > [SlotRequestId{b4c6e7357e4620bf2e997c46d7723eb1}] > 16:54:21,301 INFO > org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl - Cannot > serve slot request, no ResourceManager connected. Adding as pending request > [SlotRequestId{841bbb79b01b5e0d9ae749a03f65c303}] > 16:54:21,301 INFO > org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl - Cannot > serve slot request, no ResourceManager connected. Adding as pending request > [SlotRequestId{496120465d541ea9fd2ffcec89e2ac3b}] > 16:54:21,304 INFO org.apache.flink.runtime.jobmaster.JobMaster > - Connecting to ResourceManager akka.tcp:// > fl...@prod-bigd-dn11.net:43757/user/resourcemanager(00000000000000000000000000000000) > 16:54:21,307 INFO org.apache.flink.runtime.jobmaster.JobMaster > - Could not resolve ResourceManager address > akka.tcp://flink@prod-bigd-dn11:43757/user/resourcemanager, retrying in > 10000 ms: Could not connect to rpc endpoint under address akka.tcp:// > fl...@prod-bigd-dn11.net:43757/user/resourcemanager.. > 16:54:31,322 INFO org.apache.flink.runtime.jobmaster.JobMaster > - Could not resolve ResourceManager address > akka.tcp://flink@prod-bigd-dn11:43757/user/resourcemanager, retrying in > 10000 ms: Could not connect to rpc endpoint under address > akka.tcp://flink@prod-bigd-dn11:43757/user/resourcemanager.. > > What can cause following problems? > *Cannot serve slot request, no ResourceManager connected* > *Could not resolve ResourceManager address > akka.tcp://flink@prod-bigd-dn11:43757* > > Regards, > Vitaliy > > >