Heart Zhou created FLINK-37607:
----------------------------------

             Summary: Blocklist timeout check may lost
                 Key: FLINK-37607
                 URL: https://issues.apache.org/jira/browse/FLINK-37607
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Coordination
    Affects Versions: 1.20.1
            Reporter: Heart Zhou


The blocklist timeout check may be scheduled before the rpc server starts



The blocklist timeout check is scheduled by the mainThreadExecutor in the 
constructor.
{code:java}
DefaultBlocklistHandler(xxx,
        Duration timeoutCheckInterval,
        ComponentMainThreadExecutor mainThreadExecutor,
        xxx) {
    xxx
    this.timeoutCheckInterval = checkNotNull(timeoutCheckInterval);
    this.mainThreadExecutor = checkNotNull(mainThreadExecutor);
    xxx

    scheduleTimeoutCheck();
} {code}
 

When the check function is called, the 
org.apache.flink.runtime.rpc.RpcEndpoint#start method may not have been called 
yet, although it will be called very soon.

Therefore, the check function might be lost.

 
{code:java}
public ScheduledFuture<?> schedule(Runnable command, long delay, TimeUnit unit) 
{
    final long delayMillis = TimeUnit.MILLISECONDS.convert(delay, unit);
    FutureTask<Void> ft = new FutureTask<>(command, null);
    if (mainScheduledExecutor.isShutdown()) {
        log.warn(
                "The scheduled executor service is shutdown and ignores the 
command {}",
                command);
    } else {
        mainScheduledExecutor.schedule(
                () -> gateway.runAsync(ft), delayMillis, TimeUnit.MILLISECONDS);
    }
    return new ScheduledFutureAdapter<>(ft, delayMillis, TimeUnit.MILLISECONDS);
}{code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to