Hey Ken,

Regarding Rufus, I know he might be a bit eager in changing lines ;) If
you want to ignore his changes in git blame, please take a look here[1].

For the main issue, do you mind creating a ticket? I hope someone will
be able to pick it up.

Best,

Dawid


[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/flinkdev/ide_setup/#ignoring-refactoring-commits

On 01/10/2021 02:10, Ken Krugler wrote:
> Hi all,
>
> We’ve upgraded from Flink 1.11 to 1.13, and our workflows are now
> sometimes failing with an exception, even though the job has succeeded.
>
> The stack trace for this bit of the exception is:
>
> java.util.concurrent.ExecutionException:
> org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could
> not complete the operation. Number of retries has been exhausted.
> at
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
> at
> org.apache.flink.client.program.ContextEnvironment.getJobExecutionResult(ContextEnvironment.java:117)
> at
> org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:74)
>         at my.program.execute.workflow...
>
> The root cause is "java.net.ConnectException: Connection refused”,
> returned from the YARN node where the Job Manager is (was) running.
>
> ContextEnvironment.java line 117 is:
>
> jobExecutionResult = jobExecutionResultFuture.get();
>
> This looks like a race condition, where YARN is terminating the Job
> Manager, and this sometimes completes before the main program has
> retrieved all of the job status information.
>
> I’m wondering if this is a side effect of recent changes to make
> execution async/non-blocking.
>
> Is this a known issue? Anything we can do to work around it?
>
> Thanks,
>
> — Ken
>
> PS - The last two people working on this area code were Aljoscha and
> Robert (really wish git blame didn’t show most lines as being modified
> by “Rufus Refactor”…sigh)
>
> --------------------------
> Ken Krugler
> http://www.scaleunlimited.com <http://www.scaleunlimited.com>
> Custom big data solutions
> Flink, Pinot, Solr, Elasticsearch
>

Attachment: OpenPGP_signature
Description: OpenPGP digital signature

Reply via email to