Hello,

I'm trying to solve a problem using async-io-api. I'm running a flinkjob
which has a sleep time of 120sec during restart. After every deployment i
see around 10K file load time out. I suspect it is because of the TimeOut
parameter being *30 sec*. I played with that number and found out that
150sec gives 0 file read timeout. I tried changing Capacity as well (from
100 to 1000) but that didn't make any difference. That's surprising to me
since I thought that when the flinkjob restarts (from savepoint) it has a
backlog (of 120 sec) of file read requests and increasing Capacity will
make file read faster but that's not the case.

I have read this :
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/asyncio.html#async-io-api
but i'm still not clear.

I have a question
a) for the async-io-api, why is changing TimeOut is working instead of
Capacity?

Here is the Configuration of AsyncDataSteam.

Before

DataStream<RawData> fileStream = AsyncDataStream
        .unorderedWait(datasetStream,
                new AsyncS3Load(),
                30,
                TimeUnit.SECONDS,
                100)


After

DataStream<RawData> fileStream = AsyncDataStream
        .unorderedWait(datasetStream,
                new AsyncS3Load(),
                150,
                TimeUnit.SECONDS,
                100)


Thanks!

Reply via email to