The issue has been resolved, as I said in the previous email. It is caused
by the async function, every record processed by the async function will be
a state in the async operator, which is a map type(UserAccessLog).
Arvid Heise 于2021年8月23日周一 下午11:26写道:
> I don't see anything suspicious in your
I don't see anything suspicious in your code. The stacktrace is also for a
MapSerializer. Do you have another operator where you put Map into a custom
state?
On Fri, Aug 20, 2021 at 6:43 PM yidan zhao wrote:
> But, I do not know why this leads to the job's failure and recovery
> since I have set
But, I do not know why this leads to the job's failure and recovery
since I have set the tolerable failed checkpoint to Integer.MAX_VALUE.
Due to the failure, my task manager failed because of the task cancel
timeout, and about 80% of task managers went down due to cancel
timeout.
yidan zhao 于202
Ok, thanks. I have some result, and you can give some ensure. Here is
the issue code:
The async function's implementation. It do async redis query, and fill
some data back.
In code [ currentBatch.get(i).getD().put("ipLabel",
objects.getResponses().get(i)); ] the getD() returns a map attr in
Origin
Essentially this exception means that the state was modified while a
snapshot was being taken.
We usually see this when users hold on to some state value beyond a
single call to a user-defined function, particularly from different threads.
We may be able to pinpoint the issue if you were to p
Flink web ui shows the exception as follows.
In the task (ual_transform_UserLogBlackUidJudger ->
ual_transform_IpLabel ), the first one is a broadcast process
function, and the second one is an async function. I do not know
whether the issues have some relation to it.
And the issues not occurred b