look into when I run into similar
situations
Feel free to get back to the mailing list for further clarifications …
Thias
From: Caizhi Weng
Sent: Donnerstag, 2. September 2021 04:24
To: Daniel Vol
Cc: user
Subject: Re: Flink restarts on Checkpoint failure
Hi!
There are a ton of possible
Hi!
There are a ton of possible reasons for a checkpoint failure. The most
possible reasons might be
* The JVM is busy with garbage collecting when performing the checkpoints.
This can be checked by looking into the GC logs of a task manager.
* The state suddenly becomes quite large due to some
Hello,
I see the following error in my jobmanager log (Flink on EMR):
Checking cluster logs I see :
2021-08-21 17:17:30,489 [Checkpoint Timer] INFO
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering
checkpoint 1 (type=CHECKPOINT) @ 1629566250303 for job
c513e9ebbea4ab72d80b133
It's after a checkpoint failure. I don't know if that includes a restore
from a checkpoint.
I'll take some screenshots when the jobs hit the failure again. All of my
currently running jobs are healthy right now and haven't hit a checkpoint
failure.
On Sun, Jul 18, 20
ink job hits a checkpoint failure (e.g. timeout) and
> then has successful checkpoints, the flink job appears to be in a bad
> state. E.g. some of the operators that previously had a watermark
> start showing "no watermark". The jobs proceed very slowly.
>
> Is there docum
Hi!
This does not sound like an expected behavior. Could you share your code /
SQL and flink configuration so that others can help diagnose the issue?
Dan Hill 于2021年7月19日周一 下午1:41写道:
> After my dev flink job hits a checkpoint failure (e.g. timeout) and then
> has successful checkpoint
After my dev flink job hits a checkpoint failure (e.g. timeout) and then
has successful checkpoints, the flink job appears to be in a bad state.
E.g. some of the operators that previously had a watermark start showing
"no watermark". The jobs proceed very slowly.
Is there documentatio
Hi, Fabian
Thanks for replying. I created this ticket. It contains how to reproduce it
using code in flink-example package:
https://issues.apache.org/jira/browse/FLINK-22326
Best
Lu
On Fri, Apr 16, 2021 at 1:25 AM Fabian Paul
wrote:
> Hi Lu,
>
> Can you provide some more detailed logs of what
Hi Lu,
Can you provide some more detailed logs of what happened during the
checkpointing phase? If it is possible please enable debug logs enabled.
It would be also great know whether you have implemented your own Iterator
Operator or what kind of Flink program you are trying to execute.
Best,
Hi, Flink Users
When we migrate from flink 1.9.1 to flink 1.11, we notice job will always
fail on checkpoint if job uses Iterator Operator, no matter we use
unaligned checkpoint or not. Those jobs don't have checkpoint issues in
1.9. Is this a known issue? Thank you!
Best
Lu
state, if Flink task always checkpoint failure, are the key state cleared
> by timer?
> Thanks to your replay.
>
Hi community, now I am using Flink sql , and I set the retention time, As I
all know is that Flink will set the timer for per key to clear their state,
if Flink task always checkpoint failure, are the key state cleared by
timer?
Thanks to your replay.
> *Sent:* Tuesday, August 27, 2019 15:01
> *To:* user
> *Subject:* Re: checkpoint failure suddenly even state size less than 1 mb
>
> Hi team,
> Anyone for help/suggestion, now we have stopped all input in kafka, there
> is no processing, no sink but checkpointing is f
Kafka source shows high back pressure.
> 2. Sudden checkpoint failure for entire day until restart.
>
> My job does following thing,
> a. Read from Kafka
> b. Asyncio to external system
> c. Dumping in Cassandra, Elasticsearch
>
> Checkpointing is using file system.
> This
09 pengcheng...@bonc.com.cn, <
>> pengcheng...@bonc.com.cn> wrote:
>>
>>> Hi,What's your checkpoint config?
>>>
>>> --
>>> pengcheng...@bonc.com.cn
>>>
>>>
>>> *From:* Sushant Sawant
&
t: Re: checkpoint failure suddenly even state size less than 1 mb
Hi team,
Anyone for help/suggestion, now we have stopped all input in kafka, there is no
processing, no sink but checkpointing is failing.
Is it like once checkpoint fails it keeps failing forever until job restart.
Help appreciated.
T
p.m., "Sushant Sawant"
wrote:
Hi all,
m facing two issues which I believe are co-related though.
1. Kafka source shows high back pressure.
2. Sudden checkpoint failure for entire day until restart.
My job does following thing,
a. Read from Kafka
b. Asyncio to external system
c. Dumpin
Hi all,
m facing two issues which I believe are co-related though.
1. Kafka source shows high back pressure.
2. Sudden checkpoint failure for entire day until restart.
My job does following thing,
a. Read from Kafka
b. Asyncio to external system
c. Dumping in Cassandra, Elasticsearch
the performance of state backend, etc.
Navneeth Krishnan 于2019年7月14日周日 上午5:01写道:
> Hi All,
>
> Any pointers on the below checkpoint failure scenario. Appreciate all the
> help. Thanks
>
> Thanks
>
> On Sun, Jul 7, 2019 at 9:23 PM Navneeth Krishnan
> wrote:
>
&g
Hi All,
Any pointers on the below checkpoint failure scenario. Appreciate all the
help. Thanks
Thanks
On Sun, Jul 7, 2019 at 9:23 PM Navneeth Krishnan
wrote:
> Hi All,
>
> Occasionally I run into failed checkpoints error where 2 or 3 consecutive
> checkpoints fails after running
Hi All,
Occasionally I run into failed checkpoints error where 2 or 3 consecutive
checkpoints fails after running for a minute and then it recovers. This is
causing delay in processing the incoming data since there is huge amount of
data buffered during the failed checkpoints. I don't see any erro
Can you provide us with the TaskManager logs?
On 05.06.2018 12:30, James (Jian Wu) [FDS Data Platform] wrote:
Hi:
I am using Flink streaming continuous query.
Scenario:
Kafka-connector to consume a topic, and streaming incremental
calculate 24 hours window data. And use processingTime a
Hi:
I am using Flink streaming continuous query.
Scenario:
Kafka-connector to consume a topic, and streaming incremental calculate 24
hours window data. And use processingTime as TimeCharacteristic. I am using
RocksDB as StateBackend, file system is HDFS, and checkpoint interval is 5
minu
23 matches
Mail list logo