I will try that.
Thanks for your help, David.
Best regards
Rainie
On Sat, Mar 20, 2021 at 9:46 AM David Anderson wrote:
> You should increase the kafka transaction timeout --
> transaction.max.timeout.ms -- to something much larger than the default,
> which I believe is 15 minutes. Suitable val
You should increase the kafka transaction timeout --
transaction.max.timeout.ms -- to something much larger than the default,
which I believe is 15 minutes. Suitable values are more on the order of a
few hours to a few days -- long enough to allow for any conceivable outage.
This way, if a request
Hi Arvid,
After increasing producer.kafka.request.timeout.ms from 9 to 12.
The job has been running fine for almost 5 days, but one of the tasks
failed again recently for the same timeout error. (attached stack trace
below)
Should I keep increasing producer.kafka.request.timeout.ms value?
Thanks for the suggestion, Arvid.
Currently my job is using producer.kafka.request.timeout.ms=9
I will try to increase to 12.
Best regards
Rainie
On Thu, Mar 11, 2021 at 3:58 AM Arvid Heise wrote:
> Hi Rainie,
>
> This looks like the record batching in Kafka producer timed out. At this
Hi Rainie,
This looks like the record batching in Kafka producer timed out. At this
point, the respective records are lost forever. You probably want to tweak
your Kafka settings [1].
Usually, Flink should fail and restart at this point and recover without
data loss. However, if the transactions
Thanks for the info, David.
The job has checkpointing.
I saw some tasks failed due to
"org.apache.flink.streaming.connectors.kafka.FlinkKafkaException: Failed to
send data to Kafka"
Here is stacktrack from JM log:
container_e17_1611597945897_8007_01_000240 @ worker-node-host
(dataPort=42321).
2021
Rainie,
A restart after a failure can cause data loss if you aren't using
checkpointing, or if you experience a transaction timeout.
A manual restart can also lead to data loss, depending on how you manage
the offsets, transactions, and other state during the restart. What
happened in this case?
Thanks Yun and David.
There were some tasks that got restarted. We configured the restart policy
and the job didn't fail.
Will task restart cause data loss?
Thanks
Rainie
On Mon, Mar 8, 2021 at 10:42 AM David Anderson wrote:
> Rainie,
>
> Were there any failures/restarts, or is this discrepanc
Rainie,
Were there any failures/restarts, or is this discrepancy observed without
any disruption to the processing?
Regards,
David
On Mon, Mar 8, 2021 at 10:14 AM Rainie Li wrote:
> Thanks for the quick response, Smile.
> I don't use window operators or flatmap.
> Here is the core logic of my
Hi Rainie,
From the code it seems the current problem does not use the time-related
functionality like
window/timer? If so, the problem would be indepdent with the time type used.
Also, it would not likely due to rebalance() since the network layer has the
check of sequence
number. If there are
Thanks for the quick response, Smile.
I don't use window operators or flatmap.
Here is the core logic of my filter, it only iterates on filters list.
Will *rebalance()
*cause it?
Thanks again.
Best regards
Rainie
SingleOutputStreamOperator> matchedRecordsStream =
eventStream
.rebalanc
Hi Rainie,
Could you please provide more information about your processing logic?
Do you use window operators?
If there's no time-based operator in your logic, late arrival data won't be
dropped by default and there might be something wrong with your flat map or
filter operator. Otherwise, you ca
12 matches
Mail list logo