Hi,

I'm running an app on Kafka Streams 1.0.0, and in the past day a lot of
nodes are failing and I see this in the log.

These appear to be failures when attempting to update the changelog.  Any
ideas on what I should do to work around this?  Should I configure separate
retry and timeouts for the changelog producer?  If so How do I do that?


org.apache.kafka.streams.errors.StreamsException: stream-thread
[dp-insightstrack-staging1-9cc75371-000c-420a-b9d7-b46b2b4bab5c-StreamThread-3]
Failed to rebalance.
at
org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:860)
at
org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:808)
at
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:774)
at
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:744)
Caused by: org.apache.kafka.streams.errors.StreamsException: stream-thread
[dp-insightstrack-staging1-9cc75371-000c-420a-b9d7-b46b2b4bab5c-StreamThread-3]
failed to suspend stream tasks
at
org.apache.kafka.streams.processor.internals.TaskManager.suspendTasksAndState(TaskManager.java:204)
at
org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener.onPartitionsRevoked(StreamThread.java:291)
at
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare(ConsumerCoordinator.java:418)
at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:359)
at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:316)
at
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:295)
at
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1138)
at
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1103)
at
org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:851)
... 3 common frames omitted
Caused by: org.apache.kafka.streams.errors.ProcessorStateException: task
[0_18] Failed to flush state store summarykey-to-summary
at
org.apache.kafka.streams.processor.internals.ProcessorStateManager.flush(ProcessorStateManager.java:248)
at
org.apache.kafka.streams.processor.internals.AbstractTask.flushState(AbstractTask.java:196)
at
org.apache.kafka.streams.processor.internals.StreamTask.flushState(StreamTask.java:324)
at
org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:304)
at
org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:208)
at
org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:299)
at
org.apache.kafka.streams.processor.internals.StreamTask.suspend(StreamTask.java:414)
at
org.apache.kafka.streams.processor.internals.StreamTask.suspend(StreamTask.java:396)
at
org.apache.kafka.streams.processor.internals.AssignedTasks.suspendTasks(AssignedTasks.java:219)
at
org.apache.kafka.streams.processor.internals.AssignedTasks.suspend(AssignedTasks.java:184)
at
org.apache.kafka.streams.processor.internals.TaskManager.suspendTasksAndState(TaskManager.java:197)
... 11 common frames omitted
Caused by: org.apache.kafka.streams.errors.StreamsException: task [0_18]
Abort sending since an error caught with a previous record (key
ComparableSummaryKey(SummaryKey(GnipStream(172819),RuleId(951574687410094106),Some(TweetId(954107387677433858))))
value
TweetSummaryCountsWindow(SummaryKey(GnipStream(172819),RuleId(951574687410094106),Some(TweetId(954107387677433858))),168,1516311902337,WindowCounts(1516742462337,LongHyperLogLogPlus(com.clearspring.analytics.stream.cardinality.HyperLogLogPlus@2e76a045
),LongHyperLogLogPlus(com.clearspring.analytics.stream.cardinality.HyperLogLogPlus@8610f558
),None,-1,false,MetricMap(true,[I@74cc9b69
),com.twitter.datainsights.insightstrack.domain.InteractionSourceSummary@5653a508),None,false)
timestamp 1516742492445) to topic
dp-insightstrack-staging1-summarykey-to-summary-changelog due to
org.apache.kafka.common.errors.TimeoutException: Expiring 7 record(s) for
dp-insightstrack-staging1-summarykey-to-summary-changelog-18: 30011 ms has
passed since last append.
at
org.apache.kafka.streams.processor.internals.RecordCollectorImpl$1.onCompletion(RecordCollectorImpl.java:118)
at
org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:204)
at
org.apache.kafka.clients.producer.internals.ProducerBatch.done(ProducerBatch.java:187)
at
org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:627)
at
org.apache.kafka.clients.producer.internals.Sender.sendProducerData(Sender.java:287)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:238)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:163)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.kafka.common.errors.TimeoutException: Expiring 7
record(s) for dp-insightstrack-staging1-summarykey-to-summary-changelog-18:
30011 ms has passed since last append

Reply via email to