[ https://issues.apache.org/jira/browse/KAFKA-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15434408#comment-15434408 ]
ASF GitHub Bot commented on KAFKA-4042: --------------------------------------- GitHub user shikhar opened a pull request: https://github.com/apache/kafka/pull/1778 KAFKA-4042: Contain connector & task start/stop failures within the Worker Invoke the statusListener.onFailure() callback on start failures so that the statusBackingStore is updated. This involved a fix to the putSafe() functionality which prevented any update that was not preceded by a (non-safe) put() from completing, so here when a connector or task is transitioning directly to FAILED. Worker start methods can still throw if the same connector name or task ID is already registered with the worker, as this condition should not happen. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shikhar/kafka distherder-stayup-take4 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/1778.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1778 ---- commit 050b80331f63ec71f16a644e7fa8006823c94ecc Author: Shikhar Bhushan <shik...@confluent.io> Date: 2016-08-23T23:00:10Z KAFKA-4042: Contain connector & task start/stop failures within the Worker Invoke the statusListener.onFailure() callback on start failures so that the statusBackingStore is updated. This involved a fix to the putSafe() functionality which prevented any update that was not preceded by a (non-safe) put() from completing, so here when a connector or task is transitioning directly to FAILED. Worker start methods can still throw if the same connector name or task ID is already registered with the worker, as this condition should not happen. ---- > DistributedHerder thread can die because of connector & task lifecycle > exceptions > --------------------------------------------------------------------------------- > > Key: KAFKA-4042 > URL: https://issues.apache.org/jira/browse/KAFKA-4042 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect > Reporter: Shikhar Bhushan > Assignee: Shikhar Bhushan > Fix For: 0.10.1.0 > > > As one example, there isn't exception handling in > {{DistributedHerder.startConnector()}} or the call-chain for it originating > in the {{tick()}} on the herder thread, and it can throw an exception because > of a bad class name in the connector config. (report of issue in wild: > https://groups.google.com/d/msg/confluent-platform/EnleFnXpZCU/3B_gRxsRAgAJ) -- This message was sent by Atlassian JIRA (v6.3.4#6332)