[ https://issues.apache.org/jira/browse/KAFKA-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277404#comment-15277404 ]
ASF GitHub Bot commented on KAFKA-2479: --------------------------------------- Github user Ishiihara closed the pull request at: https://github.com/apache/kafka/pull/279 > Add CopycatExceptions to indicate transient and permanent errors in a > connector/task > ------------------------------------------------------------------------------------ > > Key: KAFKA-2479 > URL: https://issues.apache.org/jira/browse/KAFKA-2479 > Project: Kafka > Issue Type: Sub-task > Components: KafkaConnect > Reporter: Ewen Cheslack-Postava > Assignee: Liquan Pei > Fix For: 0.10.1.0, 0.10.0.0 > > > Sometimes the connector will need to indicate to the framework that an error > occurred, but the error could have multiple responses by the framework. > For source connectors, there's not much they need to indicate since they can > block indefinitely. They probably only need to indicate permanent errors for > correctness, though we may want them to indicate transient errors so we can > report health of the task in a metric. > For sink connectors, there are at least a couple of scenarios: > 1. A task encounters some error while processing a {{put(records)}} call and > was unable to fully process it, but thinks it could be resolved in the > future. The task doesn't want to see any new records until the issue is > resolved, but will need to see the same set of records again. (It would be > nice if the task doesn't have to deal with saving these to a buffer itself.) > 2. A task encounters some error while processing data, but it has > enqueued/handled the data passed into the {{put(records)}} call. For example, > it may have passed it to some library which buffers it, but then the library > indicated that it is having some connection issues. The connector might be > able accept more data, but the task is not in a healthy state. > 3. The task encounters some error that it decides is unrecoverable. This > might just be transient errors that repeat for long enough that the task > thinks its time to give up. Unclear what to do here, but one option is > relocating the task to another worker, hoping that the issue is specific to > the worker. > Note that it is not, generally, safe for sink tasks to do their own backoff > or we'd potentially starve the consumer, which needs to poll() in order to > heartbeat. So we need to make sure whatever mechanism we implement encourages > the user to throw an exception and pass control back to us instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)