[jira] [Commented] (KAFKA-5330) Use per-task converters in Connect

ASF GitHub Bot (JIRA) Thu, 01 Jun 2017 13:03:21 -0700

    [ 
https://issues.apache.org/jira/browse/KAFKA-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16033578#comment-16033578
 ]


ASF GitHub Bot commented on KAFKA-5330:
---------------------------------------

GitHub user tbcdns opened a pull request:

    https://github.com/apache/kafka/pull/3196

    KAFKA-5330: Use per-task converters

    Instead of sharing the same converter instance within the worker, use a 
converter per task.
    
    More details:
    - https://github.com/confluentinc/schema-registry/issues/514
    - https://issues.apache.org/jira/browse/KAFKA-5330

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tbcdns/kafka KAFKA-5330

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/3196.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3196
    
----
commit 43f38a5a4e0a7eeff233afe90a7893f252c6edac
Author: Thibaud Chardonnens <thibaud.chardonn...@swisscom.com>
Date:   2017-06-01T19:38:16Z

    KAFKA-5330: Use per-task converters

----


> Use per-task converters in Connect
> ----------------------------------
>
>                 Key: KAFKA-5330
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5330
>             Project: Kafka
>          Issue Type: Improvement
>          Components: KafkaConnect
>    Affects Versions: 0.11.0.0
>            Reporter: Ewen Cheslack-Postava
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Because Connect started with a worker-wide model of data formats, we 
> currently allocate a single Converter per worker and only allocate an 
> independent one when the user overrides the converter.
> This can lead to performance problems when the worker-level default converter 
> is used by a large number of tasks because converters need to be threadsafe 
> to support this model and they may spend a lot of time just on 
> synchronization.
> We could, instead, simply allocate one converter per task. There is some 
> overhead involved, but generally it shouldn't be that large. For example, 
> Confluent's Avro converters will each have their own schema cache and have to 
> make their on calls to the schema registry API, but these are relatively 
> small, likely inconsequential compared to any normal overhead we would 
> already have for creating and managing each task. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (KAFKA-5330) Use per-task converters in Connect

Reply via email to