[ 
https://issues.apache.org/jira/browse/KAFKA-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15896321#comment-15896321
 ] 

ASF GitHub Bot commented on KAFKA-4843:
---------------------------------------

GitHub user enothereska opened a pull request:

    https://github.com/apache/kafka/pull/2643

    KAFKA-4843: More efficient round-robin scheduler

    - Improves streams efficiency by more than 200K requests/second (small 100 
byte requests)
    - Gets streams efficiency very close to pure consumer (see results in 
https://jenkins.confluent.io/job/system-test-kafka-branch-builder/746/console)
    
    - Maintains same fairness across tasks
    - Schedules all records in the queue in-between poll() calls, not just one 
per task.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/enothereska/kafka minor-schedule-round-robin

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/2643.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2643
    
----
commit c3f9a1756e7b7f0e2869853bde2e249fb3f1f6d9
Author: Eno Thereska <e...@confluent.io>
Date:   2017-03-05T08:09:51Z

    More efficient round robin

commit 138a491f743d5ed7017c415c9f50f974f16c8567
Author: Eno Thereska <e...@confluent.io>
Date:   2017-03-05T10:05:38Z

    Tighter loop

commit caba483760eb47304e50589f66d396a2afdf0f4e
Author: Eno Thereska <e...@confluent.io>
Date:   2017-03-05T14:16:33Z

    Increased records further

commit aaa14d1c95bfea0f7681f1b5686e0bc6736b13ee
Author: Eno Thereska <e...@confluent.io>
Date:   2017-03-05T15:00:06Z

    Temporary reduce number of tests for quick branch builder turnaround

commit 6c616addbc86f23e9f7311f6ac1cc2e5c92152ee
Author: Eno Thereska <e...@confluent.io>
Date:   2017-03-05T15:24:29Z

    Re-enable full tests

----


> Stream round-robin scheduler is inneficient
> -------------------------------------------
>
>                 Key: KAFKA-4843
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4843
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>    Affects Versions: 0.10.2.0
>            Reporter: Eno Thereska
>            Assignee: Eno Thereska
>             Fix For: 0.11.0.0
>
>
> Currently StreamThread.runloop() uses a simple round-robin scheduler, where a 
> single request is taken from each task for processing, followed by poll, 
> followed by the same process over again. For example, for an app that has 
> just 2 tasks each with 3 records ready to be processed we'd have the 
> following sequence
> poll() -> process 1 request for task T1 -> process 1 request for task T2 -> 
> poll()
> -> process 1 request for task T1 -> process 1 request for task T2 -> poll() 
> -> process 1 request for task T1 -> process 1 request for task T2 -> poll() 
> This is quite inefficient. Instead, a better round robin scheduler would do:
> poll() -> process all 3 requests for task T1 -> process all 3 requests for 
> task T2 -> poll()



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to