[jira] [Updated] (KAFKA-2092) New partitioning for better load balancing

2015-08-17 Thread Gianmarco De Francisci Morales (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gianmarco De Francisci Morales updated KAFKA-2092: -- Attachment: KAFKA-2092-v3.patch Updated formatting to pass

Re: [DISCUSS] Partitioning in Kafka

2015-08-06 Thread Gianmarco De Francisci Morales
t; each > > > >> > key on the partitions it consumes and write them to a separate > > topic. > > > For > > > >> > example, if you are writing log messages to a "logs" topic with > the > > > >> > hostname as the

Re: [DISCUSS] Partitioning in Kafka

2015-07-28 Thread Gianmarco De Francisci Morales
; compute > >> > total aggregates based on the two intermediate aggregates. The > benefit is > >> > that you are generally going to get better load balancing across > >> partitions > >> > than if you used the default partitioner. (Please correct me if my > &

[jira] [Commented] (KAFKA-2092) New partitioning for better load balancing

2015-07-27 Thread Gianmarco De Francisci Morales (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642427#comment-14642427 ] Gianmarco De Francisci Morales commented on KAFKA-

[DISCUSS] Partitioning in Kafka

2015-07-22 Thread Gianmarco De Francisci Morales
Hello folks, I'd like to ask the community about its opinion on the partitioning functions in Kafka. With KAFKA-2091 integrated we are now able to have custom partitioners in the producer. The question now becomes *which* partitioners should ship

[jira] [Comment Edited] (KAFKA-2092) New partitioning for better load balancing

2015-07-21 Thread Gianmarco De Francisci Morales (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14634927#comment-14634927 ] Gianmarco De Francisci Morales edited comment on KAFKA-2092 at 7/21/15 10:4

[jira] [Commented] (KAFKA-2092) New partitioning for better load balancing

2015-07-21 Thread Gianmarco De Francisci Morales (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14634927#comment-14634927 ] Gianmarco De Francisci Morales commented on KAFKA-2092: --- [hachi

Re: Review Request 35524: KAFKA-2092: New partitioning for better load balancing

2015-07-07 Thread Gianmarco De Francisci Morales
--- Thanks, Gianmarco De Francisci Morales

[jira] [Updated] (KAFKA-2092) New partitioning for better load balancing

2015-07-07 Thread Gianmarco De Francisci Morales (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gianmarco De Francisci Morales updated KAFKA-2092: -- Attachment: KAFKA-2092-v2.patch Added explanation and example

[jira] [Commented] (KAFKA-2092) New partitioning for better load balancing

2015-07-03 Thread Gianmarco De Francisci Morales (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14613028#comment-14613028 ] Gianmarco De Francisci Morales commented on KAFKA-

[jira] [Commented] (KAFKA-2092) New partitioning for better load balancing

2015-07-01 Thread Gianmarco De Francisci Morales (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14609701#comment-14609701 ] Gianmarco De Francisci Morales commented on KAFKA-2092: ---

[jira] [Comment Edited] (KAFKA-2092) New partitioning for better load balancing

2015-06-22 Thread Gianmarco De Francisci Morales (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14589634#comment-14589634 ] Gianmarco De Francisci Morales edited comment on KAFKA-2092 at 6/22/15 8:4

[jira] [Commented] (KAFKA-2092) New partitioning for better load balancing

2015-06-22 Thread Gianmarco De Francisci Morales (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595510#comment-14595510 ] Gianmarco De Francisci Morales commented on KAFKA-2092: --- Any

[jira] [Commented] (KAFKA-2092) New partitioning for better load balancing

2015-06-17 Thread Gianmarco De Francisci Morales (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14589634#comment-14589634 ] Gianmarco De Francisci Morales commented on KAFKA-2092: --- Thanks

Review Request 35524: KAFKA-2092: New partitioning for better load balancing

2015-06-16 Thread Gianmarco De Francisci Morales
/internals/PKGPartitioner.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/utils/Utils.java f73eedb Diff: https://reviews.apache.org/r/35524/diff/ Testing --- Thanks, Gianmarco De Francisci Morales

[jira] [Updated] (KAFKA-2092) New partitioning for better load balancing

2015-06-13 Thread Gianmarco De Francisci Morales (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gianmarco De Francisci Morales updated KAFKA-2092: -- Status: Patch Available (was: Open) First attempt at a patch

[jira] [Updated] (KAFKA-2092) New partitioning for better load balancing

2015-06-13 Thread Gianmarco De Francisci Morales (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gianmarco De Francisci Morales updated KAFKA-2092: -- Attachment: KAFKA-2092-v1.patch > New partitioning for bet

Re: [KIP-DISCUSSION] KIP-22 Expose a Partitioner interface in the new producer

2015-05-18 Thread Gianmarco De Francisci Morales
think we need to > >>> > >>>have > >>> > the partitioner.metadata property. Our reason for using string > >>>properties > >>> > is exactly to make config extensible at runtime. So a given > >>>partitioner can > >>> > add whate

Re: [KIP-DISCUSSION] KIP-22 Expose a Partitioner interface in the new producer

2015-05-04 Thread Gianmarco De Francisci Morales
; > On Fri, Apr 24, 2015, at 02:15 AM, Gianmarco De Francisci Morales wrote: > > Hi, > > > > > > Here are the questions I think we should consider: > > > 1. Do we need this at all given that we have the partition argument in > > > ProducerRecord which gi

Re: [KIP-DISCUSSION] KIP-22 Expose a Partitioner interface in the new producer

2015-04-24 Thread Gianmarco De Francisci Morales
Hi, Here are the questions I think we should consider: > 1. Do we need this at all given that we have the partition argument in > ProducerRecord which gives full control? I think we do need it because this > is a way to plug in a different partitioning strategy at run time and do it > in a fairly

[jira] [Commented] (KAFKA-2091) Expose a Partitioner interface in the new producer

2015-04-23 Thread Gianmarco De Francisci Morales (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508649#comment-14508649 ] Gianmarco De Francisci Morales commented on KAFKA-2091: --- H

[jira] [Commented] (KAFKA-2091) Expose a Partitioner interface in the new producer

2015-04-08 Thread Gianmarco De Francisci Morales (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14484926#comment-14484926 ] Gianmarco De Francisci Morales commented on KAFKA-2091: --- Looks

Re: [DISCUSS] New partitioning for better load balancing

2015-04-07 Thread Gianmarco De Francisci Morales
gt; I am wondering if those two issues can be resolved with the PKG framework? > > Guozhang > > On Sun, Apr 5, 2015 at 12:19 AM, Gianmarco De Francisci Morales < > g...@apache.org> wrote: > > > Hi Jay, > > > > Thanks, that sounds a necessary step. I guess I exp

[jira] [Updated] (KAFKA-2092) New partitioning for better load balancing

2015-04-05 Thread Gianmarco De Francisci Morales (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gianmarco De Francisci Morales updated KAFKA-2092: -- Description: We have recently studied the problem of load

Re: [DISCUSS] New partitioning for better load balancing

2015-04-05 Thread Gianmarco De Francisci Morales
t;> Gianmarco, >> I am coming from storm community. I think PKG is a very >> interesting and we can provide an implementation of Partitioner for PKG. >> Can you open a JIRA for this. >> >> -- >> Harsha >> Sent with Airmail >> >&g

[jira] [Created] (KAFKA-2092) New partitioning for better load balancing

2015-04-05 Thread Gianmarco De Francisci Morales (JIRA)
Gianmarco De Francisci Morales created KAFKA-2092: - Summary: New partitioning for better load balancing Key: KAFKA-2092 URL: https://issues.apache.org/jira/browse/KAFKA-2092 Project

[DISCUSS] New partitioning for better load balancing

2015-04-03 Thread Gianmarco De Francisci Morales
Hi, We have recently studied the problem of load balancing in distributed stream processing systems such as Samza [1]. In particular, we focused on what happens when the key distribution of the stream is skewed when using key grouping. We developed a new stream partitioning scheme (which we call P