Re: Random Partitioning Issue

2013-09-15 Thread Jay Kreps
I just took a look at this change. I agree with Joe, not to put to fine a
point on it, but this is a confusing hack.

Jun, I don't think wanting to minimizing the number of TCP connections is
going to be a very common need for people with less than 10k producers. I
also don't think people are going to get very good load balancing out of
this because most people don't have a ton of producers. I think instead we
will spend the next year explaining this behavior which 99% of people will
think is a bug (because it is crazy, non-intuitive, and breaks their usage).

Why was this done by adding special default behavior in the null key case
instead of as a partitioner? The argument that the partitioner interface
doesn't have sufficient information to choose a partition is not a good
argument for hacking in changes to the default, it is an argument for *
improving* the partitioner interface.

The whole point of a partitioner interface is to make it possible to plug
in non-standard behavior like this, right?

-Jay


On Sat, Sep 14, 2013 at 8:15 PM, Jun Rao  wrote:

> Joe,
>
> Thanks for bringing this up. I want to clarify this a bit.
>
> 1. Currently, the producer side logic is that if the partitioning key is
> not provided (i.e., it is null), the partitioner won't be called. We did
> that because we want to select a random and "available" partition to send
> messages so that if some partitions are temporarily unavailable (because of
> broker failures), messages can still be sent to other partitions. Doing
> this in the partitioner is difficult since the partitioner doesn't know
> which partitions are currently available (the DefaultEventHandler does).
>
> 2. As Joel said, the common use case in production is that there are many
> more producers than #partitions in a topic. In this case, sticking to a
> partition for a few minutes is not going to cause too much imbalance in the
> partitions and has the benefit of reducing the # of socket connections. My
> feeling is that this will benefit most production users. In fact, if one
> uses a hardware load balancer for producing data in 0.7, it behaves in
> exactly the same way (a producer will stick to a broker until the reconnect
> interval is reached).
>
> 3. It is true that If one is testing a topic with more than one partition
> (which is not the default value), this behavior can be a bit weird.
> However, I think it can be mitigated by running multiple test producer
> instances.
>
> 4. Someone reported in the mailing list that all data shows in only one
> partition after a few weeks. This is clearly not the expected behavior. We
> can take a closer look to see if this is real issue.
>
> Do you think these address your concerns?
>
> Thanks,
>
> Jun
>
>
>
> On Sat, Sep 14, 2013 at 11:18 AM, Joe Stein  wrote:
>
> > How about creating a new class called RandomRefreshPartioner and copy the
> > DefaultPartitioner code to it and then revert the DefaultPartitioner
> code.
> >  I appreciate this is a one time burden for folks using the existing
> > 0.8-beta1 bumping into KAFKA-1017 in production having to switch to the
> > RandomRefreshPartioner and when folks deploy to production will have to
> > consider this property change.
> >
> > I make this suggestion keeping in mind the new folks that on board with
> > Kafka and when everyone is in development and testing mode for the first
> > time their experience would be as expected from how it would work in
> > production this way.  In dev/test when first using Kafka they won't have
> so
> > many producers for partitions but would look to parallelize their
> consumers
> > IMHO.
> >
> > The random broker change sounds like maybe a bigger change now this late
> > in the release cycle if we can accommodate folks trying Kafka for the
> first
> > time and through their development and testing along with full blown
> > production deploys.
> >
> > /***
> >  Joe Stein
> >  Founder, Principal Consultant
> >  Big Data Open Source Security LLC
> >  http://www.stealth.ly
> >  Twitter: @allthingshadoop
> > /
> >
> >
> > On Sep 14, 2013, at 8:17 AM, Joel Koshy  wrote:
> >
> > >>
> > >>
> > >> Thanks for bringing this up - it is definitely an important point to
> > >> discuss. The underlying issue of KAFKA-1017 was uncovered to some
> > degree by
> > >> the fact that in our deployment we did not significantly increase the
> > total
> > >> number of partitions over 0.7 - i.e., in 0.7 we had say four
> partitions
> > per
> > >> broker, now we are using (say) eight partitions across the cluster. So
> > with
> > >> random partitioning every producer would end up connecting to nearly
> > every
> > >> broker (unlike 0.7 in which we would connect to only one broker within
> > each
> > >> reconnect interval). In a production-scale deployment that causes the
> > high
> > >> number of connections that KAFKA-1017 addresses.
> > >>
> > >> You are right that the fix of stick

Re: Random Partitioning Issue

2013-09-15 Thread Jay Kreps
Let me ask another question which I think is more objective. Let's say 100
random, smart infrastructure specialists try Kafka, of these 100 how many
do you believe will
1. Say that this behavior is what they expected to happen?
2. Be happy with this behavior?
I am not being facetious I am genuinely looking for a numerical estimate. I
am trying to figure out if nobody thought about this or if my estimate is
just really different. For what it is worth my estimate is 0 and 5
respectively.

This would be fine expect that we changed it from the good behavior to the
bad behavior to fix an issue that probably only we have.

-Jay


On Sun, Sep 15, 2013 at 8:37 AM, Jay Kreps  wrote:

> I just took a look at this change. I agree with Joe, not to put to fine a
> point on it, but this is a confusing hack.
>
> Jun, I don't think wanting to minimizing the number of TCP connections is
> going to be a very common need for people with less than 10k producers. I
> also don't think people are going to get very good load balancing out of
> this because most people don't have a ton of producers. I think instead we
> will spend the next year explaining this behavior which 99% of people will
> think is a bug (because it is crazy, non-intuitive, and breaks their usage).
>
> Why was this done by adding special default behavior in the null key case
> instead of as a partitioner? The argument that the partitioner interface
> doesn't have sufficient information to choose a partition is not a good
> argument for hacking in changes to the default, it is an argument for *
> improving* the partitioner interface.
>
> The whole point of a partitioner interface is to make it possible to plug
> in non-standard behavior like this, right?
>
> -Jay
>
>
> On Sat, Sep 14, 2013 at 8:15 PM, Jun Rao  wrote:
>
>> Joe,
>>
>> Thanks for bringing this up. I want to clarify this a bit.
>>
>> 1. Currently, the producer side logic is that if the partitioning key is
>> not provided (i.e., it is null), the partitioner won't be called. We did
>> that because we want to select a random and "available" partition to send
>> messages so that if some partitions are temporarily unavailable (because
>> of
>> broker failures), messages can still be sent to other partitions. Doing
>> this in the partitioner is difficult since the partitioner doesn't know
>> which partitions are currently available (the DefaultEventHandler does).
>>
>> 2. As Joel said, the common use case in production is that there are many
>> more producers than #partitions in a topic. In this case, sticking to a
>> partition for a few minutes is not going to cause too much imbalance in
>> the
>> partitions and has the benefit of reducing the # of socket connections. My
>> feeling is that this will benefit most production users. In fact, if one
>> uses a hardware load balancer for producing data in 0.7, it behaves in
>> exactly the same way (a producer will stick to a broker until the
>> reconnect
>> interval is reached).
>>
>> 3. It is true that If one is testing a topic with more than one partition
>> (which is not the default value), this behavior can be a bit weird.
>> However, I think it can be mitigated by running multiple test producer
>> instances.
>>
>> 4. Someone reported in the mailing list that all data shows in only one
>> partition after a few weeks. This is clearly not the expected behavior. We
>> can take a closer look to see if this is real issue.
>>
>> Do you think these address your concerns?
>>
>> Thanks,
>>
>> Jun
>>
>>
>>
>> On Sat, Sep 14, 2013 at 11:18 AM, Joe Stein  wrote:
>>
>> > How about creating a new class called RandomRefreshPartioner and copy
>> the
>> > DefaultPartitioner code to it and then revert the DefaultPartitioner
>> code.
>> >  I appreciate this is a one time burden for folks using the existing
>> > 0.8-beta1 bumping into KAFKA-1017 in production having to switch to the
>> > RandomRefreshPartioner and when folks deploy to production will have to
>> > consider this property change.
>> >
>> > I make this suggestion keeping in mind the new folks that on board with
>> > Kafka and when everyone is in development and testing mode for the first
>> > time their experience would be as expected from how it would work in
>> > production this way.  In dev/test when first using Kafka they won't
>> have so
>> > many producers for partitions but would look to parallelize their
>> consumers
>> > IMHO.
>> >
>> > The random broker change sounds like maybe a bigger change now this late
>> > in the release cycle if we can accommodate folks trying Kafka for the
>> first
>> > time and through their development and testing along with full blown
>> > production deploys.
>> >
>> > /***
>> >  Joe Stein
>> >  Founder, Principal Consultant
>> >  Big Data Open Source Security LLC
>> >  http://www.stealth.ly
>> >  Twitter: @allthingshadoop
>> > /
>> >
>> >
>> > On Sep 14, 2013, at 8:17 AM, Joel Koshy  wrote

Re: Review Request 14091: Patch for KAFKA-1053

2013-09-15 Thread Neha Narkhede

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14091/
---

(Updated Sept. 15, 2013, 4:40 p.m.)


Review request for kafka.


Bugs: KAFKA-1053
https://issues.apache.org/jira/browse/KAFKA-1053


Repository: kafka


Description (updated)
---

Addressed Swapnil's review comments - 1) Closed the file handle for the git 
command 2) Removed the --version command line option and automated version to 
be the timestamp


Included Guozhang's suggestion about making --version mandatory if --rb is 
specified


Included Guozhang's suggestion about making --version mandatory if --rb is 
specified


Attaching file using jira-python package


Changed the script to use the jira-python package, part 2


Changed the script to use the jira-python package


Changed the creating a reviewboard comment as suggested by Tejas


Included Tejas's comments


Changed the post-review tool to publish the rb automatically. Updated the tool 
to add a link to the rb in the JIRA


publish automatically


Patch review tool


Diffs (updated)
-

  .reviewboardrc PRE-CREATION 
  kafka-patch-review.py PRE-CREATION 

Diff: https://reviews.apache.org/r/14091/diff/


Testing
---


Thanks,

Neha Narkhede



Re: Review Request 14091: Patch for KAFKA-1053

2013-09-15 Thread Neha Narkhede


> On Sept. 14, 2013, 7:07 p.m., Swapnil Ghike wrote:
> > kafka-patch-review.py, lines 50-51
> > 
> >
> > Do you intend to save the patch file in current directory? I wonder if 
> > we should delete it, because it is more of a side effect with this tool.

I don't think we need to cleanup after this tool since patch files are very 
small in size. Users can easily do that.


> On Sept. 14, 2013, 7:07 p.m., Swapnil Ghike wrote:
> > kafka-patch-review.py, line 77
> > 
> >
> > Not sure if there should be a space after -r, or if should be --rb=

Every command line option has a short-hand version that can be specified using 
one "-". This is true of most command line tools. You can read the short and 
full versions of all options using --help


> On Sept. 14, 2013, 7:07 p.m., Swapnil Ghike wrote:
> > kafka-patch-review.py, lines 41-43
> > 
> >
> > This is definitely useful. But I can specify versions out of order and 
> > confuse the reviewer. Is it possible to read the versions attached on the 
> > JIRA, and auto-increment the latest version by one?
> > 
> > Wondering if there is a jira.get_attachments() function..

Now that I think more about version, I doubt it is useful for the user to 
specify it. If it is not completely automated like you said, users could make 
mistakes and get the versions incorrect. Auto incrementing by reading from the 
jira is overly complicated. I think a -.patch is a better 
approach. That way it is completely automated and easy to reason about


> On Sept. 14, 2013, 7:07 p.m., Swapnil Ghike wrote:
> > kafka-patch-review.py, line 22
> > 
> >
> > Should we make it default to the current branch? 
> >

I think with multiple branches in progress, it is better for the branch field 
to be explicit. That way fewer people will make mistakes while uploading 
patches using the tool. If it is automated, users might make mistakes and 
upload incorrect patches to JIRA and reviewboard and not realize it.


- Neha


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14091/#review26115
---


On Sept. 15, 2013, 4:40 p.m., Neha Narkhede wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14091/
> ---
> 
> (Updated Sept. 15, 2013, 4:40 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1053
> https://issues.apache.org/jira/browse/KAFKA-1053
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Addressed Swapnil's review comments - 1) Closed the file handle for the git 
> command 2) Removed the --version command line option and automated version to 
> be the timestamp
> 
> 
> Included Guozhang's suggestion about making --version mandatory if --rb is 
> specified
> 
> 
> Included Guozhang's suggestion about making --version mandatory if --rb is 
> specified
> 
> 
> Attaching file using jira-python package
> 
> 
> Changed the script to use the jira-python package, part 2
> 
> 
> Changed the script to use the jira-python package
> 
> 
> Changed the creating a reviewboard comment as suggested by Tejas
> 
> 
> Included Tejas's comments
> 
> 
> Changed the post-review tool to publish the rb automatically. Updated the 
> tool to add a link to the rb in the JIRA
> 
> 
> publish automatically
> 
> 
> Patch review tool
> 
> 
> Diffs
> -
> 
>   .reviewboardrc PRE-CREATION 
>   kafka-patch-review.py PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/14091/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Neha Narkhede
> 
>



[jira] [Updated] (KAFKA-1053) Kafka patch review tool

2013-09-15 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-1053:
-

Attachment: KAFKA-1053-2013-09-15_09:40:04.patch

> Kafka patch review tool
> ---
>
> Key: KAFKA-1053
> URL: https://issues.apache.org/jira/browse/KAFKA-1053
> Project: Kafka
>  Issue Type: New Feature
>  Components: tools
>Reporter: Neha Narkhede
>Assignee: Neha Narkhede
> Attachments: KAFKA-1053-2013-09-15_09:40:04.patch, 
> KAFKA-1053-followup2.patch, KAFKA-1053-followup.patch, KAFKA-1053-v1.patch, 
> KAFKA-1053-v1.patch, KAFKA-1053-v1.patch, KAFKA-1053-v2.patch, 
> KAFKA-1053-v3.patch
>
>
> Created a new patch review tool that will integrate JIRA and reviewboard - 
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+patch+review+tool

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-1053) Kafka patch review tool

2013-09-15 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767840#comment-13767840
 ] 

Neha Narkhede commented on KAFKA-1053:
--

Updated reviewboard https://reviews.apache.org/r/14091/


> Kafka patch review tool
> ---
>
> Key: KAFKA-1053
> URL: https://issues.apache.org/jira/browse/KAFKA-1053
> Project: Kafka
>  Issue Type: New Feature
>  Components: tools
>Reporter: Neha Narkhede
>Assignee: Neha Narkhede
> Attachments: KAFKA-1053-2013-09-15_09:40:04.patch, 
> KAFKA-1053-followup2.patch, KAFKA-1053-followup.patch, KAFKA-1053-v1.patch, 
> KAFKA-1053-v1.patch, KAFKA-1053-v1.patch, KAFKA-1053-v2.patch, 
> KAFKA-1053-v3.patch
>
>
> Created a new patch review tool that will integrate JIRA and reviewboard - 
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+patch+review+tool

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-1053) Kafka patch review tool

2013-09-15 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767841#comment-13767841
 ] 

Neha Narkhede commented on KAFKA-1053:
--

Thanks for the review comments, [~swapnilghike]. Were you able to setup the 
tool correctly and use it to upload a patch and rb?

> Kafka patch review tool
> ---
>
> Key: KAFKA-1053
> URL: https://issues.apache.org/jira/browse/KAFKA-1053
> Project: Kafka
>  Issue Type: New Feature
>  Components: tools
>Reporter: Neha Narkhede
>Assignee: Neha Narkhede
> Attachments: KAFKA-1053-2013-09-15_09:40:04.patch, 
> KAFKA-1053-followup2.patch, KAFKA-1053-followup.patch, KAFKA-1053-v1.patch, 
> KAFKA-1053-v1.patch, KAFKA-1053-v1.patch, KAFKA-1053-v2.patch, 
> KAFKA-1053-v3.patch
>
>
> Created a new patch review tool that will integrate JIRA and reviewboard - 
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+patch+review+tool

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-1053) Kafka patch review tool

2013-09-15 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767843#comment-13767843
 ] 

Neha Narkhede commented on KAFKA-1053:
--

I think this tool will be easier to use if it is checked in. That will also 
simplify the wiki. Can I get a +1 from one of the committers?

> Kafka patch review tool
> ---
>
> Key: KAFKA-1053
> URL: https://issues.apache.org/jira/browse/KAFKA-1053
> Project: Kafka
>  Issue Type: New Feature
>  Components: tools
>Reporter: Neha Narkhede
>Assignee: Neha Narkhede
> Attachments: KAFKA-1053-2013-09-15_09:40:04.patch, 
> KAFKA-1053-followup2.patch, KAFKA-1053-followup.patch, KAFKA-1053-v1.patch, 
> KAFKA-1053-v1.patch, KAFKA-1053-v1.patch, KAFKA-1053-v2.patch, 
> KAFKA-1053-v3.patch
>
>
> Created a new patch review tool that will integrate JIRA and reviewboard - 
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+patch+review+tool

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-1011) Decompression and re-compression on MirrorMaker could result in messages being dropped in the pipeline

2013-09-15 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767850#comment-13767850
 ] 

Neha Narkhede commented on KAFKA-1011:
--

I think using the SyncProducer in the MirrorMaker might solve this issue until 
the client redesign is complete. The only extra work that is involved in making 
this work is batching. This will resolve the unintuitive behavior that we see 
today where a message is accepted at the source but gets dropped at one or more 
of the intermediate pipelines. I think we can attempt this is in trunk instead 
of waiting for the client rewrite since the work is not significantly large and 
the client rewrite release is many months away.

> Decompression and re-compression on MirrorMaker could result in messages 
> being dropped in the pipeline
> --
>
> Key: KAFKA-1011
> URL: https://issues.apache.org/jira/browse/KAFKA-1011
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Fix For: 0.8.1
>
> Attachments: KAFKA-1011.v1.patch
>
>
> The way MirrorMaker works today is that its consumers could use deep iterator 
> to decompress messages received from the source brokers and its producers 
> could re-compress the messages while sending them to the target brokers. 
> Since MirrorMakers use a centralized data channel for its consumers to pipe 
> messages to its producers, and since producers would compress messages with 
> the same topic within a batch as a single produce request, this could result 
> in messages accepted at the front end of the pipeline being dropped at the 
> target brokers of the MirrorMaker due to MesageSizeTooLargeException if it 
> happens that one batch of messages contain too many messages of the same 
> topic in MirrorMaker's producer. If we can use shallow iterator at the 
> MirrorMaker's consumer side to directly pipe compressed messages this issue 
> can be fixed. 
> Also as Swapnil pointed out, currently if the MirrorMaker lags and there are 
> large messages in the MirrorMaker queue (large after decompression), it can 
> run into an OutOfMemoryException. Shallow iteration will be very helpful in 
> avoiding this exception.
> The proposed solution of this issue is also related to KAFKA-527.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-1003) ConsumerFetcherManager should pass clientId as metricsPrefix to AbstractFetcherManager

2013-09-15 Thread Swapnil Ghike (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swapnil Ghike updated KAFKA-1003:
-

Attachment: KAFKA-1003.patch

> ConsumerFetcherManager should pass clientId as metricsPrefix to 
> AbstractFetcherManager
> --
>
> Key: KAFKA-1003
> URL: https://issues.apache.org/jira/browse/KAFKA-1003
> Project: Kafka
>  Issue Type: Bug
>Reporter: Swapnil Ghike
>Assignee: Swapnil Ghike
> Fix For: 0.8
>
> Attachments: kafka-1003.patch, KAFKA-1003.patch
>
>
> For consistency. We use clientId in the metric names elsewhere on clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-1003) ConsumerFetcherManager should pass clientId as metricsPrefix to AbstractFetcherManager

2013-09-15 Thread Swapnil Ghike (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767961#comment-13767961
 ] 

Swapnil Ghike commented on KAFKA-1003:
--

Created reviewboard 

> ConsumerFetcherManager should pass clientId as metricsPrefix to 
> AbstractFetcherManager
> --
>
> Key: KAFKA-1003
> URL: https://issues.apache.org/jira/browse/KAFKA-1003
> Project: Kafka
>  Issue Type: Bug
>Reporter: Swapnil Ghike
>Assignee: Swapnil Ghike
> Fix For: 0.8
>
> Attachments: kafka-1003.patch, KAFKA-1003.patch
>
>
> For consistency. We use clientId in the metric names elsewhere on clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-1003) ConsumerFetcherManager should pass clientId as metricsPrefix to AbstractFetcherManager

2013-09-15 Thread Swapnil Ghike (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767962#comment-13767962
 ] 

Swapnil Ghike commented on KAFKA-1003:
--

Created reviewboard 

> ConsumerFetcherManager should pass clientId as metricsPrefix to 
> AbstractFetcherManager
> --
>
> Key: KAFKA-1003
> URL: https://issues.apache.org/jira/browse/KAFKA-1003
> Project: Kafka
>  Issue Type: Bug
>Reporter: Swapnil Ghike
>Assignee: Swapnil Ghike
> Fix For: 0.8
>
> Attachments: kafka-1003.patch, KAFKA-1003.patch, KAFKA-1003.patch, 
> KAFKA-1003.patch
>
>
> For consistency. We use clientId in the metric names elsewhere on clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-1003) ConsumerFetcherManager should pass clientId as metricsPrefix to AbstractFetcherManager

2013-09-15 Thread Swapnil Ghike (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swapnil Ghike updated KAFKA-1003:
-

Attachment: KAFKA-1003.patch

> ConsumerFetcherManager should pass clientId as metricsPrefix to 
> AbstractFetcherManager
> --
>
> Key: KAFKA-1003
> URL: https://issues.apache.org/jira/browse/KAFKA-1003
> Project: Kafka
>  Issue Type: Bug
>Reporter: Swapnil Ghike
>Assignee: Swapnil Ghike
> Fix For: 0.8
>
> Attachments: kafka-1003.patch, KAFKA-1003.patch, KAFKA-1003.patch, 
> KAFKA-1003.patch
>
>
> For consistency. We use clientId in the metric names elsewhere on clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-1003) ConsumerFetcherManager should pass clientId as metricsPrefix to AbstractFetcherManager

2013-09-15 Thread Swapnil Ghike (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swapnil Ghike updated KAFKA-1003:
-

Attachment: KAFKA-1003.patch

> ConsumerFetcherManager should pass clientId as metricsPrefix to 
> AbstractFetcherManager
> --
>
> Key: KAFKA-1003
> URL: https://issues.apache.org/jira/browse/KAFKA-1003
> Project: Kafka
>  Issue Type: Bug
>Reporter: Swapnil Ghike
>Assignee: Swapnil Ghike
> Fix For: 0.8
>
> Attachments: kafka-1003.patch, KAFKA-1003.patch, KAFKA-1003.patch, 
> KAFKA-1003.patch
>
>
> For consistency. We use clientId in the metric names elsewhere on clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-1003) ConsumerFetcherManager should pass clientId as metricsPrefix to AbstractFetcherManager

2013-09-15 Thread Swapnil Ghike (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767963#comment-13767963
 ] 

Swapnil Ghike commented on KAFKA-1003:
--

Created reviewboard 

> ConsumerFetcherManager should pass clientId as metricsPrefix to 
> AbstractFetcherManager
> --
>
> Key: KAFKA-1003
> URL: https://issues.apache.org/jira/browse/KAFKA-1003
> Project: Kafka
>  Issue Type: Bug
>Reporter: Swapnil Ghike
>Assignee: Swapnil Ghike
> Fix For: 0.8
>
> Attachments: kafka-1003.patch, KAFKA-1003.patch, KAFKA-1003.patch, 
> KAFKA-1003.patch
>
>
> For consistency. We use clientId in the metric names elsewhere on clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-1003) ConsumerFetcherManager should pass clientId as metricsPrefix to AbstractFetcherManager

2013-09-15 Thread Swapnil Ghike (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767966#comment-13767966
 ] 

Swapnil Ghike commented on KAFKA-1003:
--

Created reviewboard 

> ConsumerFetcherManager should pass clientId as metricsPrefix to 
> AbstractFetcherManager
> --
>
> Key: KAFKA-1003
> URL: https://issues.apache.org/jira/browse/KAFKA-1003
> Project: Kafka
>  Issue Type: Bug
>Reporter: Swapnil Ghike
>Assignee: Swapnil Ghike
> Fix For: 0.8
>
> Attachments: kafka-1003.patch, KAFKA-1003.patch, KAFKA-1003.patch, 
> KAFKA-1003.patch, KAFKA-1003.patch
>
>
> For consistency. We use clientId in the metric names elsewhere on clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-1053) Kafka patch review tool

2013-09-15 Thread Swapnil Ghike (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767967#comment-13767967
 ] 

Swapnil Ghike commented on KAFKA-1053:
--

Hmm, tried setting up the tool according to the instruction for RHEL. Ran into 
this:
~/kafka/kafka$ python kafka-patch-review.py --help
Traceback (most recent call last):
  File "kafka-patch-review.py", line 3, in 
import argparse
ImportError: No module named argparse

Does the easy_install think work only on Mac? (jira-python and RBTools are 
installed using easy_install)?

On the Mac, I got this:
~/kafka-local/kafka$ echo $JIRA_CMDLINE_HOME 
.
~/kafka-local/kafka$ python kafka-patch-review.py -b 0.8 -j KAFKA-1003 -db
Jira Home= .
git diff 0.8 > KAFKA-1003.patch
Creating diff against 0.8 and uploading patch to JIRA KAFKA-1003
Creating a new reviewboard
post-review --publish --tracking-branch 0.8 --target-groups=kafka 
--bugs-closed=KAFKA-1003 --summary "Patch for KAFKA-1003"
There don't seem to be any diffs!

rb url= 


> Kafka patch review tool
> ---
>
> Key: KAFKA-1053
> URL: https://issues.apache.org/jira/browse/KAFKA-1053
> Project: Kafka
>  Issue Type: New Feature
>  Components: tools
>Reporter: Neha Narkhede
>Assignee: Neha Narkhede
> Attachments: KAFKA-1053-2013-09-15_09:40:04.patch, 
> KAFKA-1053-followup2.patch, KAFKA-1053-followup.patch, KAFKA-1053-v1.patch, 
> KAFKA-1053-v1.patch, KAFKA-1053-v1.patch, KAFKA-1053-v2.patch, 
> KAFKA-1053-v3.patch
>
>
> Created a new patch review tool that will integrate JIRA and reviewboard - 
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+patch+review+tool

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-1003) ConsumerFetcherManager should pass clientId as metricsPrefix to AbstractFetcherManager

2013-09-15 Thread Swapnil Ghike (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swapnil Ghike updated KAFKA-1003:
-

Attachment: KAFKA-1003.patch

> ConsumerFetcherManager should pass clientId as metricsPrefix to 
> AbstractFetcherManager
> --
>
> Key: KAFKA-1003
> URL: https://issues.apache.org/jira/browse/KAFKA-1003
> Project: Kafka
>  Issue Type: Bug
>Reporter: Swapnil Ghike
>Assignee: Swapnil Ghike
> Fix For: 0.8
>
> Attachments: kafka-1003.patch, KAFKA-1003.patch, KAFKA-1003.patch, 
> KAFKA-1003.patch, KAFKA-1003.patch
>
>
> For consistency. We use clientId in the metric names elsewhere on clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (KAFKA-1053) Kafka patch review tool

2013-09-15 Thread Swapnil Ghike (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767967#comment-13767967
 ] 

Swapnil Ghike edited comment on KAFKA-1053 at 9/16/13 12:38 AM:


Hmm, tried setting up the tool according to the instruction for RHEL. Ran into 
this:
~/kafka/kafka$ python kafka-patch-review.py --help
Traceback (most recent call last):
  File "kafka-patch-review.py", line 3, in 
import argparse
ImportError: No module named argparse

Does the easy_install think work only on Mac? (jira-python and RBTools are 
installed using easy_install)?

On the Mac, I got this:
~/kafka-local/kafka$ echo $JIRA_CMDLINE_HOME 
.
~/kafka-local/kafka$ python kafka-patch-review.py -b 0.8 -j KAFKA-1003 -db
Jira Home= .
git diff 0.8 > KAFKA-1003.patch
Creating diff against 0.8 and uploading patch to JIRA KAFKA-1003
Creating a new reviewboard
post-review --publish --tracking-branch 0.8 --target-groups=kafka 
--bugs-closed=KAFKA-1003 --summary "Patch for KAFKA-1003"
There don't seem to be any diffs!

rb url= 

If you take a look at KAFKA-1003, it has appended my diffs, it just did not 
crete a review board. I guess this is expected.


  was (Author: swapnilghike):
Hmm, tried setting up the tool according to the instruction for RHEL. Ran 
into this:
~/kafka/kafka$ python kafka-patch-review.py --help
Traceback (most recent call last):
  File "kafka-patch-review.py", line 3, in 
import argparse
ImportError: No module named argparse

Does the easy_install think work only on Mac? (jira-python and RBTools are 
installed using easy_install)?

On the Mac, I got this:
~/kafka-local/kafka$ echo $JIRA_CMDLINE_HOME 
.
~/kafka-local/kafka$ python kafka-patch-review.py -b 0.8 -j KAFKA-1003 -db
Jira Home= .
git diff 0.8 > KAFKA-1003.patch
Creating diff against 0.8 and uploading patch to JIRA KAFKA-1003
Creating a new reviewboard
post-review --publish --tracking-branch 0.8 --target-groups=kafka 
--bugs-closed=KAFKA-1003 --summary "Patch for KAFKA-1003"
There don't seem to be any diffs!

rb url= 

  
> Kafka patch review tool
> ---
>
> Key: KAFKA-1053
> URL: https://issues.apache.org/jira/browse/KAFKA-1053
> Project: Kafka
>  Issue Type: New Feature
>  Components: tools
>Reporter: Neha Narkhede
>Assignee: Neha Narkhede
> Attachments: KAFKA-1053-2013-09-15_09:40:04.patch, 
> KAFKA-1053-followup2.patch, KAFKA-1053-followup.patch, KAFKA-1053-v1.patch, 
> KAFKA-1053-v1.patch, KAFKA-1053-v1.patch, KAFKA-1053-v2.patch, 
> KAFKA-1053-v3.patch
>
>
> Created a new patch review tool that will integrate JIRA and reviewboard - 
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+patch+review+tool

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-1053) Kafka patch review tool

2013-09-15 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767974#comment-13767974
 ] 

Neha Narkhede commented on KAFKA-1053:
--

Thanks for giving it a spin, though I'm not sure you are using the latest 
version of the tool. 

1. For the argparse error, I'm not too sure if it is because argparse requires 
Python 2.7.x ? Could you try upgrading Python to 2.7.x and see if it works?
2. To upload a patch using the tool, the code must be committed to the local 
branch else the diff is empty. But I think that points to a potential 
improvement to the tool. If the diff is empty, it can skip uploading the patch 
and creating the rb.

> Kafka patch review tool
> ---
>
> Key: KAFKA-1053
> URL: https://issues.apache.org/jira/browse/KAFKA-1053
> Project: Kafka
>  Issue Type: New Feature
>  Components: tools
>Reporter: Neha Narkhede
>Assignee: Neha Narkhede
> Attachments: KAFKA-1053-2013-09-15_09:40:04.patch, 
> KAFKA-1053-followup2.patch, KAFKA-1053-followup.patch, KAFKA-1053-v1.patch, 
> KAFKA-1053-v1.patch, KAFKA-1053-v1.patch, KAFKA-1053-v2.patch, 
> KAFKA-1053-v3.patch
>
>
> Created a new patch review tool that will integrate JIRA and reviewboard - 
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+patch+review+tool

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-1003) ConsumerFetcherManager should pass clientId as metricsPrefix to AbstractFetcherManager

2013-09-15 Thread Swapnil Ghike (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swapnil Ghike updated KAFKA-1003:
-

Attachment: KAFKA-1003.patch

> ConsumerFetcherManager should pass clientId as metricsPrefix to 
> AbstractFetcherManager
> --
>
> Key: KAFKA-1003
> URL: https://issues.apache.org/jira/browse/KAFKA-1003
> Project: Kafka
>  Issue Type: Bug
>Reporter: Swapnil Ghike
>Assignee: Swapnil Ghike
> Fix For: 0.8
>
> Attachments: kafka-1003.patch, KAFKA-1003.patch, KAFKA-1003.patch, 
> KAFKA-1003.patch, KAFKA-1003.patch, KAFKA-1003.patch
>
>
> For consistency. We use clientId in the metric names elsewhere on clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-1003) ConsumerFetcherManager should pass clientId as metricsPrefix to AbstractFetcherManager

2013-09-15 Thread Swapnil Ghike (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767976#comment-13767976
 ] 

Swapnil Ghike commented on KAFKA-1003:
--

Created reviewboard 

> ConsumerFetcherManager should pass clientId as metricsPrefix to 
> AbstractFetcherManager
> --
>
> Key: KAFKA-1003
> URL: https://issues.apache.org/jira/browse/KAFKA-1003
> Project: Kafka
>  Issue Type: Bug
>Reporter: Swapnil Ghike
>Assignee: Swapnil Ghike
> Fix For: 0.8
>
> Attachments: kafka-1003.patch, KAFKA-1003.patch, KAFKA-1003.patch, 
> KAFKA-1003.patch, KAFKA-1003.patch, KAFKA-1003.patch
>
>
> For consistency. We use clientId in the metric names elsewhere on clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-1053) Kafka patch review tool

2013-09-15 Thread Swapnil Ghike (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767978#comment-13767978
 ] 

Swapnil Ghike commented on KAFKA-1053:
--

1. RHEL machine is on python 2.7.2. Maybe the libraries are not in the standard 
place or something.
2. Made a local commit, tried again, same error. I am using the latest diff 
(the one with the timestamps). Is JIRA_CMDLINE_HOME pointing to a wrong 
directory? I set it to .

> Kafka patch review tool
> ---
>
> Key: KAFKA-1053
> URL: https://issues.apache.org/jira/browse/KAFKA-1053
> Project: Kafka
>  Issue Type: New Feature
>  Components: tools
>Reporter: Neha Narkhede
>Assignee: Neha Narkhede
> Attachments: KAFKA-1053-2013-09-15_09:40:04.patch, 
> KAFKA-1053-followup2.patch, KAFKA-1053-followup.patch, KAFKA-1053-v1.patch, 
> KAFKA-1053-v1.patch, KAFKA-1053-v1.patch, KAFKA-1053-v2.patch, 
> KAFKA-1053-v3.patch
>
>
> Created a new patch review tool that will integrate JIRA and reviewboard - 
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+patch+review+tool

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request 14091: Patch for KAFKA-1053

2013-09-15 Thread Neha Narkhede

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14091/
---

(Updated Sept. 16, 2013, 3:28 a.m.)


Review request for kafka.


Bugs: KAFKA-1053
https://issues.apache.org/jira/browse/KAFKA-1053


Repository: kafka


Description (updated)
---

1) Removed reference to JIRA_CMDLINE_HOME as it is not required anymore 2) 
Improved error handling to handle the case if the patch is empty 3) Prevent 
uploading patch and updating/creating reviewboard if patch is empty or diff is 
not checked into local branch


Use underscore to separate jira, date and time in patch name


Addressed Swapnil's review comments - 1) Closed the file handle for the git 
command 2) Removed the --version command line option and automated version to 
be the timestamp


Included Guozhang's suggestion about making --version mandatory if --rb is 
specified


Included Guozhang's suggestion about making --version mandatory if --rb is 
specified


Attaching file using jira-python package


Changed the script to use the jira-python package, part 2


Changed the script to use the jira-python package


Changed the creating a reviewboard comment as suggested by Tejas


Included Tejas's comments


Changed the post-review tool to publish the rb automatically. Updated the tool 
to add a link to the rb in the JIRA


publish automatically


Patch review tool


Diffs (updated)
-

  .reviewboardrc PRE-CREATION 
  kafka-patch-review.py PRE-CREATION 

Diff: https://reviews.apache.org/r/14091/diff/


Testing
---


Thanks,

Neha Narkhede



[jira] [Updated] (KAFKA-1053) Kafka patch review tool

2013-09-15 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-1053:
-

Attachment: KAFKA-1053_2013-09-15_20:28:01.patch

> Kafka patch review tool
> ---
>
> Key: KAFKA-1053
> URL: https://issues.apache.org/jira/browse/KAFKA-1053
> Project: Kafka
>  Issue Type: New Feature
>  Components: tools
>Reporter: Neha Narkhede
>Assignee: Neha Narkhede
> Attachments: KAFKA-1053-2013-09-15_09:40:04.patch, 
> KAFKA-1053_2013-09-15_20:28:01.patch, KAFKA-1053-followup2.patch, 
> KAFKA-1053-followup.patch, KAFKA-1053-v1.patch, KAFKA-1053-v1.patch, 
> KAFKA-1053-v1.patch, KAFKA-1053-v2.patch, KAFKA-1053-v3.patch
>
>
> Created a new patch review tool that will integrate JIRA and reviewboard - 
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+patch+review+tool

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-1053) Kafka patch review tool

2013-09-15 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13768036#comment-13768036
 ] 

Neha Narkhede commented on KAFKA-1053:
--

Updated reviewboard https://reviews.apache.org/r/14091/


> Kafka patch review tool
> ---
>
> Key: KAFKA-1053
> URL: https://issues.apache.org/jira/browse/KAFKA-1053
> Project: Kafka
>  Issue Type: New Feature
>  Components: tools
>Reporter: Neha Narkhede
>Assignee: Neha Narkhede
> Attachments: KAFKA-1053-2013-09-15_09:40:04.patch, 
> KAFKA-1053_2013-09-15_20:28:01.patch, KAFKA-1053-followup2.patch, 
> KAFKA-1053-followup.patch, KAFKA-1053-v1.patch, KAFKA-1053-v1.patch, 
> KAFKA-1053-v1.patch, KAFKA-1053-v2.patch, KAFKA-1053-v3.patch
>
>
> Created a new patch review tool that will integrate JIRA and reviewboard - 
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+patch+review+tool

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira