[jira] [Updated] (KAFKA-1147) Consumer socket timeout should be greater than fetch max wait

2013-12-02 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-1147:
-

Affects Version/s: 0.8.1
   0.8

> Consumer socket timeout should be greater than fetch max wait
> -
>
> Key: KAFKA-1147
> URL: https://issues.apache.org/jira/browse/KAFKA-1147
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8, 0.8.1
>Reporter: Joel Koshy
> Fix For: 0.8.1
>
>
> From the mailing list:
> The consumer-config documentation states that "The actual timeout set
> will be max.fetch.wait + socket.timeout.ms." - however, that change
> seems to have been lost in the code a while ago - we should either fix the 
> doc or re-introduce the addition.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (KAFKA-1153) typos in documentation

2013-12-02 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-1153:
-

Assignee: Joe Stein

> typos in documentation
> --
>
> Key: KAFKA-1153
> URL: https://issues.apache.org/jira/browse/KAFKA-1153
> Project: Kafka
>  Issue Type: Bug
>Reporter: Joe Stein
>Assignee: Joe Stein
> Attachments: KAFKA-1153.patch
>
>
> Dan Hoffman hoffman...@gmail.com via kafka.apache.org 
> 9:45 AM (1 hour ago)
> to users 
> *'Not that partitioning means Kafka only provides a total order over
> messages within a partition. This combined with the ability to partition
> data by key is sufficient for the vast majority of applications. However,
> if you require a total order over messages this can be achieved with a
> topic that has only one partition, though this will mean only one consumer
> process.'*
> The first word should say *NOTE*, right?  Otherwise, I don't understand the
> meaning.
> ...
> Marc Labbe via kafka.apache.org 
> 12:57 PM (12 minutes ago)
> to users 
> while we're at it... I noticed the following typos in
> section 4.1 Motivation (
> http://kafka.apache.org/documentation.html#majordesignelements)
> "we knew" instead of "we new"
> 
> Finally in cases where the stream is fed into other data systems for
> serving we new the system would have to be able to guarantee
> fault-tolerance in the presence of machine failures.
> 
> "led us" instead of "led use"
> 
> Supporting these uses led use to a design with a number of unique elements,
> more akin to a database log then a traditional messaging system. We will
> outline some elements of the design in the following sections.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Closed] (KAFKA-1153) typos in documentation

2013-12-02 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede closed KAFKA-1153.



> typos in documentation
> --
>
> Key: KAFKA-1153
> URL: https://issues.apache.org/jira/browse/KAFKA-1153
> Project: Kafka
>  Issue Type: Bug
>Reporter: Joe Stein
>Assignee: Joe Stein
> Attachments: KAFKA-1153.patch
>
>
> Dan Hoffman hoffman...@gmail.com via kafka.apache.org 
> 9:45 AM (1 hour ago)
> to users 
> *'Not that partitioning means Kafka only provides a total order over
> messages within a partition. This combined with the ability to partition
> data by key is sufficient for the vast majority of applications. However,
> if you require a total order over messages this can be achieved with a
> topic that has only one partition, though this will mean only one consumer
> process.'*
> The first word should say *NOTE*, right?  Otherwise, I don't understand the
> meaning.
> ...
> Marc Labbe via kafka.apache.org 
> 12:57 PM (12 minutes ago)
> to users 
> while we're at it... I noticed the following typos in
> section 4.1 Motivation (
> http://kafka.apache.org/documentation.html#majordesignelements)
> "we knew" instead of "we new"
> 
> Finally in cases where the stream is fed into other data systems for
> serving we new the system would have to be able to guarantee
> fault-tolerance in the presence of machine failures.
> 
> "led us" instead of "led use"
> 
> Supporting these uses led use to a design with a number of unique elements,
> more akin to a database log then a traditional messaging system. We will
> outline some elements of the design in the following sections.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (KAFKA-1151) The Hadoop consumer API doc is not referencing the contrib consumer

2013-12-02 Thread Joe Stein (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe Stein updated KAFKA-1151:
-

Status: Patch Available  (was: Open)

> The Hadoop consumer API doc is not referencing the contrib consumer
> ---
>
> Key: KAFKA-1151
> URL: https://issues.apache.org/jira/browse/KAFKA-1151
> Project: Kafka
>  Issue Type: Bug
>Reporter: Joe Stein
> Fix For: 0.8.1
>
> Attachments: KAFKA-1151.patch
>
>
> http://kafka.apache.org/documentation.html#kafkahadoopconsumerapi
> it is pointing to https://github.com/linkedin/camus/tree/camus-kafka-0.8/
> if we are still supporting the contrib/hadoop-consumer then we should point 
> to the read me (maybe this link instead 
> https://github.com/apache/kafka/tree/0.8/contrib/hadoop-consumer)
> thoughts?



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (KAFKA-1151) The Hadoop consumer API doc is not referencing the contrib consumer

2013-12-02 Thread Joe Stein (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe Stein updated KAFKA-1151:
-

Attachment: KAFKA-1151.patch

I gave a stab at what I think would preserve the spirit of the apache contrib 
code and the progress that Camus has also brought.  Not sure if the Hadoop 
consumer can benefit from some of the Camus up stream changes (without having 
to take on and require Avro) but probably a discussion for the list or another 
JIRA but figure I start to touch/ask about that here (or a sub project or 
something... dunno).

> The Hadoop consumer API doc is not referencing the contrib consumer
> ---
>
> Key: KAFKA-1151
> URL: https://issues.apache.org/jira/browse/KAFKA-1151
> Project: Kafka
>  Issue Type: Bug
>Reporter: Joe Stein
> Fix For: 0.8.1
>
> Attachments: KAFKA-1151.patch
>
>
> http://kafka.apache.org/documentation.html#kafkahadoopconsumerapi
> it is pointing to https://github.com/linkedin/camus/tree/camus-kafka-0.8/
> if we are still supporting the contrib/hadoop-consumer then we should point 
> to the read me (maybe this link instead 
> https://github.com/apache/kafka/tree/0.8/contrib/hadoop-consumer)
> thoughts?



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: [VOTE] Apache Kafka Release 0.8.0 - Candidate 5

2013-12-02 Thread David Arthur
Seems like most people are verifying the src, so I'll pick on the 
binaries and Maven stuff ;)


A few problems I see:

There are some vestigial Git files in the src download: an empty .git 
and .gitignore


In the source download, I see the SBT license in LICENSE which seems 
correct (since we distribute an SBT binary), but in the binary download 
I see the same license. Don't we need the Scala license 
(http://www.scala-lang.org/license.html) in the binary distribution?


I create a simple Ant+Ivy project to test resolving the artifacts 
published to Apache staging repo: https://github.com/mumrah/kafka-ivy. 
This will fetch Kafka libs from the Apache staging area and other things 
from Maven Central. It will fetch the jars into lib/ivy/{conf} and 
generate a report of the dependencies, conflicts, and licenses into 
ivy-report. Notice I had to add three exclusions to get things working. 
Maybe we should add these to our pom?


I think I'll have to -1 the release due to the missing Scala license in 
the binary dist. We should check the other licenses as well (see 
ivy-report from my little Ant project).


-David

On 11/26/13 5:34 PM, Joe Stein wrote:

This is the fifth candidate for release of Apache Kafka 0.8.0.   This
release candidate is now built from JDK 6 as RC4 was built with JDK 7.

Release Notes for the 0.8.0 release
http://people.apache.org/~joestein/kafka-0.8.0-candidate5/RELEASE_NOTES.html

*** Please download, test and vote by Monday December, 2nd, 12pm PDT

Kafka's KEYS file containing PGP keys we use to sign the release:
http://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5 and sha1
checksum

* Release artifacts to be voted upon (source and binary):
http://people.apache.org/~joestein/kafka-0.8.0-candidate5/

* Maven artifacts to be voted upon prior to release:
https://repository.apache.org/content/groups/staging/

(i.e. in sbt land this can be added to the build.sbt to use Kafka
resolvers += "Apache Staging" at "
https://repository.apache.org/content/groups/staging/";
libraryDependencies += "org.apache.kafka" % "kafka_2.10" % "0.8.0"
)

* The tag to be voted upon (off the 0.8 branch) is the 0.8.0 tag
https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2c20a71a010659e25af075a024cbd692c87d4c89

/***
  Joe Stein
  Founder, Principal Consultant
  Big Data Open Source Security LLC
  http://www.stealth.ly
  Twitter: @allthingshadoop 
/





[jira] [Commented] (KAFKA-1154) replicas may not have consistent data after becoming follower

2013-12-02 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836683#comment-13836683
 ] 

Neha Narkhede commented on KAFKA-1154:
--

We used to do this, looks like this was a regression introduced in KAFKA-1001. 

1. ReplicaManager
Do we still need this "TODO: the above may need to be fixed later" ?

2. We had added the ability for a special consumer to read the replica log for 
troubleshooting. This patch takes that convenience away. We should probably 
look for another way to prevent the replica verification tool from giving false 
negatives. Can it use a different consumer id?


> replicas may not have consistent data after becoming follower
> -
>
> Key: KAFKA-1154
> URL: https://issues.apache.org/jira/browse/KAFKA-1154
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.1
>Reporter: Jun Rao
>Assignee: Jun Rao
> Fix For: 0.8.1
>
> Attachments: KAFKA-1154.patch
>
>
> This is an issued introduced in KAFKA-1001. The issue is that in 
> ReplicaManager.makeFollowers(), we truncate the log before marking the 
> replica as the follower. New messages from the producer can still be added to 
> the log after the log is truncated, but before the replica is marked as the 
> follower. Those newly produced messages can actually be committed, which 
> implies those truncated messages are also committed. However, the new leader 
> is not guaranteed to have those truncated messages.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15938: replicas may not have consistent data after becoming follower

2013-12-02 Thread Neha Narkhede

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15938/#review29586
---



core/src/main/scala/kafka/server/KafkaApis.scala


We had added the ability for a special consumer to read the replica log for 
troubleshooting. This patch takes that convenience away. We should probably 
look for another way to prevent the replica verification tool from giving false 
negatives. Can it use a different consumer id?



core/src/main/scala/kafka/server/ReplicaManager.scala


Do we still need this "TODO: the above may need to be fixed later" ?


- Neha Narkhede


On Dec. 1, 2013, 11:33 p.m., Jun Rao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15938/
> ---
> 
> (Updated Dec. 1, 2013, 11:33 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1154
> https://issues.apache.org/jira/browse/KAFKA-1154
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> kafka-1154; fix 1
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/api/FetchRequest.scala 
> fb2a2306003ac64a8a3b2fc5fc999e0be273f48d 
>   core/src/main/scala/kafka/api/RequestOrResponse.scala 
> b62330be6241c8ff4acd21f0fa7e80b7636e0d42 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> 80a70f1e5e3a7670b2238fe63b8d9e0eac6b46ac 
>   core/src/main/scala/kafka/server/ReplicaManager.scala 
> 54f6e1674255f62eba9d90aab0db371c82baf749 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala 
> f1f139e485d98e42be17cdcc327961420cd8c012 
> 
> Diff: https://reviews.apache.org/r/15938/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jun Rao
> 
>



Re: [VOTE] Apache Kafka Release 0.8.0 - Candidate 5

2013-12-02 Thread Kostya Golikov
Talking about binary release, do we really need to include bin/run-rat.sh?
As far as I understand it is only used to bring licenses to the scene and
quite redundant for already baked release.

Next, I not quite sure, but probably it makes sense to drop
libs/scala-compiler.jar -- kafka do not perform compilations during runtime
and this step will trim some fat from the resulting release (from 17 mb
down to 9.5 mb*).

I managed to satisfy maven with only two exclusions, but yes, it would be
good to see them in original pom.

* by the way using the best possible compression method  (-9 instead of
default -6) + drop of compiler lib gave me the very same result -- 9.5 Mb


2013/12/2 David Arthur 

> Seems like most people are verifying the src, so I'll pick on the binaries
> and Maven stuff ;)
>
> A few problems I see:
>
> There are some vestigial Git files in the src download: an empty .git and
> .gitignore
>
> In the source download, I see the SBT license in LICENSE which seems
> correct (since we distribute an SBT binary), but in the binary download I
> see the same license. Don't we need the Scala license (
> http://www.scala-lang.org/license.html) in the binary distribution?
>
> I create a simple Ant+Ivy project to test resolving the artifacts
> published to Apache staging repo: https://github.com/mumrah/kafka-ivy.
> This will fetch Kafka libs from the Apache staging area and other things
> from Maven Central. It will fetch the jars into lib/ivy/{conf} and generate
> a report of the dependencies, conflicts, and licenses into ivy-report.
> Notice I had to add three exclusions to get things working. Maybe we should
> add these to our pom?
>
> I think I'll have to -1 the release due to the missing Scala license in
> the binary dist. We should check the other licenses as well (see ivy-report
> from my little Ant project).
>
> -David
>
> On 11/26/13 5:34 PM, Joe Stein wrote:
>
>> This is the fifth candidate for release of Apache Kafka 0.8.0.   This
>> release candidate is now built from JDK 6 as RC4 was built with JDK 7.
>>
>> Release Notes for the 0.8.0 release
>> http://people.apache.org/~joestein/kafka-0.8.0-
>> candidate5/RELEASE_NOTES.html
>>
>> *** Please download, test and vote by Monday December, 2nd, 12pm PDT
>>
>> Kafka's KEYS file containing PGP keys we use to sign the release:
>> http://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5 and
>> sha1
>> checksum
>>
>> * Release artifacts to be voted upon (source and binary):
>> http://people.apache.org/~joestein/kafka-0.8.0-candidate5/
>>
>> * Maven artifacts to be voted upon prior to release:
>> https://repository.apache.org/content/groups/staging/
>>
>> (i.e. in sbt land this can be added to the build.sbt to use Kafka
>> resolvers += "Apache Staging" at "
>> https://repository.apache.org/content/groups/staging/";
>> libraryDependencies += "org.apache.kafka" % "kafka_2.10" % "0.8.0"
>> )
>>
>> * The tag to be voted upon (off the 0.8 branch) is the 0.8.0 tag
>> https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=
>> 2c20a71a010659e25af075a024cbd692c87d4c89
>>
>> /***
>>   Joe Stein
>>   Founder, Principal Consultant
>>   Big Data Open Source Security LLC
>>   http://www.stealth.ly
>>   Twitter: @allthingshadoop 
>> /
>>
>>
>


Re: Review Request 15901: Patch for KAFKA-1152

2013-12-02 Thread Neha Narkhede

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15901/#review29588
---



core/src/main/scala/kafka/server/ReplicaManager.scala


the check should probably be leaderId >= 0. The "leaders" in the 
LeaderAndIsrRequest is misleading, cannot be trusted and needs to be deprecated.



core/src/main/scala/kafka/server/ReplicaManager.scala


this format statement is broken. We need a parentheses surrounding the 
entire trace statement


- Neha Narkhede


On Nov. 29, 2013, 6:41 a.m., Swapnil Ghike wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15901/
> ---
> 
> (Updated Nov. 29, 2013, 6:41 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1152
> https://issues.apache.org/jira/browse/KAFKA-1152
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> ReplicaManager's handling of the leaderAndIsrRequest should gracefully handle 
> leader == -1
> 
> 
> ReplicaManager's handling of the leaderAndIsrRequest should gracefully handle 
> leader == -1
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/server/ReplicaManager.scala 
> 161f58134f20f9335dbd2bee6ac3f71897cbef7c 
> 
> Diff: https://reviews.apache.org/r/15901/diff/
> 
> 
> Testing
> ---
> 
> Builds with all scala versions; unit tests pass
> 
> 
> Thanks,
> 
> Swapnil Ghike
> 
>



[jira] [Commented] (KAFKA-1036) Unable to rename replication offset checkpoint in windows

2013-12-02 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836698#comment-13836698
 ] 

Neha Narkhede commented on KAFKA-1036:
--

[~jantxu] Are you sure this is required? If we always delete the destination 
file and then execute renameTo, it should work in all cases, no? [~sriramsub] 
What do you think?

> Unable to rename replication offset checkpoint in windows
> -
>
> Key: KAFKA-1036
> URL: https://issues.apache.org/jira/browse/KAFKA-1036
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1
> Environment: windows
>Reporter: Timothy Chen
>Priority: Critical
>  Labels: windows
> Fix For: 0.8.1
>
> Attachments: filelock.patch.diff
>
>
> Although there was a fix for checkpoint file renaming in windows that tries 
> to delete the existing checkpoint file if renamed failed, I'm still seeing 
> renaming errors on windows even though the destination file doesn't exist.
> A bit investigation shows that it wasn't able to rename the file since the 
> kafka jvm still holds a fie lock on the tmp file and wasn't able to rename 
> it. 
> Attaching a patch that calls a explict writer.close so it can release the 
> lock and can able to rename it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: [VOTE] Apache Kafka Release 0.8.0 - Candidate 5

2013-12-02 Thread Neha Narkhede
I think we should maintain a wiki describing the release process in detail,
so we save the turnaround time on a release. We can have a VOTE thread to
agree on the release guidelines and follow it. Having  said that, it is
worth having the correct .pom file at the very least, since the release is
not very useful if people cannot consume it without pain.

Thanks,
Neha


On Mon, Dec 2, 2013 at 8:59 AM, Joe Stein  wrote:

> General future thought comment first: lets be careful please to raising
> issues as show stoppers that have been there previously (especially if
> greater than one version previous release back also has the problem) and
> can get fixed in a subsequent release and is only now more pressing because
> we know about them... seeing something should not necessarily always create
> priority (sometimes sure, of course but not always that is not the best way
> to manage changes).  The VOTE thread should be to artifacts and what we are
> releasing as proper and correct per Apache guidelines... and to make sure
> that the person doing the release doesn't do something incorrect ... like
> using the wrong version of JDK to build =8^/.  If we are not happy with
> release as ready to ship then lets not call a VOTE and save the prolonged
> weeks that drag out with so many release candidates.  The community suffers
> from this.
>
> ok, now on to RC5 ...lets extend the vote until 12pm PT tomorrow ...
> hopefully a few more hours for other folks to comment and discuss the
> issues you raised with my $0.02852425 included below and follow-ups as they
> become necessary... I am also out of pocket in a few hours until tomorrow
> morning so if it passed I would not be able to publish and announce or if
> failed look towards RC6 anyways =8^)
>
> /***
>  Joe Stein
>  Founder, Principal Consultant
>  Big Data Open Source Security LLC
>  http://www.stealth.ly
>  Twitter: @allthingshadoop 
> /
>
>
> On Mon, Dec 2, 2013 at 11:00 AM, David Arthur  wrote:
>
> > Seems like most people are verifying the src, so I'll pick on the
> binaries
> > and Maven stuff ;)
> >
> > A few problems I see:
> >
> > There are some vestigial Git files in the src download: an empty .git and
> > .gitignore
> >
>
> Ok, I can do a better job with 0.8.1 but I am not sure this is very
> different than beta1 and not necessarily a show stopper for 0.8.0 requiring
> another release candidate, is it?  I think updating the release docs and
> rmdir .git after the rm -fr and rm .gitignore moving forward makes sense.
>
>
> >
> > In the source download, I see the SBT license in LICENSE which seems
> > correct (since we distribute an SBT binary), but in the binary download I
> > see the same license. Don't we need the Scala license (
> > http://www.scala-lang.org/license.html) in the binary distribution?
> >
>
> I fixed this already not only in the binary release
> https://issues.apache.org/jira/browse/KAFKA-1131 but also in the JAR files
> that are published to Maven
> https://issues.apache.org/jira/browse/KAFKA-1133are you checking from
> http://people.apache.org/~joestein/kafka-0.8.0-candidate5/ because I just
> downloaded again and it looks alright to me.  If not then definitely this
> RC should be shot down because it does not do what we are saying it is
> doing.. but if it is wrong can you be more specific and create a JIRA with
> the fix because I thought I got it right already... but if not then lets
> get it right because that is why we pulled the release in RC3
>
>
> >
> > I create a simple Ant+Ivy project to test resolving the artifacts
> > published to Apache staging repo: https://github.com/mumrah/kafka-ivy.
> > This will fetch Kafka libs from the Apache staging area and other things
> > from Maven Central. It will fetch the jars into lib/ivy/{conf} and
> generate
> > a report of the dependencies, conflicts, and licenses into ivy-report.
> > Notice I had to add three exclusions to get things working. Maybe we
> should
> > add these to our pom?
> >
>
> I don't think this is a showstopper is it?  can't this wait for 0.8.1 and
> not hold up the 0.8.0 release?
>
> I didn't have this issue with java maven pom or scala sbt so maybe
> something more ivy ant specific causing this?  folks use gradle too so I
> expect some feedback at some point to that working or not perhaps in 0.8.1
> or even 0.9 we can try to cover every way everyone uses and make sure they
> are all good to go moving forward... perhaps even some vagrant, docker,
> puppet and chef love too (which I can contribute if folks are interested)
> =8^)
>
> In any case can you create a JIRA and throw a patch up on it please,
> thanks! IMHO this is for 0.8.1 though ... what are thoughts here...
>
>
> >
> > I think I'll have to -1 the release due to the missing Scala license in
> > the binary dist. We should check the other licenses as well (see
> ivy-report
> > from my litt

Re: [VOTE] Apache Kafka Release 0.8.0 - Candidate 5

2013-12-02 Thread Joe Stein
General future thought comment first: lets be careful please to raising
issues as show stoppers that have been there previously (especially if
greater than one version previous release back also has the problem) and
can get fixed in a subsequent release and is only now more pressing because
we know about them... seeing something should not necessarily always create
priority (sometimes sure, of course but not always that is not the best way
to manage changes).  The VOTE thread should be to artifacts and what we are
releasing as proper and correct per Apache guidelines... and to make sure
that the person doing the release doesn't do something incorrect ... like
using the wrong version of JDK to build =8^/.  If we are not happy with
release as ready to ship then lets not call a VOTE and save the prolonged
weeks that drag out with so many release candidates.  The community suffers
from this.

ok, now on to RC5 ...lets extend the vote until 12pm PT tomorrow ...
hopefully a few more hours for other folks to comment and discuss the
issues you raised with my $0.02852425 included below and follow-ups as they
become necessary... I am also out of pocket in a few hours until tomorrow
morning so if it passed I would not be able to publish and announce or if
failed look towards RC6 anyways =8^)

/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop 
/


On Mon, Dec 2, 2013 at 11:00 AM, David Arthur  wrote:

> Seems like most people are verifying the src, so I'll pick on the binaries
> and Maven stuff ;)
>
> A few problems I see:
>
> There are some vestigial Git files in the src download: an empty .git and
> .gitignore
>

Ok, I can do a better job with 0.8.1 but I am not sure this is very
different than beta1 and not necessarily a show stopper for 0.8.0 requiring
another release candidate, is it?  I think updating the release docs and
rmdir .git after the rm -fr and rm .gitignore moving forward makes sense.


>
> In the source download, I see the SBT license in LICENSE which seems
> correct (since we distribute an SBT binary), but in the binary download I
> see the same license. Don't we need the Scala license (
> http://www.scala-lang.org/license.html) in the binary distribution?
>

I fixed this already not only in the binary release
https://issues.apache.org/jira/browse/KAFKA-1131 but also in the JAR files
that are published to Maven
https://issues.apache.org/jira/browse/KAFKA-1133are you checking from
http://people.apache.org/~joestein/kafka-0.8.0-candidate5/ because I just
downloaded again and it looks alright to me.  If not then definitely this
RC should be shot down because it does not do what we are saying it is
doing.. but if it is wrong can you be more specific and create a JIRA with
the fix because I thought I got it right already... but if not then lets
get it right because that is why we pulled the release in RC3


>
> I create a simple Ant+Ivy project to test resolving the artifacts
> published to Apache staging repo: https://github.com/mumrah/kafka-ivy.
> This will fetch Kafka libs from the Apache staging area and other things
> from Maven Central. It will fetch the jars into lib/ivy/{conf} and generate
> a report of the dependencies, conflicts, and licenses into ivy-report.
> Notice I had to add three exclusions to get things working. Maybe we should
> add these to our pom?
>

I don't think this is a showstopper is it?  can't this wait for 0.8.1 and
not hold up the 0.8.0 release?

I didn't have this issue with java maven pom or scala sbt so maybe
something more ivy ant specific causing this?  folks use gradle too so I
expect some feedback at some point to that working or not perhaps in 0.8.1
or even 0.9 we can try to cover every way everyone uses and make sure they
are all good to go moving forward... perhaps even some vagrant, docker,
puppet and chef love too (which I can contribute if folks are interested)
=8^)

In any case can you create a JIRA and throw a patch up on it please,
thanks! IMHO this is for 0.8.1 though ... what are thoughts here...


>
> I think I'll have to -1 the release due to the missing Scala license in
> the binary dist. We should check the other licenses as well (see ivy-report
> from my little Ant project).
>

it would break my heart to have lots of binding +1 votes and 2 non-binding
votes one +1 and one -1, I still haven't cast my vote yet was hoping
everyone would get their voices and everything in before calling the VOTE
closed or canceled.  I really don't mind preparing a release candidate 6
that is not the issue at all but I think we need to be thoughtful about
using the release candidates to fixe things that should be fixed and part
of the releases themselves where the release candidates are to make sure
that the preparation of the build is not wrong (like it was in RC4 where I

Re: Review Request 15659: Incorporate Joel/Jun's comments, MM system test passed, rebased

2013-12-02 Thread Neha Narkhede

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15659/#review29593
---



core/src/main/scala/kafka/consumer/ZookeeperTopicEventWatcher.scala


Can we improve this WARN? It implies that we will not shutdown the client, 
but we proceed with the shutdown anyways :)


- Neha Narkhede


On Nov. 21, 2013, 7:22 p.m., Guozhang Wang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15659/
> ---
> 
> (Updated Nov. 21, 2013, 7:22 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1103
> https://issues.apache.org/jira/browse/KAFKA-1103
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1103.v2
> 
> 
> Dummy
> 
> 
> KAFKA-1103.v1
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/consumer/TopicFilter.scala 
> cf3853b223095e1fe0921175c407a906828b8113 
>   core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala 
> 6d0cfa665e90a168a70501a81f10fa4d3c7a7f22 
>   core/src/main/scala/kafka/consumer/ZookeeperTopicEventWatcher.scala 
> a67c193df9f7cbfc52f75dc1b71dc017de1b5fe2 
>   core/src/test/scala/unit/kafka/consumer/TopicFilterTest.scala 
> 40a2bf7a9277eb5f94bc07b40d7726d81860cefc 
>   system_test/migration_tool_testsuite/0.7/config/test-log4j.properties 
> a3ae33f20e4b7cff87d8cf8368d0639b8bea73a6 
> 
> Diff: https://reviews.apache.org/r/15659/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Guozhang Wang
> 
>



[jira] [Updated] (KAFKA-1133) LICENSE and NOTICE files need to get into META-INF when jars are built before they're signed for publishing to maven

2013-12-02 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-1133:
-

Assignee: Joe Stein

> LICENSE and NOTICE files need to get into  META-INF when jars are built 
> before they're signed for publishing to maven
> -
>
> Key: KAFKA-1133
> URL: https://issues.apache.org/jira/browse/KAFKA-1133
> Project: Kafka
>  Issue Type: Bug
>Reporter: Joe Stein
>Assignee: Joe Stein
> Fix For: 0.8, 0.8.1
>
> Attachments: KAFKA-1133.patch
>
>
> This needs to happen in our Build.scala the sbt package docs 
> http://www.scala-sbt.org/release/docs/Howto/package.html probably a straight 
> forward line of code or ten or whatever to-do this maybe



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15901: Patch for KAFKA-1152

2013-12-02 Thread Swapnil Ghike


> On Dec. 2, 2013, 5:06 p.m., Neha Narkhede wrote:
> > core/src/main/scala/kafka/server/ReplicaManager.scala, line 358
> > 
> >
> > the check should probably be leaderId >= 0. The "leaders" in the 
> > LeaderAndIsrRequest is misleading, cannot be trusted and needs to be 
> > deprecated.

On the controller, leaders exclude shutdown brokers. 

val leaders = controllerContext.liveOrShuttingDownBrokers.filter(b => 
leaderIds.contains(b.id))

On the broker, should we not check whether the leader that it is being asked to 
follow is alive or not?


- Swapnil


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15901/#review29588
---


On Nov. 29, 2013, 6:41 a.m., Swapnil Ghike wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15901/
> ---
> 
> (Updated Nov. 29, 2013, 6:41 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1152
> https://issues.apache.org/jira/browse/KAFKA-1152
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> ReplicaManager's handling of the leaderAndIsrRequest should gracefully handle 
> leader == -1
> 
> 
> ReplicaManager's handling of the leaderAndIsrRequest should gracefully handle 
> leader == -1
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/server/ReplicaManager.scala 
> 161f58134f20f9335dbd2bee6ac3f71897cbef7c 
> 
> Diff: https://reviews.apache.org/r/15901/diff/
> 
> 
> Testing
> ---
> 
> Builds with all scala versions; unit tests pass
> 
> 
> Thanks,
> 
> Swapnil Ghike
> 
>



Re: [VOTE] Apache Kafka Release 0.8.0 - Candidate 5

2013-12-02 Thread Joe Stein
Neha, as far as the release process is this what you had in mind
https://cwiki.apache.org/confluence/display/KAFKA/Release+Process or
different content or more of something or such?

Per the POM, I was able to use the artifacts from the maven repository
without having to-do anything more than just specifying the artifacts with
sbt.

resolvers += "Apache Staging" at "
https://repository.apache.org/content/groups/staging/";

libraryDependencies ++= Seq(
...,
"org.apache.kafka" % "kafka_2.10" % "0.8.0",

)

and on the pure maven side


ApacheStaging
https://repository.apache.org/content/groups/staging/

...

org.apache.kafka
kafka_2.9.2
0.8.0


log4j
log4j




which very closely mirrors what David was talking about with ivy as well...
I didn't really think much of it just a matter of XML we can document
(there is actually no using maven documentation on the site at all we
should correct that in any case TBD post release) but if folks find it to
be a pain then we should definitely fix it for sure.  off the top of my
head I don't see how to-do that in the Build.scala but I really don't
expect it to be too difficult to figure out... the question is do we hold
it off for 0.8.1 since technically nothing is breaking (like the null
pointer exceptions we had for the bonked pom in beta1 that I shipped to
maven central).

Before canceling the vote can we at least get consensus to what we are
canceling and exactly what fixes should be in RC6 or ... agree to ship RC5
and hold whatever is left for 0.8.1

I am totally fine with working on RC6 (actually just cancelled my plans for
the evening because of a whole slew of client work that hit my plate) but I
want to make sure we have everything covered that everyone that is voting
expects to be in there.

David, a few items below don't make sense I sent another email on the
thread in regards to the LICENSE


/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop 
/


On Mon, Dec 2, 2013 at 12:19 PM, Neha Narkhede wrote:

> I think we should maintain a wiki describing the release process in detail,
> so we save the turnaround time on a release. We can have a VOTE thread to
> agree on the release guidelines and follow it. Having  said that, it is
> worth having the correct .pom file at the very least, since the release is
> not very useful if people cannot consume it without pain.
>
> Thanks,
> Neha
>
>
> On Mon, Dec 2, 2013 at 8:59 AM, Joe Stein  wrote:
>
> > General future thought comment first: lets be careful please to raising
> > issues as show stoppers that have been there previously (especially if
> > greater than one version previous release back also has the problem) and
> > can get fixed in a subsequent release and is only now more pressing
> because
> > we know about them... seeing something should not necessarily always
> create
> > priority (sometimes sure, of course but not always that is not the best
> way
> > to manage changes).  The VOTE thread should be to artifacts and what we
> are
> > releasing as proper and correct per Apache guidelines... and to make sure
> > that the person doing the release doesn't do something incorrect ... like
> > using the wrong version of JDK to build =8^/.  If we are not happy with
> > release as ready to ship then lets not call a VOTE and save the prolonged
> > weeks that drag out with so many release candidates.  The community
> suffers
> > from this.
> >
> > ok, now on to RC5 ...lets extend the vote until 12pm PT tomorrow ...
> > hopefully a few more hours for other folks to comment and discuss the
> > issues you raised with my $0.02852425 included below and follow-ups as
> they
> > become necessary... I am also out of pocket in a few hours until tomorrow
> > morning so if it passed I would not be able to publish and announce or if
> > failed look towards RC6 anyways =8^)
> >
> > /***
> >  Joe Stein
> >  Founder, Principal Consultant
> >  Big Data Open Source Security LLC
> >  http://www.stealth.ly
> >  Twitter: @allthingshadoop 
> > /
> >
> >
> > On Mon, Dec 2, 2013 at 11:00 AM, David Arthur  wrote:
> >
> > > Seems like most people are verifying the src, so I'll pick on the
> > binaries
> > > and Maven stuff ;)
> > >
> > > A few problems I see:
> > >
> > > There are some vestigial Git files in the src download: an empty .git
> and
> > > .gitignore
> > >
> >
> > Ok, I can do a better job with 0.8.1 but I am not sure this is very
> > different than beta1 and not necessarily a show stopper for 0.8.0
> requir

Re: Review Request 15711: Patch for KAFKA-930

2013-12-02 Thread Neha Narkhede

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15711/#review29597
---



core/src/main/scala/kafka/controller/KafkaController.scala


instead of hardcoding this to 5 seconds, how about delaying it by 
leaderImbalanceCheckIntervalSeconds?



core/src/main/scala/kafka/controller/KafkaController.scala


this API is now a little awkward due to the updateZK parameter. Do we 
really need it? Another way is for the partition-rebalance-thread to always 
ensure creating the path and let this API delete it. This will keep the API 
clean.



core/src/main/scala/kafka/controller/KafkaController.scala


it seems we only need the preferred replica per partition, not the entire 
set of replicas right? In that case, we can simplify 
preferredReplicasForTopicsByBrokers to Map[Int, Map[TopicAndPartition, Int]] 
and call it preferredReplicaForPartitionsByBrokers



core/src/main/scala/kafka/controller/KafkaController.scala


It seems we don't need the brokerIds variable since it is never reused 
beyond the check in the if statement



core/src/main/scala/kafka/controller/KafkaController.scala


we also don't have semicolons as a coding convention. Difficult to switch 
between java and scala, eh? :)



core/src/main/scala/kafka/controller/KafkaController.scala


"trigger a leader rebalance for partitions that should have a leader on 
this broker" ?



core/src/main/scala/kafka/controller/KafkaController.scala


could we rename topicPartition to replicasPerPartition?



core/src/main/scala/kafka/server/KafkaConfig.scala


do we need this config option? It seems that the same could be achieved by 
setting a very high value for leader.imbalance.check.interval.seconds.


- Neha Narkhede


On Nov. 21, 2013, 5:42 p.m., Sriram Subramanian wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15711/
> ---
> 
> (Updated Nov. 21, 2013, 5:42 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-930
> https://issues.apache.org/jira/browse/KAFKA-930
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> trunk
> 
> 
> commit missing code
> 
> 
> some more changes
> 
> 
> fix merge conflicts
> 
> 
> Add auto leader rebalance support
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> trunk
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> trunk
> 
> Conflicts:
>   core/src/main/scala/kafka/admin/AdminUtils.scala
>   core/src/main/scala/kafka/admin/TopicCommand.scala
> 
> change comments
> 
> 
> commit the remaining changes
> 
> 
> Move AddPartitions into TopicCommand
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/controller/KafkaController.scala 
> 4c319aba97655e7c4ec97fac2e34de4e28c9f5d3 
>   core/src/main/scala/kafka/server/KafkaConfig.scala 
> b324344d0a383398db8bfe2cbeec2c1378fe13c9 
> 
> Diff: https://reviews.apache.org/r/15711/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sriram Subramanian
> 
>



Re: [VOTE] Apache Kafka Release 0.8.0 - Candidate 5

2013-12-02 Thread David Arthur

Inline:

On 12/2/13 11:59 AM, Joe Stein wrote:

General future thought comment first: lets be careful please to raising
issues as show stoppers that have been there previously (especially if
greater than one version previous release back also has the problem) and
can get fixed in a subsequent release and is only now more pressing because
we know about them... seeing something should not necessarily always create
priority (sometimes sure, of course but not always that is not the best way
to manage changes).  The VOTE thread should be to artifacts and what we are
releasing as proper and correct per Apache guidelines... and to make sure
that the person doing the release doesn't do something incorrect ... like
using the wrong version of JDK to build =8^/.  If we are not happy with
release as ready to ship then lets not call a VOTE and save the prolonged
weeks that drag out with so many release candidates.  The community suffers
from this.
+1 If we can get most of this release preparation stuff automated, then 
we can iterate on it in a release branch before tagging and voting.

ok, now on to RC5 ...lets extend the vote until 12pm PT tomorrow ...
hopefully a few more hours for other folks to comment and discuss the
issues you raised with my $0.02852425 included below and follow-ups as they
become necessary... I am also out of pocket in a few hours until tomorrow
morning so if it passed I would not be able to publish and announce or if
failed look towards RC6 anyways =8^)

/***
  Joe Stein
  Founder, Principal Consultant
  Big Data Open Source Security LLC
  http://www.stealth.ly
  Twitter: @allthingshadoop 
/


On Mon, Dec 2, 2013 at 11:00 AM, David Arthur  wrote:


Seems like most people are verifying the src, so I'll pick on the binaries
and Maven stuff ;)

A few problems I see:

There are some vestigial Git files in the src download: an empty .git and
.gitignore


Ok, I can do a better job with 0.8.1 but I am not sure this is very
different than beta1 and not necessarily a show stopper for 0.8.0 requiring
another release candidate, is it?  I think updating the release docs and
rmdir .git after the rm -fr and rm .gitignore moving forward makes sense.

Agreed, not a show stopper.




In the source download, I see the SBT license in LICENSE which seems
correct (since we distribute an SBT binary), but in the binary download I
see the same license. Don't we need the Scala license (
http://www.scala-lang.org/license.html) in the binary distribution?


I fixed this already not only in the binary release
https://issues.apache.org/jira/browse/KAFKA-1131 but also in the JAR files
that are published to Maven
https://issues.apache.org/jira/browse/KAFKA-1133are you checking from
http://people.apache.org/~joestein/kafka-0.8.0-candidate5/ because I just
downloaded again and it looks alright to me.  If not then definitely this
RC should be shot down because it does not do what we are saying it is
doing.. but if it is wrong can you be more specific and create a JIRA with
the fix because I thought I got it right already... but if not then lets
get it right because that is why we pulled the release in RC3
The LICENSE file in both the src and binary downloads includes "SBT 
LICENSE" at the end. I could be wrong, but I think the src download 
should include the SBT licnese and the binary download should include 
the Scala license. Since we have released in the past without proper 
licensing, it's probably not a huge deal to do it again (but we should 
fix it).



I create a simple Ant+Ivy project to test resolving the artifacts
published to Apache staging repo: https://github.com/mumrah/kafka-ivy.
This will fetch Kafka libs from the Apache staging area and other things
from Maven Central. It will fetch the jars into lib/ivy/{conf} and generate
a report of the dependencies, conflicts, and licenses into ivy-report.
Notice I had to add three exclusions to get things working. Maybe we should
add these to our pom?


I don't think this is a showstopper is it?  can't this wait for 0.8.1 and
not hold up the 0.8.0 release?
No I don't think it's a show stopper. But to Neha's point, a painless 
Maven/Ivy/SBT/Gradle integration is important since this is how most 
users interface with Kafka. That said, ZooKeeper is what's pulling in 
these troublesome deps and it doesn't stop people from using ZooKeeper. 
I can live with this.


I didn't have this issue with java maven pom or scala sbt so maybe
something more ivy ant specific causing this?
No clue... maybe? I run into these deps all the time when dealing with 
ZooKeeper.

folks use gradle too so I
expect some feedback at some point to that working or not perhaps in 0.8.1
or even 0.9 we can try to cover every way everyone uses and make sure they
are all good to go moving forward... perhaps even some vagrant, docker,
puppet and chef love too (which I can contribute if f

[jira] [Created] (KAFKA-1155) Kafka server can miss zookeeper watches during long zkclient callbacks

2013-12-02 Thread Neha Narkhede (JIRA)
Neha Narkhede created KAFKA-1155:


 Summary: Kafka server can miss zookeeper watches during long 
zkclient callbacks
 Key: KAFKA-1155
 URL: https://issues.apache.org/jira/browse/KAFKA-1155
 Project: Kafka
  Issue Type: Bug
  Components: controller
Affects Versions: 0.8, 0.8.1
Reporter: Neha Narkhede
Assignee: Neha Narkhede
Priority: Critical


On getting a zookeeper watch, zkclient invokes the blocking user callback and 
only re-registers the watch after the callback returns. This leaves a possibly 
large window of time when Kafka has not registered for watches on the desired 
zookeeper paths and hence can miss important state changes (on the controller). 
In any case, it is worth noting that even though zookeeper has a 
read-and-set-watch API, there can always be a window of time between the watch 
being fired, the callback and the read-and-set-watch API call. Due to the 
zkclient wrapper, it is difficult to handle this properly in the Kafka code 
unless we directly use the zookeeper client. One way of getting around this 
issue is to use timestamps on the paths and when a watch fires, check if the 
timestamp in zk is different from the one in the callback handler.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15711: Patch for KAFKA-930

2013-12-02 Thread Neha Narkhede

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15711/#review29606
---



core/src/main/scala/kafka/server/KafkaConfig.scala


can we disable this feature until 
https://issues.apache.org/jira/browse/KAFKA-1155 is solved?


- Neha Narkhede


On Nov. 21, 2013, 5:42 p.m., Sriram Subramanian wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15711/
> ---
> 
> (Updated Nov. 21, 2013, 5:42 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-930
> https://issues.apache.org/jira/browse/KAFKA-930
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> trunk
> 
> 
> commit missing code
> 
> 
> some more changes
> 
> 
> fix merge conflicts
> 
> 
> Add auto leader rebalance support
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> trunk
> 
> 
> Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
> trunk
> 
> Conflicts:
>   core/src/main/scala/kafka/admin/AdminUtils.scala
>   core/src/main/scala/kafka/admin/TopicCommand.scala
> 
> change comments
> 
> 
> commit the remaining changes
> 
> 
> Move AddPartitions into TopicCommand
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/controller/KafkaController.scala 
> 4c319aba97655e7c4ec97fac2e34de4e28c9f5d3 
>   core/src/main/scala/kafka/server/KafkaConfig.scala 
> b324344d0a383398db8bfe2cbeec2c1378fe13c9 
> 
> Diff: https://reviews.apache.org/r/15711/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sriram Subramanian
> 
>



Re: [VOTE] Apache Kafka Release 0.8.0 - Candidate 5

2013-12-02 Thread Neha Narkhede
Thanks for creating
https://cwiki.apache.org/confluence/display/KAFKA/Release+Process. That was
what I was looking for. It will be worth updating it right after the 0.8
release and keep it updated as we change the guidelines. Thanks again!

-Neha


On Mon, Dec 2, 2013 at 10:19 AM, David Arthur  wrote:

> Inline:
>
>
> On 12/2/13 11:59 AM, Joe Stein wrote:
>
>> General future thought comment first: lets be careful please to raising
>> issues as show stoppers that have been there previously (especially if
>> greater than one version previous release back also has the problem) and
>> can get fixed in a subsequent release and is only now more pressing
>> because
>> we know about them... seeing something should not necessarily always
>> create
>> priority (sometimes sure, of course but not always that is not the best
>> way
>> to manage changes).  The VOTE thread should be to artifacts and what we
>> are
>> releasing as proper and correct per Apache guidelines... and to make sure
>> that the person doing the release doesn't do something incorrect ... like
>> using the wrong version of JDK to build =8^/.  If we are not happy with
>> release as ready to ship then lets not call a VOTE and save the prolonged
>> weeks that drag out with so many release candidates.  The community
>> suffers
>> from this.
>>
> +1 If we can get most of this release preparation stuff automated, then we
> can iterate on it in a release branch before tagging and voting.
>
>  ok, now on to RC5 ...lets extend the vote until 12pm PT tomorrow ...
>> hopefully a few more hours for other folks to comment and discuss the
>> issues you raised with my $0.02852425 included below and follow-ups as
>> they
>> become necessary... I am also out of pocket in a few hours until tomorrow
>> morning so if it passed I would not be able to publish and announce or if
>> failed look towards RC6 anyways =8^)
>>
>> /***
>>   Joe Stein
>>   Founder, Principal Consultant
>>   Big Data Open Source Security LLC
>>   http://www.stealth.ly
>>   Twitter: @allthingshadoop 
>> /
>>
>>
>> On Mon, Dec 2, 2013 at 11:00 AM, David Arthur  wrote:
>>
>>  Seems like most people are verifying the src, so I'll pick on the
>>> binaries
>>> and Maven stuff ;)
>>>
>>> A few problems I see:
>>>
>>> There are some vestigial Git files in the src download: an empty .git and
>>> .gitignore
>>>
>>>  Ok, I can do a better job with 0.8.1 but I am not sure this is very
>> different than beta1 and not necessarily a show stopper for 0.8.0
>> requiring
>> another release candidate, is it?  I think updating the release docs and
>> rmdir .git after the rm -fr and rm .gitignore moving forward makes sense.
>>
> Agreed, not a show stopper.
>
>
>>
>>  In the source download, I see the SBT license in LICENSE which seems
>>> correct (since we distribute an SBT binary), but in the binary download I
>>> see the same license. Don't we need the Scala license (
>>> http://www.scala-lang.org/license.html) in the binary distribution?
>>>
>>>  I fixed this already not only in the binary release
>> https://issues.apache.org/jira/browse/KAFKA-1131 but also in the JAR
>> files
>> that are published to Maven
>> https://issues.apache.org/jira/browse/KAFKA-1133are you checking from
>> http://people.apache.org/~joestein/kafka-0.8.0-candidate5/ because I just
>> downloaded again and it looks alright to me.  If not then definitely this
>> RC should be shot down because it does not do what we are saying it is
>> doing.. but if it is wrong can you be more specific and create a JIRA with
>> the fix because I thought I got it right already... but if not then lets
>> get it right because that is why we pulled the release in RC3
>>
> The LICENSE file in both the src and binary downloads includes "SBT
> LICENSE" at the end. I could be wrong, but I think the src download should
> include the SBT licnese and the binary download should include the Scala
> license. Since we have released in the past without proper licensing, it's
> probably not a huge deal to do it again (but we should fix it).
>
>
>>  I create a simple Ant+Ivy project to test resolving the artifacts
>>> published to Apache staging repo: https://github.com/mumrah/kafka-ivy.
>>> This will fetch Kafka libs from the Apache staging area and other things
>>> from Maven Central. It will fetch the jars into lib/ivy/{conf} and
>>> generate
>>> a report of the dependencies, conflicts, and licenses into ivy-report.
>>> Notice I had to add three exclusions to get things working. Maybe we
>>> should
>>> add these to our pom?
>>>
>>>  I don't think this is a showstopper is it?  can't this wait for 0.8.1
>> and
>> not hold up the 0.8.0 release?
>>
> No I don't think it's a show stopper. But to Neha's point, a painless
> Maven/Ivy/SBT/Gradle integration is important since this is how most users
> interface with Kafka. That said, ZooKeeper is what's pulling i

[jira] [Updated] (KAFKA-1074) Reassign partitions should delete the old replicas from disk

2013-12-02 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-1074:
-

Assignee: Jun Rao

> Reassign partitions should delete the old replicas from disk
> 
>
> Key: KAFKA-1074
> URL: https://issues.apache.org/jira/browse/KAFKA-1074
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8
>Reporter: Jun Rao
>Assignee: Jun Rao
> Fix For: 0.8.1
>
> Attachments: KAFKA-1074.patch
>
>
> Currently, after reassigning replicas to other brokers, the old replicas are 
> not removed from disk and have to be deleted manually.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: [VOTE] Apache Kafka Release 0.8.0 - Candidate 5

2013-12-02 Thread David Arthur

Joe, another thing I noticed in the staging repo:

The POM for 2.8.2, 2.9.1, 2.9.2, and 2.10 include scala-compiler and 
some other stuff that is not included in 2.8.0


2.8.0 
https://repository.apache.org/content/groups/staging/org/apache/kafka/kafka_2.8.0/0.8.0/kafka_2.8.0-0.8.0.pom
2.8.2 
https://repository.apache.org/content/groups/staging/org/apache/kafka/kafka_2.8.2/0.8.0/kafka_2.8.2-0.8.0.pom


Here's a diff of those two: 
https://gist.github.com/mumrah/7bd6bd8e2805210d5d9d/revisions


I think maybe the 2.8.0 POM is missing some stuff it needs (zkclient, 
snappy, yammer metrics). And there is a duplicate ZK entry for the POMs 
>2.8.0


-David



On 12/2/13 12:57 PM, Joe Stein wrote:

Neha, as far as the release process is this what you had in mind
https://cwiki.apache.org/confluence/display/KAFKA/Release+Process or
different content or more of something or such?

Per the POM, I was able to use the artifacts from the maven repository
without having to-do anything more than just specifying the artifacts with
sbt.

resolvers += "Apache Staging" at "
https://repository.apache.org/content/groups/staging/";

libraryDependencies ++= Seq(
 ...,
"org.apache.kafka" % "kafka_2.10" % "0.8.0",
 
)

and on the pure maven side

 
 ApacheStaging
 https://repository.apache.org/content/groups/staging/
 
...
 
 org.apache.kafka
 kafka_2.9.2
 0.8.0
 
 
 log4j
 log4j
 
 
 

which very closely mirrors what David was talking about with ivy as well...
I didn't really think much of it just a matter of XML we can document
(there is actually no using maven documentation on the site at all we
should correct that in any case TBD post release) but if folks find it to
be a pain then we should definitely fix it for sure.  off the top of my
head I don't see how to-do that in the Build.scala but I really don't
expect it to be too difficult to figure out... the question is do we hold
it off for 0.8.1 since technically nothing is breaking (like the null
pointer exceptions we had for the bonked pom in beta1 that I shipped to
maven central).

Before canceling the vote can we at least get consensus to what we are
canceling and exactly what fixes should be in RC6 or ... agree to ship RC5
and hold whatever is left for 0.8.1

I am totally fine with working on RC6 (actually just cancelled my plans for
the evening because of a whole slew of client work that hit my plate) but I
want to make sure we have everything covered that everyone that is voting
expects to be in there.

David, a few items below don't make sense I sent another email on the
thread in regards to the LICENSE


/***
  Joe Stein
  Founder, Principal Consultant
  Big Data Open Source Security LLC
  http://www.stealth.ly
  Twitter: @allthingshadoop 
/


On Mon, Dec 2, 2013 at 12:19 PM, Neha Narkhede wrote:


I think we should maintain a wiki describing the release process in detail,
so we save the turnaround time on a release. We can have a VOTE thread to
agree on the release guidelines and follow it. Having  said that, it is
worth having the correct .pom file at the very least, since the release is
not very useful if people cannot consume it without pain.

Thanks,
Neha


On Mon, Dec 2, 2013 at 8:59 AM, Joe Stein  wrote:


General future thought comment first: lets be careful please to raising
issues as show stoppers that have been there previously (especially if
greater than one version previous release back also has the problem) and
can get fixed in a subsequent release and is only now more pressing

because

we know about them... seeing something should not necessarily always

create

priority (sometimes sure, of course but not always that is not the best

way

to manage changes).  The VOTE thread should be to artifacts and what we

are

releasing as proper and correct per Apache guidelines... and to make sure
that the person doing the release doesn't do something incorrect ... like
using the wrong version of JDK to build =8^/.  If we are not happy with
release as ready to ship then lets not call a VOTE and save the prolonged
weeks that drag out with so many release candidates.  The community

suffers

from this.

ok, now on to RC5 ...lets extend the vote until 12pm PT tomorrow ...
hopefully a few more hours for other folks to comment and discuss the
issues you raised with my $0.02852425 included below and follow-ups as

they

become necessary... I am also out of pocket in a few hours until tomorrow
morning so if it passed I would not be able to publish and announce or if
failed look towards RC6 anyways =8^)

/***
  Joe Stein
  Founder, Principal Consultant
  Big Data Open Source Security LLC
  http://www.stea

Re: Review Request 15938: replicas may not have consistent data after becoming follower

2013-12-02 Thread Jun Rao


> On Dec. 2, 2013, 4:55 p.m., Neha Narkhede wrote:
> > core/src/main/scala/kafka/server/ReplicaManager.scala, line 329
> > 
> >
> > Do we still need this "TODO: the above may need to be fixed later" ?

Yes, this can be removed.


> On Dec. 2, 2013, 4:55 p.m., Neha Narkhede wrote:
> > core/src/main/scala/kafka/server/KafkaApis.scala, line 396
> > 
> >
> > We had added the ability for a special consumer to read the replica log 
> > for troubleshooting. This patch takes that convenience away. We should 
> > probably look for another way to prevent the replica verification tool from 
> > giving false negatives. Can it use a different consumer id?

We could add another debugging consumer mode so that it can read beyond HW. 
This will complicate the broker side logic a bit though. Also, reading beyond 
HW always has the danger that the fetched data is garbage since it's truncated. 
Perhaps we can wait and see if this new mode is really needed?


- Jun


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15938/#review29586
---


On Dec. 1, 2013, 11:33 p.m., Jun Rao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15938/
> ---
> 
> (Updated Dec. 1, 2013, 11:33 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1154
> https://issues.apache.org/jira/browse/KAFKA-1154
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> kafka-1154; fix 1
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/api/FetchRequest.scala 
> fb2a2306003ac64a8a3b2fc5fc999e0be273f48d 
>   core/src/main/scala/kafka/api/RequestOrResponse.scala 
> b62330be6241c8ff4acd21f0fa7e80b7636e0d42 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> 80a70f1e5e3a7670b2238fe63b8d9e0eac6b46ac 
>   core/src/main/scala/kafka/server/ReplicaManager.scala 
> 54f6e1674255f62eba9d90aab0db371c82baf749 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala 
> f1f139e485d98e42be17cdcc327961420cd8c012 
> 
> Diff: https://reviews.apache.org/r/15938/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jun Rao
> 
>



Re: Review Request 15938: replicas may not have consistent data after becoming follower

2013-12-02 Thread Neha Narkhede


> On Dec. 2, 2013, 4:55 p.m., Neha Narkhede wrote:
> > core/src/main/scala/kafka/server/KafkaApis.scala, line 396
> > 
> >
> > We had added the ability for a special consumer to read the replica log 
> > for troubleshooting. This patch takes that convenience away. We should 
> > probably look for another way to prevent the replica verification tool from 
> > giving false negatives. Can it use a different consumer id?
> 
> Jun Rao wrote:
> We could add another debugging consumer mode so that it can read beyond 
> HW. This will complicate the broker side logic a bit though. Also, reading 
> beyond HW always has the danger that the fetched data is garbage since it's 
> truncated. Perhaps we can wait and see if this new mode is really needed?

Yes, we can probably wait. So, if the debugging consumer also reads upto the 
HW, just like a normal consumer, do we need to have a special "debugging 
consumer" ?


- Neha


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15938/#review29586
---


On Dec. 1, 2013, 11:33 p.m., Jun Rao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15938/
> ---
> 
> (Updated Dec. 1, 2013, 11:33 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1154
> https://issues.apache.org/jira/browse/KAFKA-1154
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> kafka-1154; fix 1
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/api/FetchRequest.scala 
> fb2a2306003ac64a8a3b2fc5fc999e0be273f48d 
>   core/src/main/scala/kafka/api/RequestOrResponse.scala 
> b62330be6241c8ff4acd21f0fa7e80b7636e0d42 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> 80a70f1e5e3a7670b2238fe63b8d9e0eac6b46ac 
>   core/src/main/scala/kafka/server/ReplicaManager.scala 
> 54f6e1674255f62eba9d90aab0db371c82baf749 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala 
> f1f139e485d98e42be17cdcc327961420cd8c012 
> 
> Diff: https://reviews.apache.org/r/15938/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jun Rao
> 
>



[jira] [Assigned] (KAFKA-1050) Support for "no data loss" mode

2013-12-02 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede reassigned KAFKA-1050:


Assignee: Neha Narkhede

> Support for "no data loss" mode
> ---
>
> Key: KAFKA-1050
> URL: https://issues.apache.org/jira/browse/KAFKA-1050
> Project: Kafka
>  Issue Type: Task
>Reporter: Justin SB
>Assignee: Neha Narkhede
>
> I'd love to use Apache Kafka, but for my application data loss is not 
> acceptable.  Even at the expense of availability (i.e. I need C not A in CAP).
> I think there are two things that I need to change to get a quorum model:
> 1) Make sure I set request.required.acks to 2 (for a 3 node cluster) or 3 
> (for a 5 node cluster) on every request, so that I can only write if a quorum 
> is active.
> 2) Prevent the behaviour where a non-ISR can become the leader if all ISRs 
> die.  I think this is as easy as tweaking 
> core/src/main/scala/kafka/controller/PartitionLeaderSelector.scala, 
> essentially to throw an exception around line 64 in the "data loss" case.
> I haven't yet implemented / tested this.  I'd love to get some input from the 
> Kafka-experts on whether my plan is:
>  (a) correct - will this work?
>  (b) complete - have I missed any cases?
>  (c) recommended - is this a terrible idea :-)
> Thanks for any pointers!



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15674: Reassign partitions should delete the old replicas from disk

2013-12-02 Thread Neha Narkhede


> On Nov. 19, 2013, 6:40 p.m., Jay Kreps wrote:
> > What happens if I am doing a read or write concurrently with a delete?
> > 
> > Would it be simpler just to have the delete log work like the segment 
> > delete where rather than trying to lock we remove it from the segment list 
> > and then just enqueue a delete in 60 seconds. My concern is just that 
> > reasoning about the various locking strategies in the log is getting 
> > increasingly difficult.
> 
> Jun Rao wrote:
> Yes, we could try deleting the log asynchronously. The issues there are:
> 
> 1. The same partition could be moved back to this broker during the 
> delayed window.
> 2. It's not clear if 60 secs (or any value) is good enough since the time 
> that an ongoing scheduled flush takes is unbounded.
> 
> The following is how this patch handles outstanding reads/writes on the 
> deleted data.
> 
> 1. All read operations are ok since we already handle unexpected 
> exceptions in KafkaApi. The caller will get an error.
> 2. Currently, if we hit an IOException while writing to the log by the 
> producer request, the replica fetcher or the log flusher, we halt the broker. 
> We need to make sure that the deletion of a log doesn't cause the halt. This 
> is achieved by preventing those operations on the log once it's deleted.
> 2.1 For producer requests, the delete partition operation will 
> synchronize on the leaderAndIsrUpdate lock.
> 2.2 For replica fetcher, this is already handled since the fetcher is 
> removed before the log is deleted.
> 2.3 For log flusher, the flush and the delete will now synchronize on a 
> delete lock.
> 
> I agree that this approach uses more locks, which potentially makes the 
> code harder to understand. However, my feeling is that this is probably a 
> less hacky approach than the async delete one.

At least until the various locks are cleaned up, the current approach used in 
the patch seems safer compared to an async delete. Will take a closer look at 
the patch sometime today.


- Neha


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15674/#review29123
---


On Nov. 19, 2013, 4:28 p.m., Jun Rao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15674/
> ---
> 
> (Updated Nov. 19, 2013, 4:28 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1074
> https://issues.apache.org/jira/browse/KAFKA-1074
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> kafka-1074; fix 3
> 
> 
> kafka-1074; fix 2
> 
> 
> kafka-1074
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/cluster/Partition.scala 
> 02ccc17c79b6d44c75f9bb6ca7cda8c51ae6f6fb 
>   core/src/main/scala/kafka/log/Log.scala 
> 1883a53de112ad08449dc73a2ca08208c11a2537 
>   core/src/main/scala/kafka/log/LogManager.scala 
> 81be88aa618ed5614703d45a0556b77c97290085 
>   core/src/main/scala/kafka/log/LogSegment.scala 
> 0d6926ea105a99c9ff2cfc9ea6440f2f2d37bde8 
>   core/src/main/scala/kafka/server/ReplicaManager.scala 
> 161f58134f20f9335dbd2bee6ac3f71897cbef7c 
>   core/src/test/scala/unit/kafka/admin/AdminTest.scala 
> c30069e837e54fb91bf1d5b75b133282a28dedf8 
> 
> Diff: https://reviews.apache.org/r/15674/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jun Rao
> 
>



[jira] Subscription: outstanding kafka patches

2013-12-02 Thread jira
Issue Subscription
Filter: outstanding kafka patches (75 issues)
The list of outstanding kafka patches
Subscriber: kafka-mailing-list

Key Summary
KAFKA-1154  replicas may not have consistent data after becoming follower
https://issues.apache.org/jira/browse/KAFKA-1154
KAFKA-1151  The Hadoop consumer API doc is not referencing the contrib consumer
https://issues.apache.org/jira/browse/KAFKA-1151
KAFKA-1145  Broker fail to sync after restart
https://issues.apache.org/jira/browse/KAFKA-1145
KAFKA-1144  commitOffsets can be passed the offsets to commit
https://issues.apache.org/jira/browse/KAFKA-1144
KAFKA-1142  Patch review tool should take diff with origin from last divergent 
point
https://issues.apache.org/jira/browse/KAFKA-1142
KAFKA-1130  "log.dirs" is a confusing property name
https://issues.apache.org/jira/browse/KAFKA-1130
KAFKA-1116  Need to upgrade sbt-assembly to compile on scala 2.10.2
https://issues.apache.org/jira/browse/KAFKA-1116
KAFKA-1110  Unable to produce messages with snappy/gzip compression
https://issues.apache.org/jira/browse/KAFKA-1110
KAFKA-1109  Need to fix GC log configuration code, not able to override 
KAFKA_GC_LOG_OPTS
https://issues.apache.org/jira/browse/KAFKA-1109
KAFKA-1106  HighwaterMarkCheckpoint failure puting broker into a bad state
https://issues.apache.org/jira/browse/KAFKA-1106
KAFKA-1093  Log.getOffsetsBefore(t, …) does not return the last confirmed 
offset before t
https://issues.apache.org/jira/browse/KAFKA-1093
KAFKA-1086  Improve GetOffsetShell to find metadata automatically
https://issues.apache.org/jira/browse/KAFKA-1086
KAFKA-1082  zkclient dies after UnknownHostException in zk reconnect
https://issues.apache.org/jira/browse/KAFKA-1082
KAFKA-1079  Liars in PrimitiveApiTest that promise to test api in compression 
mode, but don't do this actually
https://issues.apache.org/jira/browse/KAFKA-1079
KAFKA-1074  Reassign partitions should delete the old replicas from disk
https://issues.apache.org/jira/browse/KAFKA-1074
KAFKA-1049  Encoder implementations are required to provide an undocumented 
constructor.
https://issues.apache.org/jira/browse/KAFKA-1049
KAFKA-1032  Messages sent to the old leader will be lost on broker GC resulted 
failure
https://issues.apache.org/jira/browse/KAFKA-1032
KAFKA-1020  Remove getAllReplicasOnBroker from KafkaController
https://issues.apache.org/jira/browse/KAFKA-1020
KAFKA-1012  Implement an Offset Manager and hook offset requests to it
https://issues.apache.org/jira/browse/KAFKA-1012
KAFKA-1011  Decompression and re-compression on MirrorMaker could result in 
messages being dropped in the pipeline
https://issues.apache.org/jira/browse/KAFKA-1011
KAFKA-1005  kafka.perf.ConsumerPerformance not shutting down consumer
https://issues.apache.org/jira/browse/KAFKA-1005
KAFKA-998   Producer should not retry on non-recoverable error codes
https://issues.apache.org/jira/browse/KAFKA-998
KAFKA-997   Provide a strict verification mode when reading configuration 
properties
https://issues.apache.org/jira/browse/KAFKA-997
KAFKA-996   Capitalize first letter for log entries
https://issues.apache.org/jira/browse/KAFKA-996
KAFKA-984   Avoid a full rebalance in cases when a new topic is discovered but 
container/broker set stay the same
https://issues.apache.org/jira/browse/KAFKA-984
KAFKA-976   Order-Preserving Mirror Maker Testcase
https://issues.apache.org/jira/browse/KAFKA-976
KAFKA-967   Use key range in ProducerPerformance
https://issues.apache.org/jira/browse/KAFKA-967
KAFKA-917   Expose zk.session.timeout.ms in console consumer
https://issues.apache.org/jira/browse/KAFKA-917
KAFKA-885   sbt package builds two kafka jars
https://issues.apache.org/jira/browse/KAFKA-885
KAFKA-881   Kafka broker not respecting log.roll.hours
https://issues.apache.org/jira/browse/KAFKA-881
KAFKA-873   Consider replacing zkclient with curator (with zkclient-bridge)
https://issues.apache.org/jira/browse/KAFKA-873
KAFKA-868   System Test - add test case for rolling controlled shutdown
https://issues.apache.org/jira/browse/KAFKA-868
KAFKA-863   System Test - update 0.7 version of kafka-run-class.sh for 
Migration Tool test cases
https://issues.apache.org/jira/browse/KAFKA-863
KAFKA-859   support basic auth protection of mx4j console
https://issues.apache.org/jira/browse/KAFKA-859
KAFKA-855   Ant+Ivy build for Kafka
https://issues.apache.org/jira/browse/KAFKA-855
KAFKA-854   Upgrade dependencies for 0.8
https://issues.apache.org/jira/browse/KAFKA-854
KAFKA-815   Improve SimpleConsumerShell to take in a max messages config option
   

Re: Review Request 15938: replicas may not have consistent data after becoming follower

2013-12-02 Thread Neha Narkhede


> On Dec. 2, 2013, 4:55 p.m., Neha Narkhede wrote:
> > core/src/main/scala/kafka/server/KafkaApis.scala, line 396
> > 
> >
> > We had added the ability for a special consumer to read the replica log 
> > for troubleshooting. This patch takes that convenience away. We should 
> > probably look for another way to prevent the replica verification tool from 
> > giving false negatives. Can it use a different consumer id?
> 
> Jun Rao wrote:
> We could add another debugging consumer mode so that it can read beyond 
> HW. This will complicate the broker side logic a bit though. Also, reading 
> beyond HW always has the danger that the fetched data is garbage since it's 
> truncated. Perhaps we can wait and see if this new mode is really needed?
> 
> Neha Narkhede wrote:
> Yes, we can probably wait. So, if the debugging consumer also reads upto 
> the HW, just like a normal consumer, do we need to have a special "debugging 
> consumer" ?

Hmm.. so debugging consumer will be useful to read from replicas, which 
ordinary consumers can't do. We can probably address the debugging consumer 
properly in the future if/when we find use for reading beyond the HW. Rest of 
the patch looks good.


- Neha


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15938/#review29586
---


On Dec. 1, 2013, 11:33 p.m., Jun Rao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15938/
> ---
> 
> (Updated Dec. 1, 2013, 11:33 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1154
> https://issues.apache.org/jira/browse/KAFKA-1154
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> kafka-1154; fix 1
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/api/FetchRequest.scala 
> fb2a2306003ac64a8a3b2fc5fc999e0be273f48d 
>   core/src/main/scala/kafka/api/RequestOrResponse.scala 
> b62330be6241c8ff4acd21f0fa7e80b7636e0d42 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> 80a70f1e5e3a7670b2238fe63b8d9e0eac6b46ac 
>   core/src/main/scala/kafka/server/ReplicaManager.scala 
> 54f6e1674255f62eba9d90aab0db371c82baf749 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala 
> f1f139e485d98e42be17cdcc327961420cd8c012 
> 
> Diff: https://reviews.apache.org/r/15938/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jun Rao
> 
>



Re: Review Request 15938: replicas may not have consistent data after becoming follower

2013-12-02 Thread Neha Narkhede

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15938/#review29622
---

Ship it!


Ship It!

- Neha Narkhede


On Dec. 1, 2013, 11:33 p.m., Jun Rao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15938/
> ---
> 
> (Updated Dec. 1, 2013, 11:33 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1154
> https://issues.apache.org/jira/browse/KAFKA-1154
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> kafka-1154; fix 1
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/api/FetchRequest.scala 
> fb2a2306003ac64a8a3b2fc5fc999e0be273f48d 
>   core/src/main/scala/kafka/api/RequestOrResponse.scala 
> b62330be6241c8ff4acd21f0fa7e80b7636e0d42 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> 80a70f1e5e3a7670b2238fe63b8d9e0eac6b46ac 
>   core/src/main/scala/kafka/server/ReplicaManager.scala 
> 54f6e1674255f62eba9d90aab0db371c82baf749 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala 
> f1f139e485d98e42be17cdcc327961420cd8c012 
> 
> Diff: https://reviews.apache.org/r/15938/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jun Rao
> 
>



[jira] [Resolved] (KAFKA-1154) replicas may not have consistent data after becoming follower

2013-12-02 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-1154.


Resolution: Fixed

Thanks for the review. Committed to trunk after addressing the minor review 
comments.

> replicas may not have consistent data after becoming follower
> -
>
> Key: KAFKA-1154
> URL: https://issues.apache.org/jira/browse/KAFKA-1154
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.1
>Reporter: Jun Rao
>Assignee: Jun Rao
> Fix For: 0.8.1
>
> Attachments: KAFKA-1154.patch
>
>
> This is an issued introduced in KAFKA-1001. The issue is that in 
> ReplicaManager.makeFollowers(), we truncate the log before marking the 
> replica as the follower. New messages from the producer can still be added to 
> the log after the log is truncated, but before the replica is marked as the 
> follower. Those newly produced messages can actually be committed, which 
> implies those truncated messages are also committed. However, the new leader 
> is not guaranteed to have those truncated messages.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (KAFKA-1156) Improve reassignment tool to output the existing assignment to facilitate rollbacks

2013-12-02 Thread Neha Narkhede (JIRA)
Neha Narkhede created KAFKA-1156:


 Summary: Improve reassignment tool to output the existing 
assignment to facilitate rollbacks
 Key: KAFKA-1156
 URL: https://issues.apache.org/jira/browse/KAFKA-1156
 Project: Kafka
  Issue Type: Bug
  Components: tools
Affects Versions: 0.8.1
Reporter: Neha Narkhede
Assignee: Neha Narkhede
Priority: Critical


It is useful for the partition reassignment tool to output the current 
partition assignment as part of the dry run. This will make rollbacks easier if 
the reassignment does not work out.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (KAFKA-1157) Clean up Per-topic Configuration from Kafka properties

2013-12-02 Thread Guozhang Wang (JIRA)
Guozhang Wang created KAFKA-1157:


 Summary: Clean up Per-topic Configuration from Kafka properties
 Key: KAFKA-1157
 URL: https://issues.apache.org/jira/browse/KAFKA-1157
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
Assignee: Guozhang Wang


After KAFKA-554, per-topic configurations could be removed from kafka 
properties.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Review Request 15950: Patch for KAFKA-1157

2013-12-02 Thread Guozhang Wang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15950/
---

Review request for kafka.


Bugs: KAFKA-1157
https://issues.apache.org/jira/browse/KAFKA-1157


Repository: kafka


Description
---

KAFKA-1157.v1


Diffs
-

  core/src/main/scala/kafka/server/KafkaConfig.scala 
b324344d0a383398db8bfe2cbeec2c1378fe13c9 

Diff: https://reviews.apache.org/r/15950/diff/


Testing
---


Thanks,

Guozhang Wang



[jira] [Updated] (KAFKA-1157) Clean up Per-topic Configuration from Kafka properties

2013-12-02 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1157:
-

Attachment: KAFKA-1157.patch

> Clean up Per-topic Configuration from Kafka properties
> --
>
> Key: KAFKA-1157
> URL: https://issues.apache.org/jira/browse/KAFKA-1157
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Attachments: KAFKA-1157.patch
>
>
> After KAFKA-554, per-topic configurations could be removed from kafka 
> properties.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (KAFKA-1157) Clean up Per-topic Configuration from Kafka properties

2013-12-02 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837131#comment-13837131
 ] 

Guozhang Wang commented on KAFKA-1157:
--

Created reviewboard https://reviews.apache.org/r/15950/
 against branch origin/trunk

> Clean up Per-topic Configuration from Kafka properties
> --
>
> Key: KAFKA-1157
> URL: https://issues.apache.org/jira/browse/KAFKA-1157
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Attachments: KAFKA-1157.patch
>
>
> After KAFKA-554, per-topic configurations could be removed from kafka 
> properties.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Review Request 15953: Patch for KAFKA-1134

2013-12-02 Thread Guozhang Wang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15953/
---

Review request for kafka.


Bugs: KAFKA-1134
https://issues.apache.org/jira/browse/KAFKA-1134


Repository: kafka


Description
---

KAFKA-1134.v1


Diffs
-

  core/src/main/scala/kafka/controller/KafkaController.scala 
4c319aba97655e7c4ec97fac2e34de4e28c9f5d3 

Diff: https://reviews.apache.org/r/15953/diff/


Testing
---


Thanks,

Guozhang Wang



[jira] [Updated] (KAFKA-1134) onControllerFailover function should be synchronized with other functions

2013-12-02 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1134:
-

Attachment: KAFKA-1134.patch

> onControllerFailover function should be synchronized with other functions
> -
>
> Key: KAFKA-1134
> URL: https://issues.apache.org/jira/browse/KAFKA-1134
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8, 0.8.1
>Reporter: Guozhang Wang
> Attachments: KAFKA-1134.patch
>
>
> Otherwise race conditions could happen. For example, handleNewSession will 
> close all sockets with brokers while the handleStateChange in 
> onControllerFailover tries to send requests to them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (KAFKA-1134) onControllerFailover function should be synchronized with other functions

2013-12-02 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837172#comment-13837172
 ] 

Guozhang Wang commented on KAFKA-1134:
--

Created reviewboard https://reviews.apache.org/r/15953/
 against branch origin/trunk

> onControllerFailover function should be synchronized with other functions
> -
>
> Key: KAFKA-1134
> URL: https://issues.apache.org/jira/browse/KAFKA-1134
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8, 0.8.1
>Reporter: Guozhang Wang
> Attachments: KAFKA-1134.patch
>
>
> Otherwise race conditions could happen. For example, handleNewSession will 
> close all sockets with brokers while the handleStateChange in 
> onControllerFailover tries to send requests to them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15953: Patch for KAFKA-1134

2013-12-02 Thread Neha Narkhede

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15953/#review29636
---



core/src/main/scala/kafka/controller/KafkaController.scala


it seems that onControllerFailover is already protected by the 
controllerLock. The elect() API of ZookeeperLeaderElector is invoked in 3 
places and each of those acquires the controllerLock


- Neha Narkhede


On Dec. 3, 2013, 12:58 a.m., Guozhang Wang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15953/
> ---
> 
> (Updated Dec. 3, 2013, 12:58 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1134
> https://issues.apache.org/jira/browse/KAFKA-1134
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1134.v1
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/controller/KafkaController.scala 
> 4c319aba97655e7c4ec97fac2e34de4e28c9f5d3 
> 
> Diff: https://reviews.apache.org/r/15953/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Guozhang Wang
> 
>



Re: Review Request 15953: Patch for KAFKA-1134

2013-12-02 Thread Guozhang Wang


> On Dec. 3, 2013, 1:30 a.m., Neha Narkhede wrote:
> > core/src/main/scala/kafka/controller/KafkaController.scala, line 235
> > 
> >
> > it seems that onControllerFailover is already protected by the 
> > controllerLock. The elect() API of ZookeeperLeaderElector is invoked in 3 
> > places and each of those acquires the controllerLock

You are right. The real issue is not that onControllerFailover is not 
synchronized, but is that the sendRequest is asynchronized. Hence in 
onControllerFailover, it just put the request on the queue, and while the send 
thread wakes up to send the message, it may have already been closed by the 
handleNewSession procedure.

I think the correct fix should be, in 
ControllerChannelManager.removeExistingBroker, we should also clear the request 
queue.


- Guozhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15953/#review29636
---


On Dec. 3, 2013, 12:58 a.m., Guozhang Wang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15953/
> ---
> 
> (Updated Dec. 3, 2013, 12:58 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1134
> https://issues.apache.org/jira/browse/KAFKA-1134
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1134.v1
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/controller/KafkaController.scala 
> 4c319aba97655e7c4ec97fac2e34de4e28c9f5d3 
> 
> Diff: https://reviews.apache.org/r/15953/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Guozhang Wang
> 
>



[jira] [Resolved] (KAFKA-1157) Clean up Per-topic Configuration from Kafka properties

2013-12-02 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-1157.


   Resolution: Fixed
Fix Version/s: 0.8.1

Thanks for the patch. +1 and committed to trunk.

> Clean up Per-topic Configuration from Kafka properties
> --
>
> Key: KAFKA-1157
> URL: https://issues.apache.org/jira/browse/KAFKA-1157
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Fix For: 0.8.1
>
> Attachments: KAFKA-1157.patch
>
>
> After KAFKA-554, per-topic configurations could be removed from kafka 
> properties.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: [VOTE] Apache Kafka Release 0.8.0 - Candidate 5

2013-12-02 Thread Jun Rao
The release voting is based on lazy majority (
https://cwiki.apache.org/confluence/display/KAFKA/Bylaws#Bylaws-Voting). So
a -1 doesn't kill the release. The question is whether those issues are
really show stoppers.

Thanks,

Jun




On Mon, Dec 2, 2013 at 10:19 AM, David Arthur  wrote:

> Inline:
>
>
> On 12/2/13 11:59 AM, Joe Stein wrote:
>
>> General future thought comment first: lets be careful please to raising
>> issues as show stoppers that have been there previously (especially if
>> greater than one version previous release back also has the problem) and
>> can get fixed in a subsequent release and is only now more pressing
>> because
>> we know about them... seeing something should not necessarily always
>> create
>> priority (sometimes sure, of course but not always that is not the best
>> way
>> to manage changes).  The VOTE thread should be to artifacts and what we
>> are
>> releasing as proper and correct per Apache guidelines... and to make sure
>> that the person doing the release doesn't do something incorrect ... like
>> using the wrong version of JDK to build =8^/.  If we are not happy with
>> release as ready to ship then lets not call a VOTE and save the prolonged
>> weeks that drag out with so many release candidates.  The community
>> suffers
>> from this.
>>
> +1 If we can get most of this release preparation stuff automated, then we
> can iterate on it in a release branch before tagging and voting.
>
>  ok, now on to RC5 ...lets extend the vote until 12pm PT tomorrow ...
>> hopefully a few more hours for other folks to comment and discuss the
>> issues you raised with my $0.02852425 included below and follow-ups as
>> they
>> become necessary... I am also out of pocket in a few hours until tomorrow
>> morning so if it passed I would not be able to publish and announce or if
>> failed look towards RC6 anyways =8^)
>>
>> /***
>>   Joe Stein
>>   Founder, Principal Consultant
>>   Big Data Open Source Security LLC
>>   http://www.stealth.ly
>>   Twitter: @allthingshadoop 
>> /
>>
>>
>> On Mon, Dec 2, 2013 at 11:00 AM, David Arthur  wrote:
>>
>>  Seems like most people are verifying the src, so I'll pick on the
>>> binaries
>>> and Maven stuff ;)
>>>
>>> A few problems I see:
>>>
>>> There are some vestigial Git files in the src download: an empty .git and
>>> .gitignore
>>>
>>>  Ok, I can do a better job with 0.8.1 but I am not sure this is very
>> different than beta1 and not necessarily a show stopper for 0.8.0
>> requiring
>> another release candidate, is it?  I think updating the release docs and
>> rmdir .git after the rm -fr and rm .gitignore moving forward makes sense.
>>
> Agreed, not a show stopper.
>
>
>>
>>  In the source download, I see the SBT license in LICENSE which seems
>>> correct (since we distribute an SBT binary), but in the binary download I
>>> see the same license. Don't we need the Scala license (
>>> http://www.scala-lang.org/license.html) in the binary distribution?
>>>
>>>  I fixed this already not only in the binary release
>> https://issues.apache.org/jira/browse/KAFKA-1131 but also in the JAR
>> files
>> that are published to Maven
>> https://issues.apache.org/jira/browse/KAFKA-1133are you checking from
>> http://people.apache.org/~joestein/kafka-0.8.0-candidate5/ because I just
>> downloaded again and it looks alright to me.  If not then definitely this
>> RC should be shot down because it does not do what we are saying it is
>> doing.. but if it is wrong can you be more specific and create a JIRA with
>> the fix because I thought I got it right already... but if not then lets
>> get it right because that is why we pulled the release in RC3
>>
> The LICENSE file in both the src and binary downloads includes "SBT
> LICENSE" at the end. I could be wrong, but I think the src download should
> include the SBT licnese and the binary download should include the Scala
> license. Since we have released in the past without proper licensing, it's
> probably not a huge deal to do it again (but we should fix it).
>
>
>>  I create a simple Ant+Ivy project to test resolving the artifacts
>>> published to Apache staging repo: https://github.com/mumrah/kafka-ivy.
>>> This will fetch Kafka libs from the Apache staging area and other things
>>> from Maven Central. It will fetch the jars into lib/ivy/{conf} and
>>> generate
>>> a report of the dependencies, conflicts, and licenses into ivy-report.
>>> Notice I had to add three exclusions to get things working. Maybe we
>>> should
>>> add these to our pom?
>>>
>>>  I don't think this is a showstopper is it?  can't this wait for 0.8.1
>> and
>> not hold up the 0.8.0 release?
>>
> No I don't think it's a show stopper. But to Neha's point, a painless
> Maven/Ivy/SBT/Gradle integration is important since this is how most users
> interface with Kafka. That said, ZooKeeper is what's pulling in these
> trou