[GitHub] kafka pull request: Kafka 2015 - Enable ConsoleConsumer to use new...

2015-08-17 Thread benstopford
GitHub user benstopford opened a pull request:

https://github.com/apache/kafka/pull/144

Kafka 2015 - Enable ConsoleConsumer to use new consumer

This extends the original patch done by GZ to provide Console access to 
both the new and old consumer API's. The code follows a pattern similar to that 
already used in ConsoleProducer. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/benstopford/kafka KAFKA-2015

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/144.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #144


commit 68dc02dcdff89ed20c6b5fab9d24b618ad25c714
Author: Ben Stopford 
Date:   2015-08-13T12:36:13Z

KAFKA-2015

commit 6735a6af16d25fa256f8cf91890694316d1e6950
Author: Ben Stopford 
Date:   2015-08-13T12:41:43Z

KAFKA-2015

commit 6e23964d608a19509aa9880d8d2944ad795bacd6
Author: Ben Stopford 
Date:   2015-08-13T12:42:17Z

KAFKA-2015

commit 68ce8d789a98705e3ee82fbb6c7727db6c8434e2
Author: Ben Stopford 
Date:   2015-08-13T17:00:57Z

KAFKA-2015 - implemented base case keeping the existing BaseConsumer design 
(which in truth I'm not super-fond of but it retains consistency with 
BaseProducer. This checkin still includes TopicPartition, rather than Topic 
subscription.

commit dc540f25d0b4708e93c79cbbadf851bc83b8fe5c
Author: Ben Stopford 
Date:   2015-08-13T17:29:42Z

KAFKA-2015 - changed to use topic subscription rather than topicpartition 
subscription

commit b793f0e9ce0e6496f6fee8322668c2d8149ef573
Author: Ben Stopford 
Date:   2015-08-13T18:01:46Z

KAFKA-2015 - tidied leftovers from original Jira Patch from GZ

commit cb5681d27d387a7c428ead3d962bef8acf6b5ff0
Author: Ben Stopford 
Date:   2015-08-13T18:15:17Z

KAFKA-2015 - fix indentation

commit 3072e5825e2715ae28cd48fb2bda2b51468d2373
Author: Ben Stopford 
Date:   2015-08-14T15:42:39Z

KAFKA-2015 - added some limited testing

commit ba25f099adcc50ef8b5d77c8fe5c1ba137ae02fd
Author: Ben Stopford 
Date:   2015-08-14T18:18:14Z

KAFKA-2015 - couple more tests

commit 3d0638f5da2e0d35c74ad6081ac12786103d1d92
Author: Ben Stopford 
Date:   2015-08-14T18:20:20Z

KAFKA-2015 - couple more tests

commit 9d01e8ad292039abcc8201de74467b4b4b8af601
Author: Ben Stopford 
Date:   2015-08-14T18:41:47Z

KAFKA-2015 - added formatters back in

commit cb0ede65b5c343b6396992a147fc10510e65dbdf
Author: Ben Stopford 
Date:   2015-08-14T18:52:09Z

KAFKA-2015 - formatting only

commit 2e9ade93a88641bcdbf1e736dd9d4874a47e854b
Author: Ben Stopford 
Date:   2015-08-15T18:03:39Z

KAFKA-2015 - added back in previous change to remove randomly created group 
ids from ZK in the shutdown hook

commit de06323b7b75aa4450df16e5315c0db70de06fe0
Author: Ben Stopford 
Date:   2015-08-17T07:53:02Z

KAFKA-2015 - whitespace only




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2388) subscribe(topic)/unsubscribe(topic) should either take a callback to allow user to handle exceptions or it should be synchronous.

2015-08-17 Thread Onur Karaman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699166#comment-14699166
 ] 

Onur Karaman commented on KAFKA-2388:
-

Hey [~hachikuji]. I'll try to take a look at it later today.

> subscribe(topic)/unsubscribe(topic) should either take a callback to allow 
> user to handle exceptions or it should be synchronous.
> -
>
> Key: KAFKA-2388
> URL: https://issues.apache.org/jira/browse/KAFKA-2388
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jiangjie Qin
>Assignee: Jason Gustafson
>
> According to the mailing list discussion on the consumer interface, we'll 
> replace:
> {code}
> public void subscribe(String... topics);
> public void subscribe(TopicPartition... partitions);
> public Set subscriptions();
> {code}
> with:
> {code}
> void subscribe(List topics, RebalanceCallback callback);
> void assign(List partitions);
> List subscriptions();
> List assignments();
> {code}
> We don't need the unsubscribe APIs anymore.
> The RebalanceCallback would look like:
> {code}
> interface RebalanceCallback {
>   void onAssignment(List partitions);
>   void onRevocation(List partitions);
>   // handle non-existing topics, etc.
>   void onError(Exception e);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2072) Add StopReplica request/response to o.a.k.common.requests and replace the usage in core module

2015-08-17 Thread David Jacot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699279#comment-14699279
 ] 

David Jacot commented on KAFKA-2072:


As discussed with [~ijuma], it would be better to wait until KAFKA-2411 is 
completed and rebase this patch on top of it.

> Add StopReplica request/response to o.a.k.common.requests and replace the 
> usage in core module
> --
>
> Key: KAFKA-2072
> URL: https://issues.apache.org/jira/browse/KAFKA-2072
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Gwen Shapira
>Assignee: David Jacot
> Fix For: 0.8.3
>
>
> Add StopReplica request/response to o.a.k.common.requests and replace the 
> usage in core module



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33620: Patch for KAFKA-1690

2015-08-17 Thread Ismael Juma


> On July 27, 2015, 1:32 p.m., Ismael Juma wrote:
> > clients/src/main/java/org/apache/kafka/common/config/SSLConfigs.java, line 
> > 29
> > 
> >
> > SSL is deprecated 
> > (https://ma.ttias.be/rfc-7568-ssl-3-0-is-now-officially-deprecated/), 
> > should we be supporting it at all? If we do support it, then we should at 
> > least include a warning.
> 
> Ismael Juma wrote:
> I noticed that SSLv3 is actually disabled in newer JDK 8 releases by 
> default.
> 
> Sriharsha Chintalapani wrote:
> ? I know ssl 3.0 is deprecated. As you can see from SSLConfigs.java we 
> enable DEFAULT_ENABLED_PROTOCOLS = "TLSv1.2,TLSv1.1,TLSv1" only TLS not SSL.

Yes, I understand. I was asking whether we should support those protocols at 
all. I'll drop this from this review though, it can be discussed separately.


> On July 27, 2015, 1:32 p.m., Ismael Juma wrote:
> > clients/src/main/java/org/apache/kafka/common/network/DefaultAuthenticator.java,
> >  line 34
> > 
> >
> > Why not create the principal at this point instead of in the 
> > `principal()` method? Is it a costly operation?
> 
> Sriharsha Chintalapani wrote:
> at this point we don't haven't gone through the handshake to establish 
> principal

I think it would be nice to have a comment for things like this, but I'll drop 
it.


> On July 27, 2015, 1:32 p.m., Ismael Juma wrote:
> > core/src/main/scala/kafka/api/FetchResponse.scala, line 82
> > 
> >
> > Casts are to be avoided in Scala, pattern matching is a better way to 
> > do this:
> > 
> > `channel match {
> >   case tl: TransportLayer => pending = tl.hasPendingWrites
> >   case _ =>
> > }`
> > 
> > However, I see that this pattern is repeated in many classes, which is 
> > not good. Assuming that we can't change `Send.writeTo` to take a 
> > `TransportLayer` (either for compatibility or because there are 
> > implementations that don't use a `TransportLayer`), we should consider 
> > introducing a utility method `hasPendingWrites(channel)` that calls 
> > `hasPendingWrites` or returns false.
> > 
> > What do you think?
> 
> Sriharsha Chintalapani wrote:
> The reason for doing this. Here Channel can be GatheringByteChannel or 
> some other socketchannel. As I mentioned in one of comments , when 
> KafkaChannel used across the system especially when we deprecate/remove 
> BlockingChannel we can go ahead and call channel.hasPendingWrites.
> 
> Ismael Juma wrote:
> That makes sense. Still, having a utility method would make it easier to 
> make that change when that time comes, right?

Dropping it from this review.


> On July 27, 2015, 1:32 p.m., Ismael Juma wrote:
> > core/src/main/scala/kafka/server/KafkaConfig.scala, line 867
> > 
> >
> > Why are we using a Java Map here? Is it used in Java code? If not, then 
> > why not use Scala's Map.apply to make the code more concide and idiomatic? 
> > i.e. `Map(key1 -> value1, key2 -> value2...)`
> 
> Sriharsha Chintalapani wrote:
> because we need to pass these channelConfigs to SSLFactory.java

Using `asJava` from `JavaConverters` would probably be nicer, but I'll drop it 
from this review.


- Ismael


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33620/#review93110
---


On Aug. 17, 2015, 3:41 a.m., Sriharsha Chintalapani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33620/
> ---
> 
> (Updated Aug. 17, 2015, 3:41 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1690
> https://issues.apache.org/jira/browse/KAFKA-1690
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. SSLFactory tests.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Added 
> PrincipalBuilder.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Addressing 
> reviews.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Addressing 
> reviews.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Addressing 
> reviews.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Fixed minor 
> issues with the patch.
> 
> 
> KAFKA-1690. new 

Re: Review Request 33620: Patch for KAFKA-1690

2015-08-17 Thread Ismael Juma

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33620/#review95582
---



clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java 
(line 244)


The string `channelId` should come after `NEED_UNWRAP` for consistency with 
the other log messages.



clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java 
(line 325)


The string `channelId` should come after `FINISHED` for consistency with 
other log messages. Or we should remove it from all of them.



clients/src/main/java/org/apache/kafka/common/network/Selector.java (line 260)


It seems like there were some issues with the merge conflict resolution 
here: see `### Ancestor`, `end`, ` variabt B` in the javadoc.

The reason for the conflict is that there was a commit that fixed the 
reference to the `poll` method, the corrected description is:

```
* lists will be cleared at the beginning of each {@link #poll(long)} call 
and repopulated by the call if there is
* any completed I/O.```


- Ismael Juma


On Aug. 17, 2015, 3:41 a.m., Sriharsha Chintalapani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33620/
> ---
> 
> (Updated Aug. 17, 2015, 3:41 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1690
> https://issues.apache.org/jira/browse/KAFKA-1690
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. SSLFactory tests.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Added 
> PrincipalBuilder.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Addressing 
> reviews.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Addressing 
> reviews.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Addressing 
> reviews.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Fixed minor 
> issues with the patch.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Fixed minor 
> issues with the patch.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> Merge remote-tracking branch 'refs/remotes/origin/trunk' into KAFKA-1690-V1
> 
> 
> KAFKA-1690. Broker side ssl changes.
> 
> 
> KAFKA-1684. SSL for socketServer.
> 
> 
> KAFKA-1690. Added SSLProducerSendTest and fixes to get right port for SSL.
> 
> 
> Merge branch 'trunk' into KAFKA-1690-V1
> 
> 
> KAFKA-1690. Post merge fixes.
> 
> 
> KAFKA-1690. Added SSLProducerSendTest.
> 
> 
> KAFKA-1690. Minor fixes based on patch review comments.
> 
> 
> Merge commit
> 
> 
> KAFKA-1690. Added SSL Consumer Test.
> 
> 
> KAFKA-1690. SSL Support.
> 
> 
> KAFKA-1690. Addressing reviews.
> 
> 
> Merge branch 'trunk' into KAFKA-1690-V1
> 
> 
> Merge branch 'trunk' into KAFKA-1690-V1
> 
> 
> KAFKA-1690. Addressing reviews. Removed interestOps from SSLTransportLayer.
> 
> 
> KAFKA-1690. Addressing reviews.
> 
> 
> KAFKA-1690. added staged receives to selector.
> 
> 
> KAFKA-1690. Addressing reviews.
> 
> 
> Merge branch 'trunk' into KAFKA-1690-V1
> 
> 
> KAFKA-1690. Addressing reviews.
> 
> 
> KAFKA-1690. Add SSL support to broker, producer and consumer.
> 
> 
> Merge remote-tracking branch 'refs/remotes/origin/trunk' into KAFKA-1690-V1
> 
> 
> KAFKA-1690. Add SSL support to Kafka Broker, Producer & Client.
> 
> 
> Diffs
> -
> 
>   build.gradle c7f66be00d86146b2bff6c690471690b9c4f46f8 
>   checkstyle/import-control.xml e3f4f84c6becfd9087627f018690e1e2fc2b3bba 
>   clients/src/main/java/org/apache/kafka/clients/ClientUtils.java 
> 0d68bf1e1e90fe9d5d4397ddf817b9a9af8d9f7a 
>   clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 
> 2c421f42ed3fc5d61cf9c87a7eaa7bb23e26f63b 
>   clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 
> 0e51d7bd461d253f4396a5b6ca7cd391658807fa 
>   clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
> d35b421a515074d964c7fccb73d260b847ea5f00 
>   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
> be46b6c213ad8c6c09ad22886a5f36175ab0e13a 
>   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
> 03b8dd23df63a8d8a117f02eabcce4a2d48c44f7 
>   clients/src/main/j

[jira] [Commented] (KAFKA-1387) Kafka getting stuck creating ephemeral node it has already created when two zookeeper sessions are established in a very short period of time

2015-08-17 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699514#comment-14699514
 ] 

Flavio Junqueira commented on KAFKA-1387:
-

There are two problems at a high level described here: zk losing ephemerals and 
ephemerals not going away. I haven't been able to reproduce the former, but 
I've been able to find one potential problem that could be causing it.

I started by finding suspicious that the ZK listeners were not dealing with 
session events at all:

{code}
def handleStateChanged(state: KeeperState) {
  // do nothing, since zkclient will do reconnect for us.
}
{code}

 It is quite typical with ZK that you wait for the connected event before 
making progress. Looking at the ZkClient implementation, I realized that it 
retries operations in the case of connection loss or session expiration until 
they go through. There is a race here, though. Say you submit a create, but 
instead of getting OK as a response, you get connection loss. ZkClient in this 
case will say "well, need to retry" and will get a node exists exception, which 
the code currently treats as a znode from a previous session. This znode will 
never go away because it belongs to the current session!

Now let's say we get rid of such corner cases. It is still possible that when 
the client recovers it finds a znode from a previous session. It can happen 
because the lease (session) corresponding to the znode is still valid, so ZK 
can't get rid of it. Revoking leases in general is a bit complicated, but it 
sounds ok in this case if there is no risky of having multiple incarnations of 
the same element (a broker) running concurrently.

> Kafka getting stuck creating ephemeral node it has already created when two 
> zookeeper sessions are established in a very short period of time
> -
>
> Key: KAFKA-1387
> URL: https://issues.apache.org/jira/browse/KAFKA-1387
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.1
>Reporter: Fedor Korotkiy
>Priority: Blocker
>  Labels: newbie, patch, zkclient-problems
> Attachments: kafka-1387.patch
>
>
> Kafka broker re-registers itself in zookeeper every time handleNewSession() 
> callback is invoked.
> https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/server/KafkaHealthcheck.scala
>  
> Now imagine the following sequence of events.
> 1) Zookeeper session reestablishes. handleNewSession() callback is queued by 
> the zkClient, but not invoked yet.
> 2) Zookeeper session reestablishes again, queueing callback second time.
> 3) First callback is invoked, creating /broker/[id] ephemeral path.
> 4) Second callback is invoked and it tries to create /broker/[id] path using 
> createEphemeralPathExpectConflictHandleZKBug() function. But the path is 
> already exists, so createEphemeralPathExpectConflictHandleZKBug() is getting 
> stuck in the infinite loop.
> Seems like controller election code have the same issue.
> I'am able to reproduce this issue on the 0.8.1 branch from github using the 
> following configs.
> # zookeeper
> tickTime=10
> dataDir=/tmp/zk/
> clientPort=2101
> maxClientCnxns=0
> # kafka
> broker.id=1
> log.dir=/tmp/kafka
> zookeeper.connect=localhost:2101
> zookeeper.connection.timeout.ms=100
> zookeeper.sessiontimeout.ms=100
> Just start kafka and zookeeper and then pause zookeeper several times using 
> Ctrl-Z.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


question in regard to KIP-30

2015-08-17 Thread Sergiy Yevtushenko
Hi,

Are there any plans to work on this improvement?
As a possible core for default implementation it might worth to consider
https://github.com/belaban/jgroups-raft .
It already contains RAFT consensus algorithm implementation. Adding
something like distributed hash map for shared metadata should not be an
issue.

Regards,
Sergiy.


Re: question in regard to KIP-30

2015-08-17 Thread Joe Stein
I don't think it makes sense to change the core default implementation with
KIP-30. Too much risk both in stability and in increasing the time to
getting it done and available for folks that want to try Kafka without
Zookeeper.

It would be interesting to see how that implementation would work along
with the others too if that would be something folks wanted to support in
their environment I encourage that.

I will be sending out a discuss thread on KIP-30 hopefully in the next day
or so we can go back and forth on the motivations and purposes behind it
and whatever technical details are required too of course.

~ Joe Stein

On Mon, Aug 17, 2015 at 9:29 AM, Sergiy Yevtushenko <
sergiy.yevtushe...@gmail.com> wrote:

> Hi,
>
> Are there any plans to work on this improvement?
> As a possible core for default implementation it might worth to consider
> https://github.com/belaban/jgroups-raft .
> It already contains RAFT consensus algorithm implementation. Adding
> something like distributed hash map for shared metadata should not be an
> issue.
>
> Regards,
> Sergiy.
>


[jira] [Updated] (KAFKA-2092) New partitioning for better load balancing

2015-08-17 Thread Gianmarco De Francisci Morales (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gianmarco De Francisci Morales updated KAFKA-2092:
--
Attachment: KAFKA-2092-v3.patch

Updated formatting to pass checkstyle checks.

> New partitioning for better load balancing
> --
>
> Key: KAFKA-2092
> URL: https://issues.apache.org/jira/browse/KAFKA-2092
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer 
>Reporter: Gianmarco De Francisci Morales
>Assignee: Jun Rao
> Attachments: KAFKA-2092-v1.patch, KAFKA-2092-v2.patch, 
> KAFKA-2092-v3.patch
>
>
> We have recently studied the problem of load balancing in distributed stream 
> processing systems such as Samza [1].
> In particular, we focused on what happens when the key distribution of the 
> stream is skewed when using key grouping.
> We developed a new stream partitioning scheme (which we call Partial Key 
> Grouping). It achieves better load balancing than hashing while being more 
> scalable than round robin in terms of memory.
> In the paper we show a number of mining algorithms that are easy to implement 
> with partial key grouping, and whose performance can benefit from it. We 
> think that it might also be useful for a larger class of algorithms.
> PKG has already been integrated in Storm [2], and I would like to be able to 
> use it in Samza as well. As far as I understand, Kafka producers are the ones 
> that decide how to partition the stream (or Kafka topic).
> I do not have experience with Kafka, however partial key grouping is very 
> easy to implement: it requires just a few lines of code in Java when 
> implemented as a custom grouping in Storm [3].
> I believe it should be very easy to integrate.
> For all these reasons, I believe it will be a nice addition to Kafka/Samza. 
> If the community thinks it's a good idea, I will be happy to offer support in 
> the porting.
> References:
> [1] 
> https://melmeric.files.wordpress.com/2014/11/the-power-of-both-choices-practical-load-balancing-for-distributed-stream-processing-engines.pdf
> [2] https://issues.apache.org/jira/browse/STORM-632
> [3] https://github.com/gdfm/partial-key-grouping



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2435) More optimally balanced partition assignment strategy

2015-08-17 Thread Andrew Olson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699565#comment-14699565
 ] 

Andrew Olson commented on KAFKA-2435:
-

[~becket_qin] Thanks for the info, I'll send a pull request. This new strategy 
specifically targets the case where the consumer topic subscriptions are not 
all identical, which I don't believe that KAFKA-2019 addresses -- it looks like 
it would probably reduce the potential for a large imbalance, but only if there 
is more than one thread per consumer.

> More optimally balanced partition assignment strategy
> -
>
> Key: KAFKA-2435
> URL: https://issues.apache.org/jira/browse/KAFKA-2435
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Andrew Olson
>Assignee: Andrew Olson
> Attachments: KAFKA-2435.patch
>
>
> While the roundrobin partition assignment strategy is an improvement over the 
> range strategy, when the consumer topic subscriptions are not identical 
> (previously disallowed but will be possible as of KAFKA-2172) it can produce 
> heavily skewed assignments. As suggested 
> [here|https://issues.apache.org/jira/browse/KAFKA-2172?focusedCommentId=14530767&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14530767]
>  it would be nice to have a strategy that attempts to assign an equal number 
> of partitions to each consumer in a group, regardless of how similar their 
> individual topic subscriptions are. We can accomplish this by tracking the 
> number of partitions assigned to each consumer, and having the partition 
> assignment loop assign each partition to a consumer interested in that topic 
> with the least number of partitions assigned. 
> Additionally, we can optimize the distribution fairness by adjusting the 
> partition assignment order:
> * Topics with fewer consumers are assigned first.
> * In the event of a tie for least consumers, the topic with more partitions 
> is assigned first.
> The general idea behind these two rules is to keep the most flexible 
> assignment choices available as long as possible by starting with the most 
> constrained partitions/consumers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: question in regard to KIP-30

2015-08-17 Thread Sergiy Yevtushenko
Well, from my point of view it at least makes sense to abstract out the
implementation. Now any attempt to replace ZK with something else will lead
to significant efforts.
So, even providing of pluggable interface for the consensus and metadata
would be extremely helpful and would enable other people to provide
replacement for ZK-based implementation. There are many situations when ZK
is too inconvenient. For example, ZK is very inconvenient for embedding.
This in turn makes embedding of Kafka very inconvenient too (while Kafka
itself is quite convenient in this regard).

Also, jgroups-raft is only one implementation of RAFT. There are several
others which potentially can be used for this purpose.

2015-08-17 16:42 GMT+03:00 Joe Stein :

> I don't think it makes sense to change the core default implementation with
> KIP-30. Too much risk both in stability and in increasing the time to
> getting it done and available for folks that want to try Kafka without
> Zookeeper.
>
> It would be interesting to see how that implementation would work along
> with the others too if that would be something folks wanted to support in
> their environment I encourage that.
>
> I will be sending out a discuss thread on KIP-30 hopefully in the next day
> or so we can go back and forth on the motivations and purposes behind it
> and whatever technical details are required too of course.
>
> ~ Joe Stein
>
> On Mon, Aug 17, 2015 at 9:29 AM, Sergiy Yevtushenko <
> sergiy.yevtushe...@gmail.com> wrote:
>
> > Hi,
> >
> > Are there any plans to work on this improvement?
> > As a possible core for default implementation it might worth to consider
> > https://github.com/belaban/jgroups-raft .
> > It already contains RAFT consensus algorithm implementation. Adding
> > something like distributed hash map for shared metadata should not be an
> > issue.
> >
> > Regards,
> > Sergiy.
> >
>


Re: Review Request 33620: Patch for KAFKA-1690

2015-08-17 Thread Sriharsha Chintalapani

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33620/
---

(Updated Aug. 17, 2015, 3:13 p.m.)


Review request for kafka.


Bugs: KAFKA-1690
https://issues.apache.org/jira/browse/KAFKA-1690


Repository: kafka


Description (updated)
---

KAFKA-1690. new java producer needs ssl support as a client.


KAFKA-1690. new java producer needs ssl support as a client.


KAFKA-1690. new java producer needs ssl support as a client.


KAFKA-1690. new java producer needs ssl support as a client. SSLFactory tests.


KAFKA-1690. new java producer needs ssl support as a client. Added 
PrincipalBuilder.


KAFKA-1690. new java producer needs ssl support as a client. Addressing reviews.


KAFKA-1690. new java producer needs ssl support as a client. Addressing reviews.


KAFKA-1690. new java producer needs ssl support as a client. Addressing reviews.


KAFKA-1690. new java producer needs ssl support as a client. Fixed minor issues 
with the patch.


KAFKA-1690. new java producer needs ssl support as a client. Fixed minor issues 
with the patch.


KAFKA-1690. new java producer needs ssl support as a client.


KAFKA-1690. new java producer needs ssl support as a client.


Merge remote-tracking branch 'refs/remotes/origin/trunk' into KAFKA-1690-V1


KAFKA-1690. Broker side ssl changes.


KAFKA-1684. SSL for socketServer.


KAFKA-1690. Added SSLProducerSendTest and fixes to get right port for SSL.


Merge branch 'trunk' into KAFKA-1690-V1


KAFKA-1690. Post merge fixes.


KAFKA-1690. Added SSLProducerSendTest.


KAFKA-1690. Minor fixes based on patch review comments.


Merge commit


KAFKA-1690. Added SSL Consumer Test.


KAFKA-1690. SSL Support.


KAFKA-1690. Addressing reviews.


Merge branch 'trunk' into KAFKA-1690-V1


Merge branch 'trunk' into KAFKA-1690-V1


KAFKA-1690. Addressing reviews. Removed interestOps from SSLTransportLayer.


KAFKA-1690. Addressing reviews.


KAFKA-1690. added staged receives to selector.


KAFKA-1690. Addressing reviews.


Merge branch 'trunk' into KAFKA-1690-V1


KAFKA-1690. Addressing reviews.


KAFKA-1690. Add SSL support to broker, producer and consumer.


Merge remote-tracking branch 'refs/remotes/origin/trunk' into KAFKA-1690-V1


KAFKA-1690. Add SSL support to Kafka Broker, Producer & Client.


KAFKA-1690. Add SSL support for Kafka Brokers, Producers and Consumers.


Merge remote-tracking branch 'refs/remotes/origin/trunk' into KAFKA-1690-V1


Diffs (updated)
-

  build.gradle 983587fd0b7604c3a26fcbb6a1d63e5e470d23fe 
  checkstyle/import-control.xml e3f4f84c6becfd9087627f018690e1e2fc2b3bba 
  clients/src/main/java/org/apache/kafka/clients/ClientUtils.java 
0d68bf1e1e90fe9d5d4397ddf817b9a9af8d9f7a 
  clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 
2c421f42ed3fc5d61cf9c87a7eaa7bb23e26f63b 
  clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 
0e51d7bd461d253f4396a5b6ca7cd391658807fa 
  clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
d35b421a515074d964c7fccb73d260b847ea5f00 
  clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
be46b6c213ad8c6c09ad22886a5f36175ab0e13a 
  clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
03b8dd23df63a8d8a117f02eabcce4a2d48c44f7 
  clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
aa264202f2724907924985a5ecbe74afc4c6c04b 
  clients/src/main/java/org/apache/kafka/common/config/AbstractConfig.java 
6c317480a181678747bfb6b77e315b08668653c5 
  clients/src/main/java/org/apache/kafka/common/config/SSLConfigs.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/Authenticator.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/ByteBufferSend.java 
df0e6d5105ca97b7e1cb4d334ffb7b443506bd0b 
  clients/src/main/java/org/apache/kafka/common/network/ChannelBuilder.java 
PRE-CREATION 
  
clients/src/main/java/org/apache/kafka/common/network/DefaultAuthenticator.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/KafkaChannel.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/NetworkReceive.java 
3ca0098b8ec8cfdf81158465b2d40afc47eb6f80 
  
clients/src/main/java/org/apache/kafka/common/network/PlaintextChannelBuilder.java
 PRE-CREATION 
  
clients/src/main/java/org/apache/kafka/common/network/PlaintextTransportLayer.java
 PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/SSLChannelBuilder.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/Selectable.java 
618a0fa53848ae6befea7eba39c2f3285b734494 
  clients/src/main/java/org/apache/kafka/common/network/Selector.java 
ce20111ac434eb8c74585e9c63757bb9d60a832f 
  clients/src/main/java/org/

[jira] [Commented] (KAFKA-1690) new java producer needs ssl support as a client

2015-08-17 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699660#comment-14699660
 ] 

Sriharsha Chintalapani commented on KAFKA-1690:
---

Updated reviewboard https://reviews.apache.org/r/33620/diff/
 against branch origin/trunk

> new java producer needs ssl support as a client
> ---
>
> Key: KAFKA-1690
> URL: https://issues.apache.org/jira/browse/KAFKA-1690
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.3
>
> Attachments: KAFKA-1690.patch, KAFKA-1690.patch, 
> KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, 
> KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, 
> KAFKA-1690_2015-05-15_07:18:21.patch, KAFKA-1690_2015-05-20_14:54:35.patch, 
> KAFKA-1690_2015-05-21_10:37:08.patch, KAFKA-1690_2015-06-03_18:52:29.patch, 
> KAFKA-1690_2015-06-23_13:18:20.patch, KAFKA-1690_2015-07-20_06:10:42.patch, 
> KAFKA-1690_2015-07-20_11:59:57.patch, KAFKA-1690_2015-07-25_12:10:55.patch, 
> KAFKA-1690_2015-08-16_20:41:02.patch, KAFKA-1690_2015-08-17_08:12:50.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1690) new java producer needs ssl support as a client

2015-08-17 Thread Sriharsha Chintalapani (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1690:
--
Attachment: KAFKA-1690_2015-08-17_08:12:50.patch

> new java producer needs ssl support as a client
> ---
>
> Key: KAFKA-1690
> URL: https://issues.apache.org/jira/browse/KAFKA-1690
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.3
>
> Attachments: KAFKA-1690.patch, KAFKA-1690.patch, 
> KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, 
> KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, 
> KAFKA-1690_2015-05-15_07:18:21.patch, KAFKA-1690_2015-05-20_14:54:35.patch, 
> KAFKA-1690_2015-05-21_10:37:08.patch, KAFKA-1690_2015-06-03_18:52:29.patch, 
> KAFKA-1690_2015-06-23_13:18:20.patch, KAFKA-1690_2015-07-20_06:10:42.patch, 
> KAFKA-1690_2015-07-20_11:59:57.patch, KAFKA-1690_2015-07-25_12:10:55.patch, 
> KAFKA-1690_2015-08-16_20:41:02.patch, KAFKA-1690_2015-08-17_08:12:50.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-17 Thread Grant Henke
+dev

Adding dev list back in. Somehow it got dropped.


On Mon, Aug 17, 2015 at 10:16 AM, Grant Henke  wrote:

> Below is a list of candidate bug fix jiras marked fixed for 0.8.3. I don't
> suspect all of these will (or should) make it into the release but this
> should be a relatively complete list to work from:
>
>- KAFKA-2114 : Unable
>to change min.insync.replicas default
>- KAFKA-1702 :
>Messages silently Lost by producer
>- KAFKA-2012 :
>Broker should automatically handle corrupt index files
>- KAFKA-2406 : ISR
>propagation should be throttled to avoid overwhelming controller.
>- KAFKA-2336 :
>Changing offsets.topic.num.partitions after the offset topic is created
>breaks consumer group partition assignment
>- KAFKA-2337 : Verify
>that metric names will not collide when creating new topics
>- KAFKA-2393 :
>Correctly Handle InvalidTopicException in KafkaApis.getTopicMetadata()
>- KAFKA-2189 : Snappy
>compression of message batches less efficient in 0.8.2.1
>- KAFKA-2308 : New
>producer + Snappy face un-compression errors after broker restart
>- KAFKA-2042 : New
>producer metadata update always get all topics.
>- KAFKA-1367 : Broker
>topic metadata not kept in sync with ZooKeeper
>- KAFKA-972 : 
> MetadataRequest
>returns stale list of brokers
>- KAFKA-1867 : liveBroker
>list not updated on a cluster with no topics
>- KAFKA-1650 : Mirror
>Maker could lose data on unclean shutdown.
>- KAFKA-2009 : Fix
>UncheckedOffset.removeOffset synchronization and trace logging issue in
>mirror maker
>- KAFKA-2407 : Only
>create a log directory when it will be used
>- KAFKA-2327 :
>broker doesn't start if config defines advertised.host but not
>advertised.port
>- KAFKA-1788: producer record can stay in RecordAccumulator forever if
>leader is no available
>- KAFKA-2234 :
>Partition reassignment of a nonexistent topic prevents future reassignments
>- KAFKA-2096 :
>Enable keepalive socket option for broker to prevent socket leak
>- KAFKA-1057 : Trim
>whitespaces from user specified configs
>- KAFKA-1641 : Log
>cleaner exits if last cleaned offset is lower than earliest offset
>- KAFKA-1648 : Round
>robin consumer balance throws an NPE when there are no topics
>- KAFKA-1724 :
>Errors after reboot in single node setup
>- KAFKA-1758 :
>corrupt recovery file prevents startup
>- KAFKA-1866 :
>LogStartOffset gauge throws exceptions after log.delete()
>- KAFKA-1883 : 
> NullPointerException
>in RequestSendThread
>- KAFKA-1896 :
>Record size funcition of record in mirror maker hit NPE when the message
>value is null.
>- KAFKA-2101 :
>Metric metadata-age is reset on a failed update
>- KAFKA-2112 : make
>overflowWheel volatile
>- KAFKA-2117 :
>OffsetManager uses incorrect field for metadata
>- KAFKA-2164 :
>ReplicaFetcherThread: suspicious log message on reset offset
>- KAFKA-1668 :
>TopicCommand doesn't warn if --topic argument doesn't match any topics
>- KAFKA-2198 :
>kafka-topics.sh exits with 0 status on failures
>- KAFKA-2235 :
>LogCleaner offset map overflow
>- K

[jira] [Updated] (KAFKA-1690) Add SSL support to Kafka Broker, Producer and Consumer

2015-08-17 Thread Sriharsha Chintalapani (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1690:
--
Summary: Add SSL support to Kafka Broker, Producer and Consumer  (was: new 
java producer needs ssl support as a client)

> Add SSL support to Kafka Broker, Producer and Consumer
> --
>
> Key: KAFKA-1690
> URL: https://issues.apache.org/jira/browse/KAFKA-1690
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.3
>
> Attachments: KAFKA-1690.patch, KAFKA-1690.patch, 
> KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, 
> KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, 
> KAFKA-1690_2015-05-15_07:18:21.patch, KAFKA-1690_2015-05-20_14:54:35.patch, 
> KAFKA-1690_2015-05-21_10:37:08.patch, KAFKA-1690_2015-06-03_18:52:29.patch, 
> KAFKA-1690_2015-06-23_13:18:20.patch, KAFKA-1690_2015-07-20_06:10:42.patch, 
> KAFKA-1690_2015-07-20_11:59:57.patch, KAFKA-1690_2015-07-25_12:10:55.patch, 
> KAFKA-1690_2015-08-16_20:41:02.patch, KAFKA-1690_2015-08-17_08:12:50.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-2435: Fair consumer partition assignment...

2015-08-17 Thread noslowerdna
GitHub user noslowerdna opened a pull request:

https://github.com/apache/kafka/pull/146

KAFKA-2435: Fair consumer partition assignment strategy



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/noslowerdna/kafka KAFKA-2435

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/146.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #146


commit 080e265f742e1e23f84b86c7d99c45dce807d7c8
Author: Andrew Olson 
Date:   2015-08-17T15:03:13Z

KAFKA-2435: Fair consumer partition assignment strategy




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request: KAFKA-2434: Remove identical topic subscriptio...

2015-08-17 Thread noslowerdna
GitHub user noslowerdna opened a pull request:

https://github.com/apache/kafka/pull/145

KAFKA-2434: Remove identical topic subscription constraint for roundrobin 
strategy in old consumer API



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/noslowerdna/kafka KAFKA-2434

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/145.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #145


commit d5b93efd9eef20f3f5836d779e6140d5e4a6fc65
Author: Andrew Olson 
Date:   2015-08-17T15:01:44Z

KAFKA-2434: Remove identical topic subscription constraint for roundrobin 
strategy in old consumer API




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2434) remove roundrobin identical topic constraint in consumer coordinator (old API)

2015-08-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699668#comment-14699668
 ] 

ASF GitHub Bot commented on KAFKA-2434:
---

GitHub user noslowerdna opened a pull request:

https://github.com/apache/kafka/pull/145

KAFKA-2434: Remove identical topic subscription constraint for roundrobin 
strategy in old consumer API



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/noslowerdna/kafka KAFKA-2434

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/145.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #145


commit d5b93efd9eef20f3f5836d779e6140d5e4a6fc65
Author: Andrew Olson 
Date:   2015-08-17T15:01:44Z

KAFKA-2434: Remove identical topic subscription constraint for roundrobin 
strategy in old consumer API




> remove roundrobin identical topic constraint in consumer coordinator (old API)
> --
>
> Key: KAFKA-2434
> URL: https://issues.apache.org/jira/browse/KAFKA-2434
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Andrew Olson
>Assignee: Andrew Olson
> Attachments: KAFKA-2434.patch
>
>
> The roundrobin strategy algorithm improvement made in KAFKA-2196 should be 
> applied to the original high-level consumer API as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2435) More optimally balanced partition assignment strategy

2015-08-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699669#comment-14699669
 ] 

ASF GitHub Bot commented on KAFKA-2435:
---

GitHub user noslowerdna opened a pull request:

https://github.com/apache/kafka/pull/146

KAFKA-2435: Fair consumer partition assignment strategy



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/noslowerdna/kafka KAFKA-2435

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/146.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #146


commit 080e265f742e1e23f84b86c7d99c45dce807d7c8
Author: Andrew Olson 
Date:   2015-08-17T15:03:13Z

KAFKA-2435: Fair consumer partition assignment strategy




> More optimally balanced partition assignment strategy
> -
>
> Key: KAFKA-2435
> URL: https://issues.apache.org/jira/browse/KAFKA-2435
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Andrew Olson
>Assignee: Andrew Olson
> Attachments: KAFKA-2435.patch
>
>
> While the roundrobin partition assignment strategy is an improvement over the 
> range strategy, when the consumer topic subscriptions are not identical 
> (previously disallowed but will be possible as of KAFKA-2172) it can produce 
> heavily skewed assignments. As suggested 
> [here|https://issues.apache.org/jira/browse/KAFKA-2172?focusedCommentId=14530767&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14530767]
>  it would be nice to have a strategy that attempts to assign an equal number 
> of partitions to each consumer in a group, regardless of how similar their 
> individual topic subscriptions are. We can accomplish this by tracking the 
> number of partitions assigned to each consumer, and having the partition 
> assignment loop assign each partition to a consumer interested in that topic 
> with the least number of partitions assigned. 
> Additionally, we can optimize the distribution fairness by adjusting the 
> partition assignment order:
> * Topics with fewer consumers are assigned first.
> * In the event of a tie for least consumers, the topic with more partitions 
> is assigned first.
> The general idea behind these two rules is to keep the most flexible 
> assignment choices available as long as possible by starting with the most 
> constrained partitions/consumers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33620: Patch for KAFKA-1690

2015-08-17 Thread Ismael Juma

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33620/#review95589
---

Ship it!


Thanks for the various fixes. There is still a small issue in the javadoc for 
`Selector.poll` (due to the merge from trunk). I haven't spotted any other 
issues and hopefully Jun will also be happy and this can be merged very soon. :)

I ran the test suite 3 times locally via `./gradew test` and all the tests 
passed.


clients/src/main/java/org/apache/kafka/common/network/Selector.java (line 238)


The diff that exists on these two lines should not be present (it's 
basically reverting the change from another commit due to a mismerge). Note the 
incorrect link to `poll` and that the word `any` was removed.


- Ismael Juma


On Aug. 17, 2015, 3:13 p.m., Sriharsha Chintalapani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33620/
> ---
> 
> (Updated Aug. 17, 2015, 3:13 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1690
> https://issues.apache.org/jira/browse/KAFKA-1690
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. SSLFactory tests.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Added 
> PrincipalBuilder.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Addressing 
> reviews.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Addressing 
> reviews.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Addressing 
> reviews.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Fixed minor 
> issues with the patch.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Fixed minor 
> issues with the patch.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> Merge remote-tracking branch 'refs/remotes/origin/trunk' into KAFKA-1690-V1
> 
> 
> KAFKA-1690. Broker side ssl changes.
> 
> 
> KAFKA-1684. SSL for socketServer.
> 
> 
> KAFKA-1690. Added SSLProducerSendTest and fixes to get right port for SSL.
> 
> 
> Merge branch 'trunk' into KAFKA-1690-V1
> 
> 
> KAFKA-1690. Post merge fixes.
> 
> 
> KAFKA-1690. Added SSLProducerSendTest.
> 
> 
> KAFKA-1690. Minor fixes based on patch review comments.
> 
> 
> Merge commit
> 
> 
> KAFKA-1690. Added SSL Consumer Test.
> 
> 
> KAFKA-1690. SSL Support.
> 
> 
> KAFKA-1690. Addressing reviews.
> 
> 
> Merge branch 'trunk' into KAFKA-1690-V1
> 
> 
> Merge branch 'trunk' into KAFKA-1690-V1
> 
> 
> KAFKA-1690. Addressing reviews. Removed interestOps from SSLTransportLayer.
> 
> 
> KAFKA-1690. Addressing reviews.
> 
> 
> KAFKA-1690. added staged receives to selector.
> 
> 
> KAFKA-1690. Addressing reviews.
> 
> 
> Merge branch 'trunk' into KAFKA-1690-V1
> 
> 
> KAFKA-1690. Addressing reviews.
> 
> 
> KAFKA-1690. Add SSL support to broker, producer and consumer.
> 
> 
> Merge remote-tracking branch 'refs/remotes/origin/trunk' into KAFKA-1690-V1
> 
> 
> KAFKA-1690. Add SSL support to Kafka Broker, Producer & Client.
> 
> 
> KAFKA-1690. Add SSL support for Kafka Brokers, Producers and Consumers.
> 
> 
> Merge remote-tracking branch 'refs/remotes/origin/trunk' into KAFKA-1690-V1
> 
> 
> Diffs
> -
> 
>   build.gradle 983587fd0b7604c3a26fcbb6a1d63e5e470d23fe 
>   checkstyle/import-control.xml e3f4f84c6becfd9087627f018690e1e2fc2b3bba 
>   clients/src/main/java/org/apache/kafka/clients/ClientUtils.java 
> 0d68bf1e1e90fe9d5d4397ddf817b9a9af8d9f7a 
>   clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 
> 2c421f42ed3fc5d61cf9c87a7eaa7bb23e26f63b 
>   clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 
> 0e51d7bd461d253f4396a5b6ca7cd391658807fa 
>   clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
> d35b421a515074d964c7fccb73d260b847ea5f00 
>   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
> be46b6c213ad8c6c09ad22886a5f36175ab0e13a 
>   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
> 03b8dd23df63a8d8a117f02eabcce4a2d48c44f7 
>   clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
> aa264202f2724907924985a5ecbe74afc4c6c04b 
>   clients/src/main/java/org/apache/kafka/common/config/AbstractConfig.java 
> 6c317480a181678747bfb6b77e315b08668653c5 
>   clients/src/main/java/org/apache/kafka/com

[jira] [Assigned] (KAFKA-2431) Test SSL/TLS impact on performance

2015-08-17 Thread Ben Stopford (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Stopford reassigned KAFKA-2431:
---

Assignee: Ben Stopford

> Test SSL/TLS impact on performance
> --
>
> Key: KAFKA-2431
> URL: https://issues.apache.org/jira/browse/KAFKA-2431
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Ismael Juma
>Assignee: Ben Stopford
> Fix For: 0.8.3
>
>
> Test new Producer and new Consumer performance with and without SSL/TLS once 
> the SSL/TLS branch is integrated.
> The ideal scenario is that SSL/TLS would not have an impact if disabled. When 
> enabled, there will be some overhead (encryption and the inability to use 
> `SendFile`) and it will be good to quantify it. The encryption overhead is 
> reduced if recent JDKs are used with CPUs that support AES-specific 
> instructions (https://en.wikipedia.org/wiki/AES_instruction_set).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-17 Thread Stevo Slavić
Instead of cherrypicking, why not just make 0.8.2.2 of off current trunk,
with new consumer API appropriately annotated/documented as unstable?

On Mon, Aug 17, 2015, 17:17 Grant Henke  wrote:

> +dev
>
> Adding dev list back in. Somehow it got dropped.
>
>
> On Mon, Aug 17, 2015 at 10:16 AM, Grant Henke  wrote:
>
> > Below is a list of candidate bug fix jiras marked fixed for 0.8.3. I
> don't
> > suspect all of these will (or should) make it into the release but this
> > should be a relatively complete list to work from:
> >
> >- KAFKA-2114 :
> Unable
> >to change min.insync.replicas default
> >- KAFKA-1702 :
> >Messages silently Lost by producer
> >- KAFKA-2012 :
> >Broker should automatically handle corrupt index files
> >- KAFKA-2406 : ISR
> >propagation should be throttled to avoid overwhelming controller.
> >- KAFKA-2336 :
> >Changing offsets.topic.num.partitions after the offset topic is
> created
> >breaks consumer group partition assignment
> >- KAFKA-2337 :
> Verify
> >that metric names will not collide when creating new topics
> >- KAFKA-2393 :
> >Correctly Handle InvalidTopicException in KafkaApis.getTopicMetadata()
> >- KAFKA-2189 :
> Snappy
> >compression of message batches less efficient in 0.8.2.1
> >- KAFKA-2308 : New
> >producer + Snappy face un-compression errors after broker restart
> >- KAFKA-2042 : New
> >producer metadata update always get all topics.
> >- KAFKA-1367 :
> Broker
> >topic metadata not kept in sync with ZooKeeper
> >- KAFKA-972 :
> MetadataRequest
> >returns stale list of brokers
> >- KAFKA-1867 :
> liveBroker
> >list not updated on a cluster with no topics
> >- KAFKA-1650 :
> Mirror
> >Maker could lose data on unclean shutdown.
> >- KAFKA-2009 : Fix
> >UncheckedOffset.removeOffset synchronization and trace logging issue
> in
> >mirror maker
> >- KAFKA-2407 : Only
> >create a log directory when it will be used
> >- KAFKA-2327 :
> >broker doesn't start if config defines advertised.host but not
> >advertised.port
> >- KAFKA-1788: producer record can stay in RecordAccumulator forever if
> >leader is no available
> >- KAFKA-2234 :
> >Partition reassignment of a nonexistent topic prevents future
> reassignments
> >- KAFKA-2096 :
> >Enable keepalive socket option for broker to prevent socket leak
> >- KAFKA-1057 : Trim
> >whitespaces from user specified configs
> >- KAFKA-1641 : Log
> >cleaner exits if last cleaned offset is lower than earliest offset
> >- KAFKA-1648 :
> Round
> >robin consumer balance throws an NPE when there are no topics
> >- KAFKA-1724 :
> >Errors after reboot in single node setup
> >- KAFKA-1758 :
> >corrupt recovery file prevents startup
> >- KAFKA-1866 :
> >LogStartOffset gauge throws exceptions after log.delete()
> >- KAFKA-1883 :
> NullPointerException
> >in RequestSendThread
> >- KAFKA-1896 :
> >Record size funcition of record in mirror maker hit NPE when the
> message
> >value is null.
> >- KAFKA-2101 :
> >Metric metadata-age is reset on a failed update
> >- KAFKA-2112 : make
> >overflowWheel volatile
> >- KAFKA-2117 :
> >OffsetManager uses incorrect field for metadata
> >- KAFKA-2164 :
> >ReplicaFetcherThread: suspicious log message on reset offset
> >- KAFK

Re: [copycat] How to upgrade connectors

2015-08-17 Thread Jay Kreps
Yeah I meant this purely as an approach for the connectors themselves to
implement not that we would automatically suppress configs in the
framework.

-Jay

On Fri, Aug 14, 2015 at 7:48 PM, Gwen Shapira  wrote:

> The "ignore new configs" plan is good, IMO. I just don't know if its
> feasible in current Copycat:
> I'm not sure CC can ignore configuration on the connector behalf. Also,
> this is something we will want to log very clearly and I'm not sure this is
> doable outside the connector.
>
> Regarding complex implementation for connector developers. All version1.0
> connectors will have the following:
>
> int getVersion() {
> return 1;
> }
>
> Configuration upgradeConfig(Configuration config) {
> return config;
> }
>
> Not that bad :) We can even have a base class to avoid boilerplate.
>
> The complexity shows up when upgradeConfig has to actually upgrade configs,
> but thats our whole point, IMO.
>
> Gwen
>
> On Fri, Aug 14, 2015 at 10:48 AM, Jay Kreps  wrote:
>
> > I think this is a great area to think about.
> >
> > What about the simpler informal scheme of just having cc ignore config it
> > doesn't know about and instructing connector developers to maintain
> > compatibility when renaming params (i.e. support both old and new) as we
> do
> > for Kafka config.
> >
> > The steps would then be:
> > 1. CC connector developer adds new config support, possibly deprecating
> > older configs but not removing them.
> > 2. CC user then upgrades their config, then the connector implementation
> > 3. CC user can always roll back if this doesn't work in which case the
> new
> > configs added are ignored
> >
> > If you want to totally refactor your configs in a way where compatibility
> > is just not doable then you should rename the connector.
> >
> > This is one where a system test that tests last release versus this
> release
> > would probably do a lot to help ensure this stuff actually worked as
> > expected.
> >
> > Not sure if this is better or worse than the hard versioning scheme.
> >
> > -Jay
> >
> > On Fri, Aug 14, 2015 at 12:52 AM, Ewen Cheslack-Postava <
> e...@confluent.io
> > >
> > wrote:
> >
> > > Yes, I think that makes sense. As I see it, the tradeoffs are:
> > >
> > > 1. Complexity - adding these APIs increases the number of things
> > connector
> > > developers need to implement just to get started. In a lot of cases,
> the
> > > first version of a connector might literally only have some connection
> > > string as a setting and that setting will never be removed/changed. So
> it
> > > would be, in some sense, unnecessary added complexity.
> > > 2. Relies on correct modification of version number by connector
> > developer
> > > - of course any compatibility relies on the connector developer noting
> > > incompatibilities and handling them properly, but baking versioning
> into
> > > the APIs suggests we're guaranteeing compatibility. However, we know
> from
> > > experience that this is a lot harder than just adding version numbers.
> > > Seemingly small changes may change the semantics but not the format of
> > the
> > > config such that there is an unexpected incompatibility.
> > > 3. Force developers to think about compatibility up front - I think
> this
> > is
> > > the main argument for this approach. By including those APIs, it's a
> big
> > > red flag to connector developers that they need to think about
> different
> > > versions of their config and how they will handle any changes.
> > >
> > > I'm ok with including the versioning or not. I don't feel too strongly
> > > because I think including the versions is really a band-aid, not a real
> > > solution, to the compatibility problem. (I don't think there is a
> > solution
> > > for it; it's just a really hard problem...). I am wary of adding more
> > > methods to the APIs because keeping the connector APIs as simple as
> > > possible is really important for getting non-Kafka devs to develop
> > > connectors.
> > >
> > > By the way, this probably extends to the Tasks as well -- they have the
> > > same basic configuration compatibility problems as Connectors do. So I
> > > think adopting this approach implies that we'd do the same to the Task
> > > interfaces.
> > >
> > >
> > > On Thu, Aug 13, 2015 at 10:16 PM, Gwen Shapira 
> > wrote:
> > >
> > > > Hi Team Kafka,
> > > >
> > > > This may be a slightly premature discussion, but forgetting about
> > > upgrades
> > > > is a newbie mistake I'd like us to avoid :)
> > > >
> > > > So, we have connector-plugins, and users use them to create
> > > > connector-instances. This requires some configuration - database
> > > connection
> > > > string, HDFS namenode, etc.
> > > >
> > > > Now suppose we have a SystemX connector-plugin, and it takes just
> > > > "connection string" as argument. And people happily use it. Now
> imagine
> > > > that the guy who wrong the plugin wants to release
> SystemX-plugin-2.0.
> > > >
> > > > Ideally, we'd want users to be able to drop 2.0 jar instead

Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-17 Thread Gwen Shapira
The network refactoring portion was not tested well enough yet for me to
feel comfortable pushing it into a bugfix release. The new purgatory and
MirrorMaker changes are also pretty big.

The whole goal of bugfix releases is to make sure it is more stable than
the previous releases.

On Mon, Aug 17, 2015 at 8:54 AM, Stevo Slavić  wrote:

> Instead of cherrypicking, why not just make 0.8.2.2 of off current trunk,
> with new consumer API appropriately annotated/documented as unstable?
>
> On Mon, Aug 17, 2015, 17:17 Grant Henke  wrote:
>
> > +dev
> >
> > Adding dev list back in. Somehow it got dropped.
> >
> >
> > On Mon, Aug 17, 2015 at 10:16 AM, Grant Henke 
> wrote:
> >
> > > Below is a list of candidate bug fix jiras marked fixed for 0.8.3. I
> > don't
> > > suspect all of these will (or should) make it into the release but this
> > > should be a relatively complete list to work from:
> > >
> > >- KAFKA-2114 :
> > Unable
> > >to change min.insync.replicas default
> > >- KAFKA-1702 :
> > >Messages silently Lost by producer
> > >- KAFKA-2012 :
> > >Broker should automatically handle corrupt index files
> > >- KAFKA-2406 :
> ISR
> > >propagation should be throttled to avoid overwhelming controller.
> > >- KAFKA-2336 :
> > >Changing offsets.topic.num.partitions after the offset topic is
> > created
> > >breaks consumer group partition assignment
> > >- KAFKA-2337 :
> > Verify
> > >that metric names will not collide when creating new topics
> > >- KAFKA-2393 :
> > >Correctly Handle InvalidTopicException in
> KafkaApis.getTopicMetadata()
> > >- KAFKA-2189 :
> > Snappy
> > >compression of message batches less efficient in 0.8.2.1
> > >- KAFKA-2308 :
> New
> > >producer + Snappy face un-compression errors after broker restart
> > >- KAFKA-2042 :
> New
> > >producer metadata update always get all topics.
> > >- KAFKA-1367 :
> > Broker
> > >topic metadata not kept in sync with ZooKeeper
> > >- KAFKA-972 :
> > MetadataRequest
> > >returns stale list of brokers
> > >- KAFKA-1867 :
> > liveBroker
> > >list not updated on a cluster with no topics
> > >- KAFKA-1650 :
> > Mirror
> > >Maker could lose data on unclean shutdown.
> > >- KAFKA-2009 :
> Fix
> > >UncheckedOffset.removeOffset synchronization and trace logging issue
> > in
> > >mirror maker
> > >- KAFKA-2407 :
> Only
> > >create a log directory when it will be used
> > >- KAFKA-2327 :
> > >broker doesn't start if config defines advertised.host but not
> > >advertised.port
> > >- KAFKA-1788: producer record can stay in RecordAccumulator forever
> if
> > >leader is no available
> > >- KAFKA-2234 :
> > >Partition reassignment of a nonexistent topic prevents future
> > reassignments
> > >- KAFKA-2096 :
> > >Enable keepalive socket option for broker to prevent socket leak
> > >- KAFKA-1057 :
> Trim
> > >whitespaces from user specified configs
> > >- KAFKA-1641 :
> Log
> > >cleaner exits if last cleaned offset is lower than earliest offset
> > >- KAFKA-1648 :
> > Round
> > >robin consumer balance throws an NPE when there are no topics
> > >- KAFKA-1724 :
> > >Errors after reboot in single node setup
> > >- KAFKA-1758 :
> > >corrupt recovery file prevents startup
> > >- KAFKA-1866 :
> > >LogStartOffset gauge throws exceptions after log.delete()
> > >- KAFKA-1883 :
> > NullPointerException
> > >in RequestSendThread
> > >- KAFKA-1896 :
> > >Record size funcition of record in mirror maker hit NPE when the
> > message

Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-17 Thread Gwen Shapira
but +1 for 0.8.2 patch that marks the new consumer API as unstable (or
unimplemented ;)


On Mon, Aug 17, 2015 at 9:12 AM, Gwen Shapira  wrote:

> The network refactoring portion was not tested well enough yet for me to
> feel comfortable pushing it into a bugfix release. The new purgatory and
> MirrorMaker changes are also pretty big.
>
> The whole goal of bugfix releases is to make sure it is more stable than
> the previous releases.
>
> On Mon, Aug 17, 2015 at 8:54 AM, Stevo Slavić  wrote:
>
>> Instead of cherrypicking, why not just make 0.8.2.2 of off current trunk,
>> with new consumer API appropriately annotated/documented as unstable?
>>
>> On Mon, Aug 17, 2015, 17:17 Grant Henke  wrote:
>>
>> > +dev
>> >
>> > Adding dev list back in. Somehow it got dropped.
>> >
>> >
>> > On Mon, Aug 17, 2015 at 10:16 AM, Grant Henke 
>> wrote:
>> >
>> > > Below is a list of candidate bug fix jiras marked fixed for 0.8.3. I
>> > don't
>> > > suspect all of these will (or should) make it into the release but
>> this
>> > > should be a relatively complete list to work from:
>> > >
>> > >- KAFKA-2114 :
>> > Unable
>> > >to change min.insync.replicas default
>> > >- KAFKA-1702 :
>> > >Messages silently Lost by producer
>> > >- KAFKA-2012 :
>> > >Broker should automatically handle corrupt index files
>> > >- KAFKA-2406 :
>> ISR
>> > >propagation should be throttled to avoid overwhelming controller.
>> > >- KAFKA-2336 :
>> > >Changing offsets.topic.num.partitions after the offset topic is
>> > created
>> > >breaks consumer group partition assignment
>> > >- KAFKA-2337 :
>> > Verify
>> > >that metric names will not collide when creating new topics
>> > >- KAFKA-2393 :
>> > >Correctly Handle InvalidTopicException in
>> KafkaApis.getTopicMetadata()
>> > >- KAFKA-2189 :
>> > Snappy
>> > >compression of message batches less efficient in 0.8.2.1
>> > >- KAFKA-2308 :
>> New
>> > >producer + Snappy face un-compression errors after broker restart
>> > >- KAFKA-2042 :
>> New
>> > >producer metadata update always get all topics.
>> > >- KAFKA-1367 :
>> > Broker
>> > >topic metadata not kept in sync with ZooKeeper
>> > >- KAFKA-972 :
>> > MetadataRequest
>> > >returns stale list of brokers
>> > >- KAFKA-1867 :
>> > liveBroker
>> > >list not updated on a cluster with no topics
>> > >- KAFKA-1650 :
>> > Mirror
>> > >Maker could lose data on unclean shutdown.
>> > >- KAFKA-2009 :
>> Fix
>> > >UncheckedOffset.removeOffset synchronization and trace logging
>> issue
>> > in
>> > >mirror maker
>> > >- KAFKA-2407 :
>> Only
>> > >create a log directory when it will be used
>> > >- KAFKA-2327 :
>> > >broker doesn't start if config defines advertised.host but not
>> > >advertised.port
>> > >- KAFKA-1788: producer record can stay in RecordAccumulator
>> forever if
>> > >leader is no available
>> > >- KAFKA-2234 :
>> > >Partition reassignment of a nonexistent topic prevents future
>> > reassignments
>> > >- KAFKA-2096 :
>> > >Enable keepalive socket option for broker to prevent socket leak
>> > >- KAFKA-1057 :
>> Trim
>> > >whitespaces from user specified configs
>> > >- KAFKA-1641 :
>> Log
>> > >cleaner exits if last cleaned offset is lower than earliest offset
>> > >- KAFKA-1648 :
>> > Round
>> > >robin consumer balance throws an NPE when there are no topics
>> > >- KAFKA-1724 :
>> > >Errors after reboot in single node setup
>> > >- KAFKA-1758 :
>> > >corrupt recovery file prevents startup
>> > >- KAFKA-1866 :
>> > >LogStartOffset gauge throws exceptions after log.delete()
>> > >- KAFKA-1883 

[jira] [Commented] (KAFKA-2092) New partitioning for better load balancing

2015-08-17 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699763#comment-14699763
 ] 

Jason Gustafson commented on KAFKA-2092:


[~gwenshap] [~guozhang] Can either of you take a look at this patch? Since most 
people on the mailing list seemed to recognize the value and didn't have any 
strong objections to including it in Kafka, I suggest we commit it.

> New partitioning for better load balancing
> --
>
> Key: KAFKA-2092
> URL: https://issues.apache.org/jira/browse/KAFKA-2092
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer 
>Reporter: Gianmarco De Francisci Morales
>Assignee: Jun Rao
> Attachments: KAFKA-2092-v1.patch, KAFKA-2092-v2.patch, 
> KAFKA-2092-v3.patch
>
>
> We have recently studied the problem of load balancing in distributed stream 
> processing systems such as Samza [1].
> In particular, we focused on what happens when the key distribution of the 
> stream is skewed when using key grouping.
> We developed a new stream partitioning scheme (which we call Partial Key 
> Grouping). It achieves better load balancing than hashing while being more 
> scalable than round robin in terms of memory.
> In the paper we show a number of mining algorithms that are easy to implement 
> with partial key grouping, and whose performance can benefit from it. We 
> think that it might also be useful for a larger class of algorithms.
> PKG has already been integrated in Storm [2], and I would like to be able to 
> use it in Samza as well. As far as I understand, Kafka producers are the ones 
> that decide how to partition the stream (or Kafka topic).
> I do not have experience with Kafka, however partial key grouping is very 
> easy to implement: it requires just a few lines of code in Java when 
> implemented as a custom grouping in Storm [3].
> I believe it should be very easy to integrate.
> For all these reasons, I believe it will be a nice addition to Kafka/Samza. 
> If the community thinks it's a good idea, I will be happy to offer support in 
> the porting.
> References:
> [1] 
> https://melmeric.files.wordpress.com/2014/11/the-power-of-both-choices-practical-load-balancing-for-distributed-stream-processing-engines.pdf
> [2] https://issues.apache.org/jira/browse/STORM-632
> [3] https://github.com/gdfm/partial-key-grouping



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-17 Thread Flavio Junqueira
It is pretty typical that Apache projects have a release manager for each 
release:

http://www.apache.org/dev/release-publishing.html 


It doesn't have to be the same person every time, though, not even for the same 
branch.

-Flavio


> On 17 Aug 2015, at 05:01, Gwen Shapira  wrote:
> 
> BTW. I think it will be great for Apache Kafka to have a 0.8.2 "release
> manager" who's role is to cherrypick low-risk bug-fixes into the 0.8.2
> branch and once enough bug fixes happened (or if sufficiently critical
> fixes happened) to roll out a new maintenance release (with every 3 month
> as a reasonable bugfix release target).



Re: [copycat] How to upgrade connectors

2015-08-17 Thread Gwen Shapira
I see our "make life easy" priorities as follows: users > connector
developers > core copycat developers.
In this case I think that best user experience will be consistent behavior
for all copycat connectors, so they'll know what to expect during upgrades.

I'm not sure how much we can enforce from the framework side, but I suspect
we can do better than "advise connector developers to ignore unknown
configs"... even if that means making the framework or even connector
development slightly more difficult.


On Mon, Aug 17, 2015 at 9:08 AM, Jay Kreps  wrote:

> Yeah I meant this purely as an approach for the connectors themselves to
> implement not that we would automatically suppress configs in the
> framework.
>
> -Jay
>
> On Fri, Aug 14, 2015 at 7:48 PM, Gwen Shapira  wrote:
>
> > The "ignore new configs" plan is good, IMO. I just don't know if its
> > feasible in current Copycat:
> > I'm not sure CC can ignore configuration on the connector behalf. Also,
> > this is something we will want to log very clearly and I'm not sure this
> is
> > doable outside the connector.
> >
> > Regarding complex implementation for connector developers. All version1.0
> > connectors will have the following:
> >
> > int getVersion() {
> > return 1;
> > }
> >
> > Configuration upgradeConfig(Configuration config) {
> > return config;
> > }
> >
> > Not that bad :) We can even have a base class to avoid boilerplate.
> >
> > The complexity shows up when upgradeConfig has to actually upgrade
> configs,
> > but thats our whole point, IMO.
> >
> > Gwen
> >
> > On Fri, Aug 14, 2015 at 10:48 AM, Jay Kreps  wrote:
> >
> > > I think this is a great area to think about.
> > >
> > > What about the simpler informal scheme of just having cc ignore config
> it
> > > doesn't know about and instructing connector developers to maintain
> > > compatibility when renaming params (i.e. support both old and new) as
> we
> > do
> > > for Kafka config.
> > >
> > > The steps would then be:
> > > 1. CC connector developer adds new config support, possibly deprecating
> > > older configs but not removing them.
> > > 2. CC user then upgrades their config, then the connector
> implementation
> > > 3. CC user can always roll back if this doesn't work in which case the
> > new
> > > configs added are ignored
> > >
> > > If you want to totally refactor your configs in a way where
> compatibility
> > > is just not doable then you should rename the connector.
> > >
> > > This is one where a system test that tests last release versus this
> > release
> > > would probably do a lot to help ensure this stuff actually worked as
> > > expected.
> > >
> > > Not sure if this is better or worse than the hard versioning scheme.
> > >
> > > -Jay
> > >
> > > On Fri, Aug 14, 2015 at 12:52 AM, Ewen Cheslack-Postava <
> > e...@confluent.io
> > > >
> > > wrote:
> > >
> > > > Yes, I think that makes sense. As I see it, the tradeoffs are:
> > > >
> > > > 1. Complexity - adding these APIs increases the number of things
> > > connector
> > > > developers need to implement just to get started. In a lot of cases,
> > the
> > > > first version of a connector might literally only have some
> connection
> > > > string as a setting and that setting will never be removed/changed.
> So
> > it
> > > > would be, in some sense, unnecessary added complexity.
> > > > 2. Relies on correct modification of version number by connector
> > > developer
> > > > - of course any compatibility relies on the connector developer
> noting
> > > > incompatibilities and handling them properly, but baking versioning
> > into
> > > > the APIs suggests we're guaranteeing compatibility. However, we know
> > from
> > > > experience that this is a lot harder than just adding version
> numbers.
> > > > Seemingly small changes may change the semantics but not the format
> of
> > > the
> > > > config such that there is an unexpected incompatibility.
> > > > 3. Force developers to think about compatibility up front - I think
> > this
> > > is
> > > > the main argument for this approach. By including those APIs, it's a
> > big
> > > > red flag to connector developers that they need to think about
> > different
> > > > versions of their config and how they will handle any changes.
> > > >
> > > > I'm ok with including the versioning or not. I don't feel too
> strongly
> > > > because I think including the versions is really a band-aid, not a
> real
> > > > solution, to the compatibility problem. (I don't think there is a
> > > solution
> > > > for it; it's just a really hard problem...). I am wary of adding more
> > > > methods to the APIs because keeping the connector APIs as simple as
> > > > possible is really important for getting non-Kafka devs to develop
> > > > connectors.
> > > >
> > > > By the way, this probably extends to the Tasks as well -- they have
> the
> > > > same basic configuration compatibility problems as Connectors do. So
> I
> > > > think adopting this approach implies that we'd do the sam

[jira] [Commented] (KAFKA-1690) Add SSL support to Kafka Broker, Producer and Consumer

2015-08-17 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699773#comment-14699773
 ] 

Sriharsha Chintalapani commented on KAFKA-1690:
---

Instructions on enabling ssl for kafka posted here 
https://cwiki.apache.org/confluence/display/KAFKA/Deploying+SSL+for+Kafka

> Add SSL support to Kafka Broker, Producer and Consumer
> --
>
> Key: KAFKA-1690
> URL: https://issues.apache.org/jira/browse/KAFKA-1690
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.3
>
> Attachments: KAFKA-1690.patch, KAFKA-1690.patch, 
> KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, 
> KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, 
> KAFKA-1690_2015-05-15_07:18:21.patch, KAFKA-1690_2015-05-20_14:54:35.patch, 
> KAFKA-1690_2015-05-21_10:37:08.patch, KAFKA-1690_2015-06-03_18:52:29.patch, 
> KAFKA-1690_2015-06-23_13:18:20.patch, KAFKA-1690_2015-07-20_06:10:42.patch, 
> KAFKA-1690_2015-07-20_11:59:57.patch, KAFKA-1690_2015-07-25_12:10:55.patch, 
> KAFKA-1690_2015-08-16_20:41:02.patch, KAFKA-1690_2015-08-17_08:12:50.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33620: Patch for KAFKA-1690

2015-08-17 Thread Sriharsha Chintalapani

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33620/
---

(Updated Aug. 17, 2015, 4:28 p.m.)


Review request for kafka.


Bugs: KAFKA-1690
https://issues.apache.org/jira/browse/KAFKA-1690


Repository: kafka


Description (updated)
---

KAFKA-1690. new java producer needs ssl support as a client.


KAFKA-1690. new java producer needs ssl support as a client.


KAFKA-1690. new java producer needs ssl support as a client.


KAFKA-1690. new java producer needs ssl support as a client. SSLFactory tests.


KAFKA-1690. new java producer needs ssl support as a client. Added 
PrincipalBuilder.


KAFKA-1690. new java producer needs ssl support as a client. Addressing reviews.


KAFKA-1690. new java producer needs ssl support as a client. Addressing reviews.


KAFKA-1690. new java producer needs ssl support as a client. Addressing reviews.


KAFKA-1690. new java producer needs ssl support as a client. Fixed minor issues 
with the patch.


KAFKA-1690. new java producer needs ssl support as a client. Fixed minor issues 
with the patch.


KAFKA-1690. new java producer needs ssl support as a client.


KAFKA-1690. new java producer needs ssl support as a client.


Merge remote-tracking branch 'refs/remotes/origin/trunk' into KAFKA-1690-V1


KAFKA-1690. Broker side ssl changes.


KAFKA-1684. SSL for socketServer.


KAFKA-1690. Added SSLProducerSendTest and fixes to get right port for SSL.


Merge branch 'trunk' into KAFKA-1690-V1


KAFKA-1690. Post merge fixes.


KAFKA-1690. Added SSLProducerSendTest.


KAFKA-1690. Minor fixes based on patch review comments.


Merge commit


KAFKA-1690. Added SSL Consumer Test.


KAFKA-1690. SSL Support.


KAFKA-1690. Addressing reviews.


Merge branch 'trunk' into KAFKA-1690-V1


Merge branch 'trunk' into KAFKA-1690-V1


KAFKA-1690. Addressing reviews. Removed interestOps from SSLTransportLayer.


KAFKA-1690. Addressing reviews.


KAFKA-1690. added staged receives to selector.


KAFKA-1690. Addressing reviews.


Merge branch 'trunk' into KAFKA-1690-V1


KAFKA-1690. Addressing reviews.


KAFKA-1690. Add SSL support to broker, producer and consumer.


Merge remote-tracking branch 'refs/remotes/origin/trunk' into KAFKA-1690-V1


KAFKA-1690. Add SSL support to Kafka Broker, Producer & Client.


KAFKA-1690. Add SSL support for Kafka Brokers, Producers and Consumers.


Merge remote-tracking branch 'refs/remotes/origin/trunk' into KAFKA-1690-V1


KAFKA-1690. Add SSL support for Kafka Brokers, Producers and Consumers.


Diffs (updated)
-

  build.gradle 983587fd0b7604c3a26fcbb6a1d63e5e470d23fe 
  checkstyle/import-control.xml e3f4f84c6becfd9087627f018690e1e2fc2b3bba 
  clients/src/main/java/org/apache/kafka/clients/ClientUtils.java 
0d68bf1e1e90fe9d5d4397ddf817b9a9af8d9f7a 
  clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 
2c421f42ed3fc5d61cf9c87a7eaa7bb23e26f63b 
  clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 
0e51d7bd461d253f4396a5b6ca7cd391658807fa 
  clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
d35b421a515074d964c7fccb73d260b847ea5f00 
  clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
be46b6c213ad8c6c09ad22886a5f36175ab0e13a 
  clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
03b8dd23df63a8d8a117f02eabcce4a2d48c44f7 
  clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
aa264202f2724907924985a5ecbe74afc4c6c04b 
  clients/src/main/java/org/apache/kafka/common/config/AbstractConfig.java 
6c317480a181678747bfb6b77e315b08668653c5 
  clients/src/main/java/org/apache/kafka/common/config/SSLConfigs.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/Authenticator.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/ByteBufferSend.java 
df0e6d5105ca97b7e1cb4d334ffb7b443506bd0b 
  clients/src/main/java/org/apache/kafka/common/network/ChannelBuilder.java 
PRE-CREATION 
  
clients/src/main/java/org/apache/kafka/common/network/DefaultAuthenticator.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/KafkaChannel.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/NetworkReceive.java 
3ca0098b8ec8cfdf81158465b2d40afc47eb6f80 
  
clients/src/main/java/org/apache/kafka/common/network/PlaintextChannelBuilder.java
 PRE-CREATION 
  
clients/src/main/java/org/apache/kafka/common/network/PlaintextTransportLayer.java
 PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/SSLChannelBuilder.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/Selectable.java 
618a0fa53848ae6befea7eba39c2f3285b734494 
  clients/src/main/java/org/apache/kafka/common/network/Selector.ja

[jira] [Updated] (KAFKA-1690) Add SSL support to Kafka Broker, Producer and Consumer

2015-08-17 Thread Sriharsha Chintalapani (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1690:
--
Attachment: KAFKA-1690_2015-08-17_09:28:52.patch

> Add SSL support to Kafka Broker, Producer and Consumer
> --
>
> Key: KAFKA-1690
> URL: https://issues.apache.org/jira/browse/KAFKA-1690
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.3
>
> Attachments: KAFKA-1690.patch, KAFKA-1690.patch, 
> KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, 
> KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, 
> KAFKA-1690_2015-05-15_07:18:21.patch, KAFKA-1690_2015-05-20_14:54:35.patch, 
> KAFKA-1690_2015-05-21_10:37:08.patch, KAFKA-1690_2015-06-03_18:52:29.patch, 
> KAFKA-1690_2015-06-23_13:18:20.patch, KAFKA-1690_2015-07-20_06:10:42.patch, 
> KAFKA-1690_2015-07-20_11:59:57.patch, KAFKA-1690_2015-07-25_12:10:55.patch, 
> KAFKA-1690_2015-08-16_20:41:02.patch, KAFKA-1690_2015-08-17_08:12:50.patch, 
> KAFKA-1690_2015-08-17_09:28:52.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1690) Add SSL support to Kafka Broker, Producer and Consumer

2015-08-17 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699783#comment-14699783
 ] 

Sriharsha Chintalapani commented on KAFKA-1690:
---

Updated reviewboard https://reviews.apache.org/r/33620/diff/
 against branch origin/trunk

> Add SSL support to Kafka Broker, Producer and Consumer
> --
>
> Key: KAFKA-1690
> URL: https://issues.apache.org/jira/browse/KAFKA-1690
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.3
>
> Attachments: KAFKA-1690.patch, KAFKA-1690.patch, 
> KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, 
> KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, 
> KAFKA-1690_2015-05-15_07:18:21.patch, KAFKA-1690_2015-05-20_14:54:35.patch, 
> KAFKA-1690_2015-05-21_10:37:08.patch, KAFKA-1690_2015-06-03_18:52:29.patch, 
> KAFKA-1690_2015-06-23_13:18:20.patch, KAFKA-1690_2015-07-20_06:10:42.patch, 
> KAFKA-1690_2015-07-20_11:59:57.patch, KAFKA-1690_2015-07-25_12:10:55.patch, 
> KAFKA-1690_2015-08-16_20:41:02.patch, KAFKA-1690_2015-08-17_08:12:50.patch, 
> KAFKA-1690_2015-08-17_09:28:52.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2092) New partitioning for better load balancing

2015-08-17 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699794#comment-14699794
 ] 

Gwen Shapira commented on KAFKA-2092:
-

I'm -1 on putting this into Apache Kafka.

We currently have a single partitioner in the project, which makes 
decision-making pretty easy when using the Producer. I don't want to add 
complexity without very clear value.

As discussed in the mailing list, the use-case for this partitioner is marginal 
enough that adding this to Apache Kafka will create more confusion that 
usefulness.

IMO, this can live in a Github project and gets mentioned in our Ecosystem page 
so users with this use-case can find it.




> New partitioning for better load balancing
> --
>
> Key: KAFKA-2092
> URL: https://issues.apache.org/jira/browse/KAFKA-2092
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer 
>Reporter: Gianmarco De Francisci Morales
>Assignee: Jun Rao
> Attachments: KAFKA-2092-v1.patch, KAFKA-2092-v2.patch, 
> KAFKA-2092-v3.patch
>
>
> We have recently studied the problem of load balancing in distributed stream 
> processing systems such as Samza [1].
> In particular, we focused on what happens when the key distribution of the 
> stream is skewed when using key grouping.
> We developed a new stream partitioning scheme (which we call Partial Key 
> Grouping). It achieves better load balancing than hashing while being more 
> scalable than round robin in terms of memory.
> In the paper we show a number of mining algorithms that are easy to implement 
> with partial key grouping, and whose performance can benefit from it. We 
> think that it might also be useful for a larger class of algorithms.
> PKG has already been integrated in Storm [2], and I would like to be able to 
> use it in Samza as well. As far as I understand, Kafka producers are the ones 
> that decide how to partition the stream (or Kafka topic).
> I do not have experience with Kafka, however partial key grouping is very 
> easy to implement: it requires just a few lines of code in Java when 
> implemented as a custom grouping in Storm [3].
> I believe it should be very easy to integrate.
> For all these reasons, I believe it will be a nice addition to Kafka/Samza. 
> If the community thinks it's a good idea, I will be happy to offer support in 
> the porting.
> References:
> [1] 
> https://melmeric.files.wordpress.com/2014/11/the-power-of-both-choices-practical-load-balancing-for-distributed-stream-processing-engines.pdf
> [2] https://issues.apache.org/jira/browse/STORM-632
> [3] https://github.com/gdfm/partial-key-grouping



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-17 Thread Ashish Singh
+1 for 0.8.2.2 release with fixes for KAFKA-2189, 2114 and 2308.

On Mon, Aug 17, 2015 at 9:16 AM, Flavio Junqueira  wrote:

> It is pretty typical that Apache projects have a release manager for each
> release:
>
> http://www.apache.org/dev/release-publishing.html <
> http://www.apache.org/dev/release-publishing.html>
>
> It doesn't have to be the same person every time, though, not even for the
> same branch.
>
> -Flavio
>
>
> > On 17 Aug 2015, at 05:01, Gwen Shapira  wrote:
> >
> > BTW. I think it will be great for Apache Kafka to have a 0.8.2 "release
> > manager" who's role is to cherrypick low-risk bug-fixes into the 0.8.2
> > branch and once enough bug fixes happened (or if sufficiently critical
> > fixes happened) to roll out a new maintenance release (with every 3 month
> > as a reasonable bugfix release target).
>
>


-- 

Regards,
Ashish


[jira] [Created] (KAFKA-2440) Use `NetworkClient` instead of `SimpleConsumer` to fetch data from replica

2015-08-17 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-2440:
--

 Summary: Use `NetworkClient` instead of `SimpleConsumer` to fetch 
data from replica
 Key: KAFKA-2440
 URL: https://issues.apache.org/jira/browse/KAFKA-2440
 Project: Kafka
  Issue Type: Sub-task
Reporter: Ismael Juma
 Fix For: 0.8.3


This is necessary for SSL/TLS support for inter-broker communication as 
`SimpleConsumer` will not be updated to support SSL/TLS.

As explained by [~junrao] in KAFKA-2411: we need to be a bit careful since the 
follower fetcher thread doesn't need to refresh metadata itself. Instead, the 
information about the leader is propagated from the controller.

This work was originally described in KAFKA-2411, which was then updated to be 
more narrowly focused on replacing `BlockingChannel` with `Selector`.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-2440) Use `NetworkClient` instead of `SimpleConsumer` to fetch data from replica

2015-08-17 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reassigned KAFKA-2440:
--

Assignee: Ismael Juma

> Use `NetworkClient` instead of `SimpleConsumer` to fetch data from replica
> --
>
> Key: KAFKA-2440
> URL: https://issues.apache.org/jira/browse/KAFKA-2440
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Ismael Juma
>Assignee: Ismael Juma
> Fix For: 0.8.3
>
>
> This is necessary for SSL/TLS support for inter-broker communication as 
> `SimpleConsumer` will not be updated to support SSL/TLS.
> As explained by [~junrao] in KAFKA-2411: we need to be a bit careful since 
> the follower fetcher thread doesn't need to refresh metadata itself. Instead, 
> the information about the leader is propagated from the controller.
> This work was originally described in KAFKA-2411, which was then updated to 
> be more narrowly focused on replacing `BlockingChannel` with `Selector`.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (KAFKA-2440) Use `NetworkClient` instead of `SimpleConsumer` to fetch data from replica

2015-08-17 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on KAFKA-2440 started by Ismael Juma.
--
> Use `NetworkClient` instead of `SimpleConsumer` to fetch data from replica
> --
>
> Key: KAFKA-2440
> URL: https://issues.apache.org/jira/browse/KAFKA-2440
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Ismael Juma
>Assignee: Ismael Juma
> Fix For: 0.8.3
>
>
> This is necessary for SSL/TLS support for inter-broker communication as 
> `SimpleConsumer` will not be updated to support SSL/TLS.
> As explained by [~junrao] in KAFKA-2411: we need to be a bit careful since 
> the follower fetcher thread doesn't need to refresh metadata itself. Instead, 
> the information about the leader is propagated from the controller.
> This work was originally described in KAFKA-2411, which was then updated to 
> be more narrowly focused on replacing `BlockingChannel` with `Selector`.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2106) Partition balance tool between borkers

2015-08-17 Thread chenshangan (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenshangan updated KAFKA-2106:
---
Attachment: KAFKA-2106.3

implement a global balance among the cluster, which will destroy the original 
partition assignment algo. But, the cluster can reach to a balanced state 
eventually.

It first divide all the topic into two groups: big topics and small topics. 
Number of parititions of big topics is bigger than number of brokers, while 
small topics are the opposite. It will guarantee that partitions of big topic 
is distributed evenly in the cluster. When it comes to some topics, do not 
constraint absolute even distribution as big topics.

> Partition balance tool between borkers
> --
>
> Key: KAFKA-2106
> URL: https://issues.apache.org/jira/browse/KAFKA-2106
> Project: Kafka
>  Issue Type: New Feature
>  Components: admin
>Affects Versions: 0.8.3
>Reporter: chenshangan
> Attachments: KAFKA-2106.3, KAFKA-2106.patch, KAFKA-2106.patch.2
>
>
> The default partition assignment algorithm can work well in a static kafka 
> cluster(number of brokers seldom change). Actually, in production env, number 
> of brokers is always increasing according to the business data. When new 
> brokers added to the cluster, it's better to provide a tool that can help to 
> move existing data to new brokers. Currently, users need to choose topic or 
> partitions manually and use the Reassign Partitions Tool 
> (kafka-reassign-partitions.sh) to achieve the goal. It's a time-consuming 
> task when there's a lot of topics in the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2411) remove usage of BlockingChannel in the broker

2015-08-17 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-2411:
---
Description: 
In KAFKA-1690, we are adding the SSL support at Selector. However, there are 
still a few places where we use BlockingChannel for inter-broker communication. 
We need to replace those usage with Selector/NetworkClient to enable 
inter-broker communication over SSL. Specially, BlockingChannel is currently 
used in the following places.
1. ControllerChannelManager: for the controller to propagate metadata to the 
brokers.
2. KafkaServer: for the broker to send controlled shutdown request to the 
controller.
3. -AbstractFetcherThread: for the follower to fetch data from the leader 
(through SimpleConsumer)- moved to KAFKA-2440

  was:
In KAFKA-1690, we are adding the SSL support at Selector. However, there are 
still a few places where we use BlockingChannel for inter-broker communication. 
We need to replace those usage with Selector/NetworkClient to enable 
inter-broker communication over SSL. Specially, BlockingChannel is currently 
used in the following places.
1. ControllerChannelManager: for the controller to propagate metadata to the 
brokers.
2. KafkaServer: for the broker to send controlled shutdown request to the 
controller.
3. AbstractFetcherThread: for the follower to fetch data from the leader 
(through SimpleConsumer).


> remove usage of BlockingChannel in the broker
> -
>
> Key: KAFKA-2411
> URL: https://issues.apache.org/jira/browse/KAFKA-2411
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Jun Rao
>Assignee: Ismael Juma
> Fix For: 0.8.3
>
>
> In KAFKA-1690, we are adding the SSL support at Selector. However, there are 
> still a few places where we use BlockingChannel for inter-broker 
> communication. We need to replace those usage with Selector/NetworkClient to 
> enable inter-broker communication over SSL. Specially, BlockingChannel is 
> currently used in the following places.
> 1. ControllerChannelManager: for the controller to propagate metadata to the 
> brokers.
> 2. KafkaServer: for the broker to send controlled shutdown request to the 
> controller.
> 3. -AbstractFetcherThread: for the follower to fetch data from the leader 
> (through SimpleConsumer)- moved to KAFKA-2440



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [copycat] How to upgrade connectors

2015-08-17 Thread Jay Kreps
Totally agree with the priorities, and I'm actually not even advocating the
alternative I through out. I think, ultimately, the connector developer is
the one who has to support compatibility, though, irrespective of the
mechanism. So I guess it is really just a question of whether it is easier
for people to work with explicit versioning or the kind of loosy goosy
mechanism I described. I don't know the answer to that question. Internally
Kafka and related things have always used explicit versioning and I kind of
prefer that too, but I have observed that people struggle with it.

-Jay

On Mon, Aug 17, 2015 at 9:21 AM, Gwen Shapira  wrote:

> I see our "make life easy" priorities as follows: users > connector
> developers > core copycat developers.
> In this case I think that best user experience will be consistent behavior
> for all copycat connectors, so they'll know what to expect during upgrades.
>
> I'm not sure how much we can enforce from the framework side, but I suspect
> we can do better than "advise connector developers to ignore unknown
> configs"... even if that means making the framework or even connector
> development slightly more difficult.
>
>
> On Mon, Aug 17, 2015 at 9:08 AM, Jay Kreps  wrote:
>
> > Yeah I meant this purely as an approach for the connectors themselves to
> > implement not that we would automatically suppress configs in the
> > framework.
> >
> > -Jay
> >
> > On Fri, Aug 14, 2015 at 7:48 PM, Gwen Shapira  wrote:
> >
> > > The "ignore new configs" plan is good, IMO. I just don't know if its
> > > feasible in current Copycat:
> > > I'm not sure CC can ignore configuration on the connector behalf. Also,
> > > this is something we will want to log very clearly and I'm not sure
> this
> > is
> > > doable outside the connector.
> > >
> > > Regarding complex implementation for connector developers. All
> version1.0
> > > connectors will have the following:
> > >
> > > int getVersion() {
> > > return 1;
> > > }
> > >
> > > Configuration upgradeConfig(Configuration config) {
> > > return config;
> > > }
> > >
> > > Not that bad :) We can even have a base class to avoid boilerplate.
> > >
> > > The complexity shows up when upgradeConfig has to actually upgrade
> > configs,
> > > but thats our whole point, IMO.
> > >
> > > Gwen
> > >
> > > On Fri, Aug 14, 2015 at 10:48 AM, Jay Kreps  wrote:
> > >
> > > > I think this is a great area to think about.
> > > >
> > > > What about the simpler informal scheme of just having cc ignore
> config
> > it
> > > > doesn't know about and instructing connector developers to maintain
> > > > compatibility when renaming params (i.e. support both old and new) as
> > we
> > > do
> > > > for Kafka config.
> > > >
> > > > The steps would then be:
> > > > 1. CC connector developer adds new config support, possibly
> deprecating
> > > > older configs but not removing them.
> > > > 2. CC user then upgrades their config, then the connector
> > implementation
> > > > 3. CC user can always roll back if this doesn't work in which case
> the
> > > new
> > > > configs added are ignored
> > > >
> > > > If you want to totally refactor your configs in a way where
> > compatibility
> > > > is just not doable then you should rename the connector.
> > > >
> > > > This is one where a system test that tests last release versus this
> > > release
> > > > would probably do a lot to help ensure this stuff actually worked as
> > > > expected.
> > > >
> > > > Not sure if this is better or worse than the hard versioning scheme.
> > > >
> > > > -Jay
> > > >
> > > > On Fri, Aug 14, 2015 at 12:52 AM, Ewen Cheslack-Postava <
> > > e...@confluent.io
> > > > >
> > > > wrote:
> > > >
> > > > > Yes, I think that makes sense. As I see it, the tradeoffs are:
> > > > >
> > > > > 1. Complexity - adding these APIs increases the number of things
> > > > connector
> > > > > developers need to implement just to get started. In a lot of
> cases,
> > > the
> > > > > first version of a connector might literally only have some
> > connection
> > > > > string as a setting and that setting will never be removed/changed.
> > So
> > > it
> > > > > would be, in some sense, unnecessary added complexity.
> > > > > 2. Relies on correct modification of version number by connector
> > > > developer
> > > > > - of course any compatibility relies on the connector developer
> > noting
> > > > > incompatibilities and handling them properly, but baking versioning
> > > into
> > > > > the APIs suggests we're guaranteeing compatibility. However, we
> know
> > > from
> > > > > experience that this is a lot harder than just adding version
> > numbers.
> > > > > Seemingly small changes may change the semantics but not the format
> > of
> > > > the
> > > > > config such that there is an unexpected incompatibility.
> > > > > 3. Force developers to think about compatibility up front - I think
> > > this
> > > > is
> > > > > the main argument for this approach. By including those APIs, it's
> a
> > > big
> > > >

[jira] [Created] (KAFKA-2441) SSL/TLS in official docs

2015-08-17 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-2441:
--

 Summary: SSL/TLS in official docs
 Key: KAFKA-2441
 URL: https://issues.apache.org/jira/browse/KAFKA-2441
 Project: Kafka
  Issue Type: Sub-task
Reporter: Ismael Juma
 Fix For: 0.8.3


We need to add a section in the official documentation regarding SSL/TLS:

http://kafka.apache.org/documentation.html

There is already a wiki page where some of the information is already present:

https://cwiki.apache.org/confluence/display/KAFKA/Deploying+SSL+for+Kafka



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments

2015-08-17 Thread chenshangan (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699875#comment-14699875
 ] 

chenshangan commented on KAFKA-1792:


you guys can take a look at https://issues.apache.org/jira/browse/KAFKA-2106.

I really do not think the generate/execute/verify process is friendly for 
maintainance, an interactive or background running process is better.

> change behavior of --generate to produce assignment config with fair replica 
> distribution and minimal number of reassignments
> -
>
> Key: KAFKA-1792
> URL: https://issues.apache.org/jira/browse/KAFKA-1792
> Project: Kafka
>  Issue Type: Sub-task
>  Components: tools
>Reporter: Dmitry Pekar
>Assignee: Dmitry Pekar
> Fix For: 0.8.3
>
> Attachments: KAFKA-1792.patch, KAFKA-1792_2014-12-03_19:24:56.patch, 
> KAFKA-1792_2014-12-08_13:42:43.patch, KAFKA-1792_2014-12-19_16:48:12.patch, 
> KAFKA-1792_2015-01-14_12:54:52.patch, KAFKA-1792_2015-01-27_19:09:27.patch, 
> KAFKA-1792_2015-02-13_21:07:06.patch, KAFKA-1792_2015-02-26_16:58:23.patch, 
> generate_alg_tests.txt, rebalance_use_cases.txt
>
>
> Current implementation produces fair replica distribution between specified 
> list of brokers. Unfortunately, it doesn't take
> into account current replica assignment.
> So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth 
> broker id=3, 
> generate will create an assignment config which will redistribute replicas 
> fairly across brokers [0..3] 
> in the same way as those partitions were created from scratch. It will not 
> take into consideration current replica 
> assignment and accordingly will not try to minimize number of replica moves 
> between brokers.
> As proposed by [~charmalloc] this should be improved. New output of improved 
> --generate algorithm should suite following requirements:
> - fairness of replica distribution - every broker will have R or R+1 replicas 
> assigned;
> - minimum of reassignments - number of replica moves between brokers will be 
> minimal;
> Example.
> Consider following replica distribution per brokers [0..3] (we just added 
> brokers 2 and 3):
> - broker - 0, 1, 2, 3 
> - replicas - 7, 6, 0, 0
> The new algorithm will produce following assignment:
> - broker - 0, 1, 2, 3 
> - replicas - 4, 3, 3, 3
> - moves - -3, -3, +3, +3
> It will be fair and number of moves will be 6, which is minimal for specified 
> initial distribution.
> The scope of this issue is:
> - design an algorithm matching the above requirements;
> - implement this algorithm and unit tests;
> - test it manually using different initial assignments;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] Client-side Assignment for New Consumer

2015-08-17 Thread Jason Gustafson
Hey Becket,

Thanks for raising these concerns. I think the rolling upgrade problem is
handled quite a bit better with the new consumer if Onur's LeaveGroup patch
(or a variant) is accepted. Without it, the coordinator has to wait for the
full session timeout to detect that a node has left the group, which can
significantly delay rebalancing.

-Jason

On Sun, Aug 16, 2015 at 11:42 AM, Neha Narkhede  wrote:

> Hey Becket,
>
> These are all fair points. Regarding running Kafka as a service, it will be
> good for everyone to know some numbers around topic creation and changes
> around # of partitions. I don't think the test usage is a good one since no
> one should be creating and deleting topics in a loop on a production
> cluster.
>
> Regarding your point about mirror maker, I think we should definitely
> ensure that a mirror maker with a large topic subscription list should be
> well supported. The reason for the half an hour stalls in LinkedIn mirror
> makers is due to the fact that it still uses the old consumer. The 3 major
> reasons for long rebalance operations in the mirror maker using the old
> consumer are -
> 1. Every rebalance operation involves many writes to ZK. For large consumer
> groups and large number of partitions, this adds up to several seconds
> 2. The view of the members in a group is inconsistent and hence there are
> typically several rebalance attempts. Note that this is fixed in the new
> design where the group membership is always consistently communicated to
> all members.
> 3. The rebalance backoff time is high (of the order of seconds).
>
> The good news is that the new design has none of these problems. As we
> said, we will run some tests to ensure that the mirror maker case is
> handled well. Thank you for the feedback!
>
> Thanks,
> Neha
>
> On Sat, Aug 15, 2015 at 6:17 PM, Jiangjie Qin 
> wrote:
>
> > Neha, Ewen and Jason,
> >
> > Maybe I am over concerning and I agree that it does depend on the
> metadata
> > change frequency. As Neha said, a few tests will be helpful. We can see
> how
> > it goes.
> >
> > What worries me is that in LinkedIn we are in the progress of running
> Kafka
> > as a service. That means user will have the ability to create/delete
> topics
> > and update their topic configurations programmatically or through UI
> using
> > LinkedIn internal cloud service platform. And some automated test will
> also
> > be consistently running to create topic, produce/consume some data, then
> > clean up the topics. This brings us some new challenges for mirror maker
> > because we need to make sure it actually copies data instead of spending
> > too much time on rebalance. As what we see now is that each rebalance
> will
> > probably take 30 seconds or more to finish even with some tricks and
> > tuning.
> >
> > Another related use case I want to bring up is that today when we want to
> > do a rolling upgrade of mirror maker (26 nodes), there will be two rounds
> > of rebalance to bounce each node, each rebalance takes about 30 seconds.
> So
> > bouncing 26 nodes takes roughly half an hour. Awkwardly, during the
> rolling
> > bounce, because mirror maker is keeping rebalancing, it actually does not
> > really copy any data! So the pipeline will literally stall for half an
> > hour! Since we are designing the new protocol, it will also be good it we
> > make sure this use case is addressed.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Sat, Aug 15, 2015 at 10:50 AM, Neha Narkhede 
> wrote:
> >
> > > Becket,
> > >
> > > This is a clever approach for to ensure that only one thing
> communicates
> > > the metadata so even if it is stale, the entire group has the same
> view.
> > > However, the big assumption this makes is that the coordinator is that
> > one
> > > process that has the ability to know the metadata for group members,
> > which
> > > does not work for any non-consumer use case.
> > >
> > > I wonder if we may be complicating the design of 95% use cases for the
> > > remaining 5%. For instance, how many times do people create and remove
> > > topics or even add partitions? We operated LI clusters for a long time
> > and
> > > this wasn't a frequent event that would need us to optimize this design
> > > for.
> > >
> > > Also, this is something we can easily validate by running a few tests
> on
> > > the patch and I suggest we wait for that.
> > >
> > > Thanks,
> > > Neha
> > >
> > > On Sat, Aug 15, 2015 at 9:14 AM, Jason Gustafson 
> > > wrote:
> > >
> > > > Hey Jiangjie,
> > > >
> > > > I was thinking about the same problem. When metadata is changing
> > > > frequently, the clients may not be able to ever find agreement on the
> > > > current state. The server doesn't have this problem, as you say,
> > because
> > > it
> > > > can just take a snapshot and send that to the clients. Adding a
> > dampening
> > > > setting to the client would help if churn is sporadic, but not if it
> is
> > > > steady. I think the metadata would have to be changin

[jira] [Commented] (KAFKA-2436) log.retention.hours should be honored by LogManager

2015-08-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699986#comment-14699986
 ] 

ASF GitHub Bot commented on KAFKA-2436:
---

GitHub user lindong28 reopened a pull request:

https://github.com/apache/kafka/pull/142

KAFKA-2436; log.retention.hours should be honored by LogManager



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindong28/kafka KAFKA-2436

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/142.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #142


commit a713d45ad4ed59440be020cc6c74efbeb2bbe54b
Author: Dong Lin 
Date:   2015-08-16T23:11:58Z

KAFKA-2436; log.retention.hours should be honored by LogManager

commit a8436f778bbdc498cdae741a74c33296065d0b21
Author: Dong Lin 
Date:   2015-08-17T15:30:14Z

A few other configurations should also be propagated to LogManager

commit ea44479abf305aa6926a1f6b2592c52216aa334f
Author: Dong Lin 
Date:   2015-08-17T17:31:47Z

remove unnecessary type conversion

commit d634339fbb4daa0dfe112c6beea0751878613fa2
Author: Dong Lin 
Date:   2015-08-17T17:57:43Z

move logRetentionTimeMillis to the group of Log Configuration in code




> log.retention.hours should be honored by LogManager
> ---
>
> Key: KAFKA-2436
> URL: https://issues.apache.org/jira/browse/KAFKA-2436
> Project: Kafka
>  Issue Type: Bug
>Reporter: Dong Lin
>Assignee: Dong Lin
>
> Currently log.retention.hours is used to calculate 
> KafkaConfig.logRetentionTimeMillis. But it is not used in LogManager to 
> decide when to delete a log. LogManager is only using the log.retention.ms in 
> the broker configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-2436; log.retention.hours should be hono...

2015-08-17 Thread lindong28
GitHub user lindong28 reopened a pull request:

https://github.com/apache/kafka/pull/142

KAFKA-2436; log.retention.hours should be honored by LogManager



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindong28/kafka KAFKA-2436

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/142.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #142


commit a713d45ad4ed59440be020cc6c74efbeb2bbe54b
Author: Dong Lin 
Date:   2015-08-16T23:11:58Z

KAFKA-2436; log.retention.hours should be honored by LogManager

commit a8436f778bbdc498cdae741a74c33296065d0b21
Author: Dong Lin 
Date:   2015-08-17T15:30:14Z

A few other configurations should also be propagated to LogManager

commit ea44479abf305aa6926a1f6b2592c52216aa334f
Author: Dong Lin 
Date:   2015-08-17T17:31:47Z

remove unnecessary type conversion

commit d634339fbb4daa0dfe112c6beea0751878613fa2
Author: Dong Lin 
Date:   2015-08-17T17:57:43Z

move logRetentionTimeMillis to the group of Log Configuration in code




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request: KAFKA-2436; log.retention.hours should be hono...

2015-08-17 Thread lindong28
Github user lindong28 closed the pull request at:

https://github.com/apache/kafka/pull/142


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2436) log.retention.hours should be honored by LogManager

2015-08-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699985#comment-14699985
 ] 

ASF GitHub Bot commented on KAFKA-2436:
---

Github user lindong28 closed the pull request at:

https://github.com/apache/kafka/pull/142


> log.retention.hours should be honored by LogManager
> ---
>
> Key: KAFKA-2436
> URL: https://issues.apache.org/jira/browse/KAFKA-2436
> Project: Kafka
>  Issue Type: Bug
>Reporter: Dong Lin
>Assignee: Dong Lin
>
> Currently log.retention.hours is used to calculate 
> KafkaConfig.logRetentionTimeMillis. But it is not used in LogManager to 
> decide when to delete a log. LogManager is only using the log.retention.ms in 
> the broker configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33620: Patch for KAFKA-1690

2015-08-17 Thread Rajini Sivaram

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33620/#review95619
---



clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java 
(line 209)


Shouldn't this be adding WRITE rather than overriding with WRITE-only:
```
key.interestOps(key.interestOps() | SelectionKey.OP_WRITE);
```
At the moment, during renegotiation when READ is ON, there doesn't seem to 
be any code that turns READ back on again if it gets turned off here. I am 
seeing a very intermittent (one in 100) failure in the renegotiation unit test 
due to this.


- Rajini Sivaram


On Aug. 17, 2015, 4:28 p.m., Sriharsha Chintalapani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33620/
> ---
> 
> (Updated Aug. 17, 2015, 4:28 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1690
> https://issues.apache.org/jira/browse/KAFKA-1690
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. SSLFactory tests.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Added 
> PrincipalBuilder.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Addressing 
> reviews.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Addressing 
> reviews.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Addressing 
> reviews.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Fixed minor 
> issues with the patch.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Fixed minor 
> issues with the patch.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> Merge remote-tracking branch 'refs/remotes/origin/trunk' into KAFKA-1690-V1
> 
> 
> KAFKA-1690. Broker side ssl changes.
> 
> 
> KAFKA-1684. SSL for socketServer.
> 
> 
> KAFKA-1690. Added SSLProducerSendTest and fixes to get right port for SSL.
> 
> 
> Merge branch 'trunk' into KAFKA-1690-V1
> 
> 
> KAFKA-1690. Post merge fixes.
> 
> 
> KAFKA-1690. Added SSLProducerSendTest.
> 
> 
> KAFKA-1690. Minor fixes based on patch review comments.
> 
> 
> Merge commit
> 
> 
> KAFKA-1690. Added SSL Consumer Test.
> 
> 
> KAFKA-1690. SSL Support.
> 
> 
> KAFKA-1690. Addressing reviews.
> 
> 
> Merge branch 'trunk' into KAFKA-1690-V1
> 
> 
> Merge branch 'trunk' into KAFKA-1690-V1
> 
> 
> KAFKA-1690. Addressing reviews. Removed interestOps from SSLTransportLayer.
> 
> 
> KAFKA-1690. Addressing reviews.
> 
> 
> KAFKA-1690. added staged receives to selector.
> 
> 
> KAFKA-1690. Addressing reviews.
> 
> 
> Merge branch 'trunk' into KAFKA-1690-V1
> 
> 
> KAFKA-1690. Addressing reviews.
> 
> 
> KAFKA-1690. Add SSL support to broker, producer and consumer.
> 
> 
> Merge remote-tracking branch 'refs/remotes/origin/trunk' into KAFKA-1690-V1
> 
> 
> KAFKA-1690. Add SSL support to Kafka Broker, Producer & Client.
> 
> 
> KAFKA-1690. Add SSL support for Kafka Brokers, Producers and Consumers.
> 
> 
> Merge remote-tracking branch 'refs/remotes/origin/trunk' into KAFKA-1690-V1
> 
> 
> KAFKA-1690. Add SSL support for Kafka Brokers, Producers and Consumers.
> 
> 
> Diffs
> -
> 
>   build.gradle 983587fd0b7604c3a26fcbb6a1d63e5e470d23fe 
>   checkstyle/import-control.xml e3f4f84c6becfd9087627f018690e1e2fc2b3bba 
>   clients/src/main/java/org/apache/kafka/clients/ClientUtils.java 
> 0d68bf1e1e90fe9d5d4397ddf817b9a9af8d9f7a 
>   clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 
> 2c421f42ed3fc5d61cf9c87a7eaa7bb23e26f63b 
>   clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 
> 0e51d7bd461d253f4396a5b6ca7cd391658807fa 
>   clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
> d35b421a515074d964c7fccb73d260b847ea5f00 
>   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
> be46b6c213ad8c6c09ad22886a5f36175ab0e13a 
>   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
> 03b8dd23df63a8d8a117f02eabcce4a2d48c44f7 
>   clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
> aa264202f2724907924985a5ecbe74afc4c6c04b 
>   clients/src/main/java/org/apache/kafka/common/config/AbstractConfig.java 
> 6c317480a181678747bfb6b77e315b08668653c5 
>   clients/src/main/java/org/apache/kafka/common/config/SSLConfigs.java 
> PRE-CREATION 
>   cli

Re: Review Request 33620: Patch for KAFKA-1690

2015-08-17 Thread Sriharsha Chintalapani

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33620/
---

(Updated Aug. 17, 2015, 7:21 p.m.)


Review request for kafka.


Bugs: KAFKA-1690
https://issues.apache.org/jira/browse/KAFKA-1690


Repository: kafka


Description (updated)
---

KAFKA-1690. new java producer needs ssl support as a client.


KAFKA-1690. new java producer needs ssl support as a client.


KAFKA-1690. new java producer needs ssl support as a client.


KAFKA-1690. new java producer needs ssl support as a client. SSLFactory tests.


KAFKA-1690. new java producer needs ssl support as a client. Added 
PrincipalBuilder.


KAFKA-1690. new java producer needs ssl support as a client. Addressing reviews.


KAFKA-1690. new java producer needs ssl support as a client. Addressing reviews.


KAFKA-1690. new java producer needs ssl support as a client. Addressing reviews.


KAFKA-1690. new java producer needs ssl support as a client. Fixed minor issues 
with the patch.


KAFKA-1690. new java producer needs ssl support as a client. Fixed minor issues 
with the patch.


KAFKA-1690. new java producer needs ssl support as a client.


KAFKA-1690. new java producer needs ssl support as a client.


Merge remote-tracking branch 'refs/remotes/origin/trunk' into KAFKA-1690-V1


KAFKA-1690. Broker side ssl changes.


KAFKA-1684. SSL for socketServer.


KAFKA-1690. Added SSLProducerSendTest and fixes to get right port for SSL.


Merge branch 'trunk' into KAFKA-1690-V1


KAFKA-1690. Post merge fixes.


KAFKA-1690. Added SSLProducerSendTest.


KAFKA-1690. Minor fixes based on patch review comments.


Merge commit


KAFKA-1690. Added SSL Consumer Test.


KAFKA-1690. SSL Support.


KAFKA-1690. Addressing reviews.


Merge branch 'trunk' into KAFKA-1690-V1


Merge branch 'trunk' into KAFKA-1690-V1


KAFKA-1690. Addressing reviews. Removed interestOps from SSLTransportLayer.


KAFKA-1690. Addressing reviews.


KAFKA-1690. added staged receives to selector.


KAFKA-1690. Addressing reviews.


Merge branch 'trunk' into KAFKA-1690-V1


KAFKA-1690. Addressing reviews.


KAFKA-1690. Add SSL support to broker, producer and consumer.


Merge remote-tracking branch 'refs/remotes/origin/trunk' into KAFKA-1690-V1


KAFKA-1690. Add SSL support to Kafka Broker, Producer & Client.


KAFKA-1690. Add SSL support for Kafka Brokers, Producers and Consumers.


Merge remote-tracking branch 'refs/remotes/origin/trunk' into KAFKA-1690-V1


KAFKA-1690. Add SSL support for Kafka Brokers, Producers and Consumers.


KAFKA-1690. Add SSL Support Kafka Broker, Producer and Consumer.


Diffs (updated)
-

  build.gradle 983587fd0b7604c3a26fcbb6a1d63e5e470d23fe 
  checkstyle/import-control.xml e3f4f84c6becfd9087627f018690e1e2fc2b3bba 
  clients/src/main/java/org/apache/kafka/clients/ClientUtils.java 
0d68bf1e1e90fe9d5d4397ddf817b9a9af8d9f7a 
  clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 
2c421f42ed3fc5d61cf9c87a7eaa7bb23e26f63b 
  clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 
0e51d7bd461d253f4396a5b6ca7cd391658807fa 
  clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
d35b421a515074d964c7fccb73d260b847ea5f00 
  clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
be46b6c213ad8c6c09ad22886a5f36175ab0e13a 
  clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
03b8dd23df63a8d8a117f02eabcce4a2d48c44f7 
  clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
aa264202f2724907924985a5ecbe74afc4c6c04b 
  clients/src/main/java/org/apache/kafka/common/config/AbstractConfig.java 
6c317480a181678747bfb6b77e315b08668653c5 
  clients/src/main/java/org/apache/kafka/common/config/SSLConfigs.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/Authenticator.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/ByteBufferSend.java 
df0e6d5105ca97b7e1cb4d334ffb7b443506bd0b 
  clients/src/main/java/org/apache/kafka/common/network/ChannelBuilder.java 
PRE-CREATION 
  
clients/src/main/java/org/apache/kafka/common/network/DefaultAuthenticator.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/KafkaChannel.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/NetworkReceive.java 
3ca0098b8ec8cfdf81158465b2d40afc47eb6f80 
  
clients/src/main/java/org/apache/kafka/common/network/PlaintextChannelBuilder.java
 PRE-CREATION 
  
clients/src/main/java/org/apache/kafka/common/network/PlaintextTransportLayer.java
 PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/SSLChannelBuilder.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/Selectable.java 
618a0fa53848ae6befea7eba39c2f3285b734494 

[jira] [Updated] (KAFKA-1690) Add SSL support to Kafka Broker, Producer and Consumer

2015-08-17 Thread Sriharsha Chintalapani (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1690:
--
Attachment: KAFKA-1690_2015-08-17_12:20:53.patch

> Add SSL support to Kafka Broker, Producer and Consumer
> --
>
> Key: KAFKA-1690
> URL: https://issues.apache.org/jira/browse/KAFKA-1690
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.3
>
> Attachments: KAFKA-1690.patch, KAFKA-1690.patch, 
> KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, 
> KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, 
> KAFKA-1690_2015-05-15_07:18:21.patch, KAFKA-1690_2015-05-20_14:54:35.patch, 
> KAFKA-1690_2015-05-21_10:37:08.patch, KAFKA-1690_2015-06-03_18:52:29.patch, 
> KAFKA-1690_2015-06-23_13:18:20.patch, KAFKA-1690_2015-07-20_06:10:42.patch, 
> KAFKA-1690_2015-07-20_11:59:57.patch, KAFKA-1690_2015-07-25_12:10:55.patch, 
> KAFKA-1690_2015-08-16_20:41:02.patch, KAFKA-1690_2015-08-17_08:12:50.patch, 
> KAFKA-1690_2015-08-17_09:28:52.patch, KAFKA-1690_2015-08-17_12:20:53.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1690) Add SSL support to Kafka Broker, Producer and Consumer

2015-08-17 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700077#comment-14700077
 ] 

Sriharsha Chintalapani commented on KAFKA-1690:
---

Updated reviewboard https://reviews.apache.org/r/33620/diff/
 against branch origin/trunk

> Add SSL support to Kafka Broker, Producer and Consumer
> --
>
> Key: KAFKA-1690
> URL: https://issues.apache.org/jira/browse/KAFKA-1690
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.3
>
> Attachments: KAFKA-1690.patch, KAFKA-1690.patch, 
> KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, 
> KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, 
> KAFKA-1690_2015-05-15_07:18:21.patch, KAFKA-1690_2015-05-20_14:54:35.patch, 
> KAFKA-1690_2015-05-21_10:37:08.patch, KAFKA-1690_2015-06-03_18:52:29.patch, 
> KAFKA-1690_2015-06-23_13:18:20.patch, KAFKA-1690_2015-07-20_06:10:42.patch, 
> KAFKA-1690_2015-07-20_11:59:57.patch, KAFKA-1690_2015-07-25_12:10:55.patch, 
> KAFKA-1690_2015-08-16_20:41:02.patch, KAFKA-1690_2015-08-17_08:12:50.patch, 
> KAFKA-1690_2015-08-17_09:28:52.patch, KAFKA-1690_2015-08-17_12:20:53.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1387) Kafka getting stuck creating ephemeral node it has already created when two zookeeper sessions are established in a very short period of time

2015-08-17 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700113#comment-14700113
 ] 

Guozhang Wang commented on KAFKA-1387:
--

I thought that when the previous session has ended (e.g. expired), its 
ephemeral node will be "eventually" removed? Does ZooKeeper itself have a 
leasing mechanism?

> Kafka getting stuck creating ephemeral node it has already created when two 
> zookeeper sessions are established in a very short period of time
> -
>
> Key: KAFKA-1387
> URL: https://issues.apache.org/jira/browse/KAFKA-1387
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.1
>Reporter: Fedor Korotkiy
>Priority: Blocker
>  Labels: newbie, patch, zkclient-problems
> Attachments: kafka-1387.patch
>
>
> Kafka broker re-registers itself in zookeeper every time handleNewSession() 
> callback is invoked.
> https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/server/KafkaHealthcheck.scala
>  
> Now imagine the following sequence of events.
> 1) Zookeeper session reestablishes. handleNewSession() callback is queued by 
> the zkClient, but not invoked yet.
> 2) Zookeeper session reestablishes again, queueing callback second time.
> 3) First callback is invoked, creating /broker/[id] ephemeral path.
> 4) Second callback is invoked and it tries to create /broker/[id] path using 
> createEphemeralPathExpectConflictHandleZKBug() function. But the path is 
> already exists, so createEphemeralPathExpectConflictHandleZKBug() is getting 
> stuck in the infinite loop.
> Seems like controller election code have the same issue.
> I'am able to reproduce this issue on the 0.8.1 branch from github using the 
> following configs.
> # zookeeper
> tickTime=10
> dataDir=/tmp/zk/
> clientPort=2101
> maxClientCnxns=0
> # kafka
> broker.id=1
> log.dir=/tmp/kafka
> zookeeper.connect=localhost:2101
> zookeeper.connection.timeout.ms=100
> zookeeper.sessiontimeout.ms=100
> Just start kafka and zookeeper and then pause zookeeper several times using 
> Ctrl-Z.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1387) Kafka getting stuck creating ephemeral node it has already created when two zookeeper sessions are established in a very short period of time

2015-08-17 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700187#comment-14700187
 ] 

Flavio Junqueira commented on KAFKA-1387:
-

bq. I thought that when the previous session has ended (e.g. expired), its 
ephemeral node will be "eventually" removed?

If the session ends cleanly, by the client submitting a closeSession request, 
then the session closes and the ephemerals are deleted with the request. But, 
if the client crashes and the server simply stops hearing from the client, then 
the session has to time out and expire so it takes some time.

bq. Does ZooKeeper itself have a leasing mechanism?

I'm referring to the fact that the ephemeral represents a lease that is revoked 
when the session times out.

I'm not sure if this is clear, but one of the problems I'm pointing out is that 
zkclient might end up creating the ephemeral znode in your *current* session. 
In this case, the znode won't go away. Here is actually another problem I found 
along the same lines. The createEphemeral call in ZkClient ends up calling 
retryUntilConnected, which retries even when the session expires:

{code}
try {
return callable.call();
} catch (ConnectionLossException e) {
// we give the event thread some time to update the status to 
'Disconnected'
Thread.yield();
waitForRetry();
} catch (SessionExpiredException e) {
// we give the event thread some time to update the status to 
'Expired'
Thread.yield();
waitForRetry();
}
{code}

In this case, say that one call to createEphemeral via handleNewSession happens 
during a given session, but the session expires before the operation goes 
through. The client will retry with the new session. When the consumer tries 
again, it will fail because the znode is there and won't go away. This is 
another case in which the znode won't go away because it has been created in 
the current session.

> Kafka getting stuck creating ephemeral node it has already created when two 
> zookeeper sessions are established in a very short period of time
> -
>
> Key: KAFKA-1387
> URL: https://issues.apache.org/jira/browse/KAFKA-1387
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.1
>Reporter: Fedor Korotkiy
>Priority: Blocker
>  Labels: newbie, patch, zkclient-problems
> Attachments: kafka-1387.patch
>
>
> Kafka broker re-registers itself in zookeeper every time handleNewSession() 
> callback is invoked.
> https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/server/KafkaHealthcheck.scala
>  
> Now imagine the following sequence of events.
> 1) Zookeeper session reestablishes. handleNewSession() callback is queued by 
> the zkClient, but not invoked yet.
> 2) Zookeeper session reestablishes again, queueing callback second time.
> 3) First callback is invoked, creating /broker/[id] ephemeral path.
> 4) Second callback is invoked and it tries to create /broker/[id] path using 
> createEphemeralPathExpectConflictHandleZKBug() function. But the path is 
> already exists, so createEphemeralPathExpectConflictHandleZKBug() is getting 
> stuck in the infinite loop.
> Seems like controller election code have the same issue.
> I'am able to reproduce this issue on the 0.8.1 branch from github using the 
> following configs.
> # zookeeper
> tickTime=10
> dataDir=/tmp/zk/
> clientPort=2101
> maxClientCnxns=0
> # kafka
> broker.id=1
> log.dir=/tmp/kafka
> zookeeper.connect=localhost:2101
> zookeeper.connection.timeout.ms=100
> zookeeper.sessiontimeout.ms=100
> Just start kafka and zookeeper and then pause zookeeper several times using 
> Ctrl-Z.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Kafka KIP meeting Aug. 18

2015-08-17 Thread Jun Rao
Hi, Everyone,

We will have a Kafka KIP meeting tomorrow at 11am PST. If you plan to
attend but haven't received an invite, please let me know. The following is
the agenda.

Agenda
1. Discuss the proposal on client-side assignment strategy (
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Client-side+Assignment+Proposal
)
2. Discuss comments on CopyCat data api (KAFKA-2367)
3. Kafka 0.8.2.2 release
4. Review Backlog
https://issues.apache.org/jira/issues/?jql=project%20%3D%20KAFKA%20AND%20status%20%3D%20%22Patch%20Available%22%20ORDER%20BY%20updated%20DESC


Thanks,

Jun


Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-17 Thread Gwen Shapira
Thanks for creating a list, Grant!

I placed it on the wiki with a quick evaluation of the content and whether
it should be in 0.8.2.2:
https://cwiki.apache.org/confluence/display/KAFKA/Proposed+patches+for+0.8.2.2

I'm attempting to only cherrypick fixes that are both important for large
number of users (or very critical to some users) and very safe (mostly
judged by the size of the change, but not only)

If your favorite bugfix is missing from the list, or is there but marked
"No", please let us know (in this thread) what we are missing and why it is
both important and safe.
Also, if I accidentally included something you consider unsafe, speak up!

Gwen

On Mon, Aug 17, 2015 at 8:17 AM, Grant Henke  wrote:

> +dev
>
> Adding dev list back in. Somehow it got dropped.
>
>
> On Mon, Aug 17, 2015 at 10:16 AM, Grant Henke  wrote:
>
> > Below is a list of candidate bug fix jiras marked fixed for 0.8.3. I
> don't
> > suspect all of these will (or should) make it into the release but this
> > should be a relatively complete list to work from:
> >
> >- KAFKA-2114 :
> Unable
> >to change min.insync.replicas default
> >- KAFKA-1702 :
> >Messages silently Lost by producer
> >- KAFKA-2012 :
> >Broker should automatically handle corrupt index files
> >- KAFKA-2406 : ISR
> >propagation should be throttled to avoid overwhelming controller.
> >- KAFKA-2336 :
> >Changing offsets.topic.num.partitions after the offset topic is
> created
> >breaks consumer group partition assignment
> >- KAFKA-2337 :
> Verify
> >that metric names will not collide when creating new topics
> >- KAFKA-2393 :
> >Correctly Handle InvalidTopicException in KafkaApis.getTopicMetadata()
> >- KAFKA-2189 :
> Snappy
> >compression of message batches less efficient in 0.8.2.1
> >- KAFKA-2308 : New
> >producer + Snappy face un-compression errors after broker restart
> >- KAFKA-2042 : New
> >producer metadata update always get all topics.
> >- KAFKA-1367 :
> Broker
> >topic metadata not kept in sync with ZooKeeper
> >- KAFKA-972 :
> MetadataRequest
> >returns stale list of brokers
> >- KAFKA-1867 :
> liveBroker
> >list not updated on a cluster with no topics
> >- KAFKA-1650 :
> Mirror
> >Maker could lose data on unclean shutdown.
> >- KAFKA-2009 : Fix
> >UncheckedOffset.removeOffset synchronization and trace logging issue
> in
> >mirror maker
> >- KAFKA-2407 : Only
> >create a log directory when it will be used
> >- KAFKA-2327 :
> >broker doesn't start if config defines advertised.host but not
> >advertised.port
> >- KAFKA-1788: producer record can stay in RecordAccumulator forever if
> >leader is no available
> >- KAFKA-2234 :
> >Partition reassignment of a nonexistent topic prevents future
> reassignments
> >- KAFKA-2096 :
> >Enable keepalive socket option for broker to prevent socket leak
> >- KAFKA-1057 : Trim
> >whitespaces from user specified configs
> >- KAFKA-1641 : Log
> >cleaner exits if last cleaned offset is lower than earliest offset
> >- KAFKA-1648 :
> Round
> >robin consumer balance throws an NPE when there are no topics
> >- KAFKA-1724 :
> >Errors after reboot in single node setup
> >- KAFKA-1758 :
> >corrupt recovery file prevents startup
> >- KAFKA-1866 :
> >LogStartOffset gauge throws exceptions after log.delete()
> >- KAFKA-1883 :
> NullPointerException
> >in RequestSendThread
> >- KAFKA-1896 :
> >Record size funcition of record in mirror maker hit NPE when the
> message
> >valu

[jira] [Commented] (KAFKA-1690) Add SSL support to Kafka Broker, Producer and Consumer

2015-08-17 Thread Dong Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700417#comment-14700417
 ] 

Dong Lin commented on KAFKA-1690:
-

[~harsha_ch] Can you tell me the hash that I can use the apply the latest patch 
in trunk?

> Add SSL support to Kafka Broker, Producer and Consumer
> --
>
> Key: KAFKA-1690
> URL: https://issues.apache.org/jira/browse/KAFKA-1690
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.3
>
> Attachments: KAFKA-1690.patch, KAFKA-1690.patch, 
> KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, 
> KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, 
> KAFKA-1690_2015-05-15_07:18:21.patch, KAFKA-1690_2015-05-20_14:54:35.patch, 
> KAFKA-1690_2015-05-21_10:37:08.patch, KAFKA-1690_2015-06-03_18:52:29.patch, 
> KAFKA-1690_2015-06-23_13:18:20.patch, KAFKA-1690_2015-07-20_06:10:42.patch, 
> KAFKA-1690_2015-07-20_11:59:57.patch, KAFKA-1690_2015-07-25_12:10:55.patch, 
> KAFKA-1690_2015-08-16_20:41:02.patch, KAFKA-1690_2015-08-17_08:12:50.patch, 
> KAFKA-1690_2015-08-17_09:28:52.patch, KAFKA-1690_2015-08-17_12:20:53.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1690) Add SSL support to Kafka Broker, Producer and Consumer

2015-08-17 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700431#comment-14700431
 ] 

Sriharsha Chintalapani commented on KAFKA-1690:
---

[~lindong] you should be able to apply against trunk and hash is 
63b89658bcb5fc2d95e10d28987337c3d971163f

> Add SSL support to Kafka Broker, Producer and Consumer
> --
>
> Key: KAFKA-1690
> URL: https://issues.apache.org/jira/browse/KAFKA-1690
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.3
>
> Attachments: KAFKA-1690.patch, KAFKA-1690.patch, 
> KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, 
> KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, 
> KAFKA-1690_2015-05-15_07:18:21.patch, KAFKA-1690_2015-05-20_14:54:35.patch, 
> KAFKA-1690_2015-05-21_10:37:08.patch, KAFKA-1690_2015-06-03_18:52:29.patch, 
> KAFKA-1690_2015-06-23_13:18:20.patch, KAFKA-1690_2015-07-20_06:10:42.patch, 
> KAFKA-1690_2015-07-20_11:59:57.patch, KAFKA-1690_2015-07-25_12:10:55.patch, 
> KAFKA-1690_2015-08-16_20:41:02.patch, KAFKA-1690_2015-08-17_08:12:50.patch, 
> KAFKA-1690_2015-08-17_09:28:52.patch, KAFKA-1690_2015-08-17_12:20:53.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1690) Add SSL support to Kafka Broker, Producer and Consumer

2015-08-17 Thread Dong Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700469#comment-14700469
 ] 

Dong Lin commented on KAFKA-1690:
-

[~sriharsha] I am not able to apply it again 
63b89658bcb5fc2d95e10d28987337c3d971163f -- there is too much conflict. Could 
you help verify it?

> Add SSL support to Kafka Broker, Producer and Consumer
> --
>
> Key: KAFKA-1690
> URL: https://issues.apache.org/jira/browse/KAFKA-1690
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.3
>
> Attachments: KAFKA-1690.patch, KAFKA-1690.patch, 
> KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, 
> KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, 
> KAFKA-1690_2015-05-15_07:18:21.patch, KAFKA-1690_2015-05-20_14:54:35.patch, 
> KAFKA-1690_2015-05-21_10:37:08.patch, KAFKA-1690_2015-06-03_18:52:29.patch, 
> KAFKA-1690_2015-06-23_13:18:20.patch, KAFKA-1690_2015-07-20_06:10:42.patch, 
> KAFKA-1690_2015-07-20_11:59:57.patch, KAFKA-1690_2015-07-25_12:10:55.patch, 
> KAFKA-1690_2015-08-16_20:41:02.patch, KAFKA-1690_2015-08-17_08:12:50.patch, 
> KAFKA-1690_2015-08-17_09:28:52.patch, KAFKA-1690_2015-08-17_12:20:53.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1387) Kafka getting stuck creating ephemeral node it has already created when two zookeeper sessions are established in a very short period of time

2015-08-17 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700483#comment-14700483
 ] 

Guozhang Wang commented on KAFKA-1387:
--

[~fpj] That makes sense. So it seems the right resolution should be at the 
ZkClient layer, not on Kafka's layer?

> Kafka getting stuck creating ephemeral node it has already created when two 
> zookeeper sessions are established in a very short period of time
> -
>
> Key: KAFKA-1387
> URL: https://issues.apache.org/jira/browse/KAFKA-1387
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.1
>Reporter: Fedor Korotkiy
>Priority: Blocker
>  Labels: newbie, patch, zkclient-problems
> Attachments: kafka-1387.patch
>
>
> Kafka broker re-registers itself in zookeeper every time handleNewSession() 
> callback is invoked.
> https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/server/KafkaHealthcheck.scala
>  
> Now imagine the following sequence of events.
> 1) Zookeeper session reestablishes. handleNewSession() callback is queued by 
> the zkClient, but not invoked yet.
> 2) Zookeeper session reestablishes again, queueing callback second time.
> 3) First callback is invoked, creating /broker/[id] ephemeral path.
> 4) Second callback is invoked and it tries to create /broker/[id] path using 
> createEphemeralPathExpectConflictHandleZKBug() function. But the path is 
> already exists, so createEphemeralPathExpectConflictHandleZKBug() is getting 
> stuck in the infinite loop.
> Seems like controller election code have the same issue.
> I'am able to reproduce this issue on the 0.8.1 branch from github using the 
> following configs.
> # zookeeper
> tickTime=10
> dataDir=/tmp/zk/
> clientPort=2101
> maxClientCnxns=0
> # kafka
> broker.id=1
> log.dir=/tmp/kafka
> zookeeper.connect=localhost:2101
> zookeeper.connection.timeout.ms=100
> zookeeper.sessiontimeout.ms=100
> Just start kafka and zookeeper and then pause zookeeper several times using 
> Ctrl-Z.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1690) Add SSL support to Kafka Broker, Producer and Consumer

2015-08-17 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700488#comment-14700488
 ] 

Sriharsha Chintalapani commented on KAFKA-1690:
---

[~lindong] strange I was able to apply the latest diff against trunk are you 
using the latest revision from rb

> Add SSL support to Kafka Broker, Producer and Consumer
> --
>
> Key: KAFKA-1690
> URL: https://issues.apache.org/jira/browse/KAFKA-1690
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.3
>
> Attachments: KAFKA-1690.patch, KAFKA-1690.patch, 
> KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, 
> KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, 
> KAFKA-1690_2015-05-15_07:18:21.patch, KAFKA-1690_2015-05-20_14:54:35.patch, 
> KAFKA-1690_2015-05-21_10:37:08.patch, KAFKA-1690_2015-06-03_18:52:29.patch, 
> KAFKA-1690_2015-06-23_13:18:20.patch, KAFKA-1690_2015-07-20_06:10:42.patch, 
> KAFKA-1690_2015-07-20_11:59:57.patch, KAFKA-1690_2015-07-25_12:10:55.patch, 
> KAFKA-1690_2015-08-16_20:41:02.patch, KAFKA-1690_2015-08-17_08:12:50.patch, 
> KAFKA-1690_2015-08-17_09:28:52.patch, KAFKA-1690_2015-08-17_12:20:53.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] Client-side Assignment for New Consumer

2015-08-17 Thread Ewen Cheslack-Postava
Becket,

Just to clarify, when topics are being created/resized for automated tests,
are those excluded by mirrormaker? I assume you don't want to bother with
copying them since they are just for tests. If they are excluded, then they
don't need to affect the metadata used by the consumer and so wouldn't
trigger rebalancing in the scheme proposed -- the hash being reported can
be restricted to the set of topics that will be processed by the consumer
group. Consumers should only ever need to rejoin the group if they detect a
metadata change that would affect the group's behavior.

-Ewen

On Mon, Aug 17, 2015 at 11:14 AM, Jason Gustafson 
wrote:

> Hey Becket,
>
> Thanks for raising these concerns. I think the rolling upgrade problem is
> handled quite a bit better with the new consumer if Onur's LeaveGroup patch
> (or a variant) is accepted. Without it, the coordinator has to wait for the
> full session timeout to detect that a node has left the group, which can
> significantly delay rebalancing.
>
> -Jason
>
> On Sun, Aug 16, 2015 at 11:42 AM, Neha Narkhede  wrote:
>
> > Hey Becket,
> >
> > These are all fair points. Regarding running Kafka as a service, it will
> be
> > good for everyone to know some numbers around topic creation and changes
> > around # of partitions. I don't think the test usage is a good one since
> no
> > one should be creating and deleting topics in a loop on a production
> > cluster.
> >
> > Regarding your point about mirror maker, I think we should definitely
> > ensure that a mirror maker with a large topic subscription list should be
> > well supported. The reason for the half an hour stalls in LinkedIn mirror
> > makers is due to the fact that it still uses the old consumer. The 3
> major
> > reasons for long rebalance operations in the mirror maker using the old
> > consumer are -
> > 1. Every rebalance operation involves many writes to ZK. For large
> consumer
> > groups and large number of partitions, this adds up to several seconds
> > 2. The view of the members in a group is inconsistent and hence there are
> > typically several rebalance attempts. Note that this is fixed in the new
> > design where the group membership is always consistently communicated to
> > all members.
> > 3. The rebalance backoff time is high (of the order of seconds).
> >
> > The good news is that the new design has none of these problems. As we
> > said, we will run some tests to ensure that the mirror maker case is
> > handled well. Thank you for the feedback!
> >
> > Thanks,
> > Neha
> >
> > On Sat, Aug 15, 2015 at 6:17 PM, Jiangjie Qin  >
> > wrote:
> >
> > > Neha, Ewen and Jason,
> > >
> > > Maybe I am over concerning and I agree that it does depend on the
> > metadata
> > > change frequency. As Neha said, a few tests will be helpful. We can see
> > how
> > > it goes.
> > >
> > > What worries me is that in LinkedIn we are in the progress of running
> > Kafka
> > > as a service. That means user will have the ability to create/delete
> > topics
> > > and update their topic configurations programmatically or through UI
> > using
> > > LinkedIn internal cloud service platform. And some automated test will
> > also
> > > be consistently running to create topic, produce/consume some data,
> then
> > > clean up the topics. This brings us some new challenges for mirror
> maker
> > > because we need to make sure it actually copies data instead of
> spending
> > > too much time on rebalance. As what we see now is that each rebalance
> > will
> > > probably take 30 seconds or more to finish even with some tricks and
> > > tuning.
> > >
> > > Another related use case I want to bring up is that today when we want
> to
> > > do a rolling upgrade of mirror maker (26 nodes), there will be two
> rounds
> > > of rebalance to bounce each node, each rebalance takes about 30
> seconds.
> > So
> > > bouncing 26 nodes takes roughly half an hour. Awkwardly, during the
> > rolling
> > > bounce, because mirror maker is keeping rebalancing, it actually does
> not
> > > really copy any data! So the pipeline will literally stall for half an
> > > hour! Since we are designing the new protocol, it will also be good it
> we
> > > make sure this use case is addressed.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Sat, Aug 15, 2015 at 10:50 AM, Neha Narkhede 
> > wrote:
> > >
> > > > Becket,
> > > >
> > > > This is a clever approach for to ensure that only one thing
> > communicates
> > > > the metadata so even if it is stale, the entire group has the same
> > view.
> > > > However, the big assumption this makes is that the coordinator is
> that
> > > one
> > > > process that has the ability to know the metadata for group members,
> > > which
> > > > does not work for any non-consumer use case.
> > > >
> > > > I wonder if we may be complicating the design of 95% use cases for
> the
> > > > remaining 5%. For instance, how many times do people create and
> remove
> > > > topics or even add partitions? We op

[jira] [Comment Edited] (KAFKA-1690) Add SSL support to Kafka Broker, Producer and Consumer

2015-08-17 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700488#comment-14700488
 ] 

Sriharsha Chintalapani edited comment on KAFKA-1690 at 8/18/15 12:08 AM:
-

[~lindong] strange I was able to apply the latest diff against trunk are you 
using the latest revision from rb and the latest trunk hash is 
786867c2e18f79fa17be120f78a253bb9822a861


was (Author: sriharsha):
[~lindong] strange I was able to apply the latest diff against trunk are you 
using the latest revision from rb

> Add SSL support to Kafka Broker, Producer and Consumer
> --
>
> Key: KAFKA-1690
> URL: https://issues.apache.org/jira/browse/KAFKA-1690
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.3
>
> Attachments: KAFKA-1690.patch, KAFKA-1690.patch, 
> KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, 
> KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, 
> KAFKA-1690_2015-05-15_07:18:21.patch, KAFKA-1690_2015-05-20_14:54:35.patch, 
> KAFKA-1690_2015-05-21_10:37:08.patch, KAFKA-1690_2015-06-03_18:52:29.patch, 
> KAFKA-1690_2015-06-23_13:18:20.patch, KAFKA-1690_2015-07-20_06:10:42.patch, 
> KAFKA-1690_2015-07-20_11:59:57.patch, KAFKA-1690_2015-07-25_12:10:55.patch, 
> KAFKA-1690_2015-08-16_20:41:02.patch, KAFKA-1690_2015-08-17_08:12:50.patch, 
> KAFKA-1690_2015-08-17_09:28:52.patch, KAFKA-1690_2015-08-17_12:20:53.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1690) Add SSL support to Kafka Broker, Producer and Consumer

2015-08-17 Thread Dong Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700491#comment-14700491
 ] 

Dong Lin commented on KAFKA-1690:
-

[~sriharsha] The patch KAFKA-1690_2015-08-17_12-20-53.patch from this JIRA 
ticket doesn't work. But patch from reviewboard works well. Thanks!

> Add SSL support to Kafka Broker, Producer and Consumer
> --
>
> Key: KAFKA-1690
> URL: https://issues.apache.org/jira/browse/KAFKA-1690
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.3
>
> Attachments: KAFKA-1690.patch, KAFKA-1690.patch, 
> KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, 
> KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, 
> KAFKA-1690_2015-05-15_07:18:21.patch, KAFKA-1690_2015-05-20_14:54:35.patch, 
> KAFKA-1690_2015-05-21_10:37:08.patch, KAFKA-1690_2015-06-03_18:52:29.patch, 
> KAFKA-1690_2015-06-23_13:18:20.patch, KAFKA-1690_2015-07-20_06:10:42.patch, 
> KAFKA-1690_2015-07-20_11:59:57.patch, KAFKA-1690_2015-07-25_12:10:55.patch, 
> KAFKA-1690_2015-08-16_20:41:02.patch, KAFKA-1690_2015-08-17_08:12:50.patch, 
> KAFKA-1690_2015-08-17_09:28:52.patch, KAFKA-1690_2015-08-17_12:20:53.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1690) Add SSL support to Kafka Broker, Producer and Consumer

2015-08-17 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700495#comment-14700495
 ] 

Ismael Juma commented on KAFKA-1690:


If you apply the patch in Review Board via `git apply`, it should work fine. 
The JIRA patch on the other hand didn't work for me (maybe this is the issue 
you're seeing?).

> Add SSL support to Kafka Broker, Producer and Consumer
> --
>
> Key: KAFKA-1690
> URL: https://issues.apache.org/jira/browse/KAFKA-1690
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.3
>
> Attachments: KAFKA-1690.patch, KAFKA-1690.patch, 
> KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, 
> KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, 
> KAFKA-1690_2015-05-15_07:18:21.patch, KAFKA-1690_2015-05-20_14:54:35.patch, 
> KAFKA-1690_2015-05-21_10:37:08.patch, KAFKA-1690_2015-06-03_18:52:29.patch, 
> KAFKA-1690_2015-06-23_13:18:20.patch, KAFKA-1690_2015-07-20_06:10:42.patch, 
> KAFKA-1690_2015-07-20_11:59:57.patch, KAFKA-1690_2015-07-25_12:10:55.patch, 
> KAFKA-1690_2015-08-16_20:41:02.patch, KAFKA-1690_2015-08-17_08:12:50.patch, 
> KAFKA-1690_2015-08-17_09:28:52.patch, KAFKA-1690_2015-08-17_12:20:53.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1690) Add SSL support to Kafka Broker, Producer and Consumer

2015-08-17 Thread Dong Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700499#comment-14700499
 ] 

Dong Lin commented on KAFKA-1690:
-

[~ijuma] Yes. I observed this as well.

> Add SSL support to Kafka Broker, Producer and Consumer
> --
>
> Key: KAFKA-1690
> URL: https://issues.apache.org/jira/browse/KAFKA-1690
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.3
>
> Attachments: KAFKA-1690.patch, KAFKA-1690.patch, 
> KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, 
> KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, 
> KAFKA-1690_2015-05-15_07:18:21.patch, KAFKA-1690_2015-05-20_14:54:35.patch, 
> KAFKA-1690_2015-05-21_10:37:08.patch, KAFKA-1690_2015-06-03_18:52:29.patch, 
> KAFKA-1690_2015-06-23_13:18:20.patch, KAFKA-1690_2015-07-20_06:10:42.patch, 
> KAFKA-1690_2015-07-20_11:59:57.patch, KAFKA-1690_2015-07-25_12:10:55.patch, 
> KAFKA-1690_2015-08-16_20:41:02.patch, KAFKA-1690_2015-08-17_08:12:50.patch, 
> KAFKA-1690_2015-08-17_09:28:52.patch, KAFKA-1690_2015-08-17_12:20:53.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [Copycat] How will copycat serialize its metadata

2015-08-17 Thread Ewen Cheslack-Postava
@Neha, not sure what you mean by using base64 encoded strings. base64
encoding takes bytes and gives you ASCII text. We need to go from
arbitrarily structured offsets data to bytes (e.g. user has given us a
record (with schema they have defined) containing db name + table name for
the key, and another record (with schema they have defined) containing a
timestamp and value from an auto-incrementing column).

By the way, offset serialization comes with all the same schema challenges
as regular data does, although admittedly they may be less likely to see
upgrades to the schema. You need to either ship schemas with the data, have
a separate registry and just ship an ID, or have some way for the plugins
to provide their schemas to deserializers (perhaps with IDs, acting as an
embedded-in-code registry). It may be fine to choose JSON for all offset
data, but then that either requires committing to this ugly envelope
approach to include the schema with messages (and accept all that overhead)
or make separate offset deserializers APIs where you can pass in the schema
of the data to be decoded. Alternatively, you ignore schemas and hope the
connectors behave well (which was described as the loosey-goosey approach
in the other thread about handling connector config upgrades).

-Ewen

On Sat, Aug 15, 2015 at 9:00 PM, Gwen Shapira  wrote:

> Yeah, I agree that if we have the ser/de we can do anything :)
>
> I'd actually feel more comfortable if the users *have* to go through our
> APIs to get to the metadata (which again, is kind of internal to Copycat).
> If they start writing their own code that depends on this data, who knows
> what we may accidentally break?
>
> I'd prefer a more well-defined contract here.
>
> The JSON shop should still feel fairly comfortable using REST APIs... most
> of them are :)
>
> On Fri, Aug 14, 2015 at 8:14 PM, Ewen Cheslack-Postava 
> wrote:
>
> > On Fri, Aug 14, 2015 at 6:35 PM, Gwen Shapira  wrote:
> >
> > > Yeah, I missed the option to match serialization of offsets to data,
> > which
> > > solves the configuration overhead.
> > >
> > > It still doesn't give us the ability to easily evolve the metadata
> > messages
> > > or to use them in monitoring tools.
> > >
> > > And I am still not clear of the benefits of using user-defined
> > > serialization for the offsets.
> > >
> >
> > If we can get at the serialization config (e.g. via the REST API), then
> we
> > can decode the data regardless of the format. The main drawback is that
> > then any tool that needs to decode them needs the serializer jars on its
> > classpath. I think the benefit is that it lets users process that data in
> > their preferred format if they want to build their own tools. For example
> > in a shop that prefers JSON, this be the difference between them
> > considering it easily accessible (they just read the topic and parse
> using
> > their favorite JSON library) vs not being accessible (they won't pull in
> > whatever serializer we use internally, or don't want to write a custom
> > parser for our custom serialization format).
> >
> > -Ewen
> >
> >
> > >
> > > Gwen
> > >
> > > On Fri, Aug 14, 2015 at 1:29 AM, Ewen Cheslack-Postava <
> > e...@confluent.io>
> > > wrote:
> > >
> > > > I'm not sure the existing discussion is clear about how the format of
> > > > offset data is decided. One possibility is that we choose one fixed
> > > format
> > > > and that is what we use internally to store offsets no matter what
> > > > serializer you choose. This would be similar to how the __offsets
> topic
> > > is
> > > > currently handled (with a custom serialization format). In other
> words,
> > > we
> > > > use format X to store offsets. If you serialize your data with Y or
> Z,
> > we
> > > > don't care, we still use format X. The other option (which is used in
> > the
> > > > current PR-99 patch) would still make offset serialization pluggable,
> > but
> > > > there wouldn't be a separate option for it. Offset serialization
> would
> > > use
> > > > the same format as the data serialization. If you use X for data, we
> > use
> > > X
> > > > for offsets; you use Y for data, we use Y for offsets.
> > > >
> > > > @neha wrt providing access through a REST API, I guess you are
> > suggesting
> > > > that we can serialize that data to JSON for that API. I think it's
> > > > important to point out that this is arbitrarily structured,
> > > > connector-specific data. In many ways, it's not that different from
> the
> > > > actual message data in that it is highly dependent on the connector
> and
> > > > downstream consumers need to understand the connector and its data
> > format
> > > > to do anything meaningful with the data. Because of this, I'm not
> > > convinced
> > > > that serializing it in a format other than the one used for the data
> > will
> > > > be particularly useful.
> > > >
> > > >
> > > > On Thu, Aug 13, 2015 at 11:22 PM, Neha Narkhede 
> > > wrote:
> > > >
> > > > > Copycat enables streaming data in and

Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-17 Thread Grant Henke
Thanks Gwen.

I updated a few small things on the wiki page.

Below is a list of jiras I think could also be marked as included. All of
these, though not super critical, seem like fairly small and low risk
changes that help avoid potentially confusing issues or errors for users.

KAFKA-2012
KAFKA-972
KAFKA-2337 & KAFKA-2393
KAFKA-1867
KAFKA-2407
KAFKA-2234
KAFKA-1866
KAFKA-2345 & KAFKA-2355

thoughts?

Thank you,
Grant

On Mon, Aug 17, 2015 at 4:56 PM, Gwen Shapira  wrote:

> Thanks for creating a list, Grant!
>
> I placed it on the wiki with a quick evaluation of the content and whether
> it should be in 0.8.2.2:
>
> https://cwiki.apache.org/confluence/display/KAFKA/Proposed+patches+for+0.8.2.2
>
> I'm attempting to only cherrypick fixes that are both important for large
> number of users (or very critical to some users) and very safe (mostly
> judged by the size of the change, but not only)
>
> If your favorite bugfix is missing from the list, or is there but marked
> "No", please let us know (in this thread) what we are missing and why it is
> both important and safe.
> Also, if I accidentally included something you consider unsafe, speak up!
>
> Gwen
>
> On Mon, Aug 17, 2015 at 8:17 AM, Grant Henke  wrote:
>
> > +dev
> >
> > Adding dev list back in. Somehow it got dropped.
> >
> >
> > On Mon, Aug 17, 2015 at 10:16 AM, Grant Henke 
> wrote:
> >
> > > Below is a list of candidate bug fix jiras marked fixed for 0.8.3. I
> > don't
> > > suspect all of these will (or should) make it into the release but this
> > > should be a relatively complete list to work from:
> > >
> > >- KAFKA-2114 :
> > Unable
> > >to change min.insync.replicas default
> > >- KAFKA-1702 :
> > >Messages silently Lost by producer
> > >- KAFKA-2012 :
> > >Broker should automatically handle corrupt index files
> > >- KAFKA-2406 :
> ISR
> > >propagation should be throttled to avoid overwhelming controller.
> > >- KAFKA-2336 :
> > >Changing offsets.topic.num.partitions after the offset topic is
> > created
> > >breaks consumer group partition assignment
> > >- KAFKA-2337 :
> > Verify
> > >that metric names will not collide when creating new topics
> > >- KAFKA-2393 :
> > >Correctly Handle InvalidTopicException in
> KafkaApis.getTopicMetadata()
> > >- KAFKA-2189 :
> > Snappy
> > >compression of message batches less efficient in 0.8.2.1
> > >- KAFKA-2308 :
> New
> > >producer + Snappy face un-compression errors after broker restart
> > >- KAFKA-2042 :
> New
> > >producer metadata update always get all topics.
> > >- KAFKA-1367 :
> > Broker
> > >topic metadata not kept in sync with ZooKeeper
> > >- KAFKA-972 :
> > MetadataRequest
> > >returns stale list of brokers
> > >- KAFKA-1867 :
> > liveBroker
> > >list not updated on a cluster with no topics
> > >- KAFKA-1650 :
> > Mirror
> > >Maker could lose data on unclean shutdown.
> > >- KAFKA-2009 :
> Fix
> > >UncheckedOffset.removeOffset synchronization and trace logging issue
> > in
> > >mirror maker
> > >- KAFKA-2407 :
> Only
> > >create a log directory when it will be used
> > >- KAFKA-2327 :
> > >broker doesn't start if config defines advertised.host but not
> > >advertised.port
> > >- KAFKA-1788: producer record can stay in RecordAccumulator forever
> if
> > >leader is no available
> > >- KAFKA-2234 :
> > >Partition reassignment of a nonexistent topic prevents future
> > reassignments
> > >- KAFKA-2096 :
> > >Enable keepalive socket option for broker to prevent socket leak
> > >- KAFKA-1057 :
> Trim
> > >whitespaces from user specified configs
> > >- KAFKA-1641 :
> Log
> > >cleaner exits if last cleaned offset is lower than earliest offset
> > >- KAFKA-1648 :
> > Round
> > >robin consumer balance throws an NPE when there 

Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-17 Thread Jun Rao
Gwen,

Thanks for putting the list together.

I'd recommend that we exclude the following:
KAFKA-1702: This is for the old producer and is only a problem if there are
some unexpected exceptions (e.g. UnknownClass).
KAFKA-2336: Most people don't change offsets.topic.num.partitions.
KAFKA-1724: The patch there is never committed since the fix is included in
another jira (a much larger patch).
KAFKA-2241: This doesn't seem be a common problem. It only happens when the
fetch request blocks on the broker for an extended period of time, which
should be rare.

I'd also recommend that we include the following:
KAFKA-2147: This impacts the memory size of the purgatory and a number of
people have experienced that. The fix is small and has been tested in
production usage. It hasn't been committed though since the issue is
already fixed in trunk and we weren't planning for an 0.8.2.2 release then.

Thanks,

Jun

On Mon, Aug 17, 2015 at 2:56 PM, Gwen Shapira  wrote:

> Thanks for creating a list, Grant!
>
> I placed it on the wiki with a quick evaluation of the content and whether
> it should be in 0.8.2.2:
>
> https://cwiki.apache.org/confluence/display/KAFKA/Proposed+patches+for+0.8.2.2
>
> I'm attempting to only cherrypick fixes that are both important for large
> number of users (or very critical to some users) and very safe (mostly
> judged by the size of the change, but not only)
>
> If your favorite bugfix is missing from the list, or is there but marked
> "No", please let us know (in this thread) what we are missing and why it is
> both important and safe.
> Also, if I accidentally included something you consider unsafe, speak up!
>
> Gwen
>
> On Mon, Aug 17, 2015 at 8:17 AM, Grant Henke  wrote:
>
> > +dev
> >
> > Adding dev list back in. Somehow it got dropped.
> >
> >
> > On Mon, Aug 17, 2015 at 10:16 AM, Grant Henke 
> wrote:
> >
> > > Below is a list of candidate bug fix jiras marked fixed for 0.8.3. I
> > don't
> > > suspect all of these will (or should) make it into the release but this
> > > should be a relatively complete list to work from:
> > >
> > >- KAFKA-2114 :
> > Unable
> > >to change min.insync.replicas default
> > >- KAFKA-1702 :
> > >Messages silently Lost by producer
> > >- KAFKA-2012 :
> > >Broker should automatically handle corrupt index files
> > >- KAFKA-2406 :
> ISR
> > >propagation should be throttled to avoid overwhelming controller.
> > >- KAFKA-2336 :
> > >Changing offsets.topic.num.partitions after the offset topic is
> > created
> > >breaks consumer group partition assignment
> > >- KAFKA-2337 :
> > Verify
> > >that metric names will not collide when creating new topics
> > >- KAFKA-2393 :
> > >Correctly Handle InvalidTopicException in
> KafkaApis.getTopicMetadata()
> > >- KAFKA-2189 :
> > Snappy
> > >compression of message batches less efficient in 0.8.2.1
> > >- KAFKA-2308 :
> New
> > >producer + Snappy face un-compression errors after broker restart
> > >- KAFKA-2042 :
> New
> > >producer metadata update always get all topics.
> > >- KAFKA-1367 :
> > Broker
> > >topic metadata not kept in sync with ZooKeeper
> > >- KAFKA-972 :
> > MetadataRequest
> > >returns stale list of brokers
> > >- KAFKA-1867 :
> > liveBroker
> > >list not updated on a cluster with no topics
> > >- KAFKA-1650 :
> > Mirror
> > >Maker could lose data on unclean shutdown.
> > >- KAFKA-2009 :
> Fix
> > >UncheckedOffset.removeOffset synchronization and trace logging issue
> > in
> > >mirror maker
> > >- KAFKA-2407 :
> Only
> > >create a log directory when it will be used
> > >- KAFKA-2327 :
> > >broker doesn't start if config defines advertised.host but not
> > >advertised.port
> > >- KAFKA-1788: producer record can stay in RecordAccumulator forever
> if
> > >leader is no available
> > >- KAFKA-2234 :
> > >Partition reassignment of a nonexistent topic prevents future
> > reassignments
> > >- KAFKA-2096 :
> > >En

Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-17 Thread Jun Rao
Hi, Grant,

I took a look at that list. None of those is really critical as you said.
So, I'd suggest that we not include those to minimize the scope of the
release.

Thanks,

Jun

On Mon, Aug 17, 2015 at 5:16 PM, Grant Henke  wrote:

> Thanks Gwen.
>
> I updated a few small things on the wiki page.
>
> Below is a list of jiras I think could also be marked as included. All of
> these, though not super critical, seem like fairly small and low risk
> changes that help avoid potentially confusing issues or errors for users.
>
> KAFKA-2012
> KAFKA-972
> KAFKA-2337 & KAFKA-2393
> KAFKA-1867
> KAFKA-2407
> KAFKA-2234
> KAFKA-1866
> KAFKA-2345 & KAFKA-2355
>
> thoughts?
>
> Thank you,
> Grant
>
> On Mon, Aug 17, 2015 at 4:56 PM, Gwen Shapira  wrote:
>
> > Thanks for creating a list, Grant!
> >
> > I placed it on the wiki with a quick evaluation of the content and
> whether
> > it should be in 0.8.2.2:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Proposed+patches+for+0.8.2.2
> >
> > I'm attempting to only cherrypick fixes that are both important for large
> > number of users (or very critical to some users) and very safe (mostly
> > judged by the size of the change, but not only)
> >
> > If your favorite bugfix is missing from the list, or is there but marked
> > "No", please let us know (in this thread) what we are missing and why it
> is
> > both important and safe.
> > Also, if I accidentally included something you consider unsafe, speak up!
> >
> > Gwen
> >
> > On Mon, Aug 17, 2015 at 8:17 AM, Grant Henke 
> wrote:
> >
> > > +dev
> > >
> > > Adding dev list back in. Somehow it got dropped.
> > >
> > >
> > > On Mon, Aug 17, 2015 at 10:16 AM, Grant Henke 
> > wrote:
> > >
> > > > Below is a list of candidate bug fix jiras marked fixed for 0.8.3. I
> > > don't
> > > > suspect all of these will (or should) make it into the release but
> this
> > > > should be a relatively complete list to work from:
> > > >
> > > >- KAFKA-2114 :
> > > Unable
> > > >to change min.insync.replicas default
> > > >- KAFKA-1702 :
> > > >Messages silently Lost by producer
> > > >- KAFKA-2012 :
> > > >Broker should automatically handle corrupt index files
> > > >- KAFKA-2406 :
> > ISR
> > > >propagation should be throttled to avoid overwhelming controller.
> > > >- KAFKA-2336 :
> > > >Changing offsets.topic.num.partitions after the offset topic is
> > > created
> > > >breaks consumer group partition assignment
> > > >- KAFKA-2337 :
> > > Verify
> > > >that metric names will not collide when creating new topics
> > > >- KAFKA-2393 :
> > > >Correctly Handle InvalidTopicException in
> > KafkaApis.getTopicMetadata()
> > > >- KAFKA-2189 :
> > > Snappy
> > > >compression of message batches less efficient in 0.8.2.1
> > > >- KAFKA-2308 :
> > New
> > > >producer + Snappy face un-compression errors after broker restart
> > > >- KAFKA-2042 :
> > New
> > > >producer metadata update always get all topics.
> > > >- KAFKA-1367 :
> > > Broker
> > > >topic metadata not kept in sync with ZooKeeper
> > > >- KAFKA-972 :
> > > MetadataRequest
> > > >returns stale list of brokers
> > > >- KAFKA-1867 :
> > > liveBroker
> > > >list not updated on a cluster with no topics
> > > >- KAFKA-1650 :
> > > Mirror
> > > >Maker could lose data on unclean shutdown.
> > > >- KAFKA-2009 :
> > Fix
> > > >UncheckedOffset.removeOffset synchronization and trace logging
> issue
> > > in
> > > >mirror maker
> > > >- KAFKA-2407 :
> > Only
> > > >create a log directory when it will be used
> > > >- KAFKA-2327 :
> > > >broker doesn't start if config defines advertised.host but not
> > > >advertised.port
> > > >- KAFKA-1788: producer record can stay in RecordAccumulator
> forever
> > if
> > > >leader is no available
> > > >- KAFKA-2234 :
> > > >Partition reassignment of a nonexistent topic prevents future
> > > reassignments
> > > >- KAFKA-2096 :
> > > >Enable keepalive

[GitHub] kafka pull request: KAFKA-2203: Getting Java8 to relax about javad...

2015-08-17 Thread gwenshap
GitHub user gwenshap opened a pull request:

https://github.com/apache/kafka/pull/147

KAFKA-2203: Getting Java8 to relax about javadoc and let our build pass

This patch is different than the one attached to the JIRA - I'm applying 
the new javadoc rules to all subprojects while the one in the JIRA applies only 
to "clients". We need this since Copycat  has the same issues.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gwenshap/kafka KAFKA-2203

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/147.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #147


commit 4f925d96457314bd42157dae8c8c40c5c08eda39
Author: Gwen Shapira 
Date:   2015-08-18T01:05:13Z

KAFKA-2203: Getting Java8 to relax about javadoc and let our build pass




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2203) Get gradle build to work with Java 8

2015-08-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700554#comment-14700554
 ] 

ASF GitHub Bot commented on KAFKA-2203:
---

GitHub user gwenshap opened a pull request:

https://github.com/apache/kafka/pull/147

KAFKA-2203: Getting Java8 to relax about javadoc and let our build pass

This patch is different than the one attached to the JIRA - I'm applying 
the new javadoc rules to all subprojects while the one in the JIRA applies only 
to "clients". We need this since Copycat  has the same issues.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gwenshap/kafka KAFKA-2203

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/147.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #147


commit 4f925d96457314bd42157dae8c8c40c5c08eda39
Author: Gwen Shapira 
Date:   2015-08-18T01:05:13Z

KAFKA-2203: Getting Java8 to relax about javadoc and let our build pass




> Get gradle build to work with Java 8
> 
>
> Key: KAFKA-2203
> URL: https://issues.apache.org/jira/browse/KAFKA-2203
> Project: Kafka
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.8.1.1
>Reporter: Gaju Bhat
>Priority: Minor
> Fix For: 0.8.1.2
>
> Attachments: 0001-Special-case-java-8-and-javadoc-handling.patch
>
>
> The gradle build halts because javadoc in java 8 is a lot stricter about 
> valid html.
> It might be worthwhile to special case java 8 as described 
> [here|http://blog.joda.org/2014/02/turning-off-doclint-in-jdk-8-javadoc.html].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2436) log.retention.hours should be honored by LogManager

2015-08-17 Thread Dong Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Lin updated KAFKA-2436:

Status: Patch Available  (was: In Progress)

> log.retention.hours should be honored by LogManager
> ---
>
> Key: KAFKA-2436
> URL: https://issues.apache.org/jira/browse/KAFKA-2436
> Project: Kafka
>  Issue Type: Bug
>Reporter: Dong Lin
>Assignee: Dong Lin
>
> Currently log.retention.hours is used to calculate 
> KafkaConfig.logRetentionTimeMillis. But it is not used in LogManager to 
> decide when to delete a log. LogManager is only using the log.retention.ms in 
> the broker configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2436) log.retention.hours should be honored by LogManager

2015-08-17 Thread Dong Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Lin updated KAFKA-2436:

Priority: Critical  (was: Major)

> log.retention.hours should be honored by LogManager
> ---
>
> Key: KAFKA-2436
> URL: https://issues.apache.org/jira/browse/KAFKA-2436
> Project: Kafka
>  Issue Type: Bug
>Reporter: Dong Lin
>Assignee: Dong Lin
>Priority: Critical
>
> Currently log.retention.hours is used to calculate 
> KafkaConfig.logRetentionTimeMillis. But it is not used in LogManager to 
> decide when to delete a log. LogManager is only using the log.retention.ms in 
> the broker configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (KAFKA-1788) producer record can stay in RecordAccumulator forever if leader is no available

2015-08-17 Thread Manikumar Reddy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar Reddy reopened KAFKA-1788:


> producer record can stay in RecordAccumulator forever if leader is no 
> available
> ---
>
> Key: KAFKA-1788
> URL: https://issues.apache.org/jira/browse/KAFKA-1788
> Project: Kafka
>  Issue Type: Bug
>  Components: core, producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: Parth Brahmbhatt
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: KAFKA-1788.patch, KAFKA-1788_2015-01-06_13:42:37.patch, 
> KAFKA-1788_2015-01-06_13:44:41.patch
>
>
> In the new producer, when a partition has no leader for a long time (e.g., 
> all replicas are down), the records for that partition will stay in the 
> RecordAccumulator until the leader is available. This may cause the 
> bufferpool to be full and the callback for the produced message to block for 
> a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-1788) producer record can stay in RecordAccumulator forever if leader is no available

2015-08-17 Thread Manikumar Reddy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar Reddy resolved KAFKA-1788.

Resolution: Duplicate

> producer record can stay in RecordAccumulator forever if leader is no 
> available
> ---
>
> Key: KAFKA-1788
> URL: https://issues.apache.org/jira/browse/KAFKA-1788
> Project: Kafka
>  Issue Type: Bug
>  Components: core, producer 
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: Parth Brahmbhatt
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: KAFKA-1788.patch, KAFKA-1788_2015-01-06_13:42:37.patch, 
> KAFKA-1788_2015-01-06_13:44:41.patch
>
>
> In the new producer, when a partition has no leader for a long time (e.g., 
> all replicas are down), the records for that partition will stay in the 
> RecordAccumulator until the leader is available. This may cause the 
> bufferpool to be full and the callback for the produced message to block for 
> a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2203) Get gradle build to work with Java 8

2015-08-17 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700728#comment-14700728
 ] 

Allen Wittenauer commented on KAFKA-2203:
-

FYI, the precommit code that we're putting in front of the board as a potential 
TLP (see HADOOP-12111) already has support to do tests under multiple JDKs.  
I'm currently in the process of adding gradle support, so hopefully in the 
future it will cover the exact concerns mentioned here.

> Get gradle build to work with Java 8
> 
>
> Key: KAFKA-2203
> URL: https://issues.apache.org/jira/browse/KAFKA-2203
> Project: Kafka
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.8.1.1
>Reporter: Gaju Bhat
>Priority: Minor
> Fix For: 0.8.1.2
>
> Attachments: 0001-Special-case-java-8-and-javadoc-handling.patch
>
>
> The gradle build halts because javadoc in java 8 is a lot stricter about 
> valid html.
> It might be worthwhile to special case java 8 as described 
> [here|http://blog.joda.org/2014/02/turning-off-doclint-in-jdk-8-javadoc.html].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] Client-side Assignment for New Consumer

2015-08-17 Thread Jiangjie Qin
Ewen,

Honestly I am not sure whether we can keep all the topic in the same
cluster or not. By default, we will mirror the topic. Because when
application teams run their test, they would expect the environment to be
as similar to production as possible. Also, we are developing a
comprehensive continuous verification test framework as well. There will be
a bunch of test continuously running, including some tests that
periodically create topic, produce some data, verify the data integrity and
performance statistics, then do the clean up. The tests framework will also
include the entire pipeline but not only one cluster.

Another use case for wildcard besides mirror maker is that our auditing
service. Currently our auditing framework also reads from all the topics
and compares the messages count with the producers to see if there is any
data loss.

So I think for some topics we can let mirror maker exclude them, but it
might be a little bit difficult to exclude everything from all wildcard
consumer groups.

Thanks,

Jiangjie (Becket) Qin



On Mon, Aug 17, 2015 at 5:06 PM, Ewen Cheslack-Postava 
wrote:

> Becket,
>
> Just to clarify, when topics are being created/resized for automated tests,
> are those excluded by mirrormaker? I assume you don't want to bother with
> copying them since they are just for tests. If they are excluded, then they
> don't need to affect the metadata used by the consumer and so wouldn't
> trigger rebalancing in the scheme proposed -- the hash being reported can
> be restricted to the set of topics that will be processed by the consumer
> group. Consumers should only ever need to rejoin the group if they detect a
> metadata change that would affect the group's behavior.
>
> -Ewen
>
> On Mon, Aug 17, 2015 at 11:14 AM, Jason Gustafson 
> wrote:
>
> > Hey Becket,
> >
> > Thanks for raising these concerns. I think the rolling upgrade problem is
> > handled quite a bit better with the new consumer if Onur's LeaveGroup
> patch
> > (or a variant) is accepted. Without it, the coordinator has to wait for
> the
> > full session timeout to detect that a node has left the group, which can
> > significantly delay rebalancing.
> >
> > -Jason
> >
> > On Sun, Aug 16, 2015 at 11:42 AM, Neha Narkhede 
> wrote:
> >
> > > Hey Becket,
> > >
> > > These are all fair points. Regarding running Kafka as a service, it
> will
> > be
> > > good for everyone to know some numbers around topic creation and
> changes
> > > around # of partitions. I don't think the test usage is a good one
> since
> > no
> > > one should be creating and deleting topics in a loop on a production
> > > cluster.
> > >
> > > Regarding your point about mirror maker, I think we should definitely
> > > ensure that a mirror maker with a large topic subscription list should
> be
> > > well supported. The reason for the half an hour stalls in LinkedIn
> mirror
> > > makers is due to the fact that it still uses the old consumer. The 3
> > major
> > > reasons for long rebalance operations in the mirror maker using the old
> > > consumer are -
> > > 1. Every rebalance operation involves many writes to ZK. For large
> > consumer
> > > groups and large number of partitions, this adds up to several seconds
> > > 2. The view of the members in a group is inconsistent and hence there
> are
> > > typically several rebalance attempts. Note that this is fixed in the
> new
> > > design where the group membership is always consistently communicated
> to
> > > all members.
> > > 3. The rebalance backoff time is high (of the order of seconds).
> > >
> > > The good news is that the new design has none of these problems. As we
> > > said, we will run some tests to ensure that the mirror maker case is
> > > handled well. Thank you for the feedback!
> > >
> > > Thanks,
> > > Neha
> > >
> > > On Sat, Aug 15, 2015 at 6:17 PM, Jiangjie Qin
>  > >
> > > wrote:
> > >
> > > > Neha, Ewen and Jason,
> > > >
> > > > Maybe I am over concerning and I agree that it does depend on the
> > > metadata
> > > > change frequency. As Neha said, a few tests will be helpful. We can
> see
> > > how
> > > > it goes.
> > > >
> > > > What worries me is that in LinkedIn we are in the progress of running
> > > Kafka
> > > > as a service. That means user will have the ability to create/delete
> > > topics
> > > > and update their topic configurations programmatically or through UI
> > > using
> > > > LinkedIn internal cloud service platform. And some automated test
> will
> > > also
> > > > be consistently running to create topic, produce/consume some data,
> > then
> > > > clean up the topics. This brings us some new challenges for mirror
> > maker
> > > > because we need to make sure it actually copies data instead of
> > spending
> > > > too much time on rebalance. As what we see now is that each rebalance
> > > will
> > > > probably take 30 seconds or more to finish even with some tricks and
> > > > tuning.
> > > >
> > > > Another related use case I want to bring up is that today whe