Re: [DISCUSS] KIP-27 - Conditional Publish

2015-08-11 Thread Ben Kirwin
This is a very nice summary of the consistency / correctness issues
possible with a commit log.

> (assuming it’s publishing asynchronously and in an open loop)

It's perhaps already clear to folks here, but -- if you *don't* do that,
and instead only send one batch of messages at a time and check the result,
you don't have the interleaving issue. (Of course, that means you give up
pipelining batches...)
On Aug 10, 2015 2:46 PM, "Flavio Junqueira"  wrote:

> I've been trying to understand what is being proposed in this KIP and I've
> put down some notes with some feedback from Ben that I wanted to share for
> feedback. I'm not really following the flow of the thread, since I've read
> a few sources to get to this, and I apologize for that.
>
> Here is how I see it t a high level. There are really two problems being
> discussed in the context of this KIP:
> Single writer with failover:
> Consistent logs
>
> Single writer with failover
> The idea is that at any time there must be at most one publisher active.
> To get high availability, we can’t rely on a single process to be such a
> publisher and consequently we need the failover part: if the current active
> publisher crashes, then another publisher takes over and becomes active.
> One important issue with scenarios like this is that during transitions
> from one active publisher to another, there could be races and two
> publishers end up interleaving messages in a topic/partition/key.
>
> Why is this interleaving bad? This is really application specific, but one
> general way of seeing this is that only one process has the authoritative
> application state to generate messages to publish. Transitioning from an
> active publisher to another, typically requires recovering state or
> performing some kind of coordination. If no such recovery is required, then
> we are essentially in the multi-writer space. The commit log use case is a
> general one mentioned in the KIP description.
>
> Consistent logs
> Consistent logs might not be the best term here, but I’m using it to
> describe the need of having the messages in a topic/partition/key
> reflecting consistently the state of the application. For example, some
> applications might be OK with a published sequence:
>
> A B B C (e.g., value = 10)
>
> in the case the messages are idempotent operations, but others might
> really require:
>
> A B C (e.g., value += 10)
>
> if they aren’t idempotent operations. Order and gaps are also an issue, so
> some applications might be OK with:
>
> A C B (e.g., value += x)
>
> and skipping B altogether might be ok if B has no side-effects (e.g.,
> operation associated to B has failed).
>
> Putting things together
> The current KIP-27 proposal seems to do a good job with providing a
> consistent log in the absence of concurrency. It enables publishers to
> re-publish messages without duplication, which is one requirement for
> exactly-once semantics. Gaps need to be handled by the publisher. For
> example, if the publisher publishes A B C (assuming it’s publishing
> asynchronously and in an open loop), it could have A succeeding but not B
> and C. In this case, it needs to redo the publish of B and C. It could also
> have B failing and C succeeding, in which case the publisher repeats B and
> C.
>
> A really nice feature of the current proposal is that it is a simple
> primitive that enables the implementation of publishers with different
> delivery guarantees. It doesn’t seem to be well suited to the first problem
> of implementing a single writer with failover, however. It allows runs in
> which two producers interleave messages because the mechanism focuses on a
> single message. The single writer might not even care about duplicates and
> gaps depending on the application, but it might care that there aren’t two
> publishers interleaving messages in the Kafka log.
>
> A typical way of dealing with these cases is to use a token associated to
> a lease to fence off the other publishers. For example, to demote an active
> publisher, another publisher could invoke a demote call and have the ISR
> leader replace the token. The lease of the token could be done directly
> with ZooKeeper or via the ISR leader. The condition to publish a message or
> a batch could be a combination of token verification and offset check.
>
> -Flavio
>
> > On 10 Aug 2015, at 00:15, Jun Rao  wrote:
> >
> > Couple of other things.
> >
> > A. In the discussion, we talked about the usage of getting the latest
> high
> > watermark from the broker. Currently, the high watermark in a partition
> can
> > go back a bit for a short period of time during leader change. So, the
> high
> > watermark returned in the getOffset api is not 100% accurate. There is a
> > jira (KAFKA-2334) to track this issue.
> >
> > B. The proposal in the wiki is to put the expected offset in every
> message,
> > even when the messages are compressed. With Jiangjie's proposal of
> relative
> > offset, the expected offset probably can

Re: [jira] [Commented] (KAFKA-1690) new java producer needs ssl support as a client

2015-08-11 Thread Rajini Sivaram
Harsha,

The test is very timing sensitive and doesn't always go through a
renegotiation. Here is the trace from a run that passed and a failed run (I
added the logging to the end of SSLTransportLayer.handshake()). The
successful run shows a single handshake at the start, the failed run that
hangs shows a second handshake from the renegotiation. Can you check that
your test runs do go through two handshakes and whether appReadBuffer has
any data at the end of the second handshake?

Successful run:

handshake() status=NEED_UNWRAP
appReadBuffer=java.nio.DirectByteBuffer[pos=0 lim=16916 cap=16916]
handshake() status=NEED_WRAP appReadBuffer=java.nio.DirectByteBuffer[pos=0
lim=16916 cap=16916]
handshake() status=NEED_WRAP appReadBuffer=java.nio.DirectByteBuffer[pos=0
lim=16916 cap=16916]
handshake() status=NEED_WRAP appReadBuffer=java.nio.DirectByteBuffer[pos=0
lim=16916 cap=16916]
handshake() status=NEED_UNWRAP
appReadBuffer=java.nio.DirectByteBuffer[pos=0 lim=16916 cap=16916]
handshake() status=NEED_UNWRAP
appReadBuffer=java.nio.DirectByteBuffer[pos=0 lim=16916 cap=16916]
handshake() status=FINISHED appReadBuffer=java.nio.DirectByteBuffer[pos=0
lim=16916 cap=16916]  <== End of first handshake

 Failed run that hangs:

handshake() status=NEED_UNWRAP
appReadBuffer=java.nio.DirectByteBuffer[pos=0 lim=16916 cap=16916]
handshake() status=NEED_WRAP appReadBuffer=java.nio.DirectByteBuffer[pos=0
lim=16916 cap=16916]
handshake() status=NEED_WRAP appReadBuffer=java.nio.DirectByteBuffer[pos=0
lim=16916 cap=16916]
handshake() status=NEED_WRAP appReadBuffer=java.nio.DirectByteBuffer[pos=0
lim=16916 cap=16916]
handshake() status=NEED_UNWRAP
appReadBuffer=java.nio.DirectByteBuffer[pos=0 lim=16916 cap=16916]
handshake() status=NEED_UNWRAP
appReadBuffer=java.nio.DirectByteBuffer[pos=0 lim=16916 cap=16916]
handshake() status=FINISHED appReadBuffer=java.nio.DirectByteBuffer[pos=0
lim=16916 cap=16916]   <== End of first handshake
handshake() status=NEED_WRAP appReadBuffer=java.nio.DirectByteBuffer[pos=0
lim=16916 cap=16916]  <== Start of renegotiation handshake
handshake() status=NEED_UNWRAP
appReadBuffer=java.nio.DirectByteBuffer[pos=29 lim=16916 cap=16916]
handshake() status=NEED_UNWRAP
appReadBuffer=java.nio.DirectByteBuffer[pos=38 lim=16916 cap=16916]
handshake() status=NEED_UNWRAP
appReadBuffer=java.nio.DirectByteBuffer[pos=40 lim=16916 cap=16916]
handshake() status=NEED_UNWRAP
appReadBuffer=java.nio.DirectByteBuffer[pos=45 lim=16916 cap=16916]
handshake() status=NEED_UNWRAP
appReadBuffer=java.nio.DirectByteBuffer[pos=46 lim=16916 cap=16916]
handshake() status=NEED_UNWRAP
appReadBuffer=java.nio.DirectByteBuffer[pos=48 lim=16916 cap=16916]
handshake() status=NEED_UNWRAP
appReadBuffer=java.nio.DirectByteBuffer[pos=54 lim=16916 cap=16916]
handshake() status=NEED_UNWRAP
appReadBuffer=java.nio.DirectByteBuffer[pos=55 lim=16916 cap=16916]
handshake() status=NEED_UNWRAP
appReadBuffer=java.nio.DirectByteBuffer[pos=57 lim=16916 cap=16916]
handshake() status=NEED_UNWRAP
appReadBuffer=java.nio.DirectByteBuffer[pos=63 lim=16916 cap=16916]
handshake() status=NEED_UNWRAP
appReadBuffer=java.nio.DirectByteBuffer[pos=64 lim=16916 cap=16916]
handshake() status=NEED_UNWRAP
appReadBuffer=java.nio.DirectByteBuffer[pos=66 lim=16916 cap=16916]
handshake() status=NEED_UNWRAP
appReadBuffer=java.nio.DirectByteBuffer[pos=72 lim=16916 cap=16916]
handshake() status=NEED_UNWRAP
appReadBuffer=java.nio.DirectByteBuffer[pos=72 lim=16916 cap=16916]
handshake() status=NEED_UNWRAP
appReadBuffer=java.nio.DirectByteBuffer[pos=72 lim=16916 cap=16916]
handshake() status=NEED_WRAP appReadBuffer=java.nio.DirectByteBuffer[pos=72
lim=16916 cap=16916]
handshake() status=NEED_WRAP appReadBuffer=java.nio.DirectByteBuffer[pos=72
lim=16916 cap=16916]
handshake() status=FINISHED appReadBuffer=java.nio.DirectByteBuffer[pos=72
lim=16916 cap=16916]
handshake() status=NOT_HANDSHAKING
appReadBuffer=java.nio.DirectByteBuffer[pos=72 lim=16916 cap=16916] <== End
of renegotiation handshake, appReadBuffer contains data



I have tried with IBM JDK 7.1 and IBM JDK 8.0 on Windows, as well as
OpenJDK 8.0 on Linux (see versions below). All of them hang intermittently.

Windows:

java version "1.7.0"
Java(TM) SE Runtime Environment (build pwa6470_27sr3fp10-20150708_01(SR3
FP10))
IBM J9 VM (build 2.7, JRE 1.7.0 Windows 7 amd64-64 Compressed References
20150630_255653 (JIT enabled, AOT enabled)
J9VM - R27_Java727_SR3_20150630_2236_B255653
JIT  - tr.r13.java_20150623_94888.01
GC   - R27_Java727_SR3_20150630_2236_B255653_CMPRSS
J9CL - 20150630_255653)
JCL - 20150628_01 based on Oracle jdk7u85-b15

java version "1.8.0"
Java(TM) SE Runtime Environment (build pwa6480sr1fp10-20150711_01(SR1 FP10))
IBM J9 VM (build 2.8, JRE 1.8.0 Windows 7 amd64-64 Compressed References
20150630_255633 (JIT enabled, AOT enabled)
J9VM - R28_jvm.28_20150630_1742_B255633
JIT  - tr.r14.java_20150625_95081.01
GC   - R28_jvm.28_20150630_1742_B255633_CMPRSS
J9CL - 20150630_255633)
JCL - 20150711_01 based on Oracle jdk8u

[jira] [Commented] (KAFKA-2078) Getting Selector [WARN] Error in I/O with host java.io.EOFException

2015-08-11 Thread PC (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681680#comment-14681680
 ] 

PC commented on KAFKA-2078:
---

I can reproduce this bug though it appears to be a challenge to do so.
Running on Mac OS X 10.9.5 16GB Ram
Java version 1.8.0_40

It only appears to affect the Producer; 
org.apache.kafka.clients.producer.KafkaProducer 0.8.2.1

Setup:

3 Producers pumping test data to one kafka-server, with 1 replica, all running 
locally on the same machine. Each producer using the async 
.send(producerRecord, callBack) method.
The configs will be at the bottom of this post.

Here is a log snippet:

16:21:51.527 [message-consumer-akka.actor.default-dispatcher-5] DEBUG producer 
- PumpSuccess topic: test partition 0 offset: 3330477
16:21:51.528 [message-consumer-akka.actor.default-dispatcher-5] DEBUG producer 
- PumpSuccess topic: test partition 0 offset: 3330478
16:21:51.528 [message-consumer-akka.actor.default-dispatcher-5] DEBUG producer 
- PumpSuccess topic: test partition 0 offset: 3330479
16:21:51.528 [message-consumer-akka.actor.default-dispatcher-5] DEBUG producer 
- PumpSuccess topic: test partition 0 offset: 3330480
16:21:51.528 [message-consumer-akka.actor.default-dispatcher-5] DEBUG producer 
- PumpSuccess topic: test partition 0 offset: 3330481
16:26:26.220 [kafka-producer-network-thread | producer-3] WARN  
o.a.kafka.common.network.Selector - Error in I/O with localhost/127.0.0.1
java.io.EOFException: null
at 
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62) 
~[kafka-clients-0.8.2.1.jar:na]
at org.apache.kafka.common.network.Selector.poll(Selector.java:248) 
~[kafka-clients-0.8.2.1.jar:na]
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) 
[kafka-clients-0.8.2.1.jar:na]
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) 
[kafka-clients-0.8.2.1.jar:na]
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) 
[kafka-clients-0.8.2.1.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]
16:26:26.220 [kafka-producer-network-thread | producer-2] WARN  
o.a.kafka.common.network.Selector - Error in I/O with localhost/127.0.0.1
java.io.EOFException: null
at 
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62) 
~[kafka-clients-0.8.2.1.jar:na]
at org.apache.kafka.common.network.Selector.poll(Selector.java:248) 
~[kafka-clients-0.8.2.1.jar:na]
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) 
[kafka-clients-0.8.2.1.jar:na]
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) 
[kafka-clients-0.8.2.1.jar:na]
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) 
[kafka-clients-0.8.2.1.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]
16:26:26.220 [kafka-producer-network-thread | producer-1] WARN  
o.a.kafka.common.network.Selector - Error in I/O with localhost/127.0.0.1
java.io.EOFException: null
at 
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62) 
~[kafka-clients-0.8.2.1.jar:na]
at org.apache.kafka.common.network.Selector.poll(Selector.java:248) 
~[kafka-clients-0.8.2.1.jar:na]
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) 
[kafka-clients-0.8.2.1.jar:na]
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) 
[kafka-clients-0.8.2.1.jar:na]
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) 
[kafka-clients-0.8.2.1.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]

Pay attention to the timestamps. Less than 5 minutes after the producers were 
FINISHED pumping the data, these 3 exceptions were logged by the kafka-producer 
internals.

The worst is, this bug also occurred while pumping messages to the broker, 2 
days ago. The CallBack code was not called for 3 messages ( 1 per producer ) 
when this bug kicked-in nor was an exception thrown in my application. This can 
potentially lead to serious data loss and has severe implications.

I would in a heartbeat upgrade this bug as SEVERE/CRITICAL and not Major.

Temporary (unacceptable) solution is to block with a timeout to ensure we 
didn't lose data when this bug manifests itself:
try {

kafkaProducer.send(record, callBack).get(5, TimeUnit.SECONDS)
} catch {
 
}

This approach reduces the pumping throughput down to roughly ~5k messages/sec, 
from ~60k messages/sec using the async, for a single producer.

Config properties:

Kafka-Server:
broker.id=0
port=9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
#l

[jira] [Commented] (KAFKA-2078) Getting Selector [WARN] Error in I/O with host java.io.EOFException

2015-08-11 Thread PC (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681847#comment-14681847
 ] 

PC commented on KAFKA-2078:
---

Hi again,

It just happened again. This time, pumped only 4 messages. Again, look at the 
timestamps:

15:57:20,559 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could 
NOT find resource [logback.groovy]
15:57:20,559 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could 
NOT find resource [logback-test.xml]
15:57:20,560 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found 
resource [logback.xml] at 
[file:/Users/pascal/Projects/frida-core/target/scala-2.10/classes/logback.xml]
15:57:20,724 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction 
- debug attribute not set
15:57:20,727 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About 
to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]
15:57:20,737 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming 
appender as [STDOUT]
15:57:20,790 |-WARN in ch.qos.logback.core.ConsoleAppender[STDOUT] - This 
appender no longer admits a layout as a sub-component, set an encoder instead.
15:57:20,790 |-WARN in ch.qos.logback.core.ConsoleAppender[STDOUT] - To ensure 
compatibility, wrapping your layout in LayoutWrappingEncoder.
15:57:20,790 |-WARN in ch.qos.logback.core.ConsoleAppender[STDOUT] - See also 
http://logback.qos.ch/codes.html#layoutInsteadOfEncoder for details
15:57:20,791 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - 
Setting level of logger [consumer] to DEBUG
15:57:20,791 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - 
Setting level of logger [producer] to DEBUG
15:57:20,791 |-INFO in ch.qos.logback.classic.joran.action.RootLoggerAction - 
Setting level of ROOT logger to WARN
15:57:20,791 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - 
Attaching appender named [STDOUT] to Logger[ROOT]
15:57:20,792 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction 
- End of configuration.
15:57:20,793 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@38ddb44b 
- Registering current configuration as safe fallback point

15:57:21.057 [kafka-producer-network-thread | producer-2] DEBUG producer - 
PumpSuccess topic[test] partition[0] offset[582177726]
15:57:21.057 [kafka-producer-network-thread | producer-1] DEBUG producer - 
PumpSuccess topic[test] partition[0] offset[582177728]
15:57:21.057 [kafka-producer-network-thread | producer-4] DEBUG producer - 
PumpSuccess topic[test] partition[0] offset[582177725]
15:57:21.057 [kafka-producer-network-thread | producer-3] DEBUG producer - 
PumpSuccess topic[test] partition[0] offset[582177727]
16:08:39.667 [kafka-producer-network-thread | producer-2] WARN  
o.a.kafka.common.network.Selector - Error in I/O with localhost/127.0.0.1
java.io.EOFException: null
at 
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62) 
~[kafka-clients-0.8.2.1.jar:na]
at org.apache.kafka.common.network.Selector.poll(Selector.java:248) 
~[kafka-clients-0.8.2.1.jar:na]
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) 
[kafka-clients-0.8.2.1.jar:na]
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) 
[kafka-clients-0.8.2.1.jar:na]
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) 
[kafka-clients-0.8.2.1.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]
16:08:39.667 [kafka-producer-network-thread | producer-4] WARN  
o.a.kafka.common.network.Selector - Error in I/O with localhost/127.0.0.1
java.io.EOFException: null
at 
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62) 
~[kafka-clients-0.8.2.1.jar:na]
at org.apache.kafka.common.network.Selector.poll(Selector.java:248) 
~[kafka-clients-0.8.2.1.jar:na]
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) 
[kafka-clients-0.8.2.1.jar:na]
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) 
[kafka-clients-0.8.2.1.jar:na]
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) 
[kafka-clients-0.8.2.1.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]
16:08:39.667 [kafka-producer-network-thread | producer-3] WARN  
o.a.kafka.common.network.Selector - Error in I/O with localhost/127.0.0.1
java.io.EOFException: null
at 
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62) 
~[kafka-clients-0.8.2.1.jar:na]
at org.apache.kafka.common.network.Selector.poll(Selector.java:248) 
~[kafka-clients-0.8.2.1.jar:na]
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) 
[kafka-clients-0.8.2.1.jar:na]
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:

[jira] [Created] (KAFKA-2421) Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7

2015-08-11 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-2421:
-

 Summary: Upgrade LZ4 to version 1.3 to avoid crashing with IBM 
Java 7
 Key: KAFKA-2421
 URL: https://issues.apache.org/jira/browse/KAFKA-2421
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.2.1
 Environment: IBM Java 7
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram


Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7.

LZ4 version 1.2 crashes with 64-bit IBM Java 7. This has been fixed in LZ4 
version 1.3 (https://github.com/jpountz/lz4-java/blob/master/CHANGES.md, 
https://github.com/jpountz/lz4-java/pull/46).


The unit test org.apache.kafka.common.record.MemoryRecordsTest crashes when run 
with 64-bit IBM Java7 with the error:

{quote}
023EB900: Native Method 0263CE10 
(net/jpountz/lz4/LZ4JNI.LZ4_compress_limitedOutput([BII[BII)I)
023EB900: Invalid JNI call of function void 
ReleasePrimitiveArrayCritical(JNIEnv *env, jarray array, void *carray, jint 
mode): For array FFF7EAB8 parameter carray passed FFF85998, 
expected to be FFF7EAC0
14:08:42.763 0x23eb900j9mm.632*   ** ASSERTION FAILED ** at 
StandardAccessBarrier.cpp:335: ((false))
JVMDUMP039I Processing dump event "traceassert", detail "" at 2015/08/11 
15:08:42 - please wait.
{quote}

Stack trace from javacore:

3XMTHREADINFO3   Java callstack:
4XESTACKTRACEat 
net/jpountz/lz4/LZ4JNI.LZ4_compress_limitedOutput(Native Method)
4XESTACKTRACEat 
net/jpountz/lz4/LZ4JNICompressor.compress(LZ4JNICompressor.java:31)
4XESTACKTRACEat 
net/jpountz/lz4/LZ4Factory.(LZ4Factory.java:163)
4XESTACKTRACEat 
net/jpountz/lz4/LZ4Factory.instance(LZ4Factory.java:46)
4XESTACKTRACEat 
net/jpountz/lz4/LZ4Factory.nativeInstance(LZ4Factory.java:76)
5XESTACKTRACE   (entered lock: 
net/jpountz/lz4/LZ4Factory@0xE02F0BE8, entry count: 1)
4XESTACKTRACEat 
net/jpountz/lz4/LZ4Factory.fastestInstance(LZ4Factory.java:129)
4XESTACKTRACEat 
org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.(KafkaLZ4BlockOutputStream.java:72)
4XESTACKTRACEat 
org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.(KafkaLZ4BlockOutputStream.java:93)
4XESTACKTRACEat 
org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.(KafkaLZ4BlockOutputStream.java:103)
4XESTACKTRACEat 
sun/reflect/NativeConstructorAccessorImpl.newInstance0(Native Method)
4XESTACKTRACEat 
sun/reflect/NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:86)
4XESTACKTRACEat 
sun/reflect/DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:58)
4XESTACKTRACEat 
java/lang/reflect/Constructor.newInstance(Constructor.java:542)
4XESTACKTRACEat 
org/apache/kafka/common/record/Compressor.wrapForOutput(Compressor.java:222)
4XESTACKTRACEat 
org/apache/kafka/common/record/Compressor.(Compressor.java:72)
4XESTACKTRACEat 
org/apache/kafka/common/record/Compressor.(Compressor.java:76)
4XESTACKTRACEat 
org/apache/kafka/common/record/MemoryRecords.(MemoryRecords.java:43)
4XESTACKTRACEat 
org/apache/kafka/common/record/MemoryRecords.emptyRecords(MemoryRecords.java:51)
4XESTACKTRACEat 
org/apache/kafka/common/record/MemoryRecords.emptyRecords(MemoryRecords.java:55)
4XESTACKTRACEat 
org/apache/kafka/common/record/MemoryRecordsTest.testIterator(MemoryRecordsTest.java:42)

java -version
java version "1.7.0"
Java(TM) SE Runtime Environment (build pxa6470_27sr3fp1-20150605_01(SR3 FP1))
IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
20150407_243189 (JIT enabled, AOT enabled)
J9VM - R27_Java727_SR3_20150407_1831_B243189
JIT  - tr.r13.java_20150406_89182
GC   - R27_Java727_SR3_20150407_1831_B243189_CMPRSS
J9CL - 20150407_243189)
JCL - 20150601_01 based on Oracle 7u79-b14



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 37357: Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7

2015-08-11 Thread Rajini Sivaram

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37357/
---

Review request for kafka.


Bugs: KAFKA-2421
https://issues.apache.org/jira/browse/KAFKA-2421


Repository: kafka


Description
---

Patch for KAFKA-2421: Upgrade to LZ4 version 1.3 and update reference to Utils 
method that was moved to UnsafeUtils


Diffs
-

  build.gradle 1b67e628c2fca897177c12b6afad9a8700fffd1f 
  
clients/src/main/java/org/apache/kafka/common/record/KafkaLZ4BlockInputStream.java
 f480da2ae0992855cc860e1ce5cbd11ecfca7bee 
  
clients/src/main/java/org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.java
 6a2231f4775771932c36df362c88aead3189b7b8 

Diff: https://reviews.apache.org/r/37357/diff/


Testing
---


Thanks,

Rajini Sivaram



[jira] [Commented] (KAFKA-2421) Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7

2015-08-11 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681931#comment-14681931
 ] 

Rajini Sivaram commented on KAFKA-2421:
---

Created reviewboard https://reviews.apache.org/r/37357/diff/
 against branch origin/trunk

> Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7
> 
>
> Key: KAFKA-2421
> URL: https://issues.apache.org/jira/browse/KAFKA-2421
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
> Environment: IBM Java 7
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Attachments: KAFKA-2421.patch
>
>
> Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7.
> LZ4 version 1.2 crashes with 64-bit IBM Java 7. This has been fixed in LZ4 
> version 1.3 (https://github.com/jpountz/lz4-java/blob/master/CHANGES.md, 
> https://github.com/jpountz/lz4-java/pull/46).
> The unit test org.apache.kafka.common.record.MemoryRecordsTest crashes when 
> run with 64-bit IBM Java7 with the error:
> {quote}
> 023EB900: Native Method 0263CE10 
> (net/jpountz/lz4/LZ4JNI.LZ4_compress_limitedOutput([BII[BII)I)
> 023EB900: Invalid JNI call of function void 
> ReleasePrimitiveArrayCritical(JNIEnv *env, jarray array, void *carray, jint 
> mode): For array FFF7EAB8 parameter carray passed FFF85998, 
> expected to be FFF7EAC0
> 14:08:42.763 0x23eb900j9mm.632*   ** ASSERTION FAILED ** at 
> StandardAccessBarrier.cpp:335: ((false))
> JVMDUMP039I Processing dump event "traceassert", detail "" at 2015/08/11 
> 15:08:42 - please wait.
> {quote}
> Stack trace from javacore:
> 3XMTHREADINFO3   Java callstack:
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4JNI.LZ4_compress_limitedOutput(Native Method)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4JNICompressor.compress(LZ4JNICompressor.java:31)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.(LZ4Factory.java:163)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.instance(LZ4Factory.java:46)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.nativeInstance(LZ4Factory.java:76)
> 5XESTACKTRACE   (entered lock: 
> net/jpountz/lz4/LZ4Factory@0xE02F0BE8, entry count: 1)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.fastestInstance(LZ4Factory.java:129)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.(KafkaLZ4BlockOutputStream.java:72)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.(KafkaLZ4BlockOutputStream.java:93)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.(KafkaLZ4BlockOutputStream.java:103)
> 4XESTACKTRACEat 
> sun/reflect/NativeConstructorAccessorImpl.newInstance0(Native Method)
> 4XESTACKTRACEat 
> sun/reflect/NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:86)
> 4XESTACKTRACEat 
> sun/reflect/DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:58)
> 4XESTACKTRACEat 
> java/lang/reflect/Constructor.newInstance(Constructor.java:542)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/Compressor.wrapForOutput(Compressor.java:222)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/Compressor.(Compressor.java:72)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/Compressor.(Compressor.java:76)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecords.(MemoryRecords.java:43)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecords.emptyRecords(MemoryRecords.java:51)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecords.emptyRecords(MemoryRecords.java:55)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecordsTest.testIterator(MemoryRecordsTest.java:42)
> java -version
> java version "1.7.0"
> Java(TM) SE Runtime Environment (build pxa6470_27sr3fp1-20150605_01(SR3 FP1))
> IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
> 20150407_243189 (JIT enabled, AOT enabled)
> J9VM - R27_Java727_SR3_20150407_1831_B243189
> JIT  - tr.r13.java_20150406_89182
> GC   - R27_Java727_SR3_20150407_1831_B243189_CMPRSS
> J9CL - 20150407_243189)
> JCL - 20150601_01 based on Oracle 7u79-b14



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2421) Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7

2015-08-11 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-2421:
--
Status: Patch Available  (was: Open)

> Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7
> 
>
> Key: KAFKA-2421
> URL: https://issues.apache.org/jira/browse/KAFKA-2421
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
> Environment: IBM Java 7
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Attachments: KAFKA-2421.patch
>
>
> Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7.
> LZ4 version 1.2 crashes with 64-bit IBM Java 7. This has been fixed in LZ4 
> version 1.3 (https://github.com/jpountz/lz4-java/blob/master/CHANGES.md, 
> https://github.com/jpountz/lz4-java/pull/46).
> The unit test org.apache.kafka.common.record.MemoryRecordsTest crashes when 
> run with 64-bit IBM Java7 with the error:
> {quote}
> 023EB900: Native Method 0263CE10 
> (net/jpountz/lz4/LZ4JNI.LZ4_compress_limitedOutput([BII[BII)I)
> 023EB900: Invalid JNI call of function void 
> ReleasePrimitiveArrayCritical(JNIEnv *env, jarray array, void *carray, jint 
> mode): For array FFF7EAB8 parameter carray passed FFF85998, 
> expected to be FFF7EAC0
> 14:08:42.763 0x23eb900j9mm.632*   ** ASSERTION FAILED ** at 
> StandardAccessBarrier.cpp:335: ((false))
> JVMDUMP039I Processing dump event "traceassert", detail "" at 2015/08/11 
> 15:08:42 - please wait.
> {quote}
> Stack trace from javacore:
> 3XMTHREADINFO3   Java callstack:
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4JNI.LZ4_compress_limitedOutput(Native Method)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4JNICompressor.compress(LZ4JNICompressor.java:31)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.(LZ4Factory.java:163)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.instance(LZ4Factory.java:46)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.nativeInstance(LZ4Factory.java:76)
> 5XESTACKTRACE   (entered lock: 
> net/jpountz/lz4/LZ4Factory@0xE02F0BE8, entry count: 1)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.fastestInstance(LZ4Factory.java:129)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.(KafkaLZ4BlockOutputStream.java:72)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.(KafkaLZ4BlockOutputStream.java:93)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.(KafkaLZ4BlockOutputStream.java:103)
> 4XESTACKTRACEat 
> sun/reflect/NativeConstructorAccessorImpl.newInstance0(Native Method)
> 4XESTACKTRACEat 
> sun/reflect/NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:86)
> 4XESTACKTRACEat 
> sun/reflect/DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:58)
> 4XESTACKTRACEat 
> java/lang/reflect/Constructor.newInstance(Constructor.java:542)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/Compressor.wrapForOutput(Compressor.java:222)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/Compressor.(Compressor.java:72)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/Compressor.(Compressor.java:76)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecords.(MemoryRecords.java:43)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecords.emptyRecords(MemoryRecords.java:51)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecords.emptyRecords(MemoryRecords.java:55)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecordsTest.testIterator(MemoryRecordsTest.java:42)
> java -version
> java version "1.7.0"
> Java(TM) SE Runtime Environment (build pxa6470_27sr3fp1-20150605_01(SR3 FP1))
> IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
> 20150407_243189 (JIT enabled, AOT enabled)
> J9VM - R27_Java727_SR3_20150407_1831_B243189
> JIT  - tr.r13.java_20150406_89182
> GC   - R27_Java727_SR3_20150407_1831_B243189_CMPRSS
> J9CL - 20150407_243189)
> JCL - 20150601_01 based on Oracle 7u79-b14



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2421) Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7

2015-08-11 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-2421:
--
Attachment: KAFKA-2421.patch

> Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7
> 
>
> Key: KAFKA-2421
> URL: https://issues.apache.org/jira/browse/KAFKA-2421
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
> Environment: IBM Java 7
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Attachments: KAFKA-2421.patch
>
>
> Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7.
> LZ4 version 1.2 crashes with 64-bit IBM Java 7. This has been fixed in LZ4 
> version 1.3 (https://github.com/jpountz/lz4-java/blob/master/CHANGES.md, 
> https://github.com/jpountz/lz4-java/pull/46).
> The unit test org.apache.kafka.common.record.MemoryRecordsTest crashes when 
> run with 64-bit IBM Java7 with the error:
> {quote}
> 023EB900: Native Method 0263CE10 
> (net/jpountz/lz4/LZ4JNI.LZ4_compress_limitedOutput([BII[BII)I)
> 023EB900: Invalid JNI call of function void 
> ReleasePrimitiveArrayCritical(JNIEnv *env, jarray array, void *carray, jint 
> mode): For array FFF7EAB8 parameter carray passed FFF85998, 
> expected to be FFF7EAC0
> 14:08:42.763 0x23eb900j9mm.632*   ** ASSERTION FAILED ** at 
> StandardAccessBarrier.cpp:335: ((false))
> JVMDUMP039I Processing dump event "traceassert", detail "" at 2015/08/11 
> 15:08:42 - please wait.
> {quote}
> Stack trace from javacore:
> 3XMTHREADINFO3   Java callstack:
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4JNI.LZ4_compress_limitedOutput(Native Method)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4JNICompressor.compress(LZ4JNICompressor.java:31)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.(LZ4Factory.java:163)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.instance(LZ4Factory.java:46)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.nativeInstance(LZ4Factory.java:76)
> 5XESTACKTRACE   (entered lock: 
> net/jpountz/lz4/LZ4Factory@0xE02F0BE8, entry count: 1)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.fastestInstance(LZ4Factory.java:129)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.(KafkaLZ4BlockOutputStream.java:72)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.(KafkaLZ4BlockOutputStream.java:93)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.(KafkaLZ4BlockOutputStream.java:103)
> 4XESTACKTRACEat 
> sun/reflect/NativeConstructorAccessorImpl.newInstance0(Native Method)
> 4XESTACKTRACEat 
> sun/reflect/NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:86)
> 4XESTACKTRACEat 
> sun/reflect/DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:58)
> 4XESTACKTRACEat 
> java/lang/reflect/Constructor.newInstance(Constructor.java:542)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/Compressor.wrapForOutput(Compressor.java:222)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/Compressor.(Compressor.java:72)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/Compressor.(Compressor.java:76)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecords.(MemoryRecords.java:43)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecords.emptyRecords(MemoryRecords.java:51)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecords.emptyRecords(MemoryRecords.java:55)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecordsTest.testIterator(MemoryRecordsTest.java:42)
> java -version
> java version "1.7.0"
> Java(TM) SE Runtime Environment (build pxa6470_27sr3fp1-20150605_01(SR3 FP1))
> IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
> 20150407_243189 (JIT enabled, AOT enabled)
> J9VM - R27_Java727_SR3_20150407_1831_B243189
> JIT  - tr.r13.java_20150406_89182
> GC   - R27_Java727_SR3_20150407_1831_B243189_CMPRSS
> J9CL - 20150407_243189)
> JCL - 20150601_01 based on Oracle 7u79-b14



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2421) Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7

2015-08-11 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681947#comment-14681947
 ] 

Rajini Sivaram commented on KAFKA-2421:
---

Attached patch upgrades LZ4 to version 1.3 and fixes the reference to a method 
that was moved to a different class. Have tested that all unit tests work with 
IBM Java 7 with the changes.

> Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7
> 
>
> Key: KAFKA-2421
> URL: https://issues.apache.org/jira/browse/KAFKA-2421
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
> Environment: IBM Java 7
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Attachments: KAFKA-2421.patch
>
>
> Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7.
> LZ4 version 1.2 crashes with 64-bit IBM Java 7. This has been fixed in LZ4 
> version 1.3 (https://github.com/jpountz/lz4-java/blob/master/CHANGES.md, 
> https://github.com/jpountz/lz4-java/pull/46).
> The unit test org.apache.kafka.common.record.MemoryRecordsTest crashes when 
> run with 64-bit IBM Java7 with the error:
> {quote}
> 023EB900: Native Method 0263CE10 
> (net/jpountz/lz4/LZ4JNI.LZ4_compress_limitedOutput([BII[BII)I)
> 023EB900: Invalid JNI call of function void 
> ReleasePrimitiveArrayCritical(JNIEnv *env, jarray array, void *carray, jint 
> mode): For array FFF7EAB8 parameter carray passed FFF85998, 
> expected to be FFF7EAC0
> 14:08:42.763 0x23eb900j9mm.632*   ** ASSERTION FAILED ** at 
> StandardAccessBarrier.cpp:335: ((false))
> JVMDUMP039I Processing dump event "traceassert", detail "" at 2015/08/11 
> 15:08:42 - please wait.
> {quote}
> Stack trace from javacore:
> 3XMTHREADINFO3   Java callstack:
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4JNI.LZ4_compress_limitedOutput(Native Method)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4JNICompressor.compress(LZ4JNICompressor.java:31)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.(LZ4Factory.java:163)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.instance(LZ4Factory.java:46)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.nativeInstance(LZ4Factory.java:76)
> 5XESTACKTRACE   (entered lock: 
> net/jpountz/lz4/LZ4Factory@0xE02F0BE8, entry count: 1)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.fastestInstance(LZ4Factory.java:129)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.(KafkaLZ4BlockOutputStream.java:72)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.(KafkaLZ4BlockOutputStream.java:93)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.(KafkaLZ4BlockOutputStream.java:103)
> 4XESTACKTRACEat 
> sun/reflect/NativeConstructorAccessorImpl.newInstance0(Native Method)
> 4XESTACKTRACEat 
> sun/reflect/NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:86)
> 4XESTACKTRACEat 
> sun/reflect/DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:58)
> 4XESTACKTRACEat 
> java/lang/reflect/Constructor.newInstance(Constructor.java:542)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/Compressor.wrapForOutput(Compressor.java:222)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/Compressor.(Compressor.java:72)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/Compressor.(Compressor.java:76)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecords.(MemoryRecords.java:43)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecords.emptyRecords(MemoryRecords.java:51)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecords.emptyRecords(MemoryRecords.java:55)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecordsTest.testIterator(MemoryRecordsTest.java:42)
> java -version
> java version "1.7.0"
> Java(TM) SE Runtime Environment (build pxa6470_27sr3fp1-20150605_01(SR3 FP1))
> IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
> 20150407_243189 (JIT enabled, AOT enabled)
> J9VM - R27_Java727_SR3_20150407_1831_B243189
> JIT  - tr.r13.java_20150406_89182
> GC   - R27_Java727_SR3_20150407_1831_B243189_CMPRSS
> J9CL - 20150407_243189)
> JCL - 20150601_01 based on Oracle 7u79-b14



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: MINOR: Fix hard coded strings in ProduceRespon...

2015-08-11 Thread granthenke
GitHub user granthenke opened a pull request:

https://github.com/apache/kafka/pull/131

MINOR: Fix hard coded strings in ProduceResponse



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/granthenke/kafka minor-string

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/131.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #131


commit 3c6250dbf3f5bf08f6f3b3a210227e1f4f342838
Author: Grant Henke 
Date:   2015-08-11T15:27:53Z

MINOR: Fix hard coded strings in ProduceResponse




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Review Request 37357: Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7

2015-08-11 Thread Ismael Juma

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37357/#review94916
---



clients/src/main/java/org/apache/kafka/common/record/KafkaLZ4BlockInputStream.java
 (line 177)


Why not use `SafeUtils`? The implementation of `UnsafeUtils` just calls 
that, right?

`public static void checkRange(byte[] buf, int off, int len) {
  SafeUtils.checkRange(buf, off, len);
}
`

https://github.com/jpountz/lz4-java/blob/master/src/java-unsafe/net/jpountz/util/UnsafeUtils.java#L60



clients/src/main/java/org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.java
 (line 183)


Same as above.


- Ismael Juma


On Aug. 11, 2015, 3:17 p.m., Rajini Sivaram wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/37357/
> ---
> 
> (Updated Aug. 11, 2015, 3:17 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2421
> https://issues.apache.org/jira/browse/KAFKA-2421
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Patch for KAFKA-2421: Upgrade to LZ4 version 1.3 and update reference to 
> Utils method that was moved to UnsafeUtils
> 
> 
> Diffs
> -
> 
>   build.gradle 1b67e628c2fca897177c12b6afad9a8700fffd1f 
>   
> clients/src/main/java/org/apache/kafka/common/record/KafkaLZ4BlockInputStream.java
>  f480da2ae0992855cc860e1ce5cbd11ecfca7bee 
>   
> clients/src/main/java/org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.java
>  6a2231f4775771932c36df362c88aead3189b7b8 
> 
> Diff: https://reviews.apache.org/r/37357/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Rajini Sivaram
> 
>



[jira] [Updated] (KAFKA-2336) Changing offsets.topic.num.partitions after the offset topic is created breaks consumer group partition assignment

2015-08-11 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KAFKA-2336:
---
Attachment: KAFKA-2336_2015-08-11_10:37:41.patch

> Changing offsets.topic.num.partitions after the offset topic is created 
> breaks consumer group partition assignment 
> ---
>
> Key: KAFKA-2336
> URL: https://issues.apache.org/jira/browse/KAFKA-2336
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.2.1
>Reporter: Grant Henke
>Assignee: Grant Henke
> Attachments: KAFKA-2336.patch, KAFKA-2336.patch, 
> KAFKA-2336_2015-07-16_13:04:02.patch, KAFKA-2336_2015-08-11_10:37:41.patch
>
>
> Currently adjusting offsets.topic.num.partitions after the offset topic is 
> created is not supported. Meaning that the number of partitions will not 
> change once the topic has been created.
> However, changing the value in the configuration should not cause issues and 
> instead simply be ignored. Currently this is not the case. 
> When the value of offsets.topic.num.partitions is changed after the offset 
> topic is created the consumer group partition assignment completely changes 
> even though the number of partitions does not change. 
> This is because _kafka.server.OffsetManager.partitionFor(group: String)_ uses 
> the configured value and not the value of the actual topic. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 36548: Patch for KAFKA-2336

2015-08-11 Thread Grant Henke

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36548/
---

(Updated Aug. 11, 2015, 3:37 p.m.)


Review request for kafka.


Bugs: KAFKA-2336
https://issues.apache.org/jira/browse/KAFKA-2336


Repository: kafka


Description (updated)
---

KAFKA-2336: Changing offsets.topic.num.partitions after the offset topic is 
created breaks consumer group partition assignment


Diffs (updated)
-

  core/src/main/scala/kafka/server/OffsetManager.scala 
47b6ce93da320a565435b4a7916a0c4371143b8a 

Diff: https://reviews.apache.org/r/36548/diff/


Testing
---


Thanks,

Grant Henke



[jira] [Commented] (KAFKA-2336) Changing offsets.topic.num.partitions after the offset topic is created breaks consumer group partition assignment

2015-08-11 Thread Grant Henke (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681958#comment-14681958
 ] 

Grant Henke commented on KAFKA-2336:


Updated reviewboard https://reviews.apache.org/r/36548/diff/
 against branch origin/trunk

> Changing offsets.topic.num.partitions after the offset topic is created 
> breaks consumer group partition assignment 
> ---
>
> Key: KAFKA-2336
> URL: https://issues.apache.org/jira/browse/KAFKA-2336
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.2.1
>Reporter: Grant Henke
>Assignee: Grant Henke
> Attachments: KAFKA-2336.patch, KAFKA-2336.patch, 
> KAFKA-2336_2015-07-16_13:04:02.patch, KAFKA-2336_2015-08-11_10:37:41.patch
>
>
> Currently adjusting offsets.topic.num.partitions after the offset topic is 
> created is not supported. Meaning that the number of partitions will not 
> change once the topic has been created.
> However, changing the value in the configuration should not cause issues and 
> instead simply be ignored. Currently this is not the case. 
> When the value of offsets.topic.num.partitions is changed after the offset 
> topic is created the consumer group partition assignment completely changes 
> even though the number of partitions does not change. 
> This is because _kafka.server.OffsetManager.partitionFor(group: String)_ uses 
> the configured value and not the value of the actual topic. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-28 - Add a transform client for data processing

2015-08-11 Thread Jiangjie Qin
Guozhang,

By interleaved groups of message, I meant something like this: Say we have
message 0,1,2,3, message 0 and 2 together completes a business logic,
message 1 and 3 together completes a business logic. In that case, after
user processed message 2, they cannot commit offsets because if they crash
before processing message 3, message 1 will not be reconsumed. That means
it is possible that user are not able to find a point where the current
state is safe to be committed.

This is one example in the use case space table. It is still not clear to
me which use cases in the use case space table KIP-28 wants to cover. Are
we only covering the case for static topic stream with semi-auto commit?
i.e. user cannot change topic subscription on the fly and they can only
commit the current offset.

Thanks,

Jiangjie (Becket) Qin

On Mon, Aug 10, 2015 at 6:57 PM, Guozhang Wang  wrote:

> Hello folks,
>
> I have updated the KIP page with some detailed API / architecture /
> packaging proposals, along with the long promised first patch in PR:
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-28+-+Add+a+processor+client
>
> https://github.com/apache/kafka/pull/130
>
>
> Any feedbacks / comments are more than welcomed.
>
> Guozhang
>
>
> On Mon, Aug 10, 2015 at 6:55 PM, Guozhang Wang  wrote:
>
> > Hi Jun,
> >
> > 1. I have removed the streamTime in punctuate() since it is not only
> > triggered by clock time, detailed explanation can be found here:
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-28+-+Add+a+processor+client#KIP-28-Addaprocessorclient-StreamTime
> >
> > 2. Yes, if users do not schedule a task, then punctuate will never fire.
> >
> > 3. Yes, I agree. The reason it was implemented in this way is that the
> > state store registration call is triggered by the users. However I think
> it
> > is doable to change that API so that it will be more natural to have sth.
> > like:
> >
> > context.createStore(store-name, store-type).
> >
> > Guozhang
> >
> > On Tue, Aug 4, 2015 at 9:17 AM, Jun Rao  wrote:
> >
> >> A few questions/comments.
> >>
> >> 1. What's streamTime passed to punctuate()? Is that just the current
> time?
> >> 2. Is punctuate() only called if schedule() is called?
> >> 3. The way the KeyValueStore is created seems a bit weird. Since this is
> >> part of the internal state managed by KafkaProcessorContext, it seems
> >> there
> >> should be an api to create the KeyValueStore from KafkaProcessorContext,
> >> instead of passing context to the constructor of KeyValueStore?
> >>
> >> Thanks,
> >>
> >> Jun
> >>
> >> On Thu, Jul 23, 2015 at 5:59 PM, Guozhang Wang 
> >> wrote:
> >>
> >> > Hi all,
> >> >
> >> > I just posted KIP-28: Add a transform client for data processing
> >> > <
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-28+-+Add+a+transform+client+for+data+processing
> >> > >
> >> > .
> >> >
> >> > The wiki page does not yet have the full design / implementation
> >> details,
> >> > and this email is to kick-off the conversation on whether we should
> add
> >> > this new client with the described motivations, and if yes what
> >> features /
> >> > functionalities should be included.
> >> >
> >> > Looking forward to your feedback!
> >> >
> >> > -- Guozhang
> >> >
> >>
> >
> >
> >
> > --
> > -- Guozhang
> >
>
>
>
> --
> -- Guozhang
>


Re: [DISCUSS] KIP-28 - Add a transform client for data processing

2015-08-11 Thread Guozhang Wang
Jiangjie,

Thanks for the explanation, now I understands the scenario. It is one of
the CEP in stream processing, in which I think the local state should be
used for some sort of pattern matching. More concretely, let's say in this
case we have a local state storing what have been observed. Then the
sequence would be:

T0: local state {}
T1:message 0,  local state {0}
T2:message 1,  local state {0, 1}
T3:message 2,  local state {1}, matching 0 and 2, output some result
and remove 0/2 from local state.
T4:message 3,  local state {0}, matching 1 and 3, output some result
and remove 1/3 from local state.

Let's say user calls commit on T2, it will commit offset at message 2 as
well as the local state {0, 1}; then upon failure recovery, it can recover
the state as along with the committed offsets to continue.

More generally, the current design of the processor will let users to
specify their subscribed topics before starting the process, and users will
not change topic subscription on the fly, users will not be committing on
arbitrary offsets. The rationale behind this is to abstract the producer /
consumer details from the processor developers as much as possible, i.e. if
user do not want, they should not be exposed with message offsets /
partition ids / topic names etc. For most cases, the subscribed topics
should be able to specify before starting the processing job, so we let
users to specify them once and then focus on the computational logic in
implementing the process function.

Guozhang


On Tue, Aug 11, 2015 at 10:26 AM, Jiangjie Qin 
wrote:

> Guozhang,
>
> By interleaved groups of message, I meant something like this: Say we have
> message 0,1,2,3, message 0 and 2 together completes a business logic,
> message 1 and 3 together completes a business logic. In that case, after
> user processed message 2, they cannot commit offsets because if they crash
> before processing message 3, message 1 will not be reconsumed. That means
> it is possible that user are not able to find a point where the current
> state is safe to be committed.
>
> This is one example in the use case space table. It is still not clear to
> me which use cases in the use case space table KIP-28 wants to cover. Are
> we only covering the case for static topic stream with semi-auto commit?
> i.e. user cannot change topic subscription on the fly and they can only
> commit the current offset.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Mon, Aug 10, 2015 at 6:57 PM, Guozhang Wang  wrote:
>
> > Hello folks,
> >
> > I have updated the KIP page with some detailed API / architecture /
> > packaging proposals, along with the long promised first patch in PR:
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-28+-+Add+a+processor+client
> >
> > https://github.com/apache/kafka/pull/130
> >
> >
> > Any feedbacks / comments are more than welcomed.
> >
> > Guozhang
> >
> >
> > On Mon, Aug 10, 2015 at 6:55 PM, Guozhang Wang 
> wrote:
> >
> > > Hi Jun,
> > >
> > > 1. I have removed the streamTime in punctuate() since it is not only
> > > triggered by clock time, detailed explanation can be found here:
> > >
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-28+-+Add+a+processor+client#KIP-28-Addaprocessorclient-StreamTime
> > >
> > > 2. Yes, if users do not schedule a task, then punctuate will never
> fire.
> > >
> > > 3. Yes, I agree. The reason it was implemented in this way is that the
> > > state store registration call is triggered by the users. However I
> think
> > it
> > > is doable to change that API so that it will be more natural to have
> sth.
> > > like:
> > >
> > > context.createStore(store-name, store-type).
> > >
> > > Guozhang
> > >
> > > On Tue, Aug 4, 2015 at 9:17 AM, Jun Rao  wrote:
> > >
> > >> A few questions/comments.
> > >>
> > >> 1. What's streamTime passed to punctuate()? Is that just the current
> > time?
> > >> 2. Is punctuate() only called if schedule() is called?
> > >> 3. The way the KeyValueStore is created seems a bit weird. Since this
> is
> > >> part of the internal state managed by KafkaProcessorContext, it seems
> > >> there
> > >> should be an api to create the KeyValueStore from
> KafkaProcessorContext,
> > >> instead of passing context to the constructor of KeyValueStore?
> > >>
> > >> Thanks,
> > >>
> > >> Jun
> > >>
> > >> On Thu, Jul 23, 2015 at 5:59 PM, Guozhang Wang 
> > >> wrote:
> > >>
> > >> > Hi all,
> > >> >
> > >> > I just posted KIP-28: Add a transform client for data processing
> > >> > <
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-28+-+Add+a+transform+client+for+data+processing
> > >> > >
> > >> > .
> > >> >
> > >> > The wiki page does not yet have the full design / implementation
> > >> details,
> > >> > and this email is to kick-off the conversation on whether we should
> > add
> > >> > this new client with the described motivations, and if yes what
> > >> features /
> > >> > functional

Re: Typo on documentation

2015-08-11 Thread Guozhang Wang
You are right, I woke up from the future I guess :)

On Mon, Aug 10, 2015 at 11:54 PM, Gwen Shapira  wrote:

> We can't create PRs for doc bugs because the docs are (still) in SVN...
>
> On Mon, Aug 10, 2015 at 11:24 PM, Guozhang Wang 
> wrote:
>
> > Moving forward, I would suggest we just create the PR as MINOR: fix typo
> in
> > .. instead of creating jiras. This saves some overhead for such patches.
> >
> > Guozhang
> >
> > On Mon, Aug 10, 2015 at 2:53 PM, Edward Ribeiro <
> edward.ribe...@gmail.com>
> > wrote:
> >
> > > Okay.
> > >
> > > On Mon, Aug 10, 2015 at 5:21 PM, Gwen Shapira 
> wrote:
> > >
> > > > yeppers. JIRA and patch?
> > > >
> > > > On Mon, Aug 10, 2015 at 12:36 PM, Edward Ribeiro <
> > > edward.ribe...@gmail.com
> > > > >
> > > > wrote:
> > > >
> > > > > I have just seen the typo below at
> > > > > http://kafka.apache.org/documentation.html . It's supposed to be
> JMX
> > > > > instead of JMZ, right?
> > > > >
> > > > > []'s
> > > > > Eddie
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-- Guozhang


[GitHub] kafka pull request: KAFKA-1215: Rack-Aware replica assignment opti...

2015-08-11 Thread allenxwang
GitHub user allenxwang opened a pull request:

https://github.com/apache/kafka/pull/132

KAFKA-1215: Rack-Aware replica assignment option

The PR tries to achieve the following:

- Make rack-aware assignment and rack data structure optional as opposed to 
be part of the core data structure/protocol to ease the migration. The 
implementation of that returns the map of broker to rack is pluggable. User 
needs to pass the implementation class as a Kafka runtime configuration or 
command line argument.

- The rack aware replica assignment is best effort when distributing the 
replicas to racks. When there are more replicas than racks, it ensures each 
rack has at least one replica. When there are more racks than replicas, it 
ensures each rack has at most one replica. It also tries to keep the even 
distribution of replicas among brokers and racks when possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/allenxwang/kafka KAFKA-1215

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/132.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #132


commit 35db23ee7987a1811d630f14de66a99ce638
Author: Allen Wang 
Date:   2015-08-11T17:52:37Z

KAFKA-1215: Rack-Aware replica assignment option




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682185#comment-14682185
 ] 

ASF GitHub Bot commented on KAFKA-1215:
---

GitHub user allenxwang opened a pull request:

https://github.com/apache/kafka/pull/132

KAFKA-1215: Rack-Aware replica assignment option

The PR tries to achieve the following:

- Make rack-aware assignment and rack data structure optional as opposed to 
be part of the core data structure/protocol to ease the migration. The 
implementation of that returns the map of broker to rack is pluggable. User 
needs to pass the implementation class as a Kafka runtime configuration or 
command line argument.

- The rack aware replica assignment is best effort when distributing the 
replicas to racks. When there are more replicas than racks, it ensures each 
rack has at least one replica. When there are more racks than replicas, it 
ensures each rack has at most one replica. It also tries to keep the even 
distribution of replicas among brokers and racks when possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/allenxwang/kafka KAFKA-1215

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/132.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #132


commit 35db23ee7987a1811d630f14de66a99ce638
Author: Allen Wang 
Date:   2015-08-11T17:52:37Z

KAFKA-1215: Rack-Aware replica assignment option




> Rack-Aware replica assignment option
> 
>
> Key: KAFKA-1215
> URL: https://issues.apache.org/jira/browse/KAFKA-1215
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.0
>Reporter: Joris Van Remoortere
>Assignee: Jun Rao
> Fix For: 0.9.0
>
> Attachments: rack_aware_replica_assignment_v1.patch, 
> rack_aware_replica_assignment_v2.patch
>
>
> Adding a rack-id to kafka config. This rack-id can be used during replica 
> assignment by using the max-rack-replication argument in the admin scripts 
> (create topic, etc.). By default the original replication assignment 
> algorithm is used because max-rack-replication defaults to -1. 
> max-rack-replication > -1 is not honored if you are doing manual replica 
> assignment (preffered).
> If this looks good I can add some test cases specific to the rack-aware 
> assignment.
> I can also port this to trunk. We are currently running 0.8.0 in production 
> and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Typo on documentation

2015-08-11 Thread Edward Ribeiro
haha, no problem. :) Btw, ​I uploaded the patch yesterday.

Cheers,
Edward​

On Tue, Aug 11, 2015 at 2:53 PM, Guozhang Wang  wrote:

> You are right, I woke up from the future I guess :)
>
> On Mon, Aug 10, 2015 at 11:54 PM, Gwen Shapira  wrote:
>
> > We can't create PRs for doc bugs because the docs are (still) in SVN...
> >
> > On Mon, Aug 10, 2015 at 11:24 PM, Guozhang Wang 
> > wrote:
> >
> > > Moving forward, I would suggest we just create the PR as MINOR: fix
> typo
> > in
> > > .. instead of creating jiras. This saves some overhead for such
> patches.
> > >
> > > Guozhang
> > >
> > > On Mon, Aug 10, 2015 at 2:53 PM, Edward Ribeiro <
> > edward.ribe...@gmail.com>
> > > wrote:
> > >
> > > > Okay.
> > > >
> > > > On Mon, Aug 10, 2015 at 5:21 PM, Gwen Shapira 
> > wrote:
> > > >
> > > > > yeppers. JIRA and patch?
> > > > >
> > > > > On Mon, Aug 10, 2015 at 12:36 PM, Edward Ribeiro <
> > > > edward.ribe...@gmail.com
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > I have just seen the typo below at
> > > > > > http://kafka.apache.org/documentation.html . It's supposed to be
> > JMX
> > > > > > instead of JMZ, right?
> > > > > >
> > > > > > []'s
> > > > > > Eddie
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>


[jira] [Updated] (KAFKA-2390) Seek() should take a callback.

2015-08-11 Thread Dong Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Lin updated KAFKA-2390:

Status: Patch Available  (was: Open)

> Seek() should take a callback.
> --
>
> Key: KAFKA-2390
> URL: https://issues.apache.org/jira/browse/KAFKA-2390
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jiangjie Qin
>Assignee: Dong Lin
>
> Currently seek is an async call. To have the same interface as other calls 
> like commit(), seek() should take a callback. This callback will be invoked 
> if the position to seek triggers OFFSET_OUT_OF_RANGE exception from broker.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Can someone review ticket 1778

2015-08-11 Thread Abhishek Nigam
Hi Guozhang,
Can you please re-review KAFKA 1778 design.

Just to provide background for this ticket. This was a sub-ticket of kafka
admin commands KIP-4.
The goal of this was to avoid cascading controller moves maybe during
rolling broker bounce.

The approaches discussed were as follows:
a) Use a preferred controller admin command which can be used to
dynamically indicate a preferred controller.
b) Use configuration to set a whitelist or blacklist of brokers which are
eligible to become a controller.

Can we have consensus on how we want to resolve this issue.

-Abhishek

On Sun, May 17, 2015 at 10:55 PM, Abhishek Nigam 
wrote:

> Hi,
> For pinning the controller to a broker I have proposed a design. Can
> someone review the design and let me know if it looks ok.
> I can then submit a patch for this ticket within the next couple of weeks.
>
> -Abhishek
>
>


[jira] [Updated] (KAFKA-2143) Replicas get ahead of leader and fail

2015-08-11 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira updated KAFKA-2143:

Reviewer: Gwen Shapira

> Replicas get ahead of leader and fail
> -
>
> Key: KAFKA-2143
> URL: https://issues.apache.org/jira/browse/KAFKA-2143
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 0.8.2.1
>Reporter: Evan Huus
>Assignee: Jiangjie Qin
> Fix For: 0.8.3
>
>
> On a cluster of 6 nodes, we recently saw a case where a single 
> under-replicated partition suddenly appeared, replication lag spiked, and 
> network IO spiked. The cluster appeared to recover eventually on its own,
> Looking at the logs, the thing which failed was partition 7 of the topic 
> {{background_queue}}. It had an ISR of 1,4,3 and its leader at the time was 
> 3. Here are the interesting log lines:
> On node 3 (the leader):
> {noformat}
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> processing fetch request for partition [background_queue,7] offset 3722949957 
> from follower with correlation id 148185816. Possible cause: Request for 
> offset 3722949957 but we only have log segments in the range 3648049863 to 
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when 
> processing fetch request for partition [background_queue,7] offset 3722949957 
> from follower with correlation id 156007054. Possible cause: Request for 
> offset 3722949957 but we only have log segments in the range 3648049863 to 
> 3722949955. (kafka.server.ReplicaManager)
> [2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3: 
> Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3 
> (kafka.cluster.Partition)
> {noformat}
> Note that both replicas suddenly asked for an offset *ahead* of the available 
> offsets.
> And on nodes 1 and 4 (the replicas) many occurrences of the following:
> {noformat}
> [2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log 
> background_queue-7 for deletion. (kafka.log.Log) (edited)
> {noformat}
> Based on my reading, this looks like the replicas somehow got *ahead* of the 
> leader, asked for an invalid offset, got confused, and re-replicated the 
> entire topic from scratch to recover (this matches our network graphs, which 
> show 3 sending a bunch of data to 1 and 4).
> Taking a stab in the dark at the cause, there appears to be a race condition 
> where replicas can receive a new offset before the leader has committed it 
> and is ready to replicate?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-08-11 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682249#comment-14682249
 ] 

Gwen Shapira commented on KAFKA-1778:
-

Apparently I can't assign Reviewer if there is no patch, so [~guozhang], this 
is for you :)

> Create new re-elect controller admin function
> -
>
> Key: KAFKA-1778
> URL: https://issues.apache.org/jira/browse/KAFKA-1778
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Abhishek Nigam
> Fix For: 0.8.3
>
>
> kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1695) Authenticate connection to Zookeeper

2015-08-11 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira updated KAFKA-1695:

Reviewer: Flavio Junqueira  (was: Gwen Shapira)

> Authenticate connection to Zookeeper
> 
>
> Key: KAFKA-1695
> URL: https://issues.apache.org/jira/browse/KAFKA-1695
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Jay Kreps
>Assignee: Parth Brahmbhatt
>
> We need to make it possible to secure the Zookeeper cluster Kafka is using. 
> This would make use of the normal authentication ZooKeeper provides. 
> ZooKeeper supports a variety of authentication mechanisms so we will need to 
> figure out what has to be passed in to the zookeeper client.
> The intention is that when the current round of client work is done it should 
> be possible to run without clients needing access to Zookeeper so all we need 
> here is to make it so that only the Kafka cluster is able to read and write 
> to the Kafka znodes  (we shouldn't need to set any kind of acl on a per-znode 
> basis).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1387) Kafka getting stuck creating ephemeral node it has already created when two zookeeper sessions are established in a very short period of time

2015-08-11 Thread Mayuresh Gharat (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682263#comment-14682263
 ] 

Mayuresh Gharat commented on KAFKA-1387:


Can the person who uploaded the patch submit a testcase on how to reproduce 
this? 
We are hitting this in production but are not able to reproduce this locally.



> Kafka getting stuck creating ephemeral node it has already created when two 
> zookeeper sessions are established in a very short period of time
> -
>
> Key: KAFKA-1387
> URL: https://issues.apache.org/jira/browse/KAFKA-1387
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.1
>Reporter: Fedor Korotkiy
>Priority: Blocker
>  Labels: newbie, patch, zkclient-problems
> Attachments: kafka-1387.patch
>
>
> Kafka broker re-registers itself in zookeeper every time handleNewSession() 
> callback is invoked.
> https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/server/KafkaHealthcheck.scala
>  
> Now imagine the following sequence of events.
> 1) Zookeeper session reestablishes. handleNewSession() callback is queued by 
> the zkClient, but not invoked yet.
> 2) Zookeeper session reestablishes again, queueing callback second time.
> 3) First callback is invoked, creating /broker/[id] ephemeral path.
> 4) Second callback is invoked and it tries to create /broker/[id] path using 
> createEphemeralPathExpectConflictHandleZKBug() function. But the path is 
> already exists, so createEphemeralPathExpectConflictHandleZKBug() is getting 
> stuck in the infinite loop.
> Seems like controller election code have the same issue.
> I'am able to reproduce this issue on the 0.8.1 branch from github using the 
> following configs.
> # zookeeper
> tickTime=10
> dataDir=/tmp/zk/
> clientPort=2101
> maxClientCnxns=0
> # kafka
> broker.id=1
> log.dir=/tmp/kafka
> zookeeper.connect=localhost:2101
> zookeeper.connection.timeout.ms=100
> zookeeper.sessiontimeout.ms=100
> Just start kafka and zookeeper and then pause zookeeper several times using 
> Ctrl-Z.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1387) Kafka getting stuck creating ephemeral node it has already created when two zookeeper sessions are established in a very short period of time

2015-08-11 Thread Fedor Korotkiy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682269#comment-14682269
 ] 

Fedor Korotkiy commented on KAFKA-1387:
---

Have you tried steps from issue description?

> Kafka getting stuck creating ephemeral node it has already created when two 
> zookeeper sessions are established in a very short period of time
> -
>
> Key: KAFKA-1387
> URL: https://issues.apache.org/jira/browse/KAFKA-1387
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.1
>Reporter: Fedor Korotkiy
>Priority: Blocker
>  Labels: newbie, patch, zkclient-problems
> Attachments: kafka-1387.patch
>
>
> Kafka broker re-registers itself in zookeeper every time handleNewSession() 
> callback is invoked.
> https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/server/KafkaHealthcheck.scala
>  
> Now imagine the following sequence of events.
> 1) Zookeeper session reestablishes. handleNewSession() callback is queued by 
> the zkClient, but not invoked yet.
> 2) Zookeeper session reestablishes again, queueing callback second time.
> 3) First callback is invoked, creating /broker/[id] ephemeral path.
> 4) Second callback is invoked and it tries to create /broker/[id] path using 
> createEphemeralPathExpectConflictHandleZKBug() function. But the path is 
> already exists, so createEphemeralPathExpectConflictHandleZKBug() is getting 
> stuck in the infinite loop.
> Seems like controller election code have the same issue.
> I'am able to reproduce this issue on the 0.8.1 branch from github using the 
> following configs.
> # zookeeper
> tickTime=10
> dataDir=/tmp/zk/
> clientPort=2101
> maxClientCnxns=0
> # kafka
> broker.id=1
> log.dir=/tmp/kafka
> zookeeper.connect=localhost:2101
> zookeeper.connection.timeout.ms=100
> zookeeper.sessiontimeout.ms=100
> Just start kafka and zookeeper and then pause zookeeper several times using 
> Ctrl-Z.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 37357: Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7

2015-08-11 Thread Rajini Sivaram

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37357/
---

(Updated Aug. 11, 2015, 6:56 p.m.)


Review request for kafka.


Bugs: KAFKA-2421
https://issues.apache.org/jira/browse/KAFKA-2421


Repository: kafka


Description (updated)
---

Patch for KAFKA-2421: Upgrade to LZ4 version 1.3 and update reference to Utils 
method that was moved to SafeUtils


Diffs (updated)
-

  build.gradle 1b67e628c2fca897177c12b6afad9a8700fffd1f 
  
clients/src/main/java/org/apache/kafka/common/record/KafkaLZ4BlockInputStream.java
 f480da2ae0992855cc860e1ce5cbd11ecfca7bee 
  
clients/src/main/java/org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.java
 6a2231f4775771932c36df362c88aead3189b7b8 

Diff: https://reviews.apache.org/r/37357/diff/


Testing
---


Thanks,

Rajini Sivaram



[jira] [Commented] (KAFKA-2421) Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7

2015-08-11 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682303#comment-14682303
 ] 

Rajini Sivaram commented on KAFKA-2421:
---

Updated reviewboard https://reviews.apache.org/r/37357/diff/
 against branch origin/trunk

> Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7
> 
>
> Key: KAFKA-2421
> URL: https://issues.apache.org/jira/browse/KAFKA-2421
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
> Environment: IBM Java 7
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Attachments: KAFKA-2421.patch, KAFKA-2421_2015-08-11_18:54:26.patch
>
>
> Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7.
> LZ4 version 1.2 crashes with 64-bit IBM Java 7. This has been fixed in LZ4 
> version 1.3 (https://github.com/jpountz/lz4-java/blob/master/CHANGES.md, 
> https://github.com/jpountz/lz4-java/pull/46).
> The unit test org.apache.kafka.common.record.MemoryRecordsTest crashes when 
> run with 64-bit IBM Java7 with the error:
> {quote}
> 023EB900: Native Method 0263CE10 
> (net/jpountz/lz4/LZ4JNI.LZ4_compress_limitedOutput([BII[BII)I)
> 023EB900: Invalid JNI call of function void 
> ReleasePrimitiveArrayCritical(JNIEnv *env, jarray array, void *carray, jint 
> mode): For array FFF7EAB8 parameter carray passed FFF85998, 
> expected to be FFF7EAC0
> 14:08:42.763 0x23eb900j9mm.632*   ** ASSERTION FAILED ** at 
> StandardAccessBarrier.cpp:335: ((false))
> JVMDUMP039I Processing dump event "traceassert", detail "" at 2015/08/11 
> 15:08:42 - please wait.
> {quote}
> Stack trace from javacore:
> 3XMTHREADINFO3   Java callstack:
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4JNI.LZ4_compress_limitedOutput(Native Method)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4JNICompressor.compress(LZ4JNICompressor.java:31)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.(LZ4Factory.java:163)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.instance(LZ4Factory.java:46)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.nativeInstance(LZ4Factory.java:76)
> 5XESTACKTRACE   (entered lock: 
> net/jpountz/lz4/LZ4Factory@0xE02F0BE8, entry count: 1)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.fastestInstance(LZ4Factory.java:129)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.(KafkaLZ4BlockOutputStream.java:72)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.(KafkaLZ4BlockOutputStream.java:93)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.(KafkaLZ4BlockOutputStream.java:103)
> 4XESTACKTRACEat 
> sun/reflect/NativeConstructorAccessorImpl.newInstance0(Native Method)
> 4XESTACKTRACEat 
> sun/reflect/NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:86)
> 4XESTACKTRACEat 
> sun/reflect/DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:58)
> 4XESTACKTRACEat 
> java/lang/reflect/Constructor.newInstance(Constructor.java:542)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/Compressor.wrapForOutput(Compressor.java:222)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/Compressor.(Compressor.java:72)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/Compressor.(Compressor.java:76)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecords.(MemoryRecords.java:43)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecords.emptyRecords(MemoryRecords.java:51)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecords.emptyRecords(MemoryRecords.java:55)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecordsTest.testIterator(MemoryRecordsTest.java:42)
> java -version
> java version "1.7.0"
> Java(TM) SE Runtime Environment (build pxa6470_27sr3fp1-20150605_01(SR3 FP1))
> IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
> 20150407_243189 (JIT enabled, AOT enabled)
> J9VM - R27_Java727_SR3_20150407_1831_B243189
> JIT  - tr.r13.java_20150406_89182
> GC   - R27_Java727_SR3_20150407_1831_B243189_CMPRSS
> J9CL - 20150407_243189)
> JCL - 20150601_01 based on Oracle 7u79-b14



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2421) Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7

2015-08-11 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-2421:
--
Attachment: KAFKA-2421_2015-08-11_18:54:26.patch

> Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7
> 
>
> Key: KAFKA-2421
> URL: https://issues.apache.org/jira/browse/KAFKA-2421
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
> Environment: IBM Java 7
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Attachments: KAFKA-2421.patch, KAFKA-2421_2015-08-11_18:54:26.patch
>
>
> Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7.
> LZ4 version 1.2 crashes with 64-bit IBM Java 7. This has been fixed in LZ4 
> version 1.3 (https://github.com/jpountz/lz4-java/blob/master/CHANGES.md, 
> https://github.com/jpountz/lz4-java/pull/46).
> The unit test org.apache.kafka.common.record.MemoryRecordsTest crashes when 
> run with 64-bit IBM Java7 with the error:
> {quote}
> 023EB900: Native Method 0263CE10 
> (net/jpountz/lz4/LZ4JNI.LZ4_compress_limitedOutput([BII[BII)I)
> 023EB900: Invalid JNI call of function void 
> ReleasePrimitiveArrayCritical(JNIEnv *env, jarray array, void *carray, jint 
> mode): For array FFF7EAB8 parameter carray passed FFF85998, 
> expected to be FFF7EAC0
> 14:08:42.763 0x23eb900j9mm.632*   ** ASSERTION FAILED ** at 
> StandardAccessBarrier.cpp:335: ((false))
> JVMDUMP039I Processing dump event "traceassert", detail "" at 2015/08/11 
> 15:08:42 - please wait.
> {quote}
> Stack trace from javacore:
> 3XMTHREADINFO3   Java callstack:
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4JNI.LZ4_compress_limitedOutput(Native Method)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4JNICompressor.compress(LZ4JNICompressor.java:31)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.(LZ4Factory.java:163)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.instance(LZ4Factory.java:46)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.nativeInstance(LZ4Factory.java:76)
> 5XESTACKTRACE   (entered lock: 
> net/jpountz/lz4/LZ4Factory@0xE02F0BE8, entry count: 1)
> 4XESTACKTRACEat 
> net/jpountz/lz4/LZ4Factory.fastestInstance(LZ4Factory.java:129)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.(KafkaLZ4BlockOutputStream.java:72)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.(KafkaLZ4BlockOutputStream.java:93)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.(KafkaLZ4BlockOutputStream.java:103)
> 4XESTACKTRACEat 
> sun/reflect/NativeConstructorAccessorImpl.newInstance0(Native Method)
> 4XESTACKTRACEat 
> sun/reflect/NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:86)
> 4XESTACKTRACEat 
> sun/reflect/DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:58)
> 4XESTACKTRACEat 
> java/lang/reflect/Constructor.newInstance(Constructor.java:542)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/Compressor.wrapForOutput(Compressor.java:222)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/Compressor.(Compressor.java:72)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/Compressor.(Compressor.java:76)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecords.(MemoryRecords.java:43)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecords.emptyRecords(MemoryRecords.java:51)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecords.emptyRecords(MemoryRecords.java:55)
> 4XESTACKTRACEat 
> org/apache/kafka/common/record/MemoryRecordsTest.testIterator(MemoryRecordsTest.java:42)
> java -version
> java version "1.7.0"
> Java(TM) SE Runtime Environment (build pxa6470_27sr3fp1-20150605_01(SR3 FP1))
> IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
> 20150407_243189 (JIT enabled, AOT enabled)
> J9VM - R27_Java727_SR3_20150407_1831_B243189
> JIT  - tr.r13.java_20150406_89182
> GC   - R27_Java727_SR3_20150407_1831_B243189_CMPRSS
> J9CL - 20150407_243189)
> JCL - 20150601_01 based on Oracle 7u79-b14



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 37357: Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7

2015-08-11 Thread Rajini Sivaram


> On Aug. 11, 2015, 3:29 p.m., Ismael Juma wrote:
> > clients/src/main/java/org/apache/kafka/common/record/KafkaLZ4BlockInputStream.java,
> >  line 177
> > 
> >
> > Why not use `SafeUtils`? The implementation of `UnsafeUtils` just calls 
> > that, right?
> > 
> > `public static void checkRange(byte[] buf, int off, int len) {
> >   SafeUtils.checkRange(buf, off, len);
> > }
> > `
> > 
> > https://github.com/jpountz/lz4-java/blob/master/src/java-unsafe/net/jpountz/util/UnsafeUtils.java#L60

Ismael, Thank you for the review. Yes, you are right. Have updated patch to use 
SafeUtils.


> On Aug. 11, 2015, 3:29 p.m., Ismael Juma wrote:
> > clients/src/main/java/org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.java,
> >  line 183
> > 
> >
> > Same as above.

Updated this too.


- Rajini


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37357/#review94916
---


On Aug. 11, 2015, 6:56 p.m., Rajini Sivaram wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/37357/
> ---
> 
> (Updated Aug. 11, 2015, 6:56 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2421
> https://issues.apache.org/jira/browse/KAFKA-2421
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Patch for KAFKA-2421: Upgrade to LZ4 version 1.3 and update reference to 
> Utils method that was moved to SafeUtils
> 
> 
> Diffs
> -
> 
>   build.gradle 1b67e628c2fca897177c12b6afad9a8700fffd1f 
>   
> clients/src/main/java/org/apache/kafka/common/record/KafkaLZ4BlockInputStream.java
>  f480da2ae0992855cc860e1ce5cbd11ecfca7bee 
>   
> clients/src/main/java/org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.java
>  6a2231f4775771932c36df362c88aead3189b7b8 
> 
> Diff: https://reviews.apache.org/r/37357/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Rajini Sivaram
> 
>



KIP Meeting Notes 08/11/2015

2015-08-11 Thread Guozhang Wang
First of all, WebEx seems working! And we will upload the recorded video
later.

Quick summary:

KIP-26: RP-99 (https://github.com/apache/kafka/pull/99) pending for reviews.

KIP-28: RP-130 (https://github.com/apache/kafka/pull/130) looking for
feedbacks on:

1. API design (see o.k.a.stream.examples).
2. Architecture design (see KIP wiki page)
3. Packaging options.

KIP-29: we will do a quick fix for unblocking production issues with
hard-coded interval values, while at the same time keep the KIP open for
further discussions about end state configurations.

KIP-4: KAFKA-1695 / 2210 pending for reviews.

Review Backlog Management:

1. Remind people to change JIRA status as "patch available" when they
contribute the patch, and change the status back to "in progress" after it
is reviewed, as indicated in:

https://cwiki.apache.org/confluence/display/KAFKA/Contributing+Code+Changes

2. Encourage contributors to set the "reviewer" field when change JIRA
status to "patch available", and encourage volunteers assigning themselves
to "reviewers" for pending tickets.

-- Guozhang


Re: Review Request 37357: Upgrade LZ4 to version 1.3 to avoid crashing with IBM Java 7

2015-08-11 Thread Ismael Juma

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37357/#review94979
---


Changes look good. It seems like there quite a few changes in the upstream 
library, it would probably be good to do more testing than just the unit tests.

- Ismael Juma


On Aug. 11, 2015, 6:56 p.m., Rajini Sivaram wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/37357/
> ---
> 
> (Updated Aug. 11, 2015, 6:56 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2421
> https://issues.apache.org/jira/browse/KAFKA-2421
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Patch for KAFKA-2421: Upgrade to LZ4 version 1.3 and update reference to 
> Utils method that was moved to SafeUtils
> 
> 
> Diffs
> -
> 
>   build.gradle 1b67e628c2fca897177c12b6afad9a8700fffd1f 
>   
> clients/src/main/java/org/apache/kafka/common/record/KafkaLZ4BlockInputStream.java
>  f480da2ae0992855cc860e1ce5cbd11ecfca7bee 
>   
> clients/src/main/java/org/apache/kafka/common/record/KafkaLZ4BlockOutputStream.java
>  6a2231f4775771932c36df362c88aead3189b7b8 
> 
> Diff: https://reviews.apache.org/r/37357/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Rajini Sivaram
> 
>



Re: KAFKA-2364 migrate docs from SVN to git

2015-08-11 Thread Edward Ribeiro
+1. As soon as possible, please. :)

On Sat, Aug 8, 2015 at 4:05 PM, Neha Narkhede  wrote:

> +1 on the same repo for code and website. It helps to keep both in sync.
>
> On Thu, Aug 6, 2015 at 1:52 PM, Grant Henke  wrote:
>
> > +1 for the same repo. The closer docs can be to code the more accurate
> they
> > are likely to be. The same way we encourage unit tests for a new
> > feature/patch. Updating the docs can be the same.
> >
> > If we follow Sqoop's process for example, how would small
> > fixes/adjustments/additions to the live documentation occur without a new
> > release?
> >
> > On Thu, Aug 6, 2015 at 3:33 PM, Guozhang Wang 
> wrote:
> >
> > > I am +1 on same repo too. I think keeping one git history of code / doc
> > > change may actually be beneficial for this approach as well.
> > >
> > > Guozhang
> > >
> > > On Thu, Aug 6, 2015 at 9:16 AM, Gwen Shapira 
> wrote:
> > >
> > > > I prefer same repo for one-commit / lower-barrier benefits.
> > > >
> > > > Sqoop has the following process, which decouples documentation
> changes
> > > from
> > > > website changes:
> > > >
> > > > 1. Code github repo contains a doc directory, with the documentation
> > > > written and maintained in AsciiDoc. Only one version of the
> > > documentation,
> > > > since it is source controlled with the code. (unlike current SVN
> where
> > we
> > > > have directories per version)
> > > >
> > > > 2. Build process compiles the AsciiDoc to HTML and PDF
> > > >
> > > > 3. When releasing, we post the documentation of the new release to
> the
> > > > website
> > > >
> > > > Gwen
> > > >
> > > > On Thu, Aug 6, 2015 at 12:20 AM, Ismael Juma 
> > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > For reference, here is the previous discussion on moving the
> website
> > to
> > > > > Git:
> > > > >
> > > > > http://search-hadoop.com/m/uyzND11JliU1E8QU92
> > > > >
> > > > > People were positive to the idea as Jay said. I would like to see a
> > bit
> > > > of
> > > > > a discussion around whether the website should be part of the same
> > repo
> > > > as
> > > > > the code or not. I'll get the ball rolling.
> > > > >
> > > > > Pros for same repo:
> > > > > * One commit can update the code and website, which means:
> > > > > ** Lower barrier for updating docs along with relevant code changes
> > > > > ** Easier to require that both are updated at the same time
> > > > > * More eyeballs on the website changes
> > > > > * Automatically branched with the relevant code
> > > > >
> > > > > Pros for separate repo:
> > > > > * Potentially simpler for website-only changes (smaller repo, less
> > > > > verification needed)
> > > > > * Website changes don't "clutter" the code Git history
> > > > > * No risk of website change affecting the code
> > > > >
> > > > > Your thoughts, please.
> > > > >
> > > > > Best,
> > > > > Ismael
> > > > >
> > > > > On Fri, Jul 31, 2015 at 6:15 PM, Aseem Bansal <
> asmbans...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi
> > > > > >
> > > > > > When discussing on KAFKA-2364 migrating docs from svn to git came
> > up.
> > > > > That
> > > > > > would make contributing to docs much easier. I have contributed
> to
> > > > > > groovy/grails via github so I think having mirror on github could
> > be
> > > > > > useful.
> > > > > >
> > > > > > Also I think unless there is some good reason it should be a
> > separate
> > > > > repo.
> > > > > > No need to mix docs and code.
> > > > > >
> > > > > > I can try that out.
> > > > > >
> > > > > > Thoughts?
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
> >
> >
> > --
> > Grant Henke
> > Software Engineer | Cloudera
> > gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
> >
>
>
>
> --
> Thanks,
> Neha
>


[jira] [Updated] (KAFKA-313) Add JSON/CSV output and looping options to ConsumerGroupCommand

2015-08-11 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-313:

Status: In Progress  (was: Patch Available)

> Add JSON/CSV output and looping options to ConsumerGroupCommand
> ---
>
> Key: KAFKA-313
> URL: https://issues.apache.org/jira/browse/KAFKA-313
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Dave DeMaagd
>Assignee: Ashish K Singh
>Priority: Minor
>  Labels: newbie, patch
> Fix For: 0.8.3
>
> Attachments: KAFKA-313-2012032200.diff, KAFKA-313.1.patch, 
> KAFKA-313.patch, KAFKA-313_2015-02-23_18:11:32.patch, 
> KAFKA-313_2015-06-24_11:14:24.patch, KAFKA-313_2015-08-05_15:37:32.patch, 
> KAFKA-313_2015-08-05_15:43:00.patch, KAFKA-313_2015-08-10_12:58:38.patch
>
>
> Adds:
> * '--loop N' - causes the program to loop forever, sleeping for up to N 
> seconds between loops (loop time minus collection time, unless that's less 
> than 0, at which point it will just run again immediately)
> * '--asjson' - display as a JSON string instead of the more human readable 
> output format.
> Neither of the above  depend on each other (you can loop in the human 
> readable output, or do a single shot execution with JSON output).  Existing 
> behavior/output maintained if neither of the above are used.  Diff Attached.
> Impacted files:
> core/src/main/scala/kafka/admin/ConsumerGroupCommand.scala



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2120) Add a request timeout to NetworkClient

2015-08-11 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-2120:
-
Reviewer: Jason Gustafson

[~hachikuji] assigning to you for reviews. Please feel free to re-assign.

> Add a request timeout to NetworkClient
> --
>
> Key: KAFKA-2120
> URL: https://issues.apache.org/jira/browse/KAFKA-2120
> Project: Kafka
>  Issue Type: New Feature
>Reporter: Jiangjie Qin
>Assignee: Mayuresh Gharat
> Fix For: 0.8.3
>
> Attachments: KAFKA-2120.patch, KAFKA-2120_2015-07-27_15:31:19.patch, 
> KAFKA-2120_2015-07-29_15:57:02.patch, KAFKA-2120_2015-08-10_19:55:18.patch
>
>
> Currently NetworkClient does not have a timeout setting for requests. So if 
> no response is received for a request due to reasons such as broker is down, 
> the request will never be completed.
> Request timeout will also be used as implicit timeout for some methods such 
> as KafkaProducer.flush() and kafkaProducer.close().
> KIP-19 is created for this public interface change.
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-19+-+Add+a+request+timeout+to+NetworkClient



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1387) Kafka getting stuck creating ephemeral node it has already created when two zookeeper sessions are established in a very short period of time

2015-08-11 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682416#comment-14682416
 ] 

Guozhang Wang commented on KAFKA-1387:
--

[~fpj] Could you help taking a look at this issue?

> Kafka getting stuck creating ephemeral node it has already created when two 
> zookeeper sessions are established in a very short period of time
> -
>
> Key: KAFKA-1387
> URL: https://issues.apache.org/jira/browse/KAFKA-1387
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.1
>Reporter: Fedor Korotkiy
>Priority: Blocker
>  Labels: newbie, patch, zkclient-problems
> Attachments: kafka-1387.patch
>
>
> Kafka broker re-registers itself in zookeeper every time handleNewSession() 
> callback is invoked.
> https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/server/KafkaHealthcheck.scala
>  
> Now imagine the following sequence of events.
> 1) Zookeeper session reestablishes. handleNewSession() callback is queued by 
> the zkClient, but not invoked yet.
> 2) Zookeeper session reestablishes again, queueing callback second time.
> 3) First callback is invoked, creating /broker/[id] ephemeral path.
> 4) Second callback is invoked and it tries to create /broker/[id] path using 
> createEphemeralPathExpectConflictHandleZKBug() function. But the path is 
> already exists, so createEphemeralPathExpectConflictHandleZKBug() is getting 
> stuck in the infinite loop.
> Seems like controller election code have the same issue.
> I'am able to reproduce this issue on the 0.8.1 branch from github using the 
> following configs.
> # zookeeper
> tickTime=10
> dataDir=/tmp/zk/
> clientPort=2101
> maxClientCnxns=0
> # kafka
> broker.id=1
> log.dir=/tmp/kafka
> zookeeper.connect=localhost:2101
> zookeeper.connection.timeout.ms=100
> zookeeper.sessiontimeout.ms=100
> Just start kafka and zookeeper and then pause zookeeper several times using 
> Ctrl-Z.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[DISCUSS] Client-side Assignment for New Consumer

2015-08-11 Thread Jason Gustafson
Hi Kafka Devs,

One of the nagging issues in the current design of the new consumer has
been the need to support a variety of assignment strategies. We've
encountered this in particular in the design of copycat and the processing
framework (KIP-28). From what I understand, Samza also has a number of use
cases with custom assignment needs. The new consumer protocol supports new
assignment strategies by hooking them into the broker. For many
environments, this is a major pain and in some cases, a non-starter. It
also challenges the validation that the coordinator can provide. For
example, some assignment strategies call for partitions to be assigned
multiple times, which means that the coordinator can only check that
partitions have been assigned at least once.

To solve these issues, we'd like to propose moving assignment to the
client. I've written a wiki which outlines some protocol changes to achieve
this:
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Client-side+Assignment+Proposal.
To summarize briefly, instead of the coordinator assigning the partitions
itself, all subscriptions are forwarded to each member of the group which
then decides independently which partitions it should consume. The protocol
provides a mechanism for the coordinator to validate that all consumers use
the same assignment strategy, but it does not ensure that the resulting
assignment is "correct." This provides a powerful capability for users to
control the full data flow on the client side. They control how data is
written to partitions through the Partitioner interface and they control
how data is consumed through the assignment strategy, all without touching
the server.

Of course nothing comes for free. In particular, this change removes the
ability of the coordinator to validate that commits are made by consumers
who were assigned the respective partition. This might not be too bad since
we retain the ability to validate the generation id, but it is a potential
concern. We have considered alternative protocols which add a second
round-trip to the protocol in order to give the coordinator the ability to
confirm the assignment. As mentioned above, the coordinator is somewhat
limited in what it can actually validate, but this would return its ability
to validate commits. The tradeoff is that it increases the protocol's
complexity which means more ways for the protocol to fail and consequently
more edge cases in the code.

It also misses an opportunity to generalize the group membership protocol
for additional use cases. In fact, after you've gone to the trouble of
moving assignment to the client, the main thing that is left in this
protocol is basically a general group management capability. This is
exactly what is needed for a few cases that are currently under discussion
(e.g. copycat or single-writer producer). We've taken this further step in
the proposal and attempted to envision what that general protocol might
look like and how it could be used both by the consumer and for some of
these other cases.

Anyway, since time is running out on the new consumer, we have perhaps one
last chance to consider a significant change in the protocol like this, so
have a look at the wiki and share your thoughts. I've no doubt that some
ideas seem clearer in my mind than they do on paper, so ask questions if
there is any confusion.

Thanks!
Jason


[jira] [Commented] (KAFKA-2406) ISR propagation should be throttled to avoid overwhelming controller.

2015-08-11 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682431#comment-14682431
 ] 

Jiangjie Qin commented on KAFKA-2406:
-

[~junrao] [~ashishujjain] We discussed about KIP-29 on today's KIP hangout. In 
this ticket we will hard code the ISR propagation interval to fix the trunk. I 
will create another ticket and link that to KIP-29 and submit follow up patch 
once we reach conclusion for KIP-29.

I just submitted a new patch that has the ISR propagation interval hard coded 
to 5 seconds. Could you help review? Thanks.

Jiangjie (Becket) Qin

> ISR propagation should be throttled to avoid overwhelming controller.
> -
>
> Key: KAFKA-2406
> URL: https://issues.apache.org/jira/browse/KAFKA-2406
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
>Priority: Blocker
>
> This is a follow up patch for KAFKA-1367.
> We need to throttle the ISR propagation rate to avoid flooding in controller 
> to broker traffic. This might significantly increase time of controlled 
> shutdown or cluster startup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 36858: Patch for KAFKA-2120

2015-08-11 Thread Jun Rao


> On Aug. 7, 2015, 12:36 a.m., Jun Rao wrote:
> > clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java,
> >  line 223
> > 
> >
> > Not sure if the test is needed. First, it seems that batch should never 
> > will be null. Second, let's say the producer can't connect to any broker. 
> > The producer can't refresh the metdata. So the leader will still be the old 
> > one and may not be null. In this case, it seems that we should still expire 
> > the records.
> 
> Mayuresh Gharat wrote:
> In this case :" Second, let's say the producer can't connect to any 
> broker. The producer can't refresh the metdata. So the leader will still be 
> the old one and may not be null. In this case, it seems that we should still 
> expire the records.", the request will eventually fail due to requestTimeout 
> and retry exhaustion, when trying to send to broker.
> 
> > I was thinking on the same line of your suggestion, expiring the 
> batch if it has exceeded the threshold even if we have metadata available, 
> but the KIP said explicitly that "Request timeout will also be used when the 
> batches in the accumulator that are ready but not drained due to metadata 
> missing".

Got it. Thanks for the explanation.


- Jun


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36858/#review94447
---


On Aug. 11, 2015, 2:55 a.m., Mayuresh Gharat wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36858/
> ---
> 
> (Updated Aug. 11, 2015, 2:55 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2120
> https://issues.apache.org/jira/browse/KAFKA-2120
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Solved compile error
> 
> 
> Addressed Jason's comments for Kip-19
> 
> 
> Addressed Jun's comments
> 
> 
> Diffs
> -
> 
>   clients/src/main/java/org/apache/kafka/clients/ClientRequest.java 
> dc8f0f115bcda893c95d17c0a57be8d14518d034 
>   clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 
> 2c421f42ed3fc5d61cf9c87a7eaa7bb23e26f63b 
>   clients/src/main/java/org/apache/kafka/clients/InFlightRequests.java 
> 15d00d4e484bb5d51a9ae6857ed6e024a2cc1820 
>   clients/src/main/java/org/apache/kafka/clients/KafkaClient.java 
> 7ab2503794ff3aab39df881bd9fbae6547827d3b 
>   clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 
> 0e51d7bd461d253f4396a5b6ca7cd391658807fa 
>   clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
> d35b421a515074d964c7fccb73d260b847ea5f00 
>   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
> ed99e9bdf7c4ea7a6d4555d4488cf8ed0b80641b 
>   
> clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java
>  9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 
>   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
> 03b8dd23df63a8d8a117f02eabcce4a2d48c44f7 
>   clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
> aa264202f2724907924985a5ecbe74afc4c6c04b 
>   
> clients/src/main/java/org/apache/kafka/clients/producer/internals/BufferPool.java
>  4cb1e50d6c4ed55241aeaef1d3af09def5274103 
>   
> clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java
>  a152bd7697dca55609a9ec4cfe0a82c10595fbc3 
>   
> clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordBatch.java
>  06182db1c3a5da85648199b4c0c98b80ea7c6c0c 
>   
> clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
> 0baf16e55046a2f49f6431e01d52c323c95eddf0 
>   clients/src/main/java/org/apache/kafka/common/network/Selector.java 
> ce20111ac434eb8c74585e9c63757bb9d60a832f 
>   clients/src/test/java/org/apache/kafka/clients/MockClient.java 
> 9133d85342b11ba2c9888d4d2804d181831e7a8e 
>   clients/src/test/java/org/apache/kafka/clients/NetworkClientTest.java 
> 43238ceaad0322e39802b615bb805b895336a009 
>   
> clients/src/test/java/org/apache/kafka/clients/producer/internals/BufferPoolTest.java
>  2c693824fa53db1e38766b8c66a0ef42ef9d0f3a 
>   
> clients/src/test/java/org/apache/kafka/clients/producer/internals/RecordAccumulatorTest.java
>  5b2e4ffaeab7127648db608c179703b27b577414 
>   
> clients/src/test/java/org/apache/kafka/clients/producer/internals/SenderTest.java
>  8b1805d3d2bcb9fe2bacb37d870c3236aa9532c4 
>   clients/src/test/java/org/apache/kafka/common/network/SelectorTest.java 
> 158f9829ff64a969008f699e40c51e918287859e 
>   core/src/main/scala/kafka/tools/ProducerPerformance.scala 
> 0335cc64013ffe2cdf1c4879e86e11ec8c526712 
>   core/src/test/scala/integration/

Re: Review Request 36858: Patch for KAFKA-2120

2015-08-11 Thread Jason Gustafson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36858/#review94999
---

Ship it!


LGTM (other than the minor issue below). As discussed on the jira board, a more 
general approach would be to allow a timeout on the client request itself. My 
guess is that we'll need that in the long run, but the approach here is a good 
starting point.


clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
(line 302)


Can we make this value greater than sessionTimeoutMs (which is 30s). Even 
if we don't address the issue of sanity between the different timeouts in this 
patch, it would be nice to have compatible defaults to keep the consumer from 
breaking out of the box.


- Jason Gustafson


On Aug. 11, 2015, 2:55 a.m., Mayuresh Gharat wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36858/
> ---
> 
> (Updated Aug. 11, 2015, 2:55 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2120
> https://issues.apache.org/jira/browse/KAFKA-2120
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Solved compile error
> 
> 
> Addressed Jason's comments for Kip-19
> 
> 
> Addressed Jun's comments
> 
> 
> Diffs
> -
> 
>   clients/src/main/java/org/apache/kafka/clients/ClientRequest.java 
> dc8f0f115bcda893c95d17c0a57be8d14518d034 
>   clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 
> 2c421f42ed3fc5d61cf9c87a7eaa7bb23e26f63b 
>   clients/src/main/java/org/apache/kafka/clients/InFlightRequests.java 
> 15d00d4e484bb5d51a9ae6857ed6e024a2cc1820 
>   clients/src/main/java/org/apache/kafka/clients/KafkaClient.java 
> 7ab2503794ff3aab39df881bd9fbae6547827d3b 
>   clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 
> 0e51d7bd461d253f4396a5b6ca7cd391658807fa 
>   clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
> d35b421a515074d964c7fccb73d260b847ea5f00 
>   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
> ed99e9bdf7c4ea7a6d4555d4488cf8ed0b80641b 
>   
> clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java
>  9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 
>   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
> 03b8dd23df63a8d8a117f02eabcce4a2d48c44f7 
>   clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
> aa264202f2724907924985a5ecbe74afc4c6c04b 
>   
> clients/src/main/java/org/apache/kafka/clients/producer/internals/BufferPool.java
>  4cb1e50d6c4ed55241aeaef1d3af09def5274103 
>   
> clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java
>  a152bd7697dca55609a9ec4cfe0a82c10595fbc3 
>   
> clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordBatch.java
>  06182db1c3a5da85648199b4c0c98b80ea7c6c0c 
>   
> clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
> 0baf16e55046a2f49f6431e01d52c323c95eddf0 
>   clients/src/main/java/org/apache/kafka/common/network/Selector.java 
> ce20111ac434eb8c74585e9c63757bb9d60a832f 
>   clients/src/test/java/org/apache/kafka/clients/MockClient.java 
> 9133d85342b11ba2c9888d4d2804d181831e7a8e 
>   clients/src/test/java/org/apache/kafka/clients/NetworkClientTest.java 
> 43238ceaad0322e39802b615bb805b895336a009 
>   
> clients/src/test/java/org/apache/kafka/clients/producer/internals/BufferPoolTest.java
>  2c693824fa53db1e38766b8c66a0ef42ef9d0f3a 
>   
> clients/src/test/java/org/apache/kafka/clients/producer/internals/RecordAccumulatorTest.java
>  5b2e4ffaeab7127648db608c179703b27b577414 
>   
> clients/src/test/java/org/apache/kafka/clients/producer/internals/SenderTest.java
>  8b1805d3d2bcb9fe2bacb37d870c3236aa9532c4 
>   clients/src/test/java/org/apache/kafka/common/network/SelectorTest.java 
> 158f9829ff64a969008f699e40c51e918287859e 
>   core/src/main/scala/kafka/tools/ProducerPerformance.scala 
> 0335cc64013ffe2cdf1c4879e86e11ec8c526712 
>   core/src/test/scala/integration/kafka/api/ProducerFailureHandlingTest.scala 
> ee94011894b46864614b97bbd2a98375a7d3f20b 
>   core/src/test/scala/unit/kafka/utils/TestUtils.scala 
> eb169d8b33c27d598cc24e5a2e5f78b789fa38d3 
> 
> Diff: https://reviews.apache.org/r/36858/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Mayuresh Gharat
> 
>



Re: KIP Meeting Notes 08/11/2015

2015-08-11 Thread Grant Henke
>
> 2. Encourage contributors to set the "reviewer" field when change JIRA
> status to "patch available", and encourage volunteers assigning themselves
> to "reviewers" for pending tickets.


Is there somewhere that describes who to pick as a reviewer based on the
patch?  Would it be worth listing volunteer reviews in a similar location?

On Tue, Aug 11, 2015 at 2:14 PM, Guozhang Wang  wrote:

> First of all, WebEx seems working! And we will upload the recorded video
> later.
>
> Quick summary:
>
> KIP-26: RP-99 (https://github.com/apache/kafka/pull/99) pending for
> reviews.
>
> KIP-28: RP-130 (https://github.com/apache/kafka/pull/130) looking for
> feedbacks on:
>
> 1. API design (see o.k.a.stream.examples).
> 2. Architecture design (see KIP wiki page)
> 3. Packaging options.
>
> KIP-29: we will do a quick fix for unblocking production issues with
> hard-coded interval values, while at the same time keep the KIP open for
> further discussions about end state configurations.
>
> KIP-4: KAFKA-1695 / 2210 pending for reviews.
>
> Review Backlog Management:
>
> 1. Remind people to change JIRA status as "patch available" when they
> contribute the patch, and change the status back to "in progress" after it
> is reviewed, as indicated in:
>
> https://cwiki.apache.org/confluence/display/KAFKA/Contributing+Code+Changes
>
> 2. Encourage contributors to set the "reviewer" field when change JIRA
> status to "patch available", and encourage volunteers assigning themselves
> to "reviewers" for pending tickets.
>
> -- Guozhang
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


[jira] [Commented] (KAFKA-2398) Transient test failure for SocketServerTest - Socket closed.

2015-08-11 Thread Ben Stopford (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14687399#comment-14687399
 ] 

Ben Stopford commented on KAFKA-2398:
-

[~ijuma] [~becket_qin] so can we close this Jira then?

> Transient test failure for SocketServerTest - Socket closed.
> 
>
> Key: KAFKA-2398
> URL: https://issues.apache.org/jira/browse/KAFKA-2398
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jiangjie Qin
>
> See the following transient test failure for SocketServerTest.
> kafka.network.SocketServerTest > simpleRequest FAILED
> java.net.SocketException: Socket closed
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
> at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:579)
> at java.net.Socket.connect(Socket.java:528)
> at java.net.Socket.(Socket.java:425)
> at java.net.Socket.(Socket.java:208)
> at kafka.network.SocketServerTest.connect(SocketServerTest.scala:84)
> at 
> kafka.network.SocketServerTest.simpleRequest(SocketServerTest.scala:94)
> kafka.network.SocketServerTest > tooBigRequestIsRejected FAILED
> java.net.SocketException: Socket closed
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
> at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:579)
> at java.net.Socket.connect(Socket.java:528)
> at java.net.Socket.(Socket.java:425)
> at java.net.Socket.(Socket.java:208)
> at kafka.network.SocketServerTest.connect(SocketServerTest.scala:84)
> at 
> kafka.network.SocketServerTest.tooBigRequestIsRejected(SocketServerTest.scala:124)
> kafka.network.SocketServerTest > testSocketsCloseOnShutdown FAILED
> java.net.SocketException: Socket closed
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
> at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:579)
> at java.net.Socket.connect(Socket.java:528)
> at java.net.Socket.(Socket.java:425)
> at java.net.Socket.(Socket.java:208)
> at kafka.network.SocketServerTest.connect(SocketServerTest.scala:84)
> at 
> kafka.network.SocketServerTest.testSocketsCloseOnShutdown(SocketServerTest.scala:136)
> kafka.network.SocketServerTest > testMaxConnectionsPerIp FAILED
> java.net.SocketException: Socket closed
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
> at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:579)
> at java.net.Socket.connect(Socket.java:528)
> at java.net.Socket.(Socket.java:425)
> at java.net.Socket.(Socket.java:208)
> at kafka.network.SocketServerTest.connect(SocketServerTest.scala:84)
> at 
> kafka.network.SocketServerTest$$anonfun$1.apply(SocketServerTest.scala:170)
> at 
> kafka.network.SocketServerTest$$anonfun$1.apply(SocketServerTest.scala:170)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at scala.collection.immutable.Range.foreach(Range.scala:141)
> at 
> scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
> at scala.collection.AbstractTraversable.map(Traversable.scala:105)
> at 
> kafka.network.SocketServerTest.testMaxConnectionsPerIp(SocketServerTest.scala:170)
> kafka.network.SocketServerTest > t

Re: Review Request 36858: Patch for KAFKA-2120

2015-08-11 Thread Mayuresh Gharat


> On Aug. 11, 2015, 8:49 p.m., Jason Gustafson wrote:
> > clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java,
> >  line 302
> > 
> >
> > Can we make this value greater than sessionTimeoutMs (which is 30s). 
> > Even if we don't address the issue of sanity between the different timeouts 
> > in this patch, it would be nice to have compatible defaults to keep the 
> > consumer from breaking out of the box.

Hi Jason,

I will upload a new patch with the sanity test.


- Mayuresh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36858/#review94999
---


On Aug. 11, 2015, 2:55 a.m., Mayuresh Gharat wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36858/
> ---
> 
> (Updated Aug. 11, 2015, 2:55 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2120
> https://issues.apache.org/jira/browse/KAFKA-2120
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Solved compile error
> 
> 
> Addressed Jason's comments for Kip-19
> 
> 
> Addressed Jun's comments
> 
> 
> Diffs
> -
> 
>   clients/src/main/java/org/apache/kafka/clients/ClientRequest.java 
> dc8f0f115bcda893c95d17c0a57be8d14518d034 
>   clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 
> 2c421f42ed3fc5d61cf9c87a7eaa7bb23e26f63b 
>   clients/src/main/java/org/apache/kafka/clients/InFlightRequests.java 
> 15d00d4e484bb5d51a9ae6857ed6e024a2cc1820 
>   clients/src/main/java/org/apache/kafka/clients/KafkaClient.java 
> 7ab2503794ff3aab39df881bd9fbae6547827d3b 
>   clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 
> 0e51d7bd461d253f4396a5b6ca7cd391658807fa 
>   clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
> d35b421a515074d964c7fccb73d260b847ea5f00 
>   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
> ed99e9bdf7c4ea7a6d4555d4488cf8ed0b80641b 
>   
> clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java
>  9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 
>   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
> 03b8dd23df63a8d8a117f02eabcce4a2d48c44f7 
>   clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
> aa264202f2724907924985a5ecbe74afc4c6c04b 
>   
> clients/src/main/java/org/apache/kafka/clients/producer/internals/BufferPool.java
>  4cb1e50d6c4ed55241aeaef1d3af09def5274103 
>   
> clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java
>  a152bd7697dca55609a9ec4cfe0a82c10595fbc3 
>   
> clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordBatch.java
>  06182db1c3a5da85648199b4c0c98b80ea7c6c0c 
>   
> clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
> 0baf16e55046a2f49f6431e01d52c323c95eddf0 
>   clients/src/main/java/org/apache/kafka/common/network/Selector.java 
> ce20111ac434eb8c74585e9c63757bb9d60a832f 
>   clients/src/test/java/org/apache/kafka/clients/MockClient.java 
> 9133d85342b11ba2c9888d4d2804d181831e7a8e 
>   clients/src/test/java/org/apache/kafka/clients/NetworkClientTest.java 
> 43238ceaad0322e39802b615bb805b895336a009 
>   
> clients/src/test/java/org/apache/kafka/clients/producer/internals/BufferPoolTest.java
>  2c693824fa53db1e38766b8c66a0ef42ef9d0f3a 
>   
> clients/src/test/java/org/apache/kafka/clients/producer/internals/RecordAccumulatorTest.java
>  5b2e4ffaeab7127648db608c179703b27b577414 
>   
> clients/src/test/java/org/apache/kafka/clients/producer/internals/SenderTest.java
>  8b1805d3d2bcb9fe2bacb37d870c3236aa9532c4 
>   clients/src/test/java/org/apache/kafka/common/network/SelectorTest.java 
> 158f9829ff64a969008f699e40c51e918287859e 
>   core/src/main/scala/kafka/tools/ProducerPerformance.scala 
> 0335cc64013ffe2cdf1c4879e86e11ec8c526712 
>   core/src/test/scala/integration/kafka/api/ProducerFailureHandlingTest.scala 
> ee94011894b46864614b97bbd2a98375a7d3f20b 
>   core/src/test/scala/unit/kafka/utils/TestUtils.scala 
> eb169d8b33c27d598cc24e5a2e5f78b789fa38d3 
> 
> Diff: https://reviews.apache.org/r/36858/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Mayuresh Gharat
> 
>



[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-08-11 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692289#comment-14692289
 ] 

Guozhang Wang commented on KAFKA-1778:
--

Chiming in late here, I think we are actually discussing two different, though 
somewhat overlapped issues:

1. When a controller is in bad state but not resigning, or if we just want to 
move controllers programmatically (i.e. not through deleting znode or bouncing 
broker), we want to trigger a re-election, and potentially enforce a certain 
broker to be the new controller during the re-election so that the whole 
cluster can still move on without losing one broker.

2. For isolating load scenarios, we want to start a broker while indicating it 
to be the controller candidate or not. Controller elections will only be 
triggered among the candidates.

Per the JIRA title suggests, I think we are targeting on the first issue, for 
which the motivation is mainly operation convenience; hence the solution for 
the second issue may not really be preferred since it still does not allow SREs 
to trigger a new election ([~charmalloc] corrects me if I am wrong). 

> Create new re-elect controller admin function
> -
>
> Key: KAFKA-1778
> URL: https://issues.apache.org/jira/browse/KAFKA-1778
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Abhishek Nigam
> Fix For: 0.8.3
>
>
> kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1683) Implement a "session" concept in the socket server

2015-08-11 Thread Parth Brahmbhatt (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692317#comment-14692317
 ] 

Parth Brahmbhatt commented on KAFKA-1683:
-

[~eugenstud] I believe this patch will set the foundation for the 
authorization. It will introduce the concept of a session where session will 
capture identity of the client so authorization layer can use that identity to 
authorize against some acl store. The Authorizer it self is being reviewed as 
part of KAFKA-2210.

I am not sure what you mean by " as different users". can you elaborate? 

> Implement a "session" concept in the socket server
> --
>
> Key: KAFKA-1683
> URL: https://issues.apache.org/jira/browse/KAFKA-1683
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 0.9.0
>Reporter: Jay Kreps
>Assignee: Gwen Shapira
> Fix For: 0.8.3
>
> Attachments: KAFKA-1683.patch, KAFKA-1683.patch
>
>
> To implement authentication we need a way to keep track of some things 
> between requests. The initial use for this would be remembering the 
> authenticated user/principle info, but likely more uses would come up (for 
> example we will also need to remember whether and which encryption or 
> integrity measures are in place on the socket so we can wrap and unwrap 
> writes and reads).
> I was thinking we could just add a Session object that might have a user 
> field. The session object would need to get added to RequestChannel.Request 
> so it is passed down to the API layer with each request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-2363) ProducerSendTest.testCloseWithZeroTimeoutFromCallerThread Transient Failure

2015-08-11 Thread Ben Stopford (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Stopford reassigned KAFKA-2363:
---

Assignee: Ben Stopford

> ProducerSendTest.testCloseWithZeroTimeoutFromCallerThread Transient Failure
> ---
>
> Key: KAFKA-2363
> URL: https://issues.apache.org/jira/browse/KAFKA-2363
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Fangmin Lv
>Assignee: Ben Stopford
>  Labels: newbie
> Fix For: 0.9.0
>
>
> {code}
> kafka.api.ProducerSendTest > testCloseWithZeroTimeoutFromCallerThread 
> STANDARD_OUT
> [2015-07-24 23:13:05,148] WARN fsync-ing the write ahead log in SyncThread:0 
> took 1084ms which will adversely effect operation latency. See the ZooKeeper 
> troubleshooting guide (org.apache.zookeeper.s
> erver.persistence.FileTxnLog:334)
> kafka.api.ProducerSendTest > testCloseWithZeroTimeoutFromCallerThread FAILED
> java.lang.AssertionError: No request is complete.
> at org.junit.Assert.fail(Assert.java:92)
> at org.junit.Assert.assertTrue(Assert.java:44)
> at 
> kafka.api.ProducerSendTest$$anonfun$testCloseWithZeroTimeoutFromCallerThread$1.apply$mcVI$sp(ProducerSendTest.scala:343)
> at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
> at 
> kafka.api.ProducerSendTest.testCloseWithZeroTimeoutFromCallerThread(ProducerSendTest.scala:340)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 36548: Patch for KAFKA-2336

2015-08-11 Thread Gwen Shapira

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36548/#review95012
---

Ship it!


Ship It!

- Gwen Shapira


On Aug. 11, 2015, 3:37 p.m., Grant Henke wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36548/
> ---
> 
> (Updated Aug. 11, 2015, 3:37 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2336
> https://issues.apache.org/jira/browse/KAFKA-2336
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-2336: Changing offsets.topic.num.partitions after the offset topic is 
> created breaks consumer group partition assignment
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/server/OffsetManager.scala 
> 47b6ce93da320a565435b4a7916a0c4371143b8a 
> 
> Diff: https://reviews.apache.org/r/36548/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Grant Henke
> 
>



Re: Review Request 36548: Patch for KAFKA-2336

2015-08-11 Thread Gwen Shapira


> On Aug. 11, 2015, 10:08 p.m., Gwen Shapira wrote:
> > Ship It!

Jiangjie, I commited despite your concerns since this patch fixes a huge 
potential issue.

If you have an idea for an improved fix, we can tackle this in a follow up.


- Gwen


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36548/#review95012
---


On Aug. 11, 2015, 3:37 p.m., Grant Henke wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36548/
> ---
> 
> (Updated Aug. 11, 2015, 3:37 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2336
> https://issues.apache.org/jira/browse/KAFKA-2336
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-2336: Changing offsets.topic.num.partitions after the offset topic is 
> created breaks consumer group partition assignment
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/server/OffsetManager.scala 
> 47b6ce93da320a565435b4a7916a0c4371143b8a 
> 
> Diff: https://reviews.apache.org/r/36548/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Grant Henke
> 
>



[jira] [Updated] (KAFKA-2336) Changing offsets.topic.num.partitions after the offset topic is created breaks consumer group partition assignment

2015-08-11 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira updated KAFKA-2336:

   Resolution: Fixed
Fix Version/s: 0.8.3
   Status: Resolved  (was: Patch Available)

+1 and pushed to trunk.

Thanks for your contribution [~granthenke] and for the review [~becket_qin]!

> Changing offsets.topic.num.partitions after the offset topic is created 
> breaks consumer group partition assignment 
> ---
>
> Key: KAFKA-2336
> URL: https://issues.apache.org/jira/browse/KAFKA-2336
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.2.1
>Reporter: Grant Henke
>Assignee: Grant Henke
> Fix For: 0.8.3
>
> Attachments: KAFKA-2336.patch, KAFKA-2336.patch, 
> KAFKA-2336_2015-07-16_13:04:02.patch, KAFKA-2336_2015-08-11_10:37:41.patch
>
>
> Currently adjusting offsets.topic.num.partitions after the offset topic is 
> created is not supported. Meaning that the number of partitions will not 
> change once the topic has been created.
> However, changing the value in the configuration should not cause issues and 
> instead simply be ignored. Currently this is not the case. 
> When the value of offsets.topic.num.partitions is changed after the offset 
> topic is created the consumer group partition assignment completely changes 
> even though the number of partitions does not change. 
> This is because _kafka.server.OffsetManager.partitionFor(group: String)_ uses 
> the configured value and not the value of the actual topic. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2410) Implement "Auto Topic Creation" client side and remove support from Broker side

2015-08-11 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692368#comment-14692368
 ] 

Sriharsha Chintalapani commented on KAFKA-2410:
---

[~granthenke] This issue is already addressed in KAFKA-1507 and patch available 
for more than a year now.

> Implement "Auto Topic Creation" client side and remove support from Broker 
> side
> ---
>
> Key: KAFKA-2410
> URL: https://issues.apache.org/jira/browse/KAFKA-2410
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, core
>Affects Versions: 0.8.2.1
>Reporter: Grant Henke
>Assignee: Grant Henke
>
> Auto topic creation on the broker has caused pain in the past; And today it 
> still causes unusual error handling requirements on the client side, added 
> complexity in the broker, mixed responsibility of the TopicMetadataRequest, 
> and limits configuration of the option to be cluster wide. In the future 
> having it broker side will also make features such as authorization very 
> difficult. 
> There have been discussions in the past of implementing this feature client 
> side. 
> [example|https://issues.apache.org/jira/browse/KAFKA-689?focusedCommentId=13548746&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13548746]
> This Jira is to track that discussion and implementation once the necessary 
> protocol support exists: KAFKA-2229



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-08-11 Thread Abhishek Nigam (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692370#comment-14692370
 ] 

Abhishek Nigam commented on KAFKA-1778:
---

Hi Guozhang,
I agree 100% with you. Can you tell me what is the best way to move forward
on this on the open source side.

-Abhishek

On Tue, Aug 11, 2015 at 2:30 PM, Guozhang Wang (JIRA) 



> Create new re-elect controller admin function
> -
>
> Key: KAFKA-1778
> URL: https://issues.apache.org/jira/browse/KAFKA-1778
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Abhishek Nigam
> Fix For: 0.8.3
>
>
> kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1507) Using GetOffsetShell against non-existent topic creates the topic unintentionally

2015-08-11 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692373#comment-14692373
 ] 

Sriharsha Chintalapani commented on KAFKA-1507:
---

[~jkreps] Since there is interest in the community about moving creation of 
topics onto client side specifically producer side can this patch be reviewed. 
There are also other JIRAs filed
https://issues.apache.org/jira/browse/KAFKA-2410 asking for the same feature 
addressed in the patch here. There is obviously big JIRA to add create topic 
requests https://issues.apache.org/jira/browse/KAFKA-2229 not sure if this 
needs to be blocked by that. If there is interest than I can upmerge my patch.

> Using GetOffsetShell against non-existent topic creates the topic 
> unintentionally
> -
>
> Key: KAFKA-1507
> URL: https://issues.apache.org/jira/browse/KAFKA-1507
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.1
> Environment: centos
>Reporter: Luke Forehand
>Assignee: Sriharsha Chintalapani
>Priority: Minor
>  Labels: newbie
> Attachments: KAFKA-1507.patch, KAFKA-1507.patch, 
> KAFKA-1507_2014-07-22_10:27:45.patch, KAFKA-1507_2014-07-23_17:07:20.patch, 
> KAFKA-1507_2014-08-12_18:09:06.patch, KAFKA-1507_2014-08-22_11:06:38.patch, 
> KAFKA-1507_2014-08-22_11:08:51.patch
>
>
> A typo in using GetOffsetShell command can cause a
> topic to be created which cannot be deleted (because deletion is still in
> progress)
> ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list
> kafka10:9092,kafka11:9092,kafka12:9092,kafka13:9092 --topic typo --time 1
> ./kafka-topics.sh --zookeeper stormqa1/kafka-prod --describe --topic typo
> Topic:typo  PartitionCount:8ReplicationFactor:1 Configs:
>  Topic: typo Partition: 0Leader: 10  Replicas: 10
>   Isr: 10
> ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2367) Add Copycat runtime data API

2015-08-11 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692375#comment-14692375
 ] 

Gwen Shapira commented on KAFKA-2367:
-

I would prefer to use Avro as the internal Data API, rather than invent a new 
one.
Avro is 90% of the way to what we need, will seamlessly integrate with the 
fairly common use-case of Avro-in-Kafka, can serialize to JSON if people are 
interested and because it is an internal format, we are not forcing users into 
using Avro.

Avro has very good backward compatibility guarantees, so adding it as a CopyCat 
dependency is fairly safe, and IMO better than the alternatives.

> Add Copycat runtime data API
> 
>
> Key: KAFKA-2367
> URL: https://issues.apache.org/jira/browse/KAFKA-2367
> Project: Kafka
>  Issue Type: Sub-task
>  Components: copycat
>Reporter: Ewen Cheslack-Postava
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.8.3
>
>
> Design the API used for runtime data in Copycat. This API is used to 
> construct schemas and records that Copycat processes. This needs to be a 
> fairly general data model (think Avro, JSON, Protobufs, Thrift) in order to 
> support complex, varied data types that may be input from/output to many data 
> systems.
> This should issue should also address the serialization interfaces used 
> within Copycat, which translate the runtime data into serialized byte[] form. 
> It is important that these be considered together because the data format can 
> be used in multiple ways (records, partition IDs, partition offsets), so it 
> and the corresponding serializers must be sufficient for all these use cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-2408 ConsoleConsumerService direct log o...

2015-08-11 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/123


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (KAFKA-2408) (new) system tests: ConsoleConsumerService occasionally fails to register consumed message

2015-08-11 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-2408.
--
Resolution: Fixed

Issue resolved by pull request 123
[https://github.com/apache/kafka/pull/123]

> (new) system tests: ConsoleConsumerService occasionally fails to register 
> consumed message
> --
>
> Key: KAFKA-2408
> URL: https://issues.apache.org/jira/browse/KAFKA-2408
> Project: Kafka
>  Issue Type: Bug
>Reporter: Geoffrey Anderson
>Assignee: Geoffrey Anderson
> Fix For: 0.8.3
>
>
> There have been a few spurious failures in ReplicationTest.test_hard_bounce, 
> where it was reported that a few of the acked messages were not consumed.
> Checking the logs, however, it is clear that they were consumed, but 
> ConsoleConsumerService failed to parse.
> Lines causing parsing failure looks something like:
> 779725[2015-08-03 07:25:47,757] ERROR 
> [ConsumerFetcherThread-console-consumer-78957_ip-172-31-5-20-1438586715191-249db71c-0-1],
>  Error for partition [test_topic,0] to broker 1:class 
> kafka.common.NotLeaderForPartitionException 
> (kafka.consumer.ConsumerFetcherThread)
> (i.e. the consumed message, and a log message appear on the same line)
> ConsoleConsumerService simply tries to strip each line of whitespace and 
> parse as an integer, which will clearly fail in this case.
> Solution should either redirect stderr elsewhere or update parsing to handle 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2408) (new) system tests: ConsoleConsumerService occasionally fails to register consumed message

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692379#comment-14692379
 ] 

ASF GitHub Bot commented on KAFKA-2408:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/123


> (new) system tests: ConsoleConsumerService occasionally fails to register 
> consumed message
> --
>
> Key: KAFKA-2408
> URL: https://issues.apache.org/jira/browse/KAFKA-2408
> Project: Kafka
>  Issue Type: Bug
>Reporter: Geoffrey Anderson
>Assignee: Geoffrey Anderson
> Fix For: 0.8.3
>
>
> There have been a few spurious failures in ReplicationTest.test_hard_bounce, 
> where it was reported that a few of the acked messages were not consumed.
> Checking the logs, however, it is clear that they were consumed, but 
> ConsoleConsumerService failed to parse.
> Lines causing parsing failure looks something like:
> 779725[2015-08-03 07:25:47,757] ERROR 
> [ConsumerFetcherThread-console-consumer-78957_ip-172-31-5-20-1438586715191-249db71c-0-1],
>  Error for partition [test_topic,0] to broker 1:class 
> kafka.common.NotLeaderForPartitionException 
> (kafka.consumer.ConsumerFetcherThread)
> (i.e. the consumed message, and a log message appear on the same line)
> ConsoleConsumerService simply tries to strip each line of whitespace and 
> parse as an integer, which will clearly fail in this case.
> Solution should either redirect stderr elsewhere or update parsing to handle 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-08-11 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692387#comment-14692387
 ] 

Guozhang Wang commented on KAFKA-1778:
--

Could you summarize your proposal on your 27/May/15 comment, and people can 
then discuss about safetyness in corner cases and efficiency? [~junrao] 
[~jjkoshy] [~charmalloc]

> Create new re-elect controller admin function
> -
>
> Key: KAFKA-1778
> URL: https://issues.apache.org/jira/browse/KAFKA-1778
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Abhishek Nigam
> Fix For: 0.8.3
>
>
> kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2410) Implement "Auto Topic Creation" client side and remove support from Broker side

2015-08-11 Thread Grant Henke (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692400#comment-14692400
 ] 

Grant Henke commented on KAFKA-2410:


Great! Though I am concerned it overlaps with some of the Work in 
[KIP-4|https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations]
 (KAFKA-2229). Does the patch still apply today? Can it be updated and 
reviewed? Perhaps a side discussion on the dev mailing list is appropriate.

> Implement "Auto Topic Creation" client side and remove support from Broker 
> side
> ---
>
> Key: KAFKA-2410
> URL: https://issues.apache.org/jira/browse/KAFKA-2410
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, core
>Affects Versions: 0.8.2.1
>Reporter: Grant Henke
>Assignee: Grant Henke
>
> Auto topic creation on the broker has caused pain in the past; And today it 
> still causes unusual error handling requirements on the client side, added 
> complexity in the broker, mixed responsibility of the TopicMetadataRequest, 
> and limits configuration of the option to be cluster wide. In the future 
> having it broker side will also make features such as authorization very 
> difficult. 
> There have been discussions in the past of implementing this feature client 
> side. 
> [example|https://issues.apache.org/jira/browse/KAFKA-689?focusedCommentId=13548746&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13548746]
> This Jira is to track that discussion and implementation once the necessary 
> protocol support exists: KAFKA-2229



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: KIP Meeting Notes 08/11/2015

2015-08-11 Thread Guozhang Wang
Good question.

I can personally think of pros and cons of having a volunteer list, most of
them are pros but one con is that the list will never be comprehensive and
in that sense sort of discouraging people to assign themselves as the
reviewer.

Without such a list, contributors would most likely assign reviewers to who
they saw to have been a reviewer before or who they know of (i.e. a
committer most of times). But we could try to encourage people re-assign
review roles to who they think would be comfortable to do so (maybe they
have contributed multiple patches on that module, or they have participated
discussions in that topic, or they are known to have the background, etc),
while at the same time encourage people to (re-)assign reviewer to
themselves, and hope that over time more people to be observed as the
"reviewers to go to". This may also help the community to grow committers.

Thoughts?

Guozhang

On Tue, Aug 11, 2015 at 1:50 PM, Grant Henke  wrote:

> >
> > 2. Encourage contributors to set the "reviewer" field when change JIRA
> > status to "patch available", and encourage volunteers assigning
> themselves
> > to "reviewers" for pending tickets.
>
>
> Is there somewhere that describes who to pick as a reviewer based on the
> patch?  Would it be worth listing volunteer reviews in a similar location?
>
> On Tue, Aug 11, 2015 at 2:14 PM, Guozhang Wang  wrote:
>
> > First of all, WebEx seems working! And we will upload the recorded video
> > later.
> >
> > Quick summary:
> >
> > KIP-26: RP-99 (https://github.com/apache/kafka/pull/99) pending for
> > reviews.
> >
> > KIP-28: RP-130 (https://github.com/apache/kafka/pull/130) looking for
> > feedbacks on:
> >
> > 1. API design (see o.k.a.stream.examples).
> > 2. Architecture design (see KIP wiki page)
> > 3. Packaging options.
> >
> > KIP-29: we will do a quick fix for unblocking production issues with
> > hard-coded interval values, while at the same time keep the KIP open for
> > further discussions about end state configurations.
> >
> > KIP-4: KAFKA-1695 / 2210 pending for reviews.
> >
> > Review Backlog Management:
> >
> > 1. Remind people to change JIRA status as "patch available" when they
> > contribute the patch, and change the status back to "in progress" after
> it
> > is reviewed, as indicated in:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Contributing+Code+Changes
> >
> > 2. Encourage contributors to set the "reviewer" field when change JIRA
> > status to "patch available", and encourage volunteers assigning
> themselves
> > to "reviewers" for pending tickets.
> >
> > -- Guozhang
> >
>
>
>
> --
> Grant Henke
> Software Engineer | Cloudera
> gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>



-- 
-- Guozhang


[jira] [Commented] (KAFKA-1387) Kafka getting stuck creating ephemeral node it has already created when two zookeeper sessions are established in a very short period of time

2015-08-11 Thread James Lent (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692595#comment-14692595
 ] 

James Lent commented on KAFKA-1387:
---

It has been a while since I investigated this issue. I will take another look 
at it tomorrow and get back to you. 

Sent from my iPhone



> Kafka getting stuck creating ephemeral node it has already created when two 
> zookeeper sessions are established in a very short period of time
> -
>
> Key: KAFKA-1387
> URL: https://issues.apache.org/jira/browse/KAFKA-1387
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.1
>Reporter: Fedor Korotkiy
>Priority: Blocker
>  Labels: newbie, patch, zkclient-problems
> Attachments: kafka-1387.patch
>
>
> Kafka broker re-registers itself in zookeeper every time handleNewSession() 
> callback is invoked.
> https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/server/KafkaHealthcheck.scala
>  
> Now imagine the following sequence of events.
> 1) Zookeeper session reestablishes. handleNewSession() callback is queued by 
> the zkClient, but not invoked yet.
> 2) Zookeeper session reestablishes again, queueing callback second time.
> 3) First callback is invoked, creating /broker/[id] ephemeral path.
> 4) Second callback is invoked and it tries to create /broker/[id] path using 
> createEphemeralPathExpectConflictHandleZKBug() function. But the path is 
> already exists, so createEphemeralPathExpectConflictHandleZKBug() is getting 
> stuck in the infinite loop.
> Seems like controller election code have the same issue.
> I'am able to reproduce this issue on the 0.8.1 branch from github using the 
> following configs.
> # zookeeper
> tickTime=10
> dataDir=/tmp/zk/
> clientPort=2101
> maxClientCnxns=0
> # kafka
> broker.id=1
> log.dir=/tmp/kafka
> zookeeper.connect=localhost:2101
> zookeeper.connection.timeout.ms=100
> zookeeper.sessiontimeout.ms=100
> Just start kafka and zookeeper and then pause zookeeper several times using 
> Ctrl-Z.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-08-11 Thread Abhishek Nigam (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692628#comment-14692628
 ] 

Abhishek Nigam commented on KAFKA-1778:
---

Thanks Guozhang,
I will write it up in a nice proposal.

-Abhishek

On Tue, Aug 11, 2015 at 3:28 PM, Guozhang Wang (JIRA) 



> Create new re-elect controller admin function
> -
>
> Key: KAFKA-1778
> URL: https://issues.apache.org/jira/browse/KAFKA-1778
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Abhishek Nigam
> Fix For: 0.8.3
>
>
> kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Kafka Indentation

2015-08-11 Thread Aditya Auradkar
Bump. Anyone else have an opinion?

Neha/Jay - You've made your thoughts clear. Any thoughts on how/if we make
any changes?

Thanks,
Aditya


On Fri, Jul 24, 2015 at 10:32 AM, Aditya Auradkar 
wrote:

> I'm with Neha on this one. I don't have a strong preference on 2 vs 4 but
> I do think that consistency is more important. It makes writing code a bit
> easier especially since patches are increasingly likely to touch both Java
> and Scala code and it's nice to not think about formatting certain files
> differently from others.
>
> Aditya
>
> On Fri, Jul 24, 2015 at 9:45 AM, Jay Kreps  wrote:
>
>> Ismael,
>>
>> Makes sense. I think there is a good chance that it is just our ignorance
>> of scala tools. I really do like having compile time enforced formatting
>> and dependency checking as we have for java. But we really put no effort
>> into trying to improve the scala developer experience so it may be an
>> unfair comparison.
>>
>> -Jay
>>
>> On Fri, Jul 24, 2015 at 8:07 AM, Ismael Juma  wrote:
>>
>> > On Fri, Jul 24, 2015 at 2:00 AM, Jay Kreps  wrote:
>> >
>> > > I do agree that working with a mixture of scala and java is a pain in
>> the
>> > > butt. What about considering the more extreme idea of just moving the
>> > > remaining server-side scala into java? I like Scala, but the tooling
>> and
>> > > compatibility story for java is better, and Java 8 addressed some of
>> the
>> > > gaps. For a system like Kafka I do kind of think that what Scala
>> offers
>> > is
>> > > less useful, and the kind of boring Java tooling like IDE support,
>> > > findbugs, checkstyle, simple exception stack traces, and a good
>> > > compatability story is more important.
>> >
>> >
>> > I can certainly see the case for avoiding the complexity of two
>> different
>> > languages (assuming that the benefits are not worth it). However, I am
>> not
>> > sure about the "findbugs, checkstyle" point. Static checking is an area
>> > that Scala does quite well (better than Java in many ways): scalastyle,
>> > abide, scalariform, wartremover, scapegoat, etc. And Scala 2.11 also
>> has a
>> > number of Xlint warnings.
>> >
>> > Best,
>> > Ismael
>> >
>>
>
>


Re: Kafka Indentation

2015-08-11 Thread Jason Gustafson
Can the java code be indented without affecting the results of git blame?
If not, then I'd vote to leave it as it is.

(Also +1 on rewriting Kafka in Java)

-Jason

On Tue, Aug 11, 2015 at 5:15 PM, Aditya Auradkar <
aaurad...@linkedin.com.invalid> wrote:

> Bump. Anyone else have an opinion?
>
> Neha/Jay - You've made your thoughts clear. Any thoughts on how/if we make
> any changes?
>
> Thanks,
> Aditya
>
>
> On Fri, Jul 24, 2015 at 10:32 AM, Aditya Auradkar 
> wrote:
>
> > I'm with Neha on this one. I don't have a strong preference on 2 vs 4 but
> > I do think that consistency is more important. It makes writing code a
> bit
> > easier especially since patches are increasingly likely to touch both
> Java
> > and Scala code and it's nice to not think about formatting certain files
> > differently from others.
> >
> > Aditya
> >
> > On Fri, Jul 24, 2015 at 9:45 AM, Jay Kreps  wrote:
> >
> >> Ismael,
> >>
> >> Makes sense. I think there is a good chance that it is just our
> ignorance
> >> of scala tools. I really do like having compile time enforced formatting
> >> and dependency checking as we have for java. But we really put no effort
> >> into trying to improve the scala developer experience so it may be an
> >> unfair comparison.
> >>
> >> -Jay
> >>
> >> On Fri, Jul 24, 2015 at 8:07 AM, Ismael Juma  wrote:
> >>
> >> > On Fri, Jul 24, 2015 at 2:00 AM, Jay Kreps  wrote:
> >> >
> >> > > I do agree that working with a mixture of scala and java is a pain
> in
> >> the
> >> > > butt. What about considering the more extreme idea of just moving
> the
> >> > > remaining server-side scala into java? I like Scala, but the tooling
> >> and
> >> > > compatibility story for java is better, and Java 8 addressed some of
> >> the
> >> > > gaps. For a system like Kafka I do kind of think that what Scala
> >> offers
> >> > is
> >> > > less useful, and the kind of boring Java tooling like IDE support,
> >> > > findbugs, checkstyle, simple exception stack traces, and a good
> >> > > compatability story is more important.
> >> >
> >> >
> >> > I can certainly see the case for avoiding the complexity of two
> >> different
> >> > languages (assuming that the benefits are not worth it). However, I am
> >> not
> >> > sure about the "findbugs, checkstyle" point. Static checking is an
> area
> >> > that Scala does quite well (better than Java in many ways):
> scalastyle,
> >> > abide, scalariform, wartremover, scapegoat, etc. And Scala 2.11 also
> >> has a
> >> > number of Xlint warnings.
> >> >
> >> > Best,
> >> > Ismael
> >> >
> >>
> >
> >
>


Re: Kafka Indentation

2015-08-11 Thread Mayuresh Gharat
+1 on consistency.

Thanks,

Mayuresh

On Tue, Aug 11, 2015 at 5:15 PM, Aditya Auradkar <
aaurad...@linkedin.com.invalid> wrote:

> Bump. Anyone else have an opinion?
>
> Neha/Jay - You've made your thoughts clear. Any thoughts on how/if we make
> any changes?
>
> Thanks,
> Aditya
>
>
> On Fri, Jul 24, 2015 at 10:32 AM, Aditya Auradkar 
> wrote:
>
> > I'm with Neha on this one. I don't have a strong preference on 2 vs 4 but
> > I do think that consistency is more important. It makes writing code a
> bit
> > easier especially since patches are increasingly likely to touch both
> Java
> > and Scala code and it's nice to not think about formatting certain files
> > differently from others.
> >
> > Aditya
> >
> > On Fri, Jul 24, 2015 at 9:45 AM, Jay Kreps  wrote:
> >
> >> Ismael,
> >>
> >> Makes sense. I think there is a good chance that it is just our
> ignorance
> >> of scala tools. I really do like having compile time enforced formatting
> >> and dependency checking as we have for java. But we really put no effort
> >> into trying to improve the scala developer experience so it may be an
> >> unfair comparison.
> >>
> >> -Jay
> >>
> >> On Fri, Jul 24, 2015 at 8:07 AM, Ismael Juma  wrote:
> >>
> >> > On Fri, Jul 24, 2015 at 2:00 AM, Jay Kreps  wrote:
> >> >
> >> > > I do agree that working with a mixture of scala and java is a pain
> in
> >> the
> >> > > butt. What about considering the more extreme idea of just moving
> the
> >> > > remaining server-side scala into java? I like Scala, but the tooling
> >> and
> >> > > compatibility story for java is better, and Java 8 addressed some of
> >> the
> >> > > gaps. For a system like Kafka I do kind of think that what Scala
> >> offers
> >> > is
> >> > > less useful, and the kind of boring Java tooling like IDE support,
> >> > > findbugs, checkstyle, simple exception stack traces, and a good
> >> > > compatability story is more important.
> >> >
> >> >
> >> > I can certainly see the case for avoiding the complexity of two
> >> different
> >> > languages (assuming that the benefits are not worth it). However, I am
> >> not
> >> > sure about the "findbugs, checkstyle" point. Static checking is an
> area
> >> > that Scala does quite well (better than Java in many ways):
> scalastyle,
> >> > abide, scalariform, wartremover, scapegoat, etc. And Scala 2.11 also
> >> has a
> >> > number of Xlint warnings.
> >> >
> >> > Best,
> >> > Ismael
> >> >
> >>
> >
> >
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: Kafka Indentation

2015-08-11 Thread Mayuresh Gharat
missed it. +1 on rewriting Kafka in Java.

Thanks,

Mayuresh

On Tue, Aug 11, 2015 at 5:23 PM, Jason Gustafson  wrote:

> Can the java code be indented without affecting the results of git blame?
> If not, then I'd vote to leave it as it is.
>
> (Also +1 on rewriting Kafka in Java)
>
> -Jason
>
> On Tue, Aug 11, 2015 at 5:15 PM, Aditya Auradkar <
> aaurad...@linkedin.com.invalid> wrote:
>
> > Bump. Anyone else have an opinion?
> >
> > Neha/Jay - You've made your thoughts clear. Any thoughts on how/if we
> make
> > any changes?
> >
> > Thanks,
> > Aditya
> >
> >
> > On Fri, Jul 24, 2015 at 10:32 AM, Aditya Auradkar <
> aaurad...@linkedin.com>
> > wrote:
> >
> > > I'm with Neha on this one. I don't have a strong preference on 2 vs 4
> but
> > > I do think that consistency is more important. It makes writing code a
> > bit
> > > easier especially since patches are increasingly likely to touch both
> > Java
> > > and Scala code and it's nice to not think about formatting certain
> files
> > > differently from others.
> > >
> > > Aditya
> > >
> > > On Fri, Jul 24, 2015 at 9:45 AM, Jay Kreps  wrote:
> > >
> > >> Ismael,
> > >>
> > >> Makes sense. I think there is a good chance that it is just our
> > ignorance
> > >> of scala tools. I really do like having compile time enforced
> formatting
> > >> and dependency checking as we have for java. But we really put no
> effort
> > >> into trying to improve the scala developer experience so it may be an
> > >> unfair comparison.
> > >>
> > >> -Jay
> > >>
> > >> On Fri, Jul 24, 2015 at 8:07 AM, Ismael Juma 
> wrote:
> > >>
> > >> > On Fri, Jul 24, 2015 at 2:00 AM, Jay Kreps 
> wrote:
> > >> >
> > >> > > I do agree that working with a mixture of scala and java is a pain
> > in
> > >> the
> > >> > > butt. What about considering the more extreme idea of just moving
> > the
> > >> > > remaining server-side scala into java? I like Scala, but the
> tooling
> > >> and
> > >> > > compatibility story for java is better, and Java 8 addressed some
> of
> > >> the
> > >> > > gaps. For a system like Kafka I do kind of think that what Scala
> > >> offers
> > >> > is
> > >> > > less useful, and the kind of boring Java tooling like IDE support,
> > >> > > findbugs, checkstyle, simple exception stack traces, and a good
> > >> > > compatability story is more important.
> > >> >
> > >> >
> > >> > I can certainly see the case for avoiding the complexity of two
> > >> different
> > >> > languages (assuming that the benefits are not worth it). However, I
> am
> > >> not
> > >> > sure about the "findbugs, checkstyle" point. Static checking is an
> > area
> > >> > that Scala does quite well (better than Java in many ways):
> > scalastyle,
> > >> > abide, scalariform, wartremover, scapegoat, etc. And Scala 2.11 also
> > >> has a
> > >> > number of Xlint warnings.
> > >> >
> > >> > Best,
> > >> > Ismael
> > >> >
> > >>
> > >
> > >
> >
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: Kafka Indentation

2015-08-11 Thread Gwen Shapira
+1 on not breaking git blame

-1 on rewriting Kafka in Java
+1 on upping our Scala game (as Ismael pointed out)

On Tue, Aug 11, 2015 at 5:23 PM, Jason Gustafson  wrote:

> Can the java code be indented without affecting the results of git blame?
> If not, then I'd vote to leave it as it is.
>
> (Also +1 on rewriting Kafka in Java)
>
> -Jason
>
> On Tue, Aug 11, 2015 at 5:15 PM, Aditya Auradkar <
> aaurad...@linkedin.com.invalid> wrote:
>
> > Bump. Anyone else have an opinion?
> >
> > Neha/Jay - You've made your thoughts clear. Any thoughts on how/if we
> make
> > any changes?
> >
> > Thanks,
> > Aditya
> >
> >
> > On Fri, Jul 24, 2015 at 10:32 AM, Aditya Auradkar <
> aaurad...@linkedin.com>
> > wrote:
> >
> > > I'm with Neha on this one. I don't have a strong preference on 2 vs 4
> but
> > > I do think that consistency is more important. It makes writing code a
> > bit
> > > easier especially since patches are increasingly likely to touch both
> > Java
> > > and Scala code and it's nice to not think about formatting certain
> files
> > > differently from others.
> > >
> > > Aditya
> > >
> > > On Fri, Jul 24, 2015 at 9:45 AM, Jay Kreps  wrote:
> > >
> > >> Ismael,
> > >>
> > >> Makes sense. I think there is a good chance that it is just our
> > ignorance
> > >> of scala tools. I really do like having compile time enforced
> formatting
> > >> and dependency checking as we have for java. But we really put no
> effort
> > >> into trying to improve the scala developer experience so it may be an
> > >> unfair comparison.
> > >>
> > >> -Jay
> > >>
> > >> On Fri, Jul 24, 2015 at 8:07 AM, Ismael Juma 
> wrote:
> > >>
> > >> > On Fri, Jul 24, 2015 at 2:00 AM, Jay Kreps 
> wrote:
> > >> >
> > >> > > I do agree that working with a mixture of scala and java is a pain
> > in
> > >> the
> > >> > > butt. What about considering the more extreme idea of just moving
> > the
> > >> > > remaining server-side scala into java? I like Scala, but the
> tooling
> > >> and
> > >> > > compatibility story for java is better, and Java 8 addressed some
> of
> > >> the
> > >> > > gaps. For a system like Kafka I do kind of think that what Scala
> > >> offers
> > >> > is
> > >> > > less useful, and the kind of boring Java tooling like IDE support,
> > >> > > findbugs, checkstyle, simple exception stack traces, and a good
> > >> > > compatability story is more important.
> > >> >
> > >> >
> > >> > I can certainly see the case for avoiding the complexity of two
> > >> different
> > >> > languages (assuming that the benefits are not worth it). However, I
> am
> > >> not
> > >> > sure about the "findbugs, checkstyle" point. Static checking is an
> > area
> > >> > that Scala does quite well (better than Java in many ways):
> > scalastyle,
> > >> > abide, scalariform, wartremover, scapegoat, etc. And Scala 2.11 also
> > >> has a
> > >> > number of Xlint warnings.
> > >> >
> > >> > Best,
> > >> > Ismael
> > >> >
> > >>
> > >
> > >
> >
>


Re: Review Request 33049: Patch for KAFKA-2084

2015-08-11 Thread Jun Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33049/#review95033
---


Just a couple of comments below. Otherwise, LGTM.


clients/src/main/java/org/apache/kafka/common/metrics/Sensor.java (lines 131 - 
150)


I think the comment can be a simpler. Basically, if O is the observed rate 
and T is the target rate over a window of W, to bring O down to T, we need to 
add a delay of X to W such that O * W / (W + X) = T. Solving for X, we get X = 
W*(O - T)/T.



clients/src/main/java/org/apache/kafka/common/metrics/Sensor.java (line 153)


Instead of using config.samples() * config.timeWindowMs(), shouldn't we use 
the formula elapsedCurrentWindowMs + elapsedPriorWindowsMs that we used in 
Rate.measure()? We can pass in now all the way from record().


- Jun Rao


On Aug. 11, 2015, 4:58 a.m., Aditya Auradkar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33049/
> ---
> 
> (Updated Aug. 11, 2015, 4:58 a.m.)
> 
> 
> Review request for kafka, Joel Koshy and Jun Rao.
> 
> 
> Bugs: KAFKA-2084
> https://issues.apache.org/jira/browse/KAFKA-2084
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Updated patch for quotas. This patch does the following: 
> 1. Add per-client metrics for both producer and consumers 
> 2. Add configuration for quotas 
> 3. Compute delay times in the metrics package and return the delay times in 
> QuotaViolationException 
> 4. Add a DelayQueue in KafkaApi's that can be used to throttle any type of 
> request. Implemented request throttling for produce and fetch requests. 
> 5. Added unit and integration test cases for both producer and consumer
> 6. This doesn't include a system test. There is a separate ticket for that
> 7. Fixed KAFKA-2191 - (Included fix from : 
> https://reviews.apache.org/r/34418/ )
> 
> Addressed comments from Joel and Jun
> 
> 
> Diffs
> -
> 
>   clients/src/main/java/org/apache/kafka/common/metrics/Quota.java 
> d82bb0c055e631425bc1ebbc7d387baac76aeeaa 
>   
> clients/src/main/java/org/apache/kafka/common/metrics/QuotaViolationException.java
>  a451e5385c9eca76b38b425e8ac856b2715fcffe 
>   clients/src/main/java/org/apache/kafka/common/metrics/Sensor.java 
> ca823fd4639523018311b814fde69b6177e73b97 
>   clients/src/main/java/org/apache/kafka/common/metrics/stats/Rate.java 
> 98429da34418f7f1deba1b5e44e2e6025212edb3 
>   clients/src/test/java/org/apache/kafka/common/metrics/MetricsTest.java 
> 544e120594de78c43581a980b1e4087b4fb98ccb 
>   clients/src/test/java/org/apache/kafka/common/utils/MockTime.java  
>   core/src/main/scala/kafka/server/ClientQuotaManager.scala PRE-CREATION 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> 7ea509c2c41acc00430c74e025e069a833aac4e7 
>   core/src/main/scala/kafka/server/KafkaConfig.scala 
> dbe170f87331f43e2dc30165080d2cb7dfe5fdbf 
>   core/src/main/scala/kafka/server/KafkaServer.scala 
> 84d4730ac634f9a5bf12a656e422fea03ad72da8 
>   core/src/main/scala/kafka/server/ReplicaManager.scala 
> 795220e7f63d163be90738b4c1a39687b44c1395 
>   core/src/main/scala/kafka/server/ThrottledResponse.scala PRE-CREATION 
>   core/src/main/scala/kafka/utils/ShutdownableThread.scala 
> fc226c863095b7761290292cd8755cd7ad0f155c 
>   core/src/test/scala/integration/kafka/api/QuotasTest.scala PRE-CREATION 
>   core/src/test/scala/unit/kafka/server/ClientQuotaManagerTest.scala 
> PRE-CREATION 
>   core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 
> f32d206d3f52f3f9f4d649c213edd7058f4b6150 
>   core/src/test/scala/unit/kafka/server/ThrottledResponseExpirationTest.scala 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/33049/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Aditya Auradkar
> 
>



Re: Kafka Indentation

2015-08-11 Thread Ashish Singh
I am also a +1 on not breaking git blame. IDEs support language specific
settings in same project.

On Tue, Aug 11, 2015 at 5:29 PM, Gwen Shapira  wrote:

> +1 on not breaking git blame
>
> -1 on rewriting Kafka in Java
> +1 on upping our Scala game (as Ismael pointed out)
>
> On Tue, Aug 11, 2015 at 5:23 PM, Jason Gustafson 
> wrote:
>
> > Can the java code be indented without affecting the results of git blame?
> > If not, then I'd vote to leave it as it is.
> >
> > (Also +1 on rewriting Kafka in Java)
> >
> > -Jason
> >
> > On Tue, Aug 11, 2015 at 5:15 PM, Aditya Auradkar <
> > aaurad...@linkedin.com.invalid> wrote:
> >
> > > Bump. Anyone else have an opinion?
> > >
> > > Neha/Jay - You've made your thoughts clear. Any thoughts on how/if we
> > make
> > > any changes?
> > >
> > > Thanks,
> > > Aditya
> > >
> > >
> > > On Fri, Jul 24, 2015 at 10:32 AM, Aditya Auradkar <
> > aaurad...@linkedin.com>
> > > wrote:
> > >
> > > > I'm with Neha on this one. I don't have a strong preference on 2 vs 4
> > but
> > > > I do think that consistency is more important. It makes writing code
> a
> > > bit
> > > > easier especially since patches are increasingly likely to touch both
> > > Java
> > > > and Scala code and it's nice to not think about formatting certain
> > files
> > > > differently from others.
> > > >
> > > > Aditya
> > > >
> > > > On Fri, Jul 24, 2015 at 9:45 AM, Jay Kreps  wrote:
> > > >
> > > >> Ismael,
> > > >>
> > > >> Makes sense. I think there is a good chance that it is just our
> > > ignorance
> > > >> of scala tools. I really do like having compile time enforced
> > formatting
> > > >> and dependency checking as we have for java. But we really put no
> > effort
> > > >> into trying to improve the scala developer experience so it may be
> an
> > > >> unfair comparison.
> > > >>
> > > >> -Jay
> > > >>
> > > >> On Fri, Jul 24, 2015 at 8:07 AM, Ismael Juma 
> > wrote:
> > > >>
> > > >> > On Fri, Jul 24, 2015 at 2:00 AM, Jay Kreps 
> > wrote:
> > > >> >
> > > >> > > I do agree that working with a mixture of scala and java is a
> pain
> > > in
> > > >> the
> > > >> > > butt. What about considering the more extreme idea of just
> moving
> > > the
> > > >> > > remaining server-side scala into java? I like Scala, but the
> > tooling
> > > >> and
> > > >> > > compatibility story for java is better, and Java 8 addressed
> some
> > of
> > > >> the
> > > >> > > gaps. For a system like Kafka I do kind of think that what Scala
> > > >> offers
> > > >> > is
> > > >> > > less useful, and the kind of boring Java tooling like IDE
> support,
> > > >> > > findbugs, checkstyle, simple exception stack traces, and a good
> > > >> > > compatability story is more important.
> > > >> >
> > > >> >
> > > >> > I can certainly see the case for avoiding the complexity of two
> > > >> different
> > > >> > languages (assuming that the benefits are not worth it). However,
> I
> > am
> > > >> not
> > > >> > sure about the "findbugs, checkstyle" point. Static checking is an
> > > area
> > > >> > that Scala does quite well (better than Java in many ways):
> > > scalastyle,
> > > >> > abide, scalariform, wartremover, scapegoat, etc. And Scala 2.11
> also
> > > >> has a
> > > >> > number of Xlint warnings.
> > > >> >
> > > >> > Best,
> > > >> > Ismael
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>



-- 

Regards,
Ashish


Re: Review Request 33049: Patch for KAFKA-2084

2015-08-11 Thread Jun Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33049/#review95035
---



core/src/main/scala/kafka/server/KafkaConfig.scala (line 418)


I am still not sure that I see the value of the delay factor. If one wants 
to be a bit conservative, one can always configure a lower quota value.


- Jun Rao


On Aug. 11, 2015, 4:58 a.m., Aditya Auradkar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33049/
> ---
> 
> (Updated Aug. 11, 2015, 4:58 a.m.)
> 
> 
> Review request for kafka, Joel Koshy and Jun Rao.
> 
> 
> Bugs: KAFKA-2084
> https://issues.apache.org/jira/browse/KAFKA-2084
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Updated patch for quotas. This patch does the following: 
> 1. Add per-client metrics for both producer and consumers 
> 2. Add configuration for quotas 
> 3. Compute delay times in the metrics package and return the delay times in 
> QuotaViolationException 
> 4. Add a DelayQueue in KafkaApi's that can be used to throttle any type of 
> request. Implemented request throttling for produce and fetch requests. 
> 5. Added unit and integration test cases for both producer and consumer
> 6. This doesn't include a system test. There is a separate ticket for that
> 7. Fixed KAFKA-2191 - (Included fix from : 
> https://reviews.apache.org/r/34418/ )
> 
> Addressed comments from Joel and Jun
> 
> 
> Diffs
> -
> 
>   clients/src/main/java/org/apache/kafka/common/metrics/Quota.java 
> d82bb0c055e631425bc1ebbc7d387baac76aeeaa 
>   
> clients/src/main/java/org/apache/kafka/common/metrics/QuotaViolationException.java
>  a451e5385c9eca76b38b425e8ac856b2715fcffe 
>   clients/src/main/java/org/apache/kafka/common/metrics/Sensor.java 
> ca823fd4639523018311b814fde69b6177e73b97 
>   clients/src/main/java/org/apache/kafka/common/metrics/stats/Rate.java 
> 98429da34418f7f1deba1b5e44e2e6025212edb3 
>   clients/src/test/java/org/apache/kafka/common/metrics/MetricsTest.java 
> 544e120594de78c43581a980b1e4087b4fb98ccb 
>   clients/src/test/java/org/apache/kafka/common/utils/MockTime.java  
>   core/src/main/scala/kafka/server/ClientQuotaManager.scala PRE-CREATION 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> 7ea509c2c41acc00430c74e025e069a833aac4e7 
>   core/src/main/scala/kafka/server/KafkaConfig.scala 
> dbe170f87331f43e2dc30165080d2cb7dfe5fdbf 
>   core/src/main/scala/kafka/server/KafkaServer.scala 
> 84d4730ac634f9a5bf12a656e422fea03ad72da8 
>   core/src/main/scala/kafka/server/ReplicaManager.scala 
> 795220e7f63d163be90738b4c1a39687b44c1395 
>   core/src/main/scala/kafka/server/ThrottledResponse.scala PRE-CREATION 
>   core/src/main/scala/kafka/utils/ShutdownableThread.scala 
> fc226c863095b7761290292cd8755cd7ad0f155c 
>   core/src/test/scala/integration/kafka/api/QuotasTest.scala PRE-CREATION 
>   core/src/test/scala/unit/kafka/server/ClientQuotaManagerTest.scala 
> PRE-CREATION 
>   core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 
> f32d206d3f52f3f9f4d649c213edd7058f4b6150 
>   core/src/test/scala/unit/kafka/server/ThrottledResponseExpirationTest.scala 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/33049/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Aditya Auradkar
> 
>



Re: KAFKA-2364 migrate docs from SVN to git

2015-08-11 Thread Ashish Singh
+1 on same repo.

On Tue, Aug 11, 2015 at 12:21 PM, Edward Ribeiro 
wrote:

> +1. As soon as possible, please. :)
>
> On Sat, Aug 8, 2015 at 4:05 PM, Neha Narkhede  wrote:
>
> > +1 on the same repo for code and website. It helps to keep both in sync.
> >
> > On Thu, Aug 6, 2015 at 1:52 PM, Grant Henke  wrote:
> >
> > > +1 for the same repo. The closer docs can be to code the more accurate
> > they
> > > are likely to be. The same way we encourage unit tests for a new
> > > feature/patch. Updating the docs can be the same.
> > >
> > > If we follow Sqoop's process for example, how would small
> > > fixes/adjustments/additions to the live documentation occur without a
> new
> > > release?
> > >
> > > On Thu, Aug 6, 2015 at 3:33 PM, Guozhang Wang 
> > wrote:
> > >
> > > > I am +1 on same repo too. I think keeping one git history of code /
> doc
> > > > change may actually be beneficial for this approach as well.
> > > >
> > > > Guozhang
> > > >
> > > > On Thu, Aug 6, 2015 at 9:16 AM, Gwen Shapira 
> > wrote:
> > > >
> > > > > I prefer same repo for one-commit / lower-barrier benefits.
> > > > >
> > > > > Sqoop has the following process, which decouples documentation
> > changes
> > > > from
> > > > > website changes:
> > > > >
> > > > > 1. Code github repo contains a doc directory, with the
> documentation
> > > > > written and maintained in AsciiDoc. Only one version of the
> > > > documentation,
> > > > > since it is source controlled with the code. (unlike current SVN
> > where
> > > we
> > > > > have directories per version)
> > > > >
> > > > > 2. Build process compiles the AsciiDoc to HTML and PDF
> > > > >
> > > > > 3. When releasing, we post the documentation of the new release to
> > the
> > > > > website
> > > > >
> > > > > Gwen
> > > > >
> > > > > On Thu, Aug 6, 2015 at 12:20 AM, Ismael Juma 
> > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > For reference, here is the previous discussion on moving the
> > website
> > > to
> > > > > > Git:
> > > > > >
> > > > > > http://search-hadoop.com/m/uyzND11JliU1E8QU92
> > > > > >
> > > > > > People were positive to the idea as Jay said. I would like to
> see a
> > > bit
> > > > > of
> > > > > > a discussion around whether the website should be part of the
> same
> > > repo
> > > > > as
> > > > > > the code or not. I'll get the ball rolling.
> > > > > >
> > > > > > Pros for same repo:
> > > > > > * One commit can update the code and website, which means:
> > > > > > ** Lower barrier for updating docs along with relevant code
> changes
> > > > > > ** Easier to require that both are updated at the same time
> > > > > > * More eyeballs on the website changes
> > > > > > * Automatically branched with the relevant code
> > > > > >
> > > > > > Pros for separate repo:
> > > > > > * Potentially simpler for website-only changes (smaller repo,
> less
> > > > > > verification needed)
> > > > > > * Website changes don't "clutter" the code Git history
> > > > > > * No risk of website change affecting the code
> > > > > >
> > > > > > Your thoughts, please.
> > > > > >
> > > > > > Best,
> > > > > > Ismael
> > > > > >
> > > > > > On Fri, Jul 31, 2015 at 6:15 PM, Aseem Bansal <
> > asmbans...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi
> > > > > > >
> > > > > > > When discussing on KAFKA-2364 migrating docs from svn to git
> came
> > > up.
> > > > > > That
> > > > > > > would make contributing to docs much easier. I have contributed
> > to
> > > > > > > groovy/grails via github so I think having mirror on github
> could
> > > be
> > > > > > > useful.
> > > > > > >
> > > > > > > Also I think unless there is some good reason it should be a
> > > separate
> > > > > > repo.
> > > > > > > No need to mix docs and code.
> > > > > > >
> > > > > > > I can try that out.
> > > > > > >
> > > > > > > Thoughts?
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> > >
> > >
> > > --
> > > Grant Henke
> > > Software Engineer | Cloudera
> > > gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
> > >
> >
> >
> >
> > --
> > Thanks,
> > Neha
> >
>



-- 

Regards,
Ashish


Re: KAFKA-2364 migrate docs from SVN to git

2015-08-11 Thread Gwen Shapira
The vote opened 5 days ago. I believe we can conclude with 3 binding +1, 3
non-binding +1 and no -1.

Ismael, are you opening and JIRA and migrating? Or are we looking for a
volunteer?

On Tue, Aug 11, 2015 at 5:46 PM, Ashish Singh  wrote:

> +1 on same repo.
>
> On Tue, Aug 11, 2015 at 12:21 PM, Edward Ribeiro  >
> wrote:
>
> > +1. As soon as possible, please. :)
> >
> > On Sat, Aug 8, 2015 at 4:05 PM, Neha Narkhede  wrote:
> >
> > > +1 on the same repo for code and website. It helps to keep both in
> sync.
> > >
> > > On Thu, Aug 6, 2015 at 1:52 PM, Grant Henke 
> wrote:
> > >
> > > > +1 for the same repo. The closer docs can be to code the more
> accurate
> > > they
> > > > are likely to be. The same way we encourage unit tests for a new
> > > > feature/patch. Updating the docs can be the same.
> > > >
> > > > If we follow Sqoop's process for example, how would small
> > > > fixes/adjustments/additions to the live documentation occur without a
> > new
> > > > release?
> > > >
> > > > On Thu, Aug 6, 2015 at 3:33 PM, Guozhang Wang 
> > > wrote:
> > > >
> > > > > I am +1 on same repo too. I think keeping one git history of code /
> > doc
> > > > > change may actually be beneficial for this approach as well.
> > > > >
> > > > > Guozhang
> > > > >
> > > > > On Thu, Aug 6, 2015 at 9:16 AM, Gwen Shapira 
> > > wrote:
> > > > >
> > > > > > I prefer same repo for one-commit / lower-barrier benefits.
> > > > > >
> > > > > > Sqoop has the following process, which decouples documentation
> > > changes
> > > > > from
> > > > > > website changes:
> > > > > >
> > > > > > 1. Code github repo contains a doc directory, with the
> > documentation
> > > > > > written and maintained in AsciiDoc. Only one version of the
> > > > > documentation,
> > > > > > since it is source controlled with the code. (unlike current SVN
> > > where
> > > > we
> > > > > > have directories per version)
> > > > > >
> > > > > > 2. Build process compiles the AsciiDoc to HTML and PDF
> > > > > >
> > > > > > 3. When releasing, we post the documentation of the new release
> to
> > > the
> > > > > > website
> > > > > >
> > > > > > Gwen
> > > > > >
> > > > > > On Thu, Aug 6, 2015 at 12:20 AM, Ismael Juma 
> > > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > For reference, here is the previous discussion on moving the
> > > website
> > > > to
> > > > > > > Git:
> > > > > > >
> > > > > > > http://search-hadoop.com/m/uyzND11JliU1E8QU92
> > > > > > >
> > > > > > > People were positive to the idea as Jay said. I would like to
> > see a
> > > > bit
> > > > > > of
> > > > > > > a discussion around whether the website should be part of the
> > same
> > > > repo
> > > > > > as
> > > > > > > the code or not. I'll get the ball rolling.
> > > > > > >
> > > > > > > Pros for same repo:
> > > > > > > * One commit can update the code and website, which means:
> > > > > > > ** Lower barrier for updating docs along with relevant code
> > changes
> > > > > > > ** Easier to require that both are updated at the same time
> > > > > > > * More eyeballs on the website changes
> > > > > > > * Automatically branched with the relevant code
> > > > > > >
> > > > > > > Pros for separate repo:
> > > > > > > * Potentially simpler for website-only changes (smaller repo,
> > less
> > > > > > > verification needed)
> > > > > > > * Website changes don't "clutter" the code Git history
> > > > > > > * No risk of website change affecting the code
> > > > > > >
> > > > > > > Your thoughts, please.
> > > > > > >
> > > > > > > Best,
> > > > > > > Ismael
> > > > > > >
> > > > > > > On Fri, Jul 31, 2015 at 6:15 PM, Aseem Bansal <
> > > asmbans...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi
> > > > > > > >
> > > > > > > > When discussing on KAFKA-2364 migrating docs from svn to git
> > came
> > > > up.
> > > > > > > That
> > > > > > > > would make contributing to docs much easier. I have
> contributed
> > > to
> > > > > > > > groovy/grails via github so I think having mirror on github
> > could
> > > > be
> > > > > > > > useful.
> > > > > > > >
> > > > > > > > Also I think unless there is some good reason it should be a
> > > > separate
> > > > > > > repo.
> > > > > > > > No need to mix docs and code.
> > > > > > > >
> > > > > > > > I can try that out.
> > > > > > > >
> > > > > > > > Thoughts?
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > -- Guozhang
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Grant Henke
> > > > Software Engineer | Cloudera
> > > > gr...@cloudera.com | twitter.com/gchenke |
> linkedin.com/in/granthenke
> > > >
> > >
> > >
> > >
> > > --
> > > Thanks,
> > > Neha
> > >
> >
>
>
>
> --
>
> Regards,
> Ashish
>


Re: KAFKA-2364 migrate docs from SVN to git

2015-08-11 Thread Gwen Shapira
Ah, there is already a JIRA in the title. Never mind :)

On Tue, Aug 11, 2015 at 5:51 PM, Gwen Shapira  wrote:

> The vote opened 5 days ago. I believe we can conclude with 3 binding +1, 3
> non-binding +1 and no -1.
>
> Ismael, are you opening and JIRA and migrating? Or are we looking for a
> volunteer?
>
> On Tue, Aug 11, 2015 at 5:46 PM, Ashish Singh  wrote:
>
>> +1 on same repo.
>>
>> On Tue, Aug 11, 2015 at 12:21 PM, Edward Ribeiro <
>> edward.ribe...@gmail.com>
>> wrote:
>>
>> > +1. As soon as possible, please. :)
>> >
>> > On Sat, Aug 8, 2015 at 4:05 PM, Neha Narkhede 
>> wrote:
>> >
>> > > +1 on the same repo for code and website. It helps to keep both in
>> sync.
>> > >
>> > > On Thu, Aug 6, 2015 at 1:52 PM, Grant Henke 
>> wrote:
>> > >
>> > > > +1 for the same repo. The closer docs can be to code the more
>> accurate
>> > > they
>> > > > are likely to be. The same way we encourage unit tests for a new
>> > > > feature/patch. Updating the docs can be the same.
>> > > >
>> > > > If we follow Sqoop's process for example, how would small
>> > > > fixes/adjustments/additions to the live documentation occur without
>> a
>> > new
>> > > > release?
>> > > >
>> > > > On Thu, Aug 6, 2015 at 3:33 PM, Guozhang Wang 
>> > > wrote:
>> > > >
>> > > > > I am +1 on same repo too. I think keeping one git history of code
>> /
>> > doc
>> > > > > change may actually be beneficial for this approach as well.
>> > > > >
>> > > > > Guozhang
>> > > > >
>> > > > > On Thu, Aug 6, 2015 at 9:16 AM, Gwen Shapira 
>> > > wrote:
>> > > > >
>> > > > > > I prefer same repo for one-commit / lower-barrier benefits.
>> > > > > >
>> > > > > > Sqoop has the following process, which decouples documentation
>> > > changes
>> > > > > from
>> > > > > > website changes:
>> > > > > >
>> > > > > > 1. Code github repo contains a doc directory, with the
>> > documentation
>> > > > > > written and maintained in AsciiDoc. Only one version of the
>> > > > > documentation,
>> > > > > > since it is source controlled with the code. (unlike current SVN
>> > > where
>> > > > we
>> > > > > > have directories per version)
>> > > > > >
>> > > > > > 2. Build process compiles the AsciiDoc to HTML and PDF
>> > > > > >
>> > > > > > 3. When releasing, we post the documentation of the new release
>> to
>> > > the
>> > > > > > website
>> > > > > >
>> > > > > > Gwen
>> > > > > >
>> > > > > > On Thu, Aug 6, 2015 at 12:20 AM, Ismael Juma > >
>> > > > wrote:
>> > > > > >
>> > > > > > > Hi,
>> > > > > > >
>> > > > > > > For reference, here is the previous discussion on moving the
>> > > website
>> > > > to
>> > > > > > > Git:
>> > > > > > >
>> > > > > > > http://search-hadoop.com/m/uyzND11JliU1E8QU92
>> > > > > > >
>> > > > > > > People were positive to the idea as Jay said. I would like to
>> > see a
>> > > > bit
>> > > > > > of
>> > > > > > > a discussion around whether the website should be part of the
>> > same
>> > > > repo
>> > > > > > as
>> > > > > > > the code or not. I'll get the ball rolling.
>> > > > > > >
>> > > > > > > Pros for same repo:
>> > > > > > > * One commit can update the code and website, which means:
>> > > > > > > ** Lower barrier for updating docs along with relevant code
>> > changes
>> > > > > > > ** Easier to require that both are updated at the same time
>> > > > > > > * More eyeballs on the website changes
>> > > > > > > * Automatically branched with the relevant code
>> > > > > > >
>> > > > > > > Pros for separate repo:
>> > > > > > > * Potentially simpler for website-only changes (smaller repo,
>> > less
>> > > > > > > verification needed)
>> > > > > > > * Website changes don't "clutter" the code Git history
>> > > > > > > * No risk of website change affecting the code
>> > > > > > >
>> > > > > > > Your thoughts, please.
>> > > > > > >
>> > > > > > > Best,
>> > > > > > > Ismael
>> > > > > > >
>> > > > > > > On Fri, Jul 31, 2015 at 6:15 PM, Aseem Bansal <
>> > > asmbans...@gmail.com>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Hi
>> > > > > > > >
>> > > > > > > > When discussing on KAFKA-2364 migrating docs from svn to git
>> > came
>> > > > up.
>> > > > > > > That
>> > > > > > > > would make contributing to docs much easier. I have
>> contributed
>> > > to
>> > > > > > > > groovy/grails via github so I think having mirror on github
>> > could
>> > > > be
>> > > > > > > > useful.
>> > > > > > > >
>> > > > > > > > Also I think unless there is some good reason it should be a
>> > > > separate
>> > > > > > > repo.
>> > > > > > > > No need to mix docs and code.
>> > > > > > > >
>> > > > > > > > I can try that out.
>> > > > > > > >
>> > > > > > > > Thoughts?
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > -- Guozhang
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > Grant Henke
>> > > > Software Engineer | Cloudera
>> > > > gr...@cloudera.com | twitter.com/gchenke |
>> linkedin.com/in/granthenke
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Thanks,
>> > > Neha
>> >

[jira] [Created] (KAFKA-2422) Allow copycat connector plugins to be aliased to simpler names

2015-08-11 Thread Ewen Cheslack-Postava (JIRA)
Ewen Cheslack-Postava created KAFKA-2422:


 Summary: Allow copycat connector plugins to be aliased to simpler 
names
 Key: KAFKA-2422
 URL: https://issues.apache.org/jira/browse/KAFKA-2422
 Project: Kafka
  Issue Type: Sub-task
  Components: copycat
Reporter: Ewen Cheslack-Postava
Assignee: Ewen Cheslack-Postava
Priority: Minor


Configurations of connectors can get quite verbose when you have to specify the 
full class name, e.g. 

connector.class=org.apache.kafka.copycat.file.FileStreamSinkConnector

It would be nice to allow connector classes to provide shorter aliases, e.g. 
something like "file-sink", to make this config less verbose. Flume does this, 
so we can use it as an example.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2398) Transient test failure for SocketServerTest - Socket closed.

2015-08-11 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692693#comment-14692693
 ] 

Jiangjie Qin commented on KAFKA-2398:
-

[~benstopford] do we have a duplicate ticket of this? If not maybe we should 
keep it for track.

> Transient test failure for SocketServerTest - Socket closed.
> 
>
> Key: KAFKA-2398
> URL: https://issues.apache.org/jira/browse/KAFKA-2398
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jiangjie Qin
>
> See the following transient test failure for SocketServerTest.
> kafka.network.SocketServerTest > simpleRequest FAILED
> java.net.SocketException: Socket closed
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
> at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:579)
> at java.net.Socket.connect(Socket.java:528)
> at java.net.Socket.(Socket.java:425)
> at java.net.Socket.(Socket.java:208)
> at kafka.network.SocketServerTest.connect(SocketServerTest.scala:84)
> at 
> kafka.network.SocketServerTest.simpleRequest(SocketServerTest.scala:94)
> kafka.network.SocketServerTest > tooBigRequestIsRejected FAILED
> java.net.SocketException: Socket closed
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
> at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:579)
> at java.net.Socket.connect(Socket.java:528)
> at java.net.Socket.(Socket.java:425)
> at java.net.Socket.(Socket.java:208)
> at kafka.network.SocketServerTest.connect(SocketServerTest.scala:84)
> at 
> kafka.network.SocketServerTest.tooBigRequestIsRejected(SocketServerTest.scala:124)
> kafka.network.SocketServerTest > testSocketsCloseOnShutdown FAILED
> java.net.SocketException: Socket closed
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
> at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:579)
> at java.net.Socket.connect(Socket.java:528)
> at java.net.Socket.(Socket.java:425)
> at java.net.Socket.(Socket.java:208)
> at kafka.network.SocketServerTest.connect(SocketServerTest.scala:84)
> at 
> kafka.network.SocketServerTest.testSocketsCloseOnShutdown(SocketServerTest.scala:136)
> kafka.network.SocketServerTest > testMaxConnectionsPerIp FAILED
> java.net.SocketException: Socket closed
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
> at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:579)
> at java.net.Socket.connect(Socket.java:528)
> at java.net.Socket.(Socket.java:425)
> at java.net.Socket.(Socket.java:208)
> at kafka.network.SocketServerTest.connect(SocketServerTest.scala:84)
> at 
> kafka.network.SocketServerTest$$anonfun$1.apply(SocketServerTest.scala:170)
> at 
> kafka.network.SocketServerTest$$anonfun$1.apply(SocketServerTest.scala:170)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at scala.collection.immutable.Range.foreach(Range.scala:141)
> at 
> scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
> at scala.collection.AbstractTraversable.map(Traversable.scala:105)
> at 
> kafka.network.SocketServerTest.testMaxConnectionsPerIp(SocketServerTest.scala

Re: Kafka Indentation

2015-08-11 Thread Grant Henke
+1 on not breaking blame
-1  on 4 spaces for scala
-1 on rewriting Kafka in Java
+1 on upping our Scala game

so I guess an accumulative of 0 for me ;)


On Tue, Aug 11, 2015 at 7:37 PM, Ashish Singh  wrote:

> I am also a +1 on not breaking git blame. IDEs support language specific
> settings in same project.
>
> On Tue, Aug 11, 2015 at 5:29 PM, Gwen Shapira  wrote:
>
> > +1 on not breaking git blame
> >
> > -1 on rewriting Kafka in Java
> > +1 on upping our Scala game (as Ismael pointed out)
> >
> > On Tue, Aug 11, 2015 at 5:23 PM, Jason Gustafson 
> > wrote:
> >
> > > Can the java code be indented without affecting the results of git
> blame?
> > > If not, then I'd vote to leave it as it is.
> > >
> > > (Also +1 on rewriting Kafka in Java)
> > >
> > > -Jason
> > >
> > > On Tue, Aug 11, 2015 at 5:15 PM, Aditya Auradkar <
> > > aaurad...@linkedin.com.invalid> wrote:
> > >
> > > > Bump. Anyone else have an opinion?
> > > >
> > > > Neha/Jay - You've made your thoughts clear. Any thoughts on how/if we
> > > make
> > > > any changes?
> > > >
> > > > Thanks,
> > > > Aditya
> > > >
> > > >
> > > > On Fri, Jul 24, 2015 at 10:32 AM, Aditya Auradkar <
> > > aaurad...@linkedin.com>
> > > > wrote:
> > > >
> > > > > I'm with Neha on this one. I don't have a strong preference on 2
> vs 4
> > > but
> > > > > I do think that consistency is more important. It makes writing
> code
> > a
> > > > bit
> > > > > easier especially since patches are increasingly likely to touch
> both
> > > > Java
> > > > > and Scala code and it's nice to not think about formatting certain
> > > files
> > > > > differently from others.
> > > > >
> > > > > Aditya
> > > > >
> > > > > On Fri, Jul 24, 2015 at 9:45 AM, Jay Kreps 
> wrote:
> > > > >
> > > > >> Ismael,
> > > > >>
> > > > >> Makes sense. I think there is a good chance that it is just our
> > > > ignorance
> > > > >> of scala tools. I really do like having compile time enforced
> > > formatting
> > > > >> and dependency checking as we have for java. But we really put no
> > > effort
> > > > >> into trying to improve the scala developer experience so it may be
> > an
> > > > >> unfair comparison.
> > > > >>
> > > > >> -Jay
> > > > >>
> > > > >> On Fri, Jul 24, 2015 at 8:07 AM, Ismael Juma 
> > > wrote:
> > > > >>
> > > > >> > On Fri, Jul 24, 2015 at 2:00 AM, Jay Kreps 
> > > wrote:
> > > > >> >
> > > > >> > > I do agree that working with a mixture of scala and java is a
> > pain
> > > > in
> > > > >> the
> > > > >> > > butt. What about considering the more extreme idea of just
> > moving
> > > > the
> > > > >> > > remaining server-side scala into java? I like Scala, but the
> > > tooling
> > > > >> and
> > > > >> > > compatibility story for java is better, and Java 8 addressed
> > some
> > > of
> > > > >> the
> > > > >> > > gaps. For a system like Kafka I do kind of think that what
> Scala
> > > > >> offers
> > > > >> > is
> > > > >> > > less useful, and the kind of boring Java tooling like IDE
> > support,
> > > > >> > > findbugs, checkstyle, simple exception stack traces, and a
> good
> > > > >> > > compatability story is more important.
> > > > >> >
> > > > >> >
> > > > >> > I can certainly see the case for avoiding the complexity of two
> > > > >> different
> > > > >> > languages (assuming that the benefits are not worth it).
> However,
> > I
> > > am
> > > > >> not
> > > > >> > sure about the "findbugs, checkstyle" point. Static checking is
> an
> > > > area
> > > > >> > that Scala does quite well (better than Java in many ways):
> > > > scalastyle,
> > > > >> > abide, scalariform, wartremover, scapegoat, etc. And Scala 2.11
> > also
> > > > >> has a
> > > > >> > number of Xlint warnings.
> > > > >> >
> > > > >> > Best,
> > > > >> > Ismael
> > > > >> >
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>
>
>
> --
>
> Regards,
> Ashish
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: Copycat data API & serializers

2015-08-11 Thread Ewen Cheslack-Postava
Bumping this thread so hopefully more people see it. There is also some
discussion in the corresponding JIRA:
https://issues.apache.org/jira/browse/KAFKA-2367

Any feedback is useful, even if just to say you personally don't have any
strong opinions on this topic.

Thanks,
-Ewen

On Fri, Jul 31, 2015 at 6:21 PM, Ewen Cheslack-Postava 
wrote:

> Hi all,
>
> With the initial patch of of the Copycat APIs under review, I want to get
> started on the data API. I wrote up some notes with a rough proposal for
> what it should cover and a few requirements here:
> https://cwiki.apache.org/confluence/display/KAFKA/Copycat+Data+API
>
> Since they are very related, this also talks about serializers and
> includes some sketches of how I think different serialization formats could
> be implemented since we want Copycat to work with a variety of pluggable
> serialization formats.
>
> I think it's probably easy to bikeshed on the set of supported primitive
> types or the semantics of optional fields & default values. But I think the
> most important output from this discussion would be agreement that the
> schemas described will be sufficiently rich for connectors, can handle
> important cases like schema updates, but can still sufficiently abstract
> away the underlying serialization libraries such that we aren't effectively
> tied to one format despite providing a pluggable interface.
>
> --
> Thanks,
> Ewen
>



-- 
Thanks,
Ewen


[jira] [Commented] (KAFKA-2367) Add Copycat runtime data API

2015-08-11 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692853#comment-14692853
 ] 

Ewen Cheslack-Postava commented on KAFKA-2367:
--

The runtime API should not affect serialization at all. So the JSON comment 
isn't relevant I think -- if we wanted to use Avro for the runtime API, we 
would really just be lifting the Schema and GenericRecord classes but none of 
the serialization code. I personally don't have any issue with doing that, but 
the concern was that someone a) might not like adding Avro as a dependency and 
b) that we do want to support different serialization formats (which, at a 
minimum, is necessary since you may have data in other formats delivered by 
other tools to Kafka, and we still want Copycat to be able to push that data to 
other systems such as HDFS) and don't want to treat Avro as a first class 
citizen and other formats as second class.

If nobody objects, I think using Avro directly isn't a bad choice. I dislike 
some of its choices (in particular that nullable fields need to be defined as 
union types with the null type), but I agree it would be better to offload 
maintaining that code to another project that is already going to be doing it 
anyway and it does have well thought through schema migration support.

> Add Copycat runtime data API
> 
>
> Key: KAFKA-2367
> URL: https://issues.apache.org/jira/browse/KAFKA-2367
> Project: Kafka
>  Issue Type: Sub-task
>  Components: copycat
>Reporter: Ewen Cheslack-Postava
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.8.3
>
>
> Design the API used for runtime data in Copycat. This API is used to 
> construct schemas and records that Copycat processes. This needs to be a 
> fairly general data model (think Avro, JSON, Protobufs, Thrift) in order to 
> support complex, varied data types that may be input from/output to many data 
> systems.
> This should issue should also address the serialization interfaces used 
> within Copycat, which translate the runtime data into serialized byte[] form. 
> It is important that these be considered together because the data format can 
> be used in multiple ways (records, partition IDs, partition offsets), so it 
> and the corresponding serializers must be sufficient for all these use cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2367) Add Copycat runtime data API

2015-08-11 Thread James Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692887#comment-14692887
 ] 

James Cheng commented on KAFKA-2367:


[~ewencp], you mentioned schema migration support as a "pro" of using Avro. How 
is schema migration useful for an internal data API?

> Add Copycat runtime data API
> 
>
> Key: KAFKA-2367
> URL: https://issues.apache.org/jira/browse/KAFKA-2367
> Project: Kafka
>  Issue Type: Sub-task
>  Components: copycat
>Reporter: Ewen Cheslack-Postava
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.8.3
>
>
> Design the API used for runtime data in Copycat. This API is used to 
> construct schemas and records that Copycat processes. This needs to be a 
> fairly general data model (think Avro, JSON, Protobufs, Thrift) in order to 
> support complex, varied data types that may be input from/output to many data 
> systems.
> This should issue should also address the serialization interfaces used 
> within Copycat, which translate the runtime data into serialized byte[] form. 
> It is important that these be considered together because the data format can 
> be used in multiple ways (records, partition IDs, partition offsets), so it 
> and the corresponding serializers must be sufficient for all these use cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Kafka Indentation

2015-08-11 Thread Jay Kreps
Ha ha, love that this thread is simultaneously an argument over code
whitespace AND language choice. Getting agreement here will be like the
open source discussion equivalent of trying to simultaneously conquer both
France and Russia.

Anyone have preferences on text editors? I've always thought emacs was
better...

-Jay

On Tue, Aug 11, 2015 at 6:25 PM, Grant Henke  wrote:

> +1 on not breaking blame
> -1  on 4 spaces for scala
> -1 on rewriting Kafka in Java
> +1 on upping our Scala game
>
> so I guess an accumulative of 0 for me ;)
>
>
> On Tue, Aug 11, 2015 at 7:37 PM, Ashish Singh  wrote:
>
> > I am also a +1 on not breaking git blame. IDEs support language specific
> > settings in same project.
> >
> > On Tue, Aug 11, 2015 at 5:29 PM, Gwen Shapira  wrote:
> >
> > > +1 on not breaking git blame
> > >
> > > -1 on rewriting Kafka in Java
> > > +1 on upping our Scala game (as Ismael pointed out)
> > >
> > > On Tue, Aug 11, 2015 at 5:23 PM, Jason Gustafson 
> > > wrote:
> > >
> > > > Can the java code be indented without affecting the results of git
> > blame?
> > > > If not, then I'd vote to leave it as it is.
> > > >
> > > > (Also +1 on rewriting Kafka in Java)
> > > >
> > > > -Jason
> > > >
> > > > On Tue, Aug 11, 2015 at 5:15 PM, Aditya Auradkar <
> > > > aaurad...@linkedin.com.invalid> wrote:
> > > >
> > > > > Bump. Anyone else have an opinion?
> > > > >
> > > > > Neha/Jay - You've made your thoughts clear. Any thoughts on how/if
> we
> > > > make
> > > > > any changes?
> > > > >
> > > > > Thanks,
> > > > > Aditya
> > > > >
> > > > >
> > > > > On Fri, Jul 24, 2015 at 10:32 AM, Aditya Auradkar <
> > > > aaurad...@linkedin.com>
> > > > > wrote:
> > > > >
> > > > > > I'm with Neha on this one. I don't have a strong preference on 2
> > vs 4
> > > > but
> > > > > > I do think that consistency is more important. It makes writing
> > code
> > > a
> > > > > bit
> > > > > > easier especially since patches are increasingly likely to touch
> > both
> > > > > Java
> > > > > > and Scala code and it's nice to not think about formatting
> certain
> > > > files
> > > > > > differently from others.
> > > > > >
> > > > > > Aditya
> > > > > >
> > > > > > On Fri, Jul 24, 2015 at 9:45 AM, Jay Kreps 
> > wrote:
> > > > > >
> > > > > >> Ismael,
> > > > > >>
> > > > > >> Makes sense. I think there is a good chance that it is just our
> > > > > ignorance
> > > > > >> of scala tools. I really do like having compile time enforced
> > > > formatting
> > > > > >> and dependency checking as we have for java. But we really put
> no
> > > > effort
> > > > > >> into trying to improve the scala developer experience so it may
> be
> > > an
> > > > > >> unfair comparison.
> > > > > >>
> > > > > >> -Jay
> > > > > >>
> > > > > >> On Fri, Jul 24, 2015 at 8:07 AM, Ismael Juma  >
> > > > wrote:
> > > > > >>
> > > > > >> > On Fri, Jul 24, 2015 at 2:00 AM, Jay Kreps 
> > > > wrote:
> > > > > >> >
> > > > > >> > > I do agree that working with a mixture of scala and java is
> a
> > > pain
> > > > > in
> > > > > >> the
> > > > > >> > > butt. What about considering the more extreme idea of just
> > > moving
> > > > > the
> > > > > >> > > remaining server-side scala into java? I like Scala, but the
> > > > tooling
> > > > > >> and
> > > > > >> > > compatibility story for java is better, and Java 8 addressed
> > > some
> > > > of
> > > > > >> the
> > > > > >> > > gaps. For a system like Kafka I do kind of think that what
> > Scala
> > > > > >> offers
> > > > > >> > is
> > > > > >> > > less useful, and the kind of boring Java tooling like IDE
> > > support,
> > > > > >> > > findbugs, checkstyle, simple exception stack traces, and a
> > good
> > > > > >> > > compatability story is more important.
> > > > > >> >
> > > > > >> >
> > > > > >> > I can certainly see the case for avoiding the complexity of
> two
> > > > > >> different
> > > > > >> > languages (assuming that the benefits are not worth it).
> > However,
> > > I
> > > > am
> > > > > >> not
> > > > > >> > sure about the "findbugs, checkstyle" point. Static checking
> is
> > an
> > > > > area
> > > > > >> > that Scala does quite well (better than Java in many ways):
> > > > > scalastyle,
> > > > > >> > abide, scalariform, wartremover, scapegoat, etc. And Scala
> 2.11
> > > also
> > > > > >> has a
> > > > > >> > number of Xlint warnings.
> > > > > >> >
> > > > > >> > Best,
> > > > > >> > Ismael
> > > > > >> >
> > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> >
> > Regards,
> > Ashish
> >
>
>
>
> --
> Grant Henke
> Software Engineer | Cloudera
> gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>


[jira] [Commented] (KAFKA-2367) Add Copycat runtime data API

2015-08-11 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692893#comment-14692893
 ] 

Ewen Cheslack-Postava commented on KAFKA-2367:
--

[~wushujames] see the "Schema Versions and Projection" section on the wiki page 
I wrote up: https://cwiki.apache.org/confluence/display/KAFKA/Copycat+Data+API 
It isn't strictly necessary to support this in the data API (which isn't really 
internal, it is public API that connectors use), but it might be nice to 
provide for schema projection in that API so it doesn't need to be implemented 
by connectors or for each serializer implementation. This would be relevant, 
for example, in a sink connector that needs to normalize data (e.g., all data 
going into an Avro file in HDFS needs to have the same schema). If you ever 
have parts of the stream with mixed versions, you probably want to project to 
the later of the two schemas and write all the data using that updated schema.

> Add Copycat runtime data API
> 
>
> Key: KAFKA-2367
> URL: https://issues.apache.org/jira/browse/KAFKA-2367
> Project: Kafka
>  Issue Type: Sub-task
>  Components: copycat
>Reporter: Ewen Cheslack-Postava
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.8.3
>
>
> Design the API used for runtime data in Copycat. This API is used to 
> construct schemas and records that Copycat processes. This needs to be a 
> fairly general data model (think Avro, JSON, Protobufs, Thrift) in order to 
> support complex, varied data types that may be input from/output to many data 
> systems.
> This should issue should also address the serialization interfaces used 
> within Copycat, which translate the runtime data into serialized byte[] form. 
> It is important that these be considered together because the data format can 
> be used in multiple ways (records, partition IDs, partition offsets), so it 
> and the corresponding serializers must be sufficient for all these use cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2367) Add Copycat runtime data API

2015-08-11 Thread James Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692900#comment-14692900
 ] 

James Cheng commented on KAFKA-2367:


Ah, I think I understand. This would mostly be useful on the sink side, right? 
Would schema projection ever be useful on the source side?

> Add Copycat runtime data API
> 
>
> Key: KAFKA-2367
> URL: https://issues.apache.org/jira/browse/KAFKA-2367
> Project: Kafka
>  Issue Type: Sub-task
>  Components: copycat
>Reporter: Ewen Cheslack-Postava
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.8.3
>
>
> Design the API used for runtime data in Copycat. This API is used to 
> construct schemas and records that Copycat processes. This needs to be a 
> fairly general data model (think Avro, JSON, Protobufs, Thrift) in order to 
> support complex, varied data types that may be input from/output to many data 
> systems.
> This should issue should also address the serialization interfaces used 
> within Copycat, which translate the runtime data into serialized byte[] form. 
> It is important that these be considered together because the data format can 
> be used in multiple ways (records, partition IDs, partition offsets), so it 
> and the corresponding serializers must be sufficient for all these use cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2367) Add Copycat runtime data API

2015-08-11 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692903#comment-14692903
 ] 

Ewen Cheslack-Postava commented on KAFKA-2367:
--

I don't think it's useful on the source side.

> Add Copycat runtime data API
> 
>
> Key: KAFKA-2367
> URL: https://issues.apache.org/jira/browse/KAFKA-2367
> Project: Kafka
>  Issue Type: Sub-task
>  Components: copycat
>Reporter: Ewen Cheslack-Postava
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.8.3
>
>
> Design the API used for runtime data in Copycat. This API is used to 
> construct schemas and records that Copycat processes. This needs to be a 
> fairly general data model (think Avro, JSON, Protobufs, Thrift) in order to 
> support complex, varied data types that may be input from/output to many data 
> systems.
> This should issue should also address the serialization interfaces used 
> within Copycat, which translate the runtime data into serialized byte[] form. 
> It is important that these be considered together because the data format can 
> be used in multiple ways (records, partition IDs, partition offsets), so it 
> and the corresponding serializers must be sufficient for all these use cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Kafka Indentation

2015-08-11 Thread Neha Narkhede
Jay is in the mood for the mother of all bikeshedding exercises. Let me add
the (in)famous build framework question to the mix. I think we should move
to Maven :-P

On Tue, Aug 11, 2015 at 9:31 PM, Jay Kreps  wrote:

> Ha ha, love that this thread is simultaneously an argument over code
> whitespace AND language choice. Getting agreement here will be like the
> open source discussion equivalent of trying to simultaneously conquer both
> France and Russia.
>
> Anyone have preferences on text editors? I've always thought emacs was
> better...
>
> -Jay
>
> On Tue, Aug 11, 2015 at 6:25 PM, Grant Henke  wrote:
>
> > +1 on not breaking blame
> > -1  on 4 spaces for scala
> > -1 on rewriting Kafka in Java
> > +1 on upping our Scala game
> >
> > so I guess an accumulative of 0 for me ;)
> >
> >
> > On Tue, Aug 11, 2015 at 7:37 PM, Ashish Singh 
> wrote:
> >
> > > I am also a +1 on not breaking git blame. IDEs support language
> specific
> > > settings in same project.
> > >
> > > On Tue, Aug 11, 2015 at 5:29 PM, Gwen Shapira 
> wrote:
> > >
> > > > +1 on not breaking git blame
> > > >
> > > > -1 on rewriting Kafka in Java
> > > > +1 on upping our Scala game (as Ismael pointed out)
> > > >
> > > > On Tue, Aug 11, 2015 at 5:23 PM, Jason Gustafson  >
> > > > wrote:
> > > >
> > > > > Can the java code be indented without affecting the results of git
> > > blame?
> > > > > If not, then I'd vote to leave it as it is.
> > > > >
> > > > > (Also +1 on rewriting Kafka in Java)
> > > > >
> > > > > -Jason
> > > > >
> > > > > On Tue, Aug 11, 2015 at 5:15 PM, Aditya Auradkar <
> > > > > aaurad...@linkedin.com.invalid> wrote:
> > > > >
> > > > > > Bump. Anyone else have an opinion?
> > > > > >
> > > > > > Neha/Jay - You've made your thoughts clear. Any thoughts on
> how/if
> > we
> > > > > make
> > > > > > any changes?
> > > > > >
> > > > > > Thanks,
> > > > > > Aditya
> > > > > >
> > > > > >
> > > > > > On Fri, Jul 24, 2015 at 10:32 AM, Aditya Auradkar <
> > > > > aaurad...@linkedin.com>
> > > > > > wrote:
> > > > > >
> > > > > > > I'm with Neha on this one. I don't have a strong preference on
> 2
> > > vs 4
> > > > > but
> > > > > > > I do think that consistency is more important. It makes writing
> > > code
> > > > a
> > > > > > bit
> > > > > > > easier especially since patches are increasingly likely to
> touch
> > > both
> > > > > > Java
> > > > > > > and Scala code and it's nice to not think about formatting
> > certain
> > > > > files
> > > > > > > differently from others.
> > > > > > >
> > > > > > > Aditya
> > > > > > >
> > > > > > > On Fri, Jul 24, 2015 at 9:45 AM, Jay Kreps 
> > > wrote:
> > > > > > >
> > > > > > >> Ismael,
> > > > > > >>
> > > > > > >> Makes sense. I think there is a good chance that it is just
> our
> > > > > > ignorance
> > > > > > >> of scala tools. I really do like having compile time enforced
> > > > > formatting
> > > > > > >> and dependency checking as we have for java. But we really put
> > no
> > > > > effort
> > > > > > >> into trying to improve the scala developer experience so it
> may
> > be
> > > > an
> > > > > > >> unfair comparison.
> > > > > > >>
> > > > > > >> -Jay
> > > > > > >>
> > > > > > >> On Fri, Jul 24, 2015 at 8:07 AM, Ismael Juma <
> ism...@juma.me.uk
> > >
> > > > > wrote:
> > > > > > >>
> > > > > > >> > On Fri, Jul 24, 2015 at 2:00 AM, Jay Kreps <
> j...@confluent.io>
> > > > > wrote:
> > > > > > >> >
> > > > > > >> > > I do agree that working with a mixture of scala and java
> is
> > a
> > > > pain
> > > > > > in
> > > > > > >> the
> > > > > > >> > > butt. What about considering the more extreme idea of just
> > > > moving
> > > > > > the
> > > > > > >> > > remaining server-side scala into java? I like Scala, but
> the
> > > > > tooling
> > > > > > >> and
> > > > > > >> > > compatibility story for java is better, and Java 8
> addressed
> > > > some
> > > > > of
> > > > > > >> the
> > > > > > >> > > gaps. For a system like Kafka I do kind of think that what
> > > Scala
> > > > > > >> offers
> > > > > > >> > is
> > > > > > >> > > less useful, and the kind of boring Java tooling like IDE
> > > > support,
> > > > > > >> > > findbugs, checkstyle, simple exception stack traces, and a
> > > good
> > > > > > >> > > compatability story is more important.
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > I can certainly see the case for avoiding the complexity of
> > two
> > > > > > >> different
> > > > > > >> > languages (assuming that the benefits are not worth it).
> > > However,
> > > > I
> > > > > am
> > > > > > >> not
> > > > > > >> > sure about the "findbugs, checkstyle" point. Static checking
> > is
> > > an
> > > > > > area
> > > > > > >> > that Scala does quite well (better than Java in many ways):
> > > > > > scalastyle,
> > > > > > >> > abide, scalariform, wartremover, scapegoat, etc. And Scala
> > 2.11
> > > > also
> > > > > > >> has a
> > > > > > >> > number of Xlint warnings.
> > > > > > >> >
> > > > > > >> > Best,
> > > > > > >> > Ismael
> > > > > > >> >
> > > > > 

Re: [DISCUSS] Client-side Assignment for New Consumer

2015-08-11 Thread Onur Karaman
Just to make the conversation a bit easier (I don't think we have really
established names for these modes yet), basically with the new
KafkaConsumer today there's:
- "external management", where the application figures out the group
management and partition assignment externally
- "kafka management", where kafka coordinators figure out the group
management and partition assignment.

With today's design, any sort of custom assignment strategy means you'll
have to use external management. This proposal adjusts kafka management to
a place where kafka still provides the group management, but the
application figures out the partition assignment.

One concern I have regarding the JoinGroupResponse:
With kafka management today, there's only one thing looking up the
partitions and figuring out the assignment - the coordinator. All of the
consumers in the group get a consistent view of the assignment. The
proposal in the wiki said JoinGroupResponse only contains the member list
and member metadata. But the consumers still need to find out all the
partitions for all the topics their group is interested in so that they can
run the assignment algorithm. You'd probably want to also include all of
these partitions in the JoinGroupResponse. Otherwise you might run into
split-brain problems and would require additional coordination steps. I
don't see how the coordinator can provide these partitions if you put the
topic subscriptions into the opaque protocol metadata which the coordinator
never looks at.

Another concern I had was about consumer group rebalances:
Today, a consumer group can rebalance due to consumer
joins/failures/leaves(KAFKA-2397), topic partition expansions, or topic
deletion. I don't see how any of the topic related rebalances can happen if
you put the topic subscriptions into the opaque protocol metadata which the
coordinator never looks at.

I'm also uncertain about the value of adding a list of SupportedProtocols
to the JoinGroupRequest as opposed to just one. Adding heuristics to the
coordinator regarding which protocol to choose seems to add complexity to
the coordinator and add uncertainty to the consumers over what strategy
would actually run.

I have more questions, but I just wanted to get these initial concerns out
there.

- Onur

On Tue, Aug 11, 2015 at 1:19 PM, Jason Gustafson  wrote:

> Hi Kafka Devs,
>
> One of the nagging issues in the current design of the new consumer has
> been the need to support a variety of assignment strategies. We've
> encountered this in particular in the design of copycat and the processing
> framework (KIP-28). From what I understand, Samza also has a number of use
> cases with custom assignment needs. The new consumer protocol supports new
> assignment strategies by hooking them into the broker. For many
> environments, this is a major pain and in some cases, a non-starter. It
> also challenges the validation that the coordinator can provide. For
> example, some assignment strategies call for partitions to be assigned
> multiple times, which means that the coordinator can only check that
> partitions have been assigned at least once.
>
> To solve these issues, we'd like to propose moving assignment to the
> client. I've written a wiki which outlines some protocol changes to achieve
> this:
>
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Client-side+Assignment+Proposal
> .
> To summarize briefly, instead of the coordinator assigning the partitions
> itself, all subscriptions are forwarded to each member of the group which
> then decides independently which partitions it should consume. The protocol
> provides a mechanism for the coordinator to validate that all consumers use
> the same assignment strategy, but it does not ensure that the resulting
> assignment is "correct." This provides a powerful capability for users to
> control the full data flow on the client side. They control how data is
> written to partitions through the Partitioner interface and they control
> how data is consumed through the assignment strategy, all without touching
> the server.
>
> Of course nothing comes for free. In particular, this change removes the
> ability of the coordinator to validate that commits are made by consumers
> who were assigned the respective partition. This might not be too bad since
> we retain the ability to validate the generation id, but it is a potential
> concern. We have considered alternative protocols which add a second
> round-trip to the protocol in order to give the coordinator the ability to
> confirm the assignment. As mentioned above, the coordinator is somewhat
> limited in what it can actually validate, but this would return its ability
> to validate commits. The tradeoff is that it increases the protocol's
> complexity which means more ways for the protocol to fail and consequently
> more edge cases in the code.
>
> It also misses an opportunity to generalize the group membership protocol
> for additional use cases. In fact, afte

Re: [DISCUSS] Client-side Assignment for New Consumer

2015-08-11 Thread Jiangjie Qin
Hi Jason,

Thanks for writing this up. It would be useful to generalize the group
concept. I have a few questions below.

1. In old consumer actually the partition assignment are done by consumers
themselves. We used zookeeper to guarantee that a partition will only be
consumed by one consumer thread who successfully claimed its ownership.
Does the new protocol plan to provide the same guarantee?

2. It looks that both JoinGroupRequest and JoinGroupResponse has the
ProtocolMetadata.AssignmentStrategyMetadata, what would be the metadata be
sent and returned by coordinator? How will the coordinator handle the
metadata?

3. Do you mean that the number of partitions in JoinGroupResponse will be
the max partition number of a topic among all the reported partition number
by consumers? Is there any reason not just let Coordinator to return the
number of partitions of a topic in its metadata cache?

Thanks,

Jiangjie (Becket) Qin




On Tue, Aug 11, 2015 at 1:19 PM, Jason Gustafson  wrote:

> Hi Kafka Devs,
>
> One of the nagging issues in the current design of the new consumer has
> been the need to support a variety of assignment strategies. We've
> encountered this in particular in the design of copycat and the processing
> framework (KIP-28). From what I understand, Samza also has a number of use
> cases with custom assignment needs. The new consumer protocol supports new
> assignment strategies by hooking them into the broker. For many
> environments, this is a major pain and in some cases, a non-starter. It
> also challenges the validation that the coordinator can provide. For
> example, some assignment strategies call for partitions to be assigned
> multiple times, which means that the coordinator can only check that
> partitions have been assigned at least once.
>
> To solve these issues, we'd like to propose moving assignment to the
> client. I've written a wiki which outlines some protocol changes to achieve
> this:
>
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Client-side+Assignment+Proposal
> .
> To summarize briefly, instead of the coordinator assigning the partitions
> itself, all subscriptions are forwarded to each member of the group which
> then decides independently which partitions it should consume. The protocol
> provides a mechanism for the coordinator to validate that all consumers use
> the same assignment strategy, but it does not ensure that the resulting
> assignment is "correct." This provides a powerful capability for users to
> control the full data flow on the client side. They control how data is
> written to partitions through the Partitioner interface and they control
> how data is consumed through the assignment strategy, all without touching
> the server.
>
> Of course nothing comes for free. In particular, this change removes the
> ability of the coordinator to validate that commits are made by consumers
> who were assigned the respective partition. This might not be too bad since
> we retain the ability to validate the generation id, but it is a potential
> concern. We have considered alternative protocols which add a second
> round-trip to the protocol in order to give the coordinator the ability to
> confirm the assignment. As mentioned above, the coordinator is somewhat
> limited in what it can actually validate, but this would return its ability
> to validate commits. The tradeoff is that it increases the protocol's
> complexity which means more ways for the protocol to fail and consequently
> more edge cases in the code.
>
> It also misses an opportunity to generalize the group membership protocol
> for additional use cases. In fact, after you've gone to the trouble of
> moving assignment to the client, the main thing that is left in this
> protocol is basically a general group management capability. This is
> exactly what is needed for a few cases that are currently under discussion
> (e.g. copycat or single-writer producer). We've taken this further step in
> the proposal and attempted to envision what that general protocol might
> look like and how it could be used both by the consumer and for some of
> these other cases.
>
> Anyway, since time is running out on the new consumer, we have perhaps one
> last chance to consider a significant change in the protocol like this, so
> have a look at the wiki and share your thoughts. I've no doubt that some
> ideas seem clearer in my mind than they do on paper, so ask questions if
> there is any confusion.
>
> Thanks!
> Jason
>


Re: Review Request 36548: Patch for KAFKA-2336

2015-08-11 Thread Jiangjie Qin


> On Aug. 11, 2015, 10:08 p.m., Gwen Shapira wrote:
> > Ship It!
> 
> Gwen Shapira wrote:
> Jiangjie, I commited despite your concerns since this patch fixes a huge 
> potential issue.
> 
> If you have an idea for an improved fix, we can tackle this in a follow 
> up.

Thanks Gwen. I am fine with the current patch considering people are unlikely 
to have config discrepancies.


- Jiangjie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36548/#review95012
---


On Aug. 11, 2015, 3:37 p.m., Grant Henke wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36548/
> ---
> 
> (Updated Aug. 11, 2015, 3:37 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2336
> https://issues.apache.org/jira/browse/KAFKA-2336
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-2336: Changing offsets.topic.num.partitions after the offset topic is 
> created breaks consumer group partition assignment
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/server/OffsetManager.scala 
> 47b6ce93da320a565435b4a7916a0c4371143b8a 
> 
> Diff: https://reviews.apache.org/r/36548/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Grant Henke
> 
>



Re: KIP Meeting Notes 08/11/2015

2015-08-11 Thread Jiangjie Qin
Hey Guozhang,

Will it be a little bit hard to keep the volunteer list up to date?
Personally I would prefer to have a summery e-mail automatically sent to
kafka-dev list every day for tickets with patches submitted in recent 7
days. The email can also include the reviewer for the ticket. And people
can just take a look a the patch if it is not assigned to anyone. Similarly
we can also list the tickets that has been open for some time but haven't
been updated or closed.

If getting email everyday is too much we can also do it weekly, although I
think people won't complain for one more email given there are already tons
of emails every day :)

Thanks,

Jiangjie (Becket) QIn

On Tue, Aug 11, 2015 at 3:47 PM, Guozhang Wang  wrote:

> Good question.
>
> I can personally think of pros and cons of having a volunteer list, most of
> them are pros but one con is that the list will never be comprehensive and
> in that sense sort of discouraging people to assign themselves as the
> reviewer.
>
> Without such a list, contributors would most likely assign reviewers to who
> they saw to have been a reviewer before or who they know of (i.e. a
> committer most of times). But we could try to encourage people re-assign
> review roles to who they think would be comfortable to do so (maybe they
> have contributed multiple patches on that module, or they have participated
> discussions in that topic, or they are known to have the background, etc),
> while at the same time encourage people to (re-)assign reviewer to
> themselves, and hope that over time more people to be observed as the
> "reviewers to go to". This may also help the community to grow committers.
>
> Thoughts?
>
> Guozhang
>
> On Tue, Aug 11, 2015 at 1:50 PM, Grant Henke  wrote:
>
> > >
> > > 2. Encourage contributors to set the "reviewer" field when change JIRA
> > > status to "patch available", and encourage volunteers assigning
> > themselves
> > > to "reviewers" for pending tickets.
> >
> >
> > Is there somewhere that describes who to pick as a reviewer based on the
> > patch?  Would it be worth listing volunteer reviews in a similar
> location?
> >
> > On Tue, Aug 11, 2015 at 2:14 PM, Guozhang Wang 
> wrote:
> >
> > > First of all, WebEx seems working! And we will upload the recorded
> video
> > > later.
> > >
> > > Quick summary:
> > >
> > > KIP-26: RP-99 (https://github.com/apache/kafka/pull/99) pending for
> > > reviews.
> > >
> > > KIP-28: RP-130 (https://github.com/apache/kafka/pull/130) looking for
> > > feedbacks on:
> > >
> > > 1. API design (see o.k.a.stream.examples).
> > > 2. Architecture design (see KIP wiki page)
> > > 3. Packaging options.
> > >
> > > KIP-29: we will do a quick fix for unblocking production issues with
> > > hard-coded interval values, while at the same time keep the KIP open
> for
> > > further discussions about end state configurations.
> > >
> > > KIP-4: KAFKA-1695 / 2210 pending for reviews.
> > >
> > > Review Backlog Management:
> > >
> > > 1. Remind people to change JIRA status as "patch available" when they
> > > contribute the patch, and change the status back to "in progress" after
> > it
> > > is reviewed, as indicated in:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Contributing+Code+Changes
> > >
> > > 2. Encourage contributors to set the "reviewer" field when change JIRA
> > > status to "patch available", and encourage volunteers assigning
> > themselves
> > > to "reviewers" for pending tickets.
> > >
> > > -- Guozhang
> > >
> >
> >
> >
> > --
> > Grant Henke
> > Software Engineer | Cloudera
> > gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
> >
>
>
>
> --
> -- Guozhang
>


Re: [DISCUSS] Client-side Assignment for New Consumer

2015-08-11 Thread Ewen Cheslack-Postava
On Tue, Aug 11, 2015 at 10:03 PM, Onur Karaman  wrote:

> Just to make the conversation a bit easier (I don't think we have really
> established names for these modes yet), basically with the new
> KafkaConsumer today there's:
> - "external management", where the application figures out the group
> management and partition assignment externally
> - "kafka management", where kafka coordinators figure out the group
> management and partition assignment.
>
> With today's design, any sort of custom assignment strategy means you'll
> have to use external management. This proposal adjusts kafka management to
> a place where kafka still provides the group management, but the
> application figures out the partition assignment.
>
> One concern I have regarding the JoinGroupResponse:
> With kafka management today, there's only one thing looking up the
> partitions and figuring out the assignment - the coordinator. All of the
> consumers in the group get a consistent view of the assignment. The
> proposal in the wiki said JoinGroupResponse only contains the member list
> and member metadata. But the consumers still need to find out all the
> partitions for all the topics their group is interested in so that they can
> run the assignment algorithm. You'd probably want to also include all of
> these partitions in the JoinGroupResponse. Otherwise you might run into
> split-brain problems and would require additional coordination steps. I
> don't see how the coordinator can provide these partitions if you put the
> topic subscriptions into the opaque protocol metadata which the coordinator
> never looks at.
>

If you look at the example embedded consumer protocol, you can see that
each client includes the # of partitions it currently thinks exist in the
topic. This does require every client to look those up via metadata
requests (but that's not that bad and they need that info for consuming
data anyway). However, it also means that you can have disagreements if one
consumer's metadata is out of date. There are a couple of options for
resolving that. One is for each consumer to detect this and immediately
refetch metadata and start a new JoinGroup round. This is a bit annoying,
but should resolve the issue very quickly; also this type of change should
be relatively rare, so it's not necessarily worth optimizing. A different
option is for all consumers to just assume whoever reported the max # of
partitions is right and proceed with assignment that way.


>
> Another concern I had was about consumer group rebalances:
> Today, a consumer group can rebalance due to consumer
> joins/failures/leaves(KAFKA-2397), topic partition expansions, or topic
> deletion. I don't see how any of the topic related rebalances can happen if
> you put the topic subscriptions into the opaque protocol metadata which the
> coordinator never looks at.
>
>
Topic partition expansions and deletion can both be picked up by the
consumers as they periodically refresh metadata. At first I thought this
would be slower to be picked up than with the broker watching for those
events. However, in practice I don't think it really is. First of all, even
with the broker watching for those events, you still have to wait for at
least 1 heartbeat period for everyone to get notified (since we can't
proactively send notifications, they are tied to the heartbeat requests).
Second, if you have even a few consumers, they may have reasonably well
distributed metadata updates such that you're not necessarily waiting a
full metadata update period, but rather something closer to metadata update
period / # of consumers.

This does make the client implementation have to do a bit more, and that
may be a significant consideration since it makes 3rd party consumers a bit
harder to write. However, since you already need to be updating metadata it
doesn't seem like a huge additional burden.


> I'm also uncertain about the value of adding a list of SupportedProtocols
> to the JoinGroupRequest as opposed to just one. Adding heuristics to the
> coordinator regarding which protocol to choose seems to add complexity to
> the coordinator and add uncertainty to the consumers over what strategy
> would actually run.
>

Definitely adds a bit of complexity. However, there are a couple of
important use cases centered around zero downtime upgrades. Consider two
scenarios:

1. I start with the default configuration for my consumers, which gives me
range assignment. Now, I realize that was a poor choice -- it's actually
important to use a sticky assignment strategy. If I want to do a rolling
update so my service continues running while I switch to the new config, I
need to be be able to keep the group running in the old mode (range) until
everyone is updated and then they can all switch over. If the metadata
included is different at all, then at least for some time I'll need to be
able to provide both as options -- only once everyone is updated can the
new stick partitioning approach be used.

Re: [DISCUSS] Client-side Assignment for New Consumer

2015-08-11 Thread Ewen Cheslack-Postava
On Tue, Aug 11, 2015 at 10:15 PM, Jiangjie Qin 
wrote:

> Hi Jason,
>
> Thanks for writing this up. It would be useful to generalize the group
> concept. I have a few questions below.
>
> 1. In old consumer actually the partition assignment are done by consumers
> themselves. We used zookeeper to guarantee that a partition will only be
> consumed by one consumer thread who successfully claimed its ownership.
> Does the new protocol plan to provide the same guarantee?
>

Once you have all the metadata from all the consumers, assignment should
just be a simple function mapping that Map to
Map>. If everyone is consistent in
computing that, you don't need ZK involved at all.

In practice, this shouldn't be that hard to ensure for most assignment
strategies just by having decent unit testing on them. You just have to do
things like ensure your assignment strategy sorts lists into a consistent
order.

You do give up the ability to use some techniques (e.g. any randomized
algorithm if you can't distribute the seed w/ the metadata) and it's true
that nothing validates the assignment, but if that assignment algorithm
step is kept simple, small, and well tested, the risk is very minimal.


>
> 2. It looks that both JoinGroupRequest and JoinGroupResponse has the
> ProtocolMetadata.AssignmentStrategyMetadata, what would be the metadata be
> sent and returned by coordinator? How will the coordinator handle the
> metadata?
>

The coordinator is basically just blindly broadcasting all of it to group
members so they have a consistent view.

So from the coordinators perspective, it sees something like:

Consumer 1 -> JoinGroupRequest with GroupProtocols = [ "consumer"
]
Consumer 2 -> JoinGroupRequest with GroupProtocols = [ "consumer"
]

Then, in the responses would look like:

Consumer 1 <- JoinGroupResponse with GroupProtocol = "consumer" and
GroupMembers = [ Consumer 1 , Consumer 2
]
Consumer 2 <- JoinGroupResponse with GroupProtocol = "consumer" and
GroupMembers = [ Consumer 1 , Consumer 2
]

So all the responses include all the metadata for every member in the
group, and everyone can use that to consistently decide on assignment. The
broker doesn't care and cannot even understand the metadata since the data
format for it is dependent on the assignment strategy being used.

As another example that is *not* a consumer, let's say you just want to
have a single writer in the group which everyone will forward requests to.
To accomplish this, you could use a very dumb assignment strategy: there is
no metadata (empty byte[]) and all we care about is who is the first member
in the group (e.g. when IDs are sorted lexicographically). That member is
selected as the writer. In that case, we actually just care about the
membership list, there's no additional info about each member that is
required to determine who is the writer.


> 3. Do you mean that the number of partitions in JoinGroupResponse will be
> the max partition number of a topic among all the reported partition number
> by consumers? Is there any reason not just let Coordinator to return the
> number of partitions of a topic in its metadata cache?
>

Nothing from the embedded protocol is touched by the broker. The broker
just collects opaque bytes of metadata, does the selection of the strategy
if multiple are supported by some consumers, and then returns that opaque
metadata for all the members back to every member. In that way they all
have a consistent view of the group. For regular consumers, that view of
the group includes information about how many partitions each consumer
currently thinks the topics it is subscribed to has. These could be
inconsistent due to out of date metadata and it would be up to the
assignment strategy on the *client* to resolve that. As you point out, in
that case they could just take the max value that any consumer reported
seeing and use that. The consumers that notice that their metadata had a
smaller # of partitions should also trigger a metadata update when they see
someone else observing a larger # of partitions.


>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
>
>
> On Tue, Aug 11, 2015 at 1:19 PM, Jason Gustafson 
> wrote:
>
> > Hi Kafka Devs,
> >
> > One of the nagging issues in the current design of the new consumer has
> > been the need to support a variety of assignment strategies. We've
> > encountered this in particular in the design of copycat and the
> processing
> > framework (KIP-28). From what I understand, Samza also has a number of
> use
> > cases with custom assignment needs. The new consumer protocol supports
> new
> > assignment strategies by hooking them into the broker. For many
> > environments, this is a major pain and in some cases, a non-starter. It
> > also challenges the validation that the coordinator can provide. For
> > example, some assignment strategies call for partitions to be assigned
> > multiple times, which means that the coordinator can only check that
> > partitions have been assigned at least once

Re: [DISCUSS] Client-side Assignment for New Consumer

2015-08-11 Thread Jiangjie Qin
Ewen,

Thanks for the explanation.

For (1), I am more concerned about the failure case instead of normal case.
What if a consumer somehow was kick out of a group but is still consuming
and committing offsets? Does that mean the new owner and old owner might
potentially consuming from and committing offsets for the same partition?
In the old consumer, this won't happen because the new consumer will not be
able to start consumption unless the previous owner has released its
ownership. Basically, without the ownership guarantee, I don't see how the
communication among consumers themselves alone can solve the problem here.

For (2) and (3), now I understand how metadata are used. But I still don't
see why should we let the consumers to pass the topic information across
instead of letting coordinator give the information. The single producer
use case does not solve the ownership problem in abnormal case either,
which seems to be a little bit vulnerable.

Thanks,

Jiangjie (Becket) Qin


On Tue, Aug 11, 2015 at 11:06 PM, Ewen Cheslack-Postava 
wrote:

> On Tue, Aug 11, 2015 at 10:15 PM, Jiangjie Qin 
> wrote:
>
> > Hi Jason,
> >
> > Thanks for writing this up. It would be useful to generalize the group
> > concept. I have a few questions below.
> >
> > 1. In old consumer actually the partition assignment are done by
> consumers
> > themselves. We used zookeeper to guarantee that a partition will only be
> > consumed by one consumer thread who successfully claimed its ownership.
> > Does the new protocol plan to provide the same guarantee?
> >
>
> Once you have all the metadata from all the consumers, assignment should
> just be a simple function mapping that Map to
> Map>. If everyone is consistent in
> computing that, you don't need ZK involved at all.
>
> In practice, this shouldn't be that hard to ensure for most assignment
> strategies just by having decent unit testing on them. You just have to do
> things like ensure your assignment strategy sorts lists into a consistent
> order.
>
> You do give up the ability to use some techniques (e.g. any randomized
> algorithm if you can't distribute the seed w/ the metadata) and it's true
> that nothing validates the assignment, but if that assignment algorithm
> step is kept simple, small, and well tested, the risk is very minimal.
>
>
> >
> > 2. It looks that both JoinGroupRequest and JoinGroupResponse has the
> > ProtocolMetadata.AssignmentStrategyMetadata, what would be the metadata
> be
> > sent and returned by coordinator? How will the coordinator handle the
> > metadata?
> >
>
> The coordinator is basically just blindly broadcasting all of it to group
> members so they have a consistent view.
>
> So from the coordinators perspective, it sees something like:
>
> Consumer 1 -> JoinGroupRequest with GroupProtocols = [ "consumer"
> ]
> Consumer 2 -> JoinGroupRequest with GroupProtocols = [ "consumer"
> ]
>
> Then, in the responses would look like:
>
> Consumer 1 <- JoinGroupResponse with GroupProtocol = "consumer" and
> GroupMembers = [ Consumer 1 , Consumer 2
> ]
> Consumer 2 <- JoinGroupResponse with GroupProtocol = "consumer" and
> GroupMembers = [ Consumer 1 , Consumer 2
> ]
>
> So all the responses include all the metadata for every member in the
> group, and everyone can use that to consistently decide on assignment. The
> broker doesn't care and cannot even understand the metadata since the data
> format for it is dependent on the assignment strategy being used.
>
> As another example that is *not* a consumer, let's say you just want to
> have a single writer in the group which everyone will forward requests to.
> To accomplish this, you could use a very dumb assignment strategy: there is
> no metadata (empty byte[]) and all we care about is who is the first member
> in the group (e.g. when IDs are sorted lexicographically). That member is
> selected as the writer. In that case, we actually just care about the
> membership list, there's no additional info about each member that is
> required to determine who is the writer.
>
>
> > 3. Do you mean that the number of partitions in JoinGroupResponse will be
> > the max partition number of a topic among all the reported partition
> number
> > by consumers? Is there any reason not just let Coordinator to return the
> > number of partitions of a topic in its metadata cache?
> >
>
> Nothing from the embedded protocol is touched by the broker. The broker
> just collects opaque bytes of metadata, does the selection of the strategy
> if multiple are supported by some consumers, and then returns that opaque
> metadata for all the members back to every member. In that way they all
> have a consistent view of the group. For regular consumers, that view of
> the group includes information about how many partitions each consumer
> currently thinks the topics it is subscribed to has. These could be
> inconsistent due to out of date metadata and it would be up to the
> assignment strategy on the *client* to resolve that. As

Re: [DISCUSS] Client-side Assignment for New Consumer

2015-08-11 Thread Ewen Cheslack-Postava
On Tue, Aug 11, 2015 at 11:29 PM, Jiangjie Qin 
wrote:

> Ewen,
>
> Thanks for the explanation.
>
> For (1), I am more concerned about the failure case instead of normal case.
> What if a consumer somehow was kick out of a group but is still consuming
> and committing offsets? Does that mean the new owner and old owner might
> potentially consuming from and committing offsets for the same partition?
> In the old consumer, this won't happen because the new consumer will not be
> able to start consumption unless the previous owner has released its
> ownership. Basically, without the ownership guarantee, I don't see how the
> communication among consumers themselves alone can solve the problem here.
>

The generation ID check still applies to offset commits. If one of the
consumers is kicked out and misbehaving, it can obviously still fetch and
process messages, but offset commits will not work since it will not have
the current generation ID.


>
> For (2) and (3), now I understand how metadata are used. But I still don't
> see why should we let the consumers to pass the topic information across
> instead of letting coordinator give the information. The single producer
> use case does not solve the ownership problem in abnormal case either,
> which seems to be a little bit vulnerable.
>

One of the goals here was to generalize group membership so we can, for
example, use it for balancing Copycat tasks across workers. There's no
topic subscription info in that case. The metadata for copycat workers
would instead need to somehow indicate the current set of tasks that need
to be assigned to workers. By making the metadata completely opaque to the
protocol, it becomes more generally useful since it focuses squarely on the
group membership problem, allowing for that additional bit of metadata so
you don't just get a list of members, but also get a little bit of info
about each of them.

A different option that we explored is to use a sort of mixed model --
still bake all the topic subscriptions directly into the protocol but also
include metadata. That would allow us to maintain the existing
coordinator-driven approach to handling the metadata and change events like
the ones Onur pointed out. Then something like the Copycat workers would
just not fill in any topic subscriptions and it would be handled as a
degenerate case. Based on the way I explained that we can handle those
types of events, I personally feel its cleaner and a nicer generalization
to not include the subscriptions in the join group protocol, making it part
of the metadata instead.

For the single producer case, are you saying it doesn't solve ownership in
the abnormal case because a producer that doesn't know it has been kicked
out of the group yet can still produce data even though it shouldn't be
able to anymore? I definitely agree that that is a risk -- this provides a
way to get closer to a true single-writer, but there are definitely still
failure modes that this does not address.

-Ewen


>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
> On Tue, Aug 11, 2015 at 11:06 PM, Ewen Cheslack-Postava  >
> wrote:
>
> > On Tue, Aug 11, 2015 at 10:15 PM, Jiangjie Qin  >
> > wrote:
> >
> > > Hi Jason,
> > >
> > > Thanks for writing this up. It would be useful to generalize the group
> > > concept. I have a few questions below.
> > >
> > > 1. In old consumer actually the partition assignment are done by
> > consumers
> > > themselves. We used zookeeper to guarantee that a partition will only
> be
> > > consumed by one consumer thread who successfully claimed its ownership.
> > > Does the new protocol plan to provide the same guarantee?
> > >
> >
> > Once you have all the metadata from all the consumers, assignment should
> > just be a simple function mapping that Map to
> > Map>. If everyone is consistent in
> > computing that, you don't need ZK involved at all.
> >
> > In practice, this shouldn't be that hard to ensure for most assignment
> > strategies just by having decent unit testing on them. You just have to
> do
> > things like ensure your assignment strategy sorts lists into a consistent
> > order.
> >
> > You do give up the ability to use some techniques (e.g. any randomized
> > algorithm if you can't distribute the seed w/ the metadata) and it's true
> > that nothing validates the assignment, but if that assignment algorithm
> > step is kept simple, small, and well tested, the risk is very minimal.
> >
> >
> > >
> > > 2. It looks that both JoinGroupRequest and JoinGroupResponse has the
> > > ProtocolMetadata.AssignmentStrategyMetadata, what would be the metadata
> > be
> > > sent and returned by coordinator? How will the coordinator handle the
> > > metadata?
> > >
> >
> > The coordinator is basically just blindly broadcasting all of it to group
> > members so they have a consistent view.
> >
> > So from the coordinators perspective, it sees something like:
> >
> > Consumer 1 -> JoinGroupRequest with GroupProtocols = [ "consumer"
> > ]
> > Consumer