[jira] [Commented] (KAFKA-595) Decouple producer side compression from server-side compression.

2015-01-15 Thread Manikumar Reddy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278435#comment-14278435
 ] 

Manikumar Reddy commented on KAFKA-595:
---

[~nehanarkhede] We have given support for broker-side compression in 
KAFKA-1499. This issue is similar to KAFKA-1499.  I think we can close this 
issue.

> Decouple producer side compression from server-side compression.
> 
>
> Key: KAFKA-595
> URL: https://issues.apache.org/jira/browse/KAFKA-595
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.8.0
>Reporter: Neha Narkhede
>  Labels: feature
>
> In 0.7 Kafka always appended messages to the log using whatever compression 
> codec the client used. In 0.8, after the KAFKA-506 patch, the master always 
> recompresses the message before appending to the log to assign ids. Currently 
> the server uses a funky heuristic to choose a compression codec based on the 
> codecs the producer used. This doesn't actually make that much sense. It 
> would be better for the server to have its own compression (a global default 
> and per-topic override) that specified the compression codec, and have the 
> server always recompress with this codec regardless of the original codec.
> Compression currently happens in kafka.log.Log.assignOffsets (perhaps should 
> be renamed if it takes on compression as an official responsibility instead 
> of a side-effect).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] 0.8.2.0 Candidate 1

2015-01-15 Thread Jaikiran Pai
I just downloaded the Kafka binary and am trying this on my 32 bit JVM 
(Java 7)? Trying to start Zookeeper or Kafka server keeps failing with 
"Unrecognized VM option 'UseCompressedOops'":


./zookeeper-server-start.sh ../config/zookeeper.properties
Unrecognized VM option 'UseCompressedOops'
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

Same with the Kafka server startup scripts. My Java version is:

java version "1.7.0_71"
Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
Java HotSpot(TM) Server VM (build 24.71-b01, mixed mode)

Should there be a check in the script, before adding this option?

-Jaikiran

On Wednesday 14 January 2015 10:08 PM, Jun Rao wrote:

+ users mailing list. It would be great if people can test this out and
report any blocker issues.

Thanks,

Jun

On Tue, Jan 13, 2015 at 7:16 PM, Jun Rao  wrote:


This is the first candidate for release of Apache Kafka 0.8.2.0. There
has been some changes since the 0.8.2 beta release, especially in the new
java producer api and jmx mbean names. It would be great if people can test
this out thoroughly. We are giving people 10 days for testing and voting.

Release Notes for the 0.8.2.0 release
*https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/RELEASE_NOTES.html
*

*** Please download, test and vote by Friday, Jan 23h, 7pm PT

Kafka's KEYS file containing PGP keys we use to sign the release:
*https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/KEYS
* in
addition to the md5, sha1
and sha2 (SHA256) checksum.

* Release artifacts to be voted upon (source and binary):
*https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/
*

* Maven artifacts to be voted upon prior to release:
*https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/maven_staging/
*

* scala-doc
*https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/scaladoc/#package
*

* java-doc
*https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/javadoc/
*

* The tag to be voted upon (off the 0.8.2 branch) is the 0.8.2.0 tag
*https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=b0c7d579f8aeb5750573008040a42b7377a651d5
*

/***

Thanks,

Jun





Re: [VOTE] 0.8.2.0 Candidate 1

2015-01-15 Thread Manikumar Reddy
Yes,  we can add a check.  This option works only  with 64 bit jvm.
On Jan 15, 2015 6:53 PM, "Jaikiran Pai"  wrote:

> I just downloaded the Kafka binary and am trying this on my 32 bit JVM
> (Java 7)? Trying to start Zookeeper or Kafka server keeps failing with
> "Unrecognized VM option 'UseCompressedOops'":
>
> ./zookeeper-server-start.sh ../config/zookeeper.properties
> Unrecognized VM option 'UseCompressedOops'
> Error: Could not create the Java Virtual Machine.
> Error: A fatal exception has occurred. Program will exit.
>
> Same with the Kafka server startup scripts. My Java version is:
>
> java version "1.7.0_71"
> Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
> Java HotSpot(TM) Server VM (build 24.71-b01, mixed mode)
>
> Should there be a check in the script, before adding this option?
>
> -Jaikiran
>
> On Wednesday 14 January 2015 10:08 PM, Jun Rao wrote:
>
>> + users mailing list. It would be great if people can test this out and
>> report any blocker issues.
>>
>> Thanks,
>>
>> Jun
>>
>> On Tue, Jan 13, 2015 at 7:16 PM, Jun Rao  wrote:
>>
>>  This is the first candidate for release of Apache Kafka 0.8.2.0. There
>>> has been some changes since the 0.8.2 beta release, especially in the new
>>> java producer api and jmx mbean names. It would be great if people can
>>> test
>>> this out thoroughly. We are giving people 10 days for testing and voting.
>>>
>>> Release Notes for the 0.8.2.0 release
>>> *https://people.apache.org/~junrao/kafka-0.8.2.0-
>>> candidate1/RELEASE_NOTES.html
>>> >> candidate1/RELEASE_NOTES.html>*
>>>
>>> *** Please download, test and vote by Friday, Jan 23h, 7pm PT
>>>
>>> Kafka's KEYS file containing PGP keys we use to sign the release:
>>> *https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/KEYS
>>> * in
>>> addition to the md5, sha1
>>> and sha2 (SHA256) checksum.
>>>
>>> * Release artifacts to be voted upon (source and binary):
>>> *https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/
>>> *
>>>
>>> * Maven artifacts to be voted upon prior to release:
>>> *https://people.apache.org/~junrao/kafka-0.8.2.0-
>>> candidate1/maven_staging/
>>> >> candidate1/maven_staging/>*
>>>
>>> * scala-doc
>>> *https://people.apache.org/~junrao/kafka-0.8.2.0-
>>> candidate1/scaladoc/#package
>>> >> candidate1/scaladoc/#package>*
>>>
>>> * java-doc
>>> *https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/javadoc/
>>> *
>>>
>>> * The tag to be voted upon (off the 0.8.2 branch) is the 0.8.2.0 tag
>>> *https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=
>>> b0c7d579f8aeb5750573008040a42b7377a651d5
>>> >> b0c7d579f8aeb5750573008040a42b7377a651d5>*
>>>
>>> /***
>>>
>>> Thanks,
>>>
>>> Jun
>>>
>>>
>


[jira] [Commented] (KAFKA-1333) Add consumer co-ordinator module to the server

2015-01-15 Thread Andrii Biletskyi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278739#comment-14278739
 ] 

Andrii Biletskyi commented on KAFKA-1333:
-

Hey [~guozhang],
We are considering using 0.9 Consumer parts for full-fledged high level 
Consumer which supports all basic stuff like rebalance etc plus some additional 
specific to our needs features.Since you are the assignee of this ticket can 
you please tell where are you on this piece, when there will be some first 
alpha version patch to review / contribute?
Thank you in advance!

> Add consumer co-ordinator module to the server
> --
>
> Key: KAFKA-1333
> URL: https://issues.apache.org/jira/browse/KAFKA-1333
> Project: Kafka
>  Issue Type: Sub-task
>  Components: consumer
>Affects Versions: 0.9.0
>Reporter: Neha Narkhede
>Assignee: Guozhang Wang
>
> Scope of this JIRA is to just add a consumer co-ordinator module that do the 
> following:
> 1) coordinator start-up, metadata initialization
> 2) simple join group handling (just updating metadata, no failure detection / 
> rebalancing): this should be sufficient for consumers doing self offset / 
> partition management.
> Offset manager will still run side-by-side with the coordinator in this JIRA, 
> and we will merge it in KAFKA-1740.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1649) Protocol documentation does not indicate that ReplicaNotAvailable can be ignored

2015-01-15 Thread Hernan Rivas Inaka (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278790#comment-14278790
 ] 

Hernan Rivas Inaka commented on KAFKA-1649:
---

I managed to get that error on 0.8.1 (don't remember the specific revision) by 
killing leaders but it didn't always happen, but my tests always had 3 brokers 
so that might have something to do with it.

I will try to allocate some time to replicate the issue both on 0.8.1 and 0.8.2 
and see what I can find.

> Protocol documentation does not indicate that ReplicaNotAvailable can be 
> ignored
> 
>
> Key: KAFKA-1649
> URL: https://issues.apache.org/jira/browse/KAFKA-1649
> Project: Kafka
>  Issue Type: Improvement
>  Components: website
>Affects Versions: 0.8.1.1
>Reporter: Hernan Rivas Inaka
>Priority: Minor
>  Labels: protocol-documentation
>   Original Estimate: 10m
>  Remaining Estimate: 10m
>
> The protocol documentation here 
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-ErrorCodes
>  should indicate that error 9 (ReplicaNotAvailable) can be safely ignored on 
> producers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1333) Add consumer co-ordinator module to the server

2015-01-15 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278855#comment-14278855
 ] 

Jay Kreps commented on KAFKA-1333:
--

Hey [~abiletskyi] A relatively full-fledged Java consumer has been posted with 
stubs for the new APIs on KAFKA-1760 but none of the server-side work has been 
started as far as I know...

> Add consumer co-ordinator module to the server
> --
>
> Key: KAFKA-1333
> URL: https://issues.apache.org/jira/browse/KAFKA-1333
> Project: Kafka
>  Issue Type: Sub-task
>  Components: consumer
>Affects Versions: 0.9.0
>Reporter: Neha Narkhede
>Assignee: Guozhang Wang
>
> Scope of this JIRA is to just add a consumer co-ordinator module that do the 
> following:
> 1) coordinator start-up, metadata initialization
> 2) simple join group handling (just updating metadata, no failure detection / 
> rebalancing): this should be sufficient for consumers doing self offset / 
> partition management.
> Offset manager will still run side-by-side with the coordinator in this JIRA, 
> and we will merge it in KAFKA-1740.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Need some pointers to writing (real) tests

2015-01-15 Thread Jay Kreps
The integration tests for the producer are in with the server since the
dependency is that the server depends on the clients rather than vice
versa). Only the mock tests are with the clients. You should be able to add
to one of the tests in
  core/src/test/scala/integration/kafka/api/Producer*


On Wed, Jan 14, 2015 at 11:07 PM, Jaikiran Pai 
wrote:

> I have been looking at some unassigned JIRAs to work on during some spare
> time and found this one https://issues.apache.org/jira/browse/KAFKA-1837.
> As I note in that JIRA, I can see why this happens and have a potential fix
> for it. But to first reproduce the issue and then verify the fix, I have
> been attempting a testcase (in the clients). Some of the tests that are
> already present (like SenderTest) use MockProducer which won't be relevant
> in testing this issue, from what I see.
>
> So I need some inputs or pointers to create a test which will use the real
> KafkaProducer/Sender/NetworkClient. My initial attempt at this uses the
> TestUtils to create a (dummy) cluster, and that one fails for obvious
> reasons (the client not receiving a metadata update over the wire from the
> server):
>
> @Test
> public void testFailedSend() throws Exception {
> final TopicPartition tp = new TopicPartition("test", 0);
> final String producedValue = "foobar";
> final ProducerRecord product = new ProducerRecord(tp.topic(),
> producedValue);
> final Cluster cluster = TestUtils.singletonCluster("test", 1);
> final Node node = this.cluster.nodes().get(0);
> final Properties kakfaProducerConfigs = new Properties();
> kakfaProducerConfigs.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,
> node.host() + ":" + node.port());
> kakfaProducerConfigs.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
> StringSerializer.class.getName());
> kakfaProducerConfigs.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
> StringSerializer.class.getName());
> final Producer producer = new KafkaProducer(kakfaProducerConfigs);
>
> // This times out waiting for a metadata update from the server
> for the cluster (because there isn't really any real server around)
> final Future futureAck = producer.send(product);
> 
>
>
> Any pointers to existing tests?
>
> -Jaikiran
>


[jira] [Commented] (KAFKA-1577) Exception in ConnectionQuotas while shutting down

2015-01-15 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278873#comment-14278873
 ] 

Joel Koshy commented on KAFKA-1577:
---

[~german.borbolla] can you make sure you don't have a stray kafka jar in your 
classpath? I have noticed this when switching across git hashes that include 
changes to our build mechanism. Do a gradlew clean and ensure that "find . 
-name kafka*.jar" returns nothing and then rebuild before trying to reproduce 
this.

> Exception in ConnectionQuotas while shutting down
> -
>
> Key: KAFKA-1577
> URL: https://issues.apache.org/jira/browse/KAFKA-1577
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.2
>Reporter: Joel Koshy
>Assignee: Sriharsha Chintalapani
>Priority: Blocker
>  Labels: newbie
> Fix For: 0.8.2
>
> Attachments: KAFKA-1577.patch, KAFKA-1577.patch, 
> KAFKA-1577_2014-08-20_19:57:44.patch, KAFKA-1577_2014-08-26_07:33:13.patch, 
> KAFKA-1577_2014-09-26_19:13:05.patch, 
> KAFKA-1577_check_counter_before_decrementing.patch, kafka-logs.tar.gz
>
>
> {code}
> [2014-08-07 19:38:08,228] ERROR Uncaught exception in thread 
> 'kafka-network-thread-9092-0': (kafka.utils.Utils$)
> java.util.NoSuchElementException: None.get
> at scala.None$.get(Option.scala:185)
> at scala.None$.get(Option.scala:183)
> at kafka.network.ConnectionQuotas.dec(SocketServer.scala:471)
> at kafka.network.AbstractServerThread.close(SocketServer.scala:158)
> at kafka.network.AbstractServerThread.close(SocketServer.scala:150)
> at kafka.network.AbstractServerThread.closeAll(SocketServer.scala:171)
> at kafka.network.Processor.run(SocketServer.scala:338)
> at java.lang.Thread.run(Thread.java:662)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1577) Exception in ConnectionQuotas while shutting down

2015-01-15 Thread German Borbolla (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278880#comment-14278880
 ] 

German Borbolla commented on KAFKA-1577:


I will try to reproduce with the release candidate for 0.8.2

> Exception in ConnectionQuotas while shutting down
> -
>
> Key: KAFKA-1577
> URL: https://issues.apache.org/jira/browse/KAFKA-1577
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.2
>Reporter: Joel Koshy
>Assignee: Sriharsha Chintalapani
>Priority: Blocker
>  Labels: newbie
> Fix For: 0.8.2
>
> Attachments: KAFKA-1577.patch, KAFKA-1577.patch, 
> KAFKA-1577_2014-08-20_19:57:44.patch, KAFKA-1577_2014-08-26_07:33:13.patch, 
> KAFKA-1577_2014-09-26_19:13:05.patch, 
> KAFKA-1577_check_counter_before_decrementing.patch, kafka-logs.tar.gz
>
>
> {code}
> [2014-08-07 19:38:08,228] ERROR Uncaught exception in thread 
> 'kafka-network-thread-9092-0': (kafka.utils.Utils$)
> java.util.NoSuchElementException: None.get
> at scala.None$.get(Option.scala:185)
> at scala.None$.get(Option.scala:183)
> at kafka.network.ConnectionQuotas.dec(SocketServer.scala:471)
> at kafka.network.AbstractServerThread.close(SocketServer.scala:158)
> at kafka.network.AbstractServerThread.close(SocketServer.scala:150)
> at kafka.network.AbstractServerThread.closeAll(SocketServer.scala:171)
> at kafka.network.Processor.run(SocketServer.scala:338)
> at java.lang.Thread.run(Thread.java:662)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Jenkins build is back to normal : Kafka-trunk #370

2015-01-15 Thread Apache Jenkins Server
See 



[jira] [Comment Edited] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-01-15 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278895#comment-14278895
 ] 

Sriharsha Chintalapani edited comment on KAFKA-1461 at 1/15/15 4:24 PM:


[~guozhang] I had the following code in my mind about backoff retries incase of 
any error. This code will be under ReplicaFetcherThread.handlePartitions.
I am thinking off maintaining two maps in ReplicaFetcherThread
  private val partitionsWithErrorStandbyMap = new 
mutable.HashMap[TopicAndPartition, Long] // a (topic, partition) -> offset
  private val partitionsWithErrorMap = new mutable.HashMap[TopicAndPartition, 
Long] // a (topic, partition) -> timestamp
one for offset and one for timestamp.
remove the partitions from the AbstractFetcherThread.partitionsMap and add back 
to the map once the currentTime > partitionsWithErrorMap.timestamp + 
replicaFetcherRetryBackoffMs .
I am not quite sure about maintaining these two maps . If its look ok to you , 
I'll send a patch or if you have any other approach please let me know. 

{code}
  def handlePartitionsWithErrors(partitions: Iterable[TopicAndPartition]) {

//add to the partitionsWithErrorMap with currentTime.
for (partition <- partitions) {
  if(!partitionsWithErrorMap.contains(partition)) {
partitionsWithErrorMap.put(partition, System.currentTimeMillis())
currentOffset(partition) match {
  case Some(offset: Long) =>  
partitionsWithErrorStandbyMap.put(partition, offset)
}
  }
}
removePartitions(partitions.toSet)
val partitionsToBeAdded = new mutable.HashMap[TopicAndPartition, Long]
// process partitionsWithErrorMap and add partitions back if the backoff 
time elapsed.
partitionsWithErrorMap.foreach {
  case((topicAndPartition, timeMs)) =>
if(System.currentTimeMillis() > timeMs + 
brokerConfig.replicaFetcherRetryBackoffMs) {
  partitionsWithErrorStandbyMap.get(topicAndPartition) match {
case Some(offset: Long) => 
partitionsToBeAdded.put(topicAndPartition, offset)
  }
  partitionsWithErrorStandbyMap.remove(topicAndPartition)
}
}
addPartitions(partitionsToBeAdded)
  }
{code}


was (Author: sriharsha):
[~guozhang] I had the following code in my mind about backoff retries incase of 
any error. This code will be under ReplicaFetcherThread.handlePartitions.
I am thinking off maintaining two maps in ReplicaFetcherThread
  private val partitionsWithErrorStandbyMap = new 
mutable.HashMap[TopicAndPartition, Long] // a (topic, partition) -> offset
  private val partitionsWithErrorMap = new mutable.HashMap[TopicAndPartition, 
Long] // a (topic, partition) -> timestamp
one for offset and one for timestamp.
remove the partitions from the AbstractFetcherThread.partitionsMap and add back 
to the map once the currentTime > partitionsWithErrorMap.timestamp + 
replicaFetcherRetryBackoffMs .
I am not quite sure about maintaining these two maps . If its look ok to you , 
I'll send a patch or if you have any other approach please let me know. 

```code
  def handlePartitionsWithErrors(partitions: Iterable[TopicAndPartition]) {

//add to the partitionsWithErrorMap with currentTime.
for (partition <- partitions) {
  if(!partitionsWithErrorMap.contains(partition)) {
partitionsWithErrorMap.put(partition, System.currentTimeMillis())
currentOffset(partition) match {
  case Some(offset: Long) =>  
partitionsWithErrorStandbyMap.put(partition, offset)
}
  }
}
removePartitions(partitions.toSet)
val partitionsToBeAdded = new mutable.HashMap[TopicAndPartition, Long]
// process partitionsWithErrorMap and add partitions back if the backoff 
time elapsed.
partitionsWithErrorMap.foreach {
  case((topicAndPartition, timeMs)) =>
if(System.currentTimeMillis() > timeMs + 
brokerConfig.replicaFetcherRetryBackoffMs) {
  partitionsWithErrorStandbyMap.get(topicAndPartition) match {
case Some(offset: Long) => 
partitionsToBeAdded.put(topicAndPartition, offset)
  }
  partitionsWithErrorStandbyMap.remove(topicAndPartition)
}
}
addPartitions(partitionsToBeAdded)
  }
```

> Replica fetcher thread does not implement any back-off behavior
> ---
>
> Key: KAFKA-1461
> URL: https://issues.apache.org/jira/browse/KAFKA-1461
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.1.1
>Reporter: Sam Meder
>Assignee: Sriharsha Chintalapani
>  Labels: newbie++
> Fix For: 0.8.3
>
>
> The current replica fetcher thread will retry in a tight loop if any error 
> occurs during the fetch call. For example, we've seen cases where the fetc

[jira] [Resolved] (KAFKA-595) Decouple producer side compression from server-side compression.

2015-01-15 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy resolved KAFKA-595.
--
Resolution: Implemented
  Assignee: Manikumar Reddy

Yes I think we can close this.

> Decouple producer side compression from server-side compression.
> 
>
> Key: KAFKA-595
> URL: https://issues.apache.org/jira/browse/KAFKA-595
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.8.0
>Reporter: Neha Narkhede
>Assignee: Manikumar Reddy
>  Labels: feature
>
> In 0.7 Kafka always appended messages to the log using whatever compression 
> codec the client used. In 0.8, after the KAFKA-506 patch, the master always 
> recompresses the message before appending to the log to assign ids. Currently 
> the server uses a funky heuristic to choose a compression codec based on the 
> codecs the producer used. This doesn't actually make that much sense. It 
> would be better for the server to have its own compression (a global default 
> and per-topic override) that specified the compression codec, and have the 
> server always recompress with this codec regardless of the original codec.
> Compression currently happens in kafka.log.Log.assignOffsets (perhaps should 
> be renamed if it takes on compression as an official responsibility instead 
> of a side-effect).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-01-15 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278895#comment-14278895
 ] 

Sriharsha Chintalapani commented on KAFKA-1461:
---

[~guozhang] I had the following code in my mind about backoff retries incase of 
any error. This code will be under ReplicaFetcherThread.handlePartitions.
I am thinking off maintaining two maps in ReplicaFetcherThread
  private val partitionsWithErrorStandbyMap = new 
mutable.HashMap[TopicAndPartition, Long] // a (topic, partition) -> offset
  private val partitionsWithErrorMap = new mutable.HashMap[TopicAndPartition, 
Long] // a (topic, partition) -> timestamp
one for offset and one for timestamp.
remove the partitions from the AbstractFetcherThread.partitionsMap and add back 
to the map once the currentTime > partitionsWithErrorMap.timestamp + 
replicaFetcherRetryBackoffMs .
I am not quite sure about maintaining these two maps . If its look ok to you , 
I'll send a patch or if you have any other approach please let me know. 

```code
  def handlePartitionsWithErrors(partitions: Iterable[TopicAndPartition]) {

//add to the partitionsWithErrorMap with currentTime.
for (partition <- partitions) {
  if(!partitionsWithErrorMap.contains(partition)) {
partitionsWithErrorMap.put(partition, System.currentTimeMillis())
currentOffset(partition) match {
  case Some(offset: Long) =>  
partitionsWithErrorStandbyMap.put(partition, offset)
}
  }
}
removePartitions(partitions.toSet)
val partitionsToBeAdded = new mutable.HashMap[TopicAndPartition, Long]
// process partitionsWithErrorMap and add partitions back if the backoff 
time elapsed.
partitionsWithErrorMap.foreach {
  case((topicAndPartition, timeMs)) =>
if(System.currentTimeMillis() > timeMs + 
brokerConfig.replicaFetcherRetryBackoffMs) {
  partitionsWithErrorStandbyMap.get(topicAndPartition) match {
case Some(offset: Long) => 
partitionsToBeAdded.put(topicAndPartition, offset)
  }
  partitionsWithErrorStandbyMap.remove(topicAndPartition)
}
}
addPartitions(partitionsToBeAdded)
  }
```

> Replica fetcher thread does not implement any back-off behavior
> ---
>
> Key: KAFKA-1461
> URL: https://issues.apache.org/jira/browse/KAFKA-1461
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.1.1
>Reporter: Sam Meder
>Assignee: Sriharsha Chintalapani
>  Labels: newbie++
> Fix For: 0.8.3
>
>
> The current replica fetcher thread will retry in a tight loop if any error 
> occurs during the fetch call. For example, we've seen cases where the fetch 
> continuously throws a connection refused exception leading to several replica 
> fetcher threads that spin in a pretty tight loop.
> To a much lesser degree this is also an issue in the consumer fetcher thread, 
> although the fact that erroring partitions are removed so a leader can be 
> re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1333) Add consumer co-ordinator module to the server

2015-01-15 Thread Onur Karaman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278905#comment-14278905
 ] 

Onur Karaman commented on KAFKA-1333:
-

Hey everyone. I spoke with Guozhang about this yesterday. He has started this 
work.

> Add consumer co-ordinator module to the server
> --
>
> Key: KAFKA-1333
> URL: https://issues.apache.org/jira/browse/KAFKA-1333
> Project: Kafka
>  Issue Type: Sub-task
>  Components: consumer
>Affects Versions: 0.9.0
>Reporter: Neha Narkhede
>Assignee: Guozhang Wang
>
> Scope of this JIRA is to just add a consumer co-ordinator module that do the 
> following:
> 1) coordinator start-up, metadata initialization
> 2) simple join group handling (just updating metadata, no failure detection / 
> rebalancing): this should be sufficient for consumers doing self offset / 
> partition management.
> Offset manager will still run side-by-side with the coordinator in this JIRA, 
> and we will merge it in KAFKA-1740.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1507) Using GetOffsetShell against non-existent topic creates the topic unintentionally

2015-01-15 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278907#comment-14278907
 ] 

Sriharsha Chintalapani commented on KAFKA-1507:
---

[~junrao] [~nehanarkhede] Is this JIRA that can be included in 0.8.3 or 0.9.0 . 
If so I'll do an upmerge and resend the patch. Thanks.

> Using GetOffsetShell against non-existent topic creates the topic 
> unintentionally
> -
>
> Key: KAFKA-1507
> URL: https://issues.apache.org/jira/browse/KAFKA-1507
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.1
> Environment: centos
>Reporter: Luke Forehand
>Assignee: Sriharsha Chintalapani
>Priority: Minor
>  Labels: newbie
> Attachments: KAFKA-1507.patch, KAFKA-1507.patch, 
> KAFKA-1507_2014-07-22_10:27:45.patch, KAFKA-1507_2014-07-23_17:07:20.patch, 
> KAFKA-1507_2014-08-12_18:09:06.patch, KAFKA-1507_2014-08-22_11:06:38.patch, 
> KAFKA-1507_2014-08-22_11:08:51.patch
>
>
> A typo in using GetOffsetShell command can cause a
> topic to be created which cannot be deleted (because deletion is still in
> progress)
> ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list
> kafka10:9092,kafka11:9092,kafka12:9092,kafka13:9092 --topic typo --time 1
> ./kafka-topics.sh --zookeeper stormqa1/kafka-prod --describe --topic typo
> Topic:typo  PartitionCount:8ReplicationFactor:1 Configs:
>  Topic: typo Partition: 0Leader: 10  Replicas: 10
>   Isr: 10
> ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] 0.8.2.0 Candidate 1

2015-01-15 Thread Jun Rao
Thanks for reporting this. I will remove that option in RC2.

Jun

On Thu, Jan 15, 2015 at 5:21 AM, Jaikiran Pai 
wrote:

> I just downloaded the Kafka binary and am trying this on my 32 bit JVM
> (Java 7)? Trying to start Zookeeper or Kafka server keeps failing with
> "Unrecognized VM option 'UseCompressedOops'":
>
> ./zookeeper-server-start.sh ../config/zookeeper.properties
> Unrecognized VM option 'UseCompressedOops'
> Error: Could not create the Java Virtual Machine.
> Error: A fatal exception has occurred. Program will exit.
>
> Same with the Kafka server startup scripts. My Java version is:
>
> java version "1.7.0_71"
> Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
> Java HotSpot(TM) Server VM (build 24.71-b01, mixed mode)
>
> Should there be a check in the script, before adding this option?
>
> -Jaikiran
>
> On Wednesday 14 January 2015 10:08 PM, Jun Rao wrote:
>
>> + users mailing list. It would be great if people can test this out and
>> report any blocker issues.
>>
>> Thanks,
>>
>> Jun
>>
>> On Tue, Jan 13, 2015 at 7:16 PM, Jun Rao  wrote:
>>
>>  This is the first candidate for release of Apache Kafka 0.8.2.0. There
>>> has been some changes since the 0.8.2 beta release, especially in the new
>>> java producer api and jmx mbean names. It would be great if people can
>>> test
>>> this out thoroughly. We are giving people 10 days for testing and voting.
>>>
>>> Release Notes for the 0.8.2.0 release
>>> *https://people.apache.org/~junrao/kafka-0.8.2.0-
>>> candidate1/RELEASE_NOTES.html
>>> >> candidate1/RELEASE_NOTES.html>*
>>>
>>> *** Please download, test and vote by Friday, Jan 23h, 7pm PT
>>>
>>> Kafka's KEYS file containing PGP keys we use to sign the release:
>>> *https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/KEYS
>>> * in
>>> addition to the md5, sha1
>>> and sha2 (SHA256) checksum.
>>>
>>> * Release artifacts to be voted upon (source and binary):
>>> *https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/
>>> *
>>>
>>> * Maven artifacts to be voted upon prior to release:
>>> *https://people.apache.org/~junrao/kafka-0.8.2.0-
>>> candidate1/maven_staging/
>>> >> candidate1/maven_staging/>*
>>>
>>> * scala-doc
>>> *https://people.apache.org/~junrao/kafka-0.8.2.0-
>>> candidate1/scaladoc/#package
>>> >> candidate1/scaladoc/#package>*
>>>
>>> * java-doc
>>> *https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/javadoc/
>>> *
>>>
>>> * The tag to be voted upon (off the 0.8.2 branch) is the 0.8.2.0 tag
>>> *https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=
>>> b0c7d579f8aeb5750573008040a42b7377a651d5
>>> >> b0c7d579f8aeb5750573008040a42b7377a651d5>*
>>>
>>> /***
>>>
>>> Thanks,
>>>
>>> Jun
>>>
>>>
>


[jira] [Commented] (KAFKA-1333) Add consumer co-ordinator module to the server

2015-01-15 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278920#comment-14278920
 ] 

Jay Kreps commented on KAFKA-1333:
--

Woooh!!! :-)

> Add consumer co-ordinator module to the server
> --
>
> Key: KAFKA-1333
> URL: https://issues.apache.org/jira/browse/KAFKA-1333
> Project: Kafka
>  Issue Type: Sub-task
>  Components: consumer
>Affects Versions: 0.9.0
>Reporter: Neha Narkhede
>Assignee: Guozhang Wang
>
> Scope of this JIRA is to just add a consumer co-ordinator module that do the 
> following:
> 1) coordinator start-up, metadata initialization
> 2) simple join group handling (just updating metadata, no failure detection / 
> rebalancing): this should be sufficient for consumers doing self offset / 
> partition management.
> Offset manager will still run side-by-side with the coordinator in this JIRA, 
> and we will merge it in KAFKA-1740.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1507) Using GetOffsetShell against non-existent topic creates the topic unintentionally

2015-01-15 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278929#comment-14278929
 ] 

Jay Kreps commented on KAFKA-1507:
--

I think the right way to do this is have a proper create/delete/alter topic api 
(which i think is in-flight now). We should make having the get metadata 
request auto-creating topics optional and disable it by default (e.g. add an 
option like metadata.requests.auto.create=false). We can retain the auto-create 
functionality in the producer by having it issue this request in response to 
errors about a non-existant topic.

I don't think we should change the java api of the producer to expose this 
(i.e. add a producer.createTopic(name, replication, partitions, etc). Instead I 
think we should consider a Java admin client that exposes this functionality. 
This would be where we would expose other operational apis as well. The 
rationale for this is that creating, deleting, and modifying topics is actually 
not part of normal application usage so having it directly exposed in the 
producer is a bit dangerous.

We should definitely do a KIP proposal around this and get the design and API 
worked out first. I think we could do this in 0.8.3 if you are up to work on 
it. It would likely depend on some of the changes in KAFKA-1760 so we would 
want to merge that first.

> Using GetOffsetShell against non-existent topic creates the topic 
> unintentionally
> -
>
> Key: KAFKA-1507
> URL: https://issues.apache.org/jira/browse/KAFKA-1507
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.1
> Environment: centos
>Reporter: Luke Forehand
>Assignee: Sriharsha Chintalapani
>Priority: Minor
>  Labels: newbie
> Attachments: KAFKA-1507.patch, KAFKA-1507.patch, 
> KAFKA-1507_2014-07-22_10:27:45.patch, KAFKA-1507_2014-07-23_17:07:20.patch, 
> KAFKA-1507_2014-08-12_18:09:06.patch, KAFKA-1507_2014-08-22_11:06:38.patch, 
> KAFKA-1507_2014-08-22_11:08:51.patch
>
>
> A typo in using GetOffsetShell command can cause a
> topic to be created which cannot be deleted (because deletion is still in
> progress)
> ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list
> kafka10:9092,kafka11:9092,kafka12:9092,kafka13:9092 --topic typo --time 1
> ./kafka-topics.sh --zookeeper stormqa1/kafka-prod --describe --topic typo
> Topic:typo  PartitionCount:8ReplicationFactor:1 Configs:
>  Topic: typo Partition: 0Leader: 10  Replicas: 10
>   Isr: 10
> ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Heads up: KAFKA-1697 - remove code related to ack>1 on the broker

2015-01-15 Thread Ewen Cheslack-Postava
Right, so this looks like it could create an issue similar to what's
currently being discussed in
https://issues.apache.org/jira/browse/KAFKA-1649 where users now get errors
under conditions when they previously wouldn't. Old clients won't even know
about the error code, so besides failing they won't even be able to log any
meaningful error messages.

I think there are two options for compatibility:

1. An alternative change is to remove the ack > 1 code, but silently
"upgrade" requests with acks > 1 to acks = -1. This isn't the same as other
changes to behavior since the interaction between the client and server
remains the same, no error codes change, etc. The client might just see
some increased latency since the message might need to be replicated to
more brokers than they requested.
2. Split this into two patches, one that bumps the protocol version on that
message to include the new error code but maintains both old (now
deprecated) and new behavior, then a second that would be applied in a
later release that removes the old protocol + code for handling acks > 1.

2 is probably the right thing to do. If we specify the release when we'll
remove the deprecated protocol at the time of deprecation it makes things a
lot easier for people writing non-java clients and could give users better
predictability (e.g. if clients are at most 1 major release behind brokers,
they'll remain compatible but possibly use deprecated features).


On Wed, Jan 14, 2015 at 3:51 PM, Gwen Shapira  wrote:

> Hi Kafka Devs,
>
> We are working on KAFKA-1697 - remove code related to ack>1 on the
> broker. Per Neha's suggestion, I'd like to give everyone a heads up on
> what these changes mean.
>
> Once this patch is included, any produce requests that include
> request.required.acks > 1 will result in an exception. This will be
> InvalidRequiredAcks in new versions (0.8.3 and up, I assume) and
> UnknownException in existing versions (sorry, but I can't add error
> codes retroactively).
>
> This behavior is already enforced by 0.8.2 producers (sync and new),
> but we expect impact on users with older producers that relied on acks
> > 1 and external clients (i.e python, go, etc).
>
> Users who relied on acks > 1 are expected to switch to using acks = -1
> and a min.isr parameter than matches their user case.
>
> This change was discussed in the past in the context of KAFKA-1555
> (min.isr), but let us know if you have any questions or concerns
> regarding this change.
>
> Gwen
>



-- 
Thanks,
Ewen


[jira] [Commented] (KAFKA-1507) Using GetOffsetShell against non-existent topic creates the topic unintentionally

2015-01-15 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279012#comment-14279012
 ] 

Sriharsha Chintalapani commented on KAFKA-1507:
---

[~jkreps] thanks for the comments.  
"We should make having the get metadata request auto-creating topics optional 
and disable it by default (e.g. add an option like 
metadata.requests.auto.create=false)"
currently meta data request checks for broker config 
"auto.create.topics.enable" . This is disabled in my patch.

"We can retain the auto-create functionality in the producer by having it issue 
this request in response to errors about a non-existant topic."
This is what my current patch does. it sets "auto.create.topics.enable" to 
false on the broker config and when the producer makes TopicMetadataRequest  it 
returns Unknown_Topic_or_partition.  ProducerConfig has new properties like 
"auto.create.topics.enable" . 
If this property set to true( by default) producer issues a new request for 
CreateTopicRequest" upon receiving unknown_topic_or_partition error. which will 
than issues create topic on broker side with the configured numPartitions and 
replicationFactor.

"I don't think we should change the java api of the producer to expose this 
(i.e. add a producer.createTopic(name, replication, partitions, etc). ."

I agree and my patch doesn't change any api of the producer and it doesn't have 
producer.createTopic but it does introduce createTopicRequest and 
createTopicResponse.

"Instead I think we should consider a Java admin client that exposes this 
functionality. This would be where we would expose other operational apis as 
well. The rationale for this is that creating, deleting, and modifying topics 
is actually not part of normal application usage so having it directly exposed 
in the producer is a bit dangerous."

I am not sure if I understand correctly about the java admin client. We already 
have AdminUtils , is this about introducing network apis for 
create/delete/alter topics? . Even in this case I think this patch can be 
useful as the producer just makes createTopicRequest which I think what you 
want unless I missed something :). 




> Using GetOffsetShell against non-existent topic creates the topic 
> unintentionally
> -
>
> Key: KAFKA-1507
> URL: https://issues.apache.org/jira/browse/KAFKA-1507
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.1
> Environment: centos
>Reporter: Luke Forehand
>Assignee: Sriharsha Chintalapani
>Priority: Minor
>  Labels: newbie
> Attachments: KAFKA-1507.patch, KAFKA-1507.patch, 
> KAFKA-1507_2014-07-22_10:27:45.patch, KAFKA-1507_2014-07-23_17:07:20.patch, 
> KAFKA-1507_2014-08-12_18:09:06.patch, KAFKA-1507_2014-08-22_11:06:38.patch, 
> KAFKA-1507_2014-08-22_11:08:51.patch
>
>
> A typo in using GetOffsetShell command can cause a
> topic to be created which cannot be deleted (because deletion is still in
> progress)
> ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list
> kafka10:9092,kafka11:9092,kafka12:9092,kafka13:9092 --topic typo --time 1
> ./kafka-topics.sh --zookeeper stormqa1/kafka-prod --describe --topic typo
> Topic:typo  PartitionCount:8ReplicationFactor:1 Configs:
>  Topic: typo Partition: 0Leader: 10  Replicas: 10
>   Isr: 10
> ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Heads up: KAFKA-1697 - remove code related to ack>1 on the broker

2015-01-15 Thread Joe Stein
Looping in the mailing list that the client developers live on because they
are all not on dev (though they should be if they want to be helping to
build the best client libraries they can).

I whole hardily believe that we need to not break existing functionality of
the client protocol, ever.

There are many reasons for this and we have other threads on the mailing
list where we are discussing that topic (no pun intended) that I don't want
to re-hash here.

If we change wire protocol functionality *OR* the binary format (either) we
must bump version *AND* treat version as a feature flag with backward
compatibility support until it is deprecated for some time for folks to
deal with it.

match version = {
case 0: keepDoingWhatWeWereDoing()
case 1: doNewStuff()
case 2: doEvenMoreNewStuff()
}

has to be a practice we adopt imho ... I know feature flags can be
construed as "messy code" but I am eager to hear another (better?
different?) solution to this.

If we don't do a feature flag like this specifically with this change then
what happens is that someone upgrades their brokers with a rolling restart
in 0.8.3 and every single one of their producer requests start to fail and
they have a major production outage. k

I do 100% agree that > 1 makes no sense and we *REALLY* need people to
start using 0,1,-1 but we need to-do that in a way that is going to work
for everyone.

Old producers and consumers must keep working with new brokers and if we
are not going to support that then I am unclear what the use of "version"
is based on our original intentions of having it because of the 0.7=>-0.8.
We said no more breaking changes when we did that.

- Joe Stein

On Thu, Jan 15, 2015 at 12:38 PM, Ewen Cheslack-Postava 
wrote:

> Right, so this looks like it could create an issue similar to what's
> currently being discussed in
> https://issues.apache.org/jira/browse/KAFKA-1649 where users now get
> errors
> under conditions when they previously wouldn't. Old clients won't even know
> about the error code, so besides failing they won't even be able to log any
> meaningful error messages.
>
> I think there are two options for compatibility:
>
> 1. An alternative change is to remove the ack > 1 code, but silently
> "upgrade" requests with acks > 1 to acks = -1. This isn't the same as other
> changes to behavior since the interaction between the client and server
> remains the same, no error codes change, etc. The client might just see
> some increased latency since the message might need to be replicated to
> more brokers than they requested.
> 2. Split this into two patches, one that bumps the protocol version on that
> message to include the new error code but maintains both old (now
> deprecated) and new behavior, then a second that would be applied in a
> later release that removes the old protocol + code for handling acks > 1.
>
> 2 is probably the right thing to do. If we specify the release when we'll
> remove the deprecated protocol at the time of deprecation it makes things a
> lot easier for people writing non-java clients and could give users better
> predictability (e.g. if clients are at most 1 major release behind brokers,
> they'll remain compatible but possibly use deprecated features).
>
>
> On Wed, Jan 14, 2015 at 3:51 PM, Gwen Shapira 
> wrote:
>
> > Hi Kafka Devs,
> >
> > We are working on KAFKA-1697 - remove code related to ack>1 on the
> > broker. Per Neha's suggestion, I'd like to give everyone a heads up on
> > what these changes mean.
> >
> > Once this patch is included, any produce requests that include
> > request.required.acks > 1 will result in an exception. This will be
> > InvalidRequiredAcks in new versions (0.8.3 and up, I assume) and
> > UnknownException in existing versions (sorry, but I can't add error
> > codes retroactively).
> >
> > This behavior is already enforced by 0.8.2 producers (sync and new),
> > but we expect impact on users with older producers that relied on acks
> > > 1 and external clients (i.e python, go, etc).
> >
> > Users who relied on acks > 1 are expected to switch to using acks = -1
> > and a min.isr parameter than matches their user case.
> >
> > This change was discussed in the past in the context of KAFKA-1555
> > (min.isr), but let us know if you have any questions or concerns
> > regarding this change.
> >
> > Gwen
> >
>
>
>
> --
> Thanks,
> Ewen
>


[jira] [Commented] (KAFKA-1697) remove code related to ack>1 on the broker

2015-01-15 Thread Joe Stein (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279037#comment-14279037
 ] 

Joe Stein commented on KAFKA-1697:
--

With this patch I think we should change the existing functionality with 1 
update to start to LOG as a WARN in the Broker (so it gets people attention to 
stop using ack >1) but keep everything else the same... the new version of the 
request (with a match/case) should do the new functionality.

> remove code related to ack>1 on the broker
> --
>
> Key: KAFKA-1697
> URL: https://issues.apache.org/jira/browse/KAFKA-1697
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jun Rao
>Assignee: Gwen Shapira
> Fix For: 0.8.3
>
> Attachments: KAFKA-1697.patch, KAFKA-1697_2015-01-14_15:41:37.patch
>
>
> We removed the ack>1 support from the producer client in kafka-1555. We can 
> completely remove the code in the broker that supports ack>1.
> Also, we probably want to make NotEnoughReplicasAfterAppend a non-retriable 
> exception and let the client decide what to do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1697) remove code related to ack>1 on the broker

2015-01-15 Thread Joe Stein (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279037#comment-14279037
 ] 

Joe Stein edited comment on KAFKA-1697 at 1/15/15 6:21 PM:
---

With this patch I think we should change the existing functionality with > 1 
update to start to LOG as a WARN in the Broker (so it gets people attention to 
stop using ack >1) but keep everything else the same... the new version of the 
request (with a match/case) should do the new functionality and we support both.


was (Author: joestein):
With this patch I think we should change the existing functionality with 1 
update to start to LOG as a WARN in the Broker (so it gets people attention to 
stop using ack >1) but keep everything else the same... the new version of the 
request (with a match/case) should do the new functionality.

> remove code related to ack>1 on the broker
> --
>
> Key: KAFKA-1697
> URL: https://issues.apache.org/jira/browse/KAFKA-1697
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jun Rao
>Assignee: Gwen Shapira
> Fix For: 0.8.3
>
> Attachments: KAFKA-1697.patch, KAFKA-1697_2015-01-14_15:41:37.patch
>
>
> We removed the ack>1 support from the producer client in kafka-1555. We can 
> completely remove the code in the broker that supports ack>1.
> Also, we probably want to make NotEnoughReplicasAfterAppend a non-retriable 
> exception and let the client decide what to do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [kafka-clients] Re: Heads up: KAFKA-1697 - remove code related to ack>1 on the broker

2015-01-15 Thread Mark Roberts
This would sting a whole lot less if there was a programmatic way to get what 
server version is in use. Also, how will this work in mixed version clusters 
(during an upgrade, for example)?


> On Jan 15, 2015, at 10:10, Joe Stein  wrote:
> 
> Looping in the mailing list that the client developers live on because they 
> are all not on dev (though they should be if they want to be helping to build 
> the best client libraries they can).
> 
> I whole hardily believe that we need to not break existing functionality of 
> the client protocol, ever.
> 
> There are many reasons for this and we have other threads on the mailing list 
> where we are discussing that topic (no pun intended) that I don't want to 
> re-hash here.
> 
> If we change wire protocol functionality OR the binary format (either) we 
> must bump version AND treat version as a feature flag with backward 
> compatibility support until it is deprecated for some time for folks to deal 
> with it.
> 
> match version = {
> case 0: keepDoingWhatWeWereDoing()
> case 1: doNewStuff()
> case 2: doEvenMoreNewStuff()
> }
> 
> has to be a practice we adopt imho ... I know feature flags can be construed 
> as "messy code" but I am eager to hear another (better? different?) solution 
> to this.
> 
> If we don't do a feature flag like this specifically with this change then 
> what happens is that someone upgrades their brokers with a rolling restart in 
> 0.8.3 and every single one of their producer requests start to fail and they 
> have a major production outage. k
> 
> I do 100% agree that > 1 makes no sense and we *REALLY* need people to start 
> using 0,1,-1 but we need to-do that in a way that is going to work for 
> everyone.
> 
> Old producers and consumers must keep working with new brokers and if we are 
> not going to support that then I am unclear what the use of "version" is 
> based on our original intentions of having it because of the 0.7=>-0.8. We 
> said no more breaking changes when we did that.
> 
> - Joe Stein
> 
>> On Thu, Jan 15, 2015 at 12:38 PM, Ewen Cheslack-Postava  
>> wrote:
>> Right, so this looks like it could create an issue similar to what's
>> currently being discussed in
>> https://issues.apache.org/jira/browse/KAFKA-1649 where users now get errors
>> under conditions when they previously wouldn't. Old clients won't even know
>> about the error code, so besides failing they won't even be able to log any
>> meaningful error messages.
>> 
>> I think there are two options for compatibility:
>> 
>> 1. An alternative change is to remove the ack > 1 code, but silently
>> "upgrade" requests with acks > 1 to acks = -1. This isn't the same as other
>> changes to behavior since the interaction between the client and server
>> remains the same, no error codes change, etc. The client might just see
>> some increased latency since the message might need to be replicated to
>> more brokers than they requested.
>> 2. Split this into two patches, one that bumps the protocol version on that
>> message to include the new error code but maintains both old (now
>> deprecated) and new behavior, then a second that would be applied in a
>> later release that removes the old protocol + code for handling acks > 1.
>> 
>> 2 is probably the right thing to do. If we specify the release when we'll
>> remove the deprecated protocol at the time of deprecation it makes things a
>> lot easier for people writing non-java clients and could give users better
>> predictability (e.g. if clients are at most 1 major release behind brokers,
>> they'll remain compatible but possibly use deprecated features).
>> 
>> 
>> On Wed, Jan 14, 2015 at 3:51 PM, Gwen Shapira  wrote:
>> 
>> > Hi Kafka Devs,
>> >
>> > We are working on KAFKA-1697 - remove code related to ack>1 on the
>> > broker. Per Neha's suggestion, I'd like to give everyone a heads up on
>> > what these changes mean.
>> >
>> > Once this patch is included, any produce requests that include
>> > request.required.acks > 1 will result in an exception. This will be
>> > InvalidRequiredAcks in new versions (0.8.3 and up, I assume) and
>> > UnknownException in existing versions (sorry, but I can't add error
>> > codes retroactively).
>> >
>> > This behavior is already enforced by 0.8.2 producers (sync and new),
>> > but we expect impact on users with older producers that relied on acks
>> > > 1 and external clients (i.e python, go, etc).
>> >
>> > Users who relied on acks > 1 are expected to switch to using acks = -1
>> > and a min.isr parameter than matches their user case.
>> >
>> > This change was discussed in the past in the context of KAFKA-1555
>> > (min.isr), but let us know if you have any questions or concerns
>> > regarding this change.
>> >
>> > Gwen
>> >
>> 
>> 
>> 
>> --
>> Thanks,
>> Ewen
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "kafka-clients" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kafka-c

Re: Heads up: KAFKA-1697 - remove code related to ack>1 on the broker

2015-01-15 Thread Magnus Edenhill
I very much agree on what Joe is saying, let's use the version field as
intended
and be very strict with not removing nor altering existing behaviour
without bumping the version.
Old API versions could be deprecated (documentation only?) immediately and
removed completely in the next minor version bump (0.8->0.9).

An API to query supported API versions would be a good addition in the long
run but doesn't help current clients
much as such a request to an older broker version will kill the connection
without any error reporting to the client, thus making it rather useless in
the short term.

Regards,
Magnus


[DISCUSS] KIPs

2015-01-15 Thread Jay Kreps
The idea of KIPs came up in a previous discussion but there was no real
crisp definition of what they were. Here is an attempt at defining a
process:
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals

The trick here is to have something light-weight enough that it isn't a
hassle for small changes, but enough so that changes get the eyeballs of
the committers and heavy users.

Thoughts?

-Jay


[GitHub] kafka pull request: Trunk

2015-01-15 Thread SylviaVargasCTL
GitHub user SylviaVargasCTL opened a pull request:

https://github.com/apache/kafka/pull/42

Trunk



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/42.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #42


commit de6066e8e78e62739e3eb6f771d0739bf9b73dfd
Author: Stevo Slavic 
Date:   2014-04-09T15:13:53Z

kafka-1370; Gradle startup script for Windows; patched by Stevo Slavic; 
reviewed by Jun Rao

commit 75d5f5bff8519b36d5eb0a904ebbd0d3c0b7c8cc
Author: Stevo Slavic 
Date:   2014-04-09T15:24:13Z

kafka-1375; Formatting for in README.md is broken; patched by Stevo Slavic; 
reviewed by Jun Rao

commit 8d15de85114da6012530f0dd837f131bd1e367cd
Author: Joel Koshy 
Date:   2014-04-08T21:21:46Z

KAFKA-1373; Set first dirty (uncompacted) offset to first offset of the
log if no checkpoint exists. Reviewed by Neha Narkhede and Timothy Chen.

commit 911ff524515148421ff58b41f271c21f792ed9de
Author: Guozhang Wang 
Date:   2014-04-09T21:49:17Z

kafka-1364; ReplicaManagerTest hard-codes log dir; patched by Guozhang 
Wang; reviewed by Jun Rao

commit 47019a849e69209c16defa81001055aa9f57674d
Author: Jun Rao 
Date:   2014-04-09T21:53:03Z

kafka-1376; transient test failure in UncleanLeaderElectionTest; patched by 
Jun Rao; reviewed by Joel Koshy

commit 44ee6b7c9d9da207bebe6d927b38ed7df1388df3
Author: Guozhang Wang 
Date:   2014-04-10T01:52:23Z

kafka-1353;report capacity used by request thread pool and network thread 
pool; patched by Guozhang Wang; reviewed by Jun Rao

commit 2d429e19da22416aeb7de68b9e00f33a337e31a0
Author: Guozhang Wang 
Date:   2014-04-11T21:39:35Z

kafka-1337; follow-up patch to add broker list for new producer in system 
test overriden function; patched by Guozhang Wang; reviewed by Neha Narkhede, 
Jun Rao

commit a3a2cba842a9945d4ce7b032e311d17956c33249
Author: Timothy Chen 
Date:   2014-04-12T03:31:13Z

kafka-1363; testTopicConfigChangesDuringDeleteTopic hangs; patched by 
Timothy Chen; reviewed by Guozhang Wang, Neha Narkhede and Jun Rao

commit 6bb616e5ae6a2f45cfa190e245c9217a8bf9771a
Author: Jun Rao 
Date:   2014-04-12T04:29:09Z

trivial change to add kafka doap project file

commit 05612ac44de775cf3bc8ffdc15c033920d4a1440
Author: Jun Rao 
Date:   2014-04-12T20:47:13Z

kafka-1377; transient unit test failure in LogOffsetTest; patched by Jun 
Rao; reviewed by Neha Narkhede

commit d37ca7f627551c9960e2edb8498784bf2487d53e
Author: Jun Rao 
Date:   2014-04-12T20:48:57Z

kafka-1378; transient unit test failure in LogRecoveryTest; patched by Jun 
Rao; reviewed by Neha Narkhede

commit 2bfd49b955831f3455ff486ce4f613d73239a317
Author: Jun Rao 
Date:   2014-04-12T20:51:29Z

kafka-1381; transient unit test failure in AddPartitionsTest; patched by 
Jun Rao; reviewed by Neha Narkhede

commit 4bd33e5ba792667913991638a15f0a2c9f20d7b5
Author: Stevo Slavic 
Date:   2014-04-13T15:43:53Z

kafka-1210; Windows Bat files are not working properly; patched by Stevo 
Slavic; reviewed by Jun Rao

commit 9a6f7113ed630d8e6bb50f4a58846d976a2d5f97
Author: Jun Rao 
Date:   2014-04-15T20:46:54Z

kafka-1390; TestUtils.waitUntilLeaderIsElectedOrChanged may wait longer 
than it needs; patched by Jun Rao; reviewed by Guozhang Wang

commit ec075c5a853e4168ff30cf133493588671aa2fac
Author: Jay Kreps 
Date:   2014-04-16T17:19:18Z

KAFKA-1359: Ensure all topic/server metrics registered at once.

commit 97f13ec73255f3978c1cbb80ea26f446cf756515
Author: Joel Koshy 
Date:   2014-04-15T21:08:21Z

KAFKA-1323; Fix regression due to KAFKA-1315 (support for relative
directories in log.dirs property broke). Patched by Timothy Chen and
Guozhang Wang; reviewed by Joel Koshy, Neha Narkhede and Jun Rao.

commit b9351e04f0a26f2f9e4abc7ec526ed802495f991
Author: Jay Kreps 
Date:   2014-04-18T00:07:27Z

KAFKA-1398 dynamic config changes are broken.

commit 8b052150f55fb9a6f8c4aedc6b3fa0528719671e
Author: Jay Kreps 
Date:   2014-04-17T00:32:43Z

KAFKA-1337: Fix incorrect producer configs after config renaming.

commit 037c054be260bcc3b470b9724572cb814b704bff
Author: Joel Koshy 
Date:   2014-04-18T20:10:34Z

KAFKA-1362; Publish sources and javadoc jars; (also removed Scala 
2.8.2-specific actions). Reviewed by Jun Rao and Joe Stein

commit 89f040c8c9807fc4f9e219f0b57279b692b6e45d
Author: Jay Kreps 
Date:   2014-04-18T18:03:37Z

KAFKA-1398 Dynamic config follow-on-comments.

commit 3af3efe3773348fd7adb8ca43f2abc5490416e55
Author: Joel Koshy 
Date:   2014-04-22T00:06:39Z

KAFKA-1327; Log cleaner metrics follow-up patch to reset dirtiest log 
cleanable ratio; reviewed by Jun Rao
(cherry picked from commit 874620d)

commit ed68ba402e088070da0513a3675ed420ec6a70b0

Re: [kafka-clients] Re: Heads up: KAFKA-1697 - remove code related to ack>1 on the broker

2015-01-15 Thread Gwen Shapira
Is the protocol bump caused by the behavior change or the new error code?

1) IMO, error_codes are data, and clients can expect to receive errors
that they don't understand (i.e. unknown errors). AFAIK, clients don't
break on unknown errors, they are simple more challenging to debug. If
we document the new behavior, then its definitely debuggable and
fixable.

2) The behavior change is basically a deprecation - i.e. acks > 1 were
never documented, and are not supported by Kafka clients starting with
version 0.8.2. I'm not sure this requires a protocol bump either,
although its a better case than new error codes.

Thanks,
Gwen

On Thu, Jan 15, 2015 at 10:10 AM, Joe Stein  wrote:
> Looping in the mailing list that the client developers live on because they
> are all not on dev (though they should be if they want to be helping to
> build the best client libraries they can).
>
> I whole hardily believe that we need to not break existing functionality of
> the client protocol, ever.
>
> There are many reasons for this and we have other threads on the mailing
> list where we are discussing that topic (no pun intended) that I don't want
> to re-hash here.
>
> If we change wire protocol functionality OR the binary format (either) we
> must bump version AND treat version as a feature flag with backward
> compatibility support until it is deprecated for some time for folks to deal
> with it.
>
> match version = {
> case 0: keepDoingWhatWeWereDoing()
> case 1: doNewStuff()
> case 2: doEvenMoreNewStuff()
> }
>
> has to be a practice we adopt imho ... I know feature flags can be construed
> as "messy code" but I am eager to hear another (better? different?) solution
> to this.
>
> If we don't do a feature flag like this specifically with this change then
> what happens is that someone upgrades their brokers with a rolling restart
> in 0.8.3 and every single one of their producer requests start to fail and
> they have a major production outage. k
>
> I do 100% agree that > 1 makes no sense and we *REALLY* need people to start
> using 0,1,-1 but we need to-do that in a way that is going to work for
> everyone.
>
> Old producers and consumers must keep working with new brokers and if we are
> not going to support that then I am unclear what the use of "version" is
> based on our original intentions of having it because of the 0.7=>-0.8. We
> said no more breaking changes when we did that.
>
> - Joe Stein
>
> On Thu, Jan 15, 2015 at 12:38 PM, Ewen Cheslack-Postava 
> wrote:
>>
>> Right, so this looks like it could create an issue similar to what's
>> currently being discussed in
>> https://issues.apache.org/jira/browse/KAFKA-1649 where users now get
>> errors
>> under conditions when they previously wouldn't. Old clients won't even
>> know
>> about the error code, so besides failing they won't even be able to log
>> any
>> meaningful error messages.
>>
>> I think there are two options for compatibility:
>>
>> 1. An alternative change is to remove the ack > 1 code, but silently
>> "upgrade" requests with acks > 1 to acks = -1. This isn't the same as
>> other
>> changes to behavior since the interaction between the client and server
>> remains the same, no error codes change, etc. The client might just see
>> some increased latency since the message might need to be replicated to
>> more brokers than they requested.
>> 2. Split this into two patches, one that bumps the protocol version on
>> that
>> message to include the new error code but maintains both old (now
>> deprecated) and new behavior, then a second that would be applied in a
>> later release that removes the old protocol + code for handling acks > 1.
>>
>> 2 is probably the right thing to do. If we specify the release when we'll
>> remove the deprecated protocol at the time of deprecation it makes things
>> a
>> lot easier for people writing non-java clients and could give users better
>> predictability (e.g. if clients are at most 1 major release behind
>> brokers,
>> they'll remain compatible but possibly use deprecated features).
>>
>>
>> On Wed, Jan 14, 2015 at 3:51 PM, Gwen Shapira 
>> wrote:
>>
>> > Hi Kafka Devs,
>> >
>> > We are working on KAFKA-1697 - remove code related to ack>1 on the
>> > broker. Per Neha's suggestion, I'd like to give everyone a heads up on
>> > what these changes mean.
>> >
>> > Once this patch is included, any produce requests that include
>> > request.required.acks > 1 will result in an exception. This will be
>> > InvalidRequiredAcks in new versions (0.8.3 and up, I assume) and
>> > UnknownException in existing versions (sorry, but I can't add error
>> > codes retroactively).
>> >
>> > This behavior is already enforced by 0.8.2 producers (sync and new),
>> > but we expect impact on users with older producers that relied on acks
>> > > 1 and external clients (i.e python, go, etc).
>> >
>> > Users who relied on acks > 1 are expected to switch to using acks = -1
>> > and a min.isr parameter than matches their user case.
>> >

RE: [kafka-clients] Re: Heads up: KAFKA-1697 - remove code related to ack>1 on the broker

2015-01-15 Thread Felix GV
Next time the protocol is evolved and new error codes can be introduced, would 
it make sense to add a new one called "Deprecated" (or "Deprecation" or 
"DeprecatedOperation" or whatever sounds best)? This would act as a more 
precise form of "Unknown" error. It could help identify what the problem is 
more easily when debugging clients.

Of course, this is the kind of lever one would prefer never pulling, but when 
you need it, you're better off having it than not, and if you end up having it 
and never using it, it does not do much harm either.


--

Felix GV
Data Infrastructure Engineer
Distributed Data Systems
LinkedIn

f...@linkedin.com
linkedin.com/in/felixgv

From: kafka-clie...@googlegroups.com [kafka-clie...@googlegroups.com] on behalf 
of Magnus Edenhill [mag...@edenhill.se]
Sent: Thursday, January 15, 2015 10:40 AM
To: dev@kafka.apache.org
Cc: kafka-clie...@googlegroups.com
Subject: [kafka-clients] Re: Heads up: KAFKA-1697 - remove code related to 
ack>1 on the broker


I very much agree on what Joe is saying, let's use the version field as intended
and be very strict with not removing nor altering existing behaviour without 
bumping the version.
Old API versions could be deprecated (documentation only?) immediately and 
removed completely in the next minor version bump (0.8->0.9).

An API to query supported API versions would be a good addition in the long run 
but doesn't help current clients
much as such a request to an older broker version will kill the connection 
without any error reporting to the client, thus making it rather useless in the 
short term.

Regards,
Magnus


--
You received this message because you are subscribed to the Google Groups 
"kafka-clients" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 
kafka-clients+unsubscr...@googlegroups.com.
To post to this group, send email to 
kafka-clie...@googlegroups.com.
Visit this group at http://groups.google.com/group/kafka-clients.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/kafka-clients/CAHCQUcBtJ1nXi5_dEaHyR2QcRycQHh03rUCY%2BRo2Ussg9kM6UQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: [DISCUSS] KIPs

2015-01-15 Thread Joe Stein
Thanks Jay for kicking this off! I think the confluence page you wrote up
is a great start.


The KIP makes sense to me (at a minimum) if there is going to be any
"breaking change". This way Kafka can continue to grow and blossom and we
have a process in place if we are going to release a thorn ... and when we
do it is *CLEAR* about what and why that is. We can easily document which
KIPs where involved with this release (which I think should get committed
afterwards somewhere so no chance of edit after release). This approach I
had been thinking about also allows changes to occur as they do now as long
as they are backwards compatible.  Hopefully we never need a KIP but when
we do the PMC can vote on it and folks can read the release notes with
*CLEAR* understanding what is going to break their existing setup... at
least that is how I have been thinking about it.


Let me know what you think about this base minimum approach... I hadn't
really thought of the KIP for *ANY* "major change" and I have to think more
about that. I have some other comments for minor items in the confluence
page I will make once I think more about how I feel having a KIP for more
than what I was thinking about already.


I do think we should have "major changes" go through confluence, mailing
list discuss and JIRA but kind of feel we have been doing that already ...
if there are cases where that isn't the case we should highlight and learn
from them and formalize that more if need be.


/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop 
/

On Thu, Jan 15, 2015 at 1:42 PM, Jay Kreps  wrote:

> The idea of KIPs came up in a previous discussion but there was no real
> crisp definition of what they were. Here is an attempt at defining a
> process:
>
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
>
> The trick here is to have something light-weight enough that it isn't a
> hassle for small changes, but enough so that changes get the eyeballs of
> the committers and heavy users.
>
> Thoughts?
>
> -Jay
>


Re: [kafka-clients] Re: Heads up: KAFKA-1697 - remove code related to ack>1 on the broker

2015-01-15 Thread Dana Powers
> clients don't break on unknown errors

maybe true for the official java clients, but I dont think the assumption
holds true for community-maintained clients and users of those clients.
 kafka-python generally follows the fail-fast philosophy and raises an
exception on any unrecognized error code in any server response.  in this
case, kafka-python allows users to set their own required-acks policy when
creating a producer instance.  It is possible that users of kafka-python
have deployed producer code that uses ack>1 -- perhaps in production
environments -- and for those users the new error code will crash their
producer code.  I would not be surprised if the same were true of other
community clients.

*one reason for the fail-fast approach is that there isn't great
documentation on what errors to expect for each request / response -- so we
use failures to alert that some error case is not handled properly.  and
because of that, introducing new error cases without bumping the api
version is likely to cause those errors to get raised/thrown all the way
back up to the user.  of course we (client maintainers) can fix the issues
in the client libraries and suggest users upgrade, but it's not the ideal
situation.


long-winded way of saying: I agree w/ Joe.

-Dana


On Thu, Jan 15, 2015 at 12:07 PM, Gwen Shapira 
wrote:

> Is the protocol bump caused by the behavior change or the new error code?
>
> 1) IMO, error_codes are data, and clients can expect to receive errors
> that they don't understand (i.e. unknown errors). AFAIK, clients don't
> break on unknown errors, they are simple more challenging to debug. If
> we document the new behavior, then its definitely debuggable and
> fixable.
>
> 2) The behavior change is basically a deprecation - i.e. acks > 1 were
> never documented, and are not supported by Kafka clients starting with
> version 0.8.2. I'm not sure this requires a protocol bump either,
> although its a better case than new error codes.
>
> Thanks,
> Gwen
>
> On Thu, Jan 15, 2015 at 10:10 AM, Joe Stein  wrote:
> > Looping in the mailing list that the client developers live on because
> they
> > are all not on dev (though they should be if they want to be helping to
> > build the best client libraries they can).
> >
> > I whole hardily believe that we need to not break existing functionality
> of
> > the client protocol, ever.
> >
> > There are many reasons for this and we have other threads on the mailing
> > list where we are discussing that topic (no pun intended) that I don't
> want
> > to re-hash here.
> >
> > If we change wire protocol functionality OR the binary format (either) we
> > must bump version AND treat version as a feature flag with backward
> > compatibility support until it is deprecated for some time for folks to
> deal
> > with it.
> >
> > match version = {
> > case 0: keepDoingWhatWeWereDoing()
> > case 1: doNewStuff()
> > case 2: doEvenMoreNewStuff()
> > }
> >
> > has to be a practice we adopt imho ... I know feature flags can be
> construed
> > as "messy code" but I am eager to hear another (better? different?)
> solution
> > to this.
> >
> > If we don't do a feature flag like this specifically with this change
> then
> > what happens is that someone upgrades their brokers with a rolling
> restart
> > in 0.8.3 and every single one of their producer requests start to fail
> and
> > they have a major production outage. k
> >
> > I do 100% agree that > 1 makes no sense and we *REALLY* need people to
> start
> > using 0,1,-1 but we need to-do that in a way that is going to work for
> > everyone.
> >
> > Old producers and consumers must keep working with new brokers and if we
> are
> > not going to support that then I am unclear what the use of "version" is
> > based on our original intentions of having it because of the 0.7=>-0.8.
> We
> > said no more breaking changes when we did that.
> >
> > - Joe Stein
> >
> > On Thu, Jan 15, 2015 at 12:38 PM, Ewen Cheslack-Postava <
> e...@confluent.io>
> > wrote:
> >>
> >> Right, so this looks like it could create an issue similar to what's
> >> currently being discussed in
> >> https://issues.apache.org/jira/browse/KAFKA-1649 where users now get
> >> errors
> >> under conditions when they previously wouldn't. Old clients won't even
> >> know
> >> about the error code, so besides failing they won't even be able to log
> >> any
> >> meaningful error messages.
> >>
> >> I think there are two options for compatibility:
> >>
> >> 1. An alternative change is to remove the ack > 1 code, but silently
> >> "upgrade" requests with acks > 1 to acks = -1. This isn't the same as
> >> other
> >> changes to behavior since the interaction between the client and server
> >> remains the same, no error codes change, etc. The client might just see
> >> some increased latency since the message might need to be replicated to
> >> more brokers than they requested.
> >> 2. Split this into two patches, one that bumps the protocol version on
> >> that
> >

[jira] [Commented] (KAFKA-1804) Kafka network thread lacks top exception handler

2015-01-15 Thread Alexey Ozeritskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279396#comment-14279396
 ] 

Alexey Ozeritskiy commented on KAFKA-1804:
--

We've written the simple patch for kafka-network-thread:
{code:java}
  override def run(): Unit = {
try {
  original_run()
} catch {
  case e: Throwable => 
error("ERROR IN NETWORK THREAD: %s".format(e), e)
Runtime.getRuntime.halt(1)
}
  }
{code}
and got the following trace:
{code}
[2015-01-15 23:04:08,537] ERROR ERROR IN NETWORK THREAD: 
java.util.NoSuchElementException: None.get (kafka.network.Processor)
java.util.NoSuchElementException: None.get
at scala.None$.get(Option.scala:313)
at scala.None$.get(Option.scala:311)
at kafka.network.ConnectionQuotas.dec(SocketServer.scala:544)
at kafka.network.AbstractServerThread.close(SocketServer.scala:165)
at kafka.network.AbstractServerThread.close(SocketServer.scala:157)
at kafka.network.Processor.close(SocketServer.scala:394)
at kafka.network.Processor.processNewResponses(SocketServer.scala:426)
at kafka.network.Processor.iteration(SocketServer.scala:328)
at kafka.network.Processor.run(SocketServer.scala:381)
at java.lang.Thread.run(Thread.java:745)
{code}

> Kafka network thread lacks top exception handler
> 
>
> Key: KAFKA-1804
> URL: https://issues.apache.org/jira/browse/KAFKA-1804
> Project: Kafka
>  Issue Type: Bug
>Reporter: Oleg Golovin
>
> We have faced the problem that some kafka network threads may fail, so that 
> jstack attached to Kafka process showed fewer threads than we had defined in 
> our Kafka configuration. This leads to API requests processed by this thread 
> getting stuck unresponed.
> There were no error messages in the log regarding thread failure.
> We have examined Kafka code to find out there is no top try-catch block in 
> the network thread code, which could at least log possible errors.
> Could you add top-level try-catch block for the network thread, which should 
> recover network thread in case of exception?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1476) Get a list of consumer groups

2015-01-15 Thread Onur Karaman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279407#comment-14279407
 ] 

Onur Karaman commented on KAFKA-1476:
-

Updated reviewboard https://reviews.apache.org/r/29831/diff/
 against branch origin/trunk

> Get a list of consumer groups
> -
>
> Key: KAFKA-1476
> URL: https://issues.apache.org/jira/browse/KAFKA-1476
> Project: Kafka
>  Issue Type: Wish
>  Components: tools
>Affects Versions: 0.8.1.1
>Reporter: Ryan Williams
>Assignee: Balaji Seshadri
>  Labels: newbie
> Fix For: 0.9.0
>
> Attachments: ConsumerCommand.scala, KAFKA-1476-LIST-GROUPS.patch, 
> KAFKA-1476-RENAME.patch, KAFKA-1476-REVIEW-COMMENTS.patch, KAFKA-1476.patch, 
> KAFKA-1476.patch, KAFKA-1476.patch, KAFKA-1476.patch, 
> KAFKA-1476_2014-11-10_11:58:26.patch, KAFKA-1476_2014-11-10_12:04:01.patch, 
> KAFKA-1476_2014-11-10_12:06:35.patch, KAFKA-1476_2014-12-05_12:00:12.patch, 
> KAFKA-1476_2015-01-12_16:22:26.patch, KAFKA-1476_2015-01-12_16:31:20.patch, 
> KAFKA-1476_2015-01-13_10:36:18.patch, KAFKA-1476_2015-01-15_14:30:04.patch
>
>
> It would be useful to have a way to get a list of consumer groups currently 
> active via some tool/script that ships with kafka. This would be helpful so 
> that the system tools can be explored more easily.
> For example, when running the ConsumerOffsetChecker, it requires a group 
> option
> bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --topic test --group 
> ?
> But, when just getting started with kafka, using the console producer and 
> consumer, it is not clear what value to use for the group option.  If a list 
> of consumer groups could be listed, then it would be clear what value to use.
> Background:
> http://mail-archives.apache.org/mod_mbox/kafka-users/201405.mbox/%3cCAOq_b1w=slze5jrnakxvak0gu9ctdkpazak1g4dygvqzbsg...@mail.gmail.com%3e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1476) Get a list of consumer groups

2015-01-15 Thread Onur Karaman (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Onur Karaman updated KAFKA-1476:

Attachment: KAFKA-1476_2015-01-15_14:30:04.patch

> Get a list of consumer groups
> -
>
> Key: KAFKA-1476
> URL: https://issues.apache.org/jira/browse/KAFKA-1476
> Project: Kafka
>  Issue Type: Wish
>  Components: tools
>Affects Versions: 0.8.1.1
>Reporter: Ryan Williams
>Assignee: Balaji Seshadri
>  Labels: newbie
> Fix For: 0.9.0
>
> Attachments: ConsumerCommand.scala, KAFKA-1476-LIST-GROUPS.patch, 
> KAFKA-1476-RENAME.patch, KAFKA-1476-REVIEW-COMMENTS.patch, KAFKA-1476.patch, 
> KAFKA-1476.patch, KAFKA-1476.patch, KAFKA-1476.patch, 
> KAFKA-1476_2014-11-10_11:58:26.patch, KAFKA-1476_2014-11-10_12:04:01.patch, 
> KAFKA-1476_2014-11-10_12:06:35.patch, KAFKA-1476_2014-12-05_12:00:12.patch, 
> KAFKA-1476_2015-01-12_16:22:26.patch, KAFKA-1476_2015-01-12_16:31:20.patch, 
> KAFKA-1476_2015-01-13_10:36:18.patch, KAFKA-1476_2015-01-15_14:30:04.patch
>
>
> It would be useful to have a way to get a list of consumer groups currently 
> active via some tool/script that ships with kafka. This would be helpful so 
> that the system tools can be explored more easily.
> For example, when running the ConsumerOffsetChecker, it requires a group 
> option
> bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --topic test --group 
> ?
> But, when just getting started with kafka, using the console producer and 
> consumer, it is not clear what value to use for the group option.  If a list 
> of consumer groups could be listed, then it would be clear what value to use.
> Background:
> http://mail-archives.apache.org/mod_mbox/kafka-users/201405.mbox/%3cCAOq_b1w=slze5jrnakxvak0gu9ctdkpazak1g4dygvqzbsg...@mail.gmail.com%3e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 29831: Patch for KAFKA-1476

2015-01-15 Thread Onur Karaman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29831/
---

(Updated Jan. 15, 2015, 10:30 p.m.)


Review request for kafka.


Bugs: KAFKA-1476
https://issues.apache.org/jira/browse/KAFKA-1476


Repository: kafka


Description
---

Merged in work for KAFKA-1476 and sub-task KAFKA-1826


Diffs (updated)
-

  bin/kafka-consumer-groups.sh PRE-CREATION 
  core/src/main/scala/kafka/admin/AdminUtils.scala 
28b12c7b89a56c113b665fbde1b95f873f8624a3 
  core/src/main/scala/kafka/admin/ConsumerGroupCommand.scala PRE-CREATION 
  core/src/main/scala/kafka/utils/ZkUtils.scala 
c14bd455b6642f5e6eb254670bef9f57ae41d6cb 
  core/src/test/scala/unit/kafka/admin/DeleteConsumerGroupTest.scala 
PRE-CREATION 
  core/src/test/scala/unit/kafka/utils/TestUtils.scala 
ac15d34425795d5be20c51b01fa1108bdcd66583 

Diff: https://reviews.apache.org/r/29831/diff/


Testing
---


Thanks,

Onur Karaman



[jira] [Created] (KAFKA-1866) LogStartOffset gauge throws exceptions after log.delete()

2015-01-15 Thread Gian Merlino (JIRA)
Gian Merlino created KAFKA-1866:
---

 Summary: LogStartOffset gauge throws exceptions after log.delete()
 Key: KAFKA-1866
 URL: https://issues.apache.org/jira/browse/KAFKA-1866
 Project: Kafka
  Issue Type: Bug
Reporter: Gian Merlino


The LogStartOffset gauge does "logSegments.head.baseOffset", which throws 
NoSuchElementException on an empty list, which can occur after a delete() of 
the log. This makes life harder for custom MetricsReporters, since they have to 
deal with .value() possibly throwing an exception.

Locally we're dealing with this by having Log.delete() also call removeMetric 
on all the gauges. That also has the benefit of not having a bunch of metrics 
floating around for logs that the broker is not actually handling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1866) LogStartOffset gauge throws exceptions after log.delete()

2015-01-15 Thread Gian Merlino (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279412#comment-14279412
 ] 

Gian Merlino commented on KAFKA-1866:
-

At least, I think it has that benefit. I haven't actually tested it yet :)

> LogStartOffset gauge throws exceptions after log.delete()
> -
>
> Key: KAFKA-1866
> URL: https://issues.apache.org/jira/browse/KAFKA-1866
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gian Merlino
>
> The LogStartOffset gauge does "logSegments.head.baseOffset", which throws 
> NoSuchElementException on an empty list, which can occur after a delete() of 
> the log. This makes life harder for custom MetricsReporters, since they have 
> to deal with .value() possibly throwing an exception.
> Locally we're dealing with this by having Log.delete() also call removeMetric 
> on all the gauges. That also has the benefit of not having a bunch of metrics 
> floating around for logs that the broker is not actually handling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [kafka-clients] Re: Heads up: KAFKA-1697 - remove code related to ack>1 on the broker

2015-01-15 Thread Ewen Cheslack-Postava
Gwen,

I think the only option that wouldn't require a protocol version change is
the one where acks > 1 is converted to acks = -1 since it's the only one
that doesn't potentially break older clients. The protocol guide says that
the expected upgrade path is servers first, then clients, so old clients,
including non-java clients, that may be using acks > 1 should be able to
work with a new broker version.

It's more work, but I think dealing with the protocol change is the right
thing to do since it eventually gets us to the behavior I think is better
-- the broker should reject requests with invalid values. I think Joe and I
were basically in agreement. In my mind the major piece missing from his
description is how long we're going to maintain his "case 0" behavior. It's
impractical to maintain old versions forever, but it sounds like there
hasn't been a decision on how long to maintain them. Maybe that's another
item to add to KIPs -- protocol versions and behavior need to be listed as
deprecated and the earliest version in which they'll be removed should be
specified so users can understand which versions are guaranteed to be
compatible, even if they're using (well-written) non-java clients.

-Ewen


On Thu, Jan 15, 2015 at 12:52 PM, Dana Powers  wrote:

> > clients don't break on unknown errors
>
> maybe true for the official java clients, but I dont think the assumption
> holds true for community-maintained clients and users of those clients.
>  kafka-python generally follows the fail-fast philosophy and raises an
> exception on any unrecognized error code in any server response.  in this
> case, kafka-python allows users to set their own required-acks policy when
> creating a producer instance.  It is possible that users of kafka-python
> have deployed producer code that uses ack>1 -- perhaps in production
> environments -- and for those users the new error code will crash their
> producer code.  I would not be surprised if the same were true of other
> community clients.
>
> *one reason for the fail-fast approach is that there isn't great
> documentation on what errors to expect for each request / response -- so we
> use failures to alert that some error case is not handled properly.  and
> because of that, introducing new error cases without bumping the api
> version is likely to cause those errors to get raised/thrown all the way
> back up to the user.  of course we (client maintainers) can fix the issues
> in the client libraries and suggest users upgrade, but it's not the ideal
> situation.
>
>
> long-winded way of saying: I agree w/ Joe.
>
> -Dana
>
>
> On Thu, Jan 15, 2015 at 12:07 PM, Gwen Shapira 
> wrote:
>
> > Is the protocol bump caused by the behavior change or the new error code?
> >
> > 1) IMO, error_codes are data, and clients can expect to receive errors
> > that they don't understand (i.e. unknown errors). AFAIK, clients don't
> > break on unknown errors, they are simple more challenging to debug. If
> > we document the new behavior, then its definitely debuggable and
> > fixable.
> >
> > 2) The behavior change is basically a deprecation - i.e. acks > 1 were
> > never documented, and are not supported by Kafka clients starting with
> > version 0.8.2. I'm not sure this requires a protocol bump either,
> > although its a better case than new error codes.
> >
> > Thanks,
> > Gwen
> >
> > On Thu, Jan 15, 2015 at 10:10 AM, Joe Stein 
> wrote:
> > > Looping in the mailing list that the client developers live on because
> > they
> > > are all not on dev (though they should be if they want to be helping to
> > > build the best client libraries they can).
> > >
> > > I whole hardily believe that we need to not break existing
> functionality
> > of
> > > the client protocol, ever.
> > >
> > > There are many reasons for this and we have other threads on the
> mailing
> > > list where we are discussing that topic (no pun intended) that I don't
> > want
> > > to re-hash here.
> > >
> > > If we change wire protocol functionality OR the binary format (either)
> we
> > > must bump version AND treat version as a feature flag with backward
> > > compatibility support until it is deprecated for some time for folks to
> > deal
> > > with it.
> > >
> > > match version = {
> > > case 0: keepDoingWhatWeWereDoing()
> > > case 1: doNewStuff()
> > > case 2: doEvenMoreNewStuff()
> > > }
> > >
> > > has to be a practice we adopt imho ... I know feature flags can be
> > construed
> > > as "messy code" but I am eager to hear another (better? different?)
> > solution
> > > to this.
> > >
> > > If we don't do a feature flag like this specifically with this change
> > then
> > > what happens is that someone upgrades their brokers with a rolling
> > restart
> > > in 0.8.3 and every single one of their producer requests start to fail
> > and
> > > they have a major production outage. k
> > >
> > > I do 100% agree that > 1 makes no sense and we *REALLY* need people to
> > start
> > > using 0,1,-1 but we need t

[jira] [Commented] (KAFKA-1866) LogStartOffset gauge throws exceptions after log.delete()

2015-01-15 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279473#comment-14279473
 ] 

Sriharsha Chintalapani commented on KAFKA-1866:
---

[~gian] any steps to reproduce this. Thanks.

> LogStartOffset gauge throws exceptions after log.delete()
> -
>
> Key: KAFKA-1866
> URL: https://issues.apache.org/jira/browse/KAFKA-1866
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gian Merlino
>
> The LogStartOffset gauge does "logSegments.head.baseOffset", which throws 
> NoSuchElementException on an empty list, which can occur after a delete() of 
> the log. This makes life harder for custom MetricsReporters, since they have 
> to deal with .value() possibly throwing an exception.
> Locally we're dealing with this by having Log.delete() also call removeMetric 
> on all the gauges. That also has the benefit of not having a bunch of metrics 
> floating around for logs that the broker is not actually handling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-1866) LogStartOffset gauge throws exceptions after log.delete()

2015-01-15 Thread Sriharsha Chintalapani (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani reassigned KAFKA-1866:
-

Assignee: Sriharsha Chintalapani

> LogStartOffset gauge throws exceptions after log.delete()
> -
>
> Key: KAFKA-1866
> URL: https://issues.apache.org/jira/browse/KAFKA-1866
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gian Merlino
>Assignee: Sriharsha Chintalapani
>
> The LogStartOffset gauge does "logSegments.head.baseOffset", which throws 
> NoSuchElementException on an empty list, which can occur after a delete() of 
> the log. This makes life harder for custom MetricsReporters, since they have 
> to deal with .value() possibly throwing an exception.
> Locally we're dealing with this by having Log.delete() also call removeMetric 
> on all the gauges. That also has the benefit of not having a bunch of metrics 
> floating around for logs that the broker is not actually handling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1476) Get a list of consumer groups

2015-01-15 Thread Onur Karaman (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Onur Karaman updated KAFKA-1476:

Attachment: sample-kafka-consumer-groups-sh-output.txt

I've attached a sample run of kafka-consumer-groups.sh covering all the 
commands and options. I've excluded the following noise from SLF4J coming from 
every command:
{code}
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/vagrant/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/vagrant/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
{code}

> Get a list of consumer groups
> -
>
> Key: KAFKA-1476
> URL: https://issues.apache.org/jira/browse/KAFKA-1476
> Project: Kafka
>  Issue Type: Wish
>  Components: tools
>Affects Versions: 0.8.1.1
>Reporter: Ryan Williams
>Assignee: Balaji Seshadri
>  Labels: newbie
> Fix For: 0.9.0
>
> Attachments: ConsumerCommand.scala, KAFKA-1476-LIST-GROUPS.patch, 
> KAFKA-1476-RENAME.patch, KAFKA-1476-REVIEW-COMMENTS.patch, KAFKA-1476.patch, 
> KAFKA-1476.patch, KAFKA-1476.patch, KAFKA-1476.patch, 
> KAFKA-1476_2014-11-10_11:58:26.patch, KAFKA-1476_2014-11-10_12:04:01.patch, 
> KAFKA-1476_2014-11-10_12:06:35.patch, KAFKA-1476_2014-12-05_12:00:12.patch, 
> KAFKA-1476_2015-01-12_16:22:26.patch, KAFKA-1476_2015-01-12_16:31:20.patch, 
> KAFKA-1476_2015-01-13_10:36:18.patch, KAFKA-1476_2015-01-15_14:30:04.patch, 
> sample-kafka-consumer-groups-sh-output.txt
>
>
> It would be useful to have a way to get a list of consumer groups currently 
> active via some tool/script that ships with kafka. This would be helpful so 
> that the system tools can be explored more easily.
> For example, when running the ConsumerOffsetChecker, it requires a group 
> option
> bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --topic test --group 
> ?
> But, when just getting started with kafka, using the console producer and 
> consumer, it is not clear what value to use for the group option.  If a list 
> of consumer groups could be listed, then it would be clear what value to use.
> Background:
> http://mail-archives.apache.org/mod_mbox/kafka-users/201405.mbox/%3cCAOq_b1w=slze5jrnakxvak0gu9ctdkpazak1g4dygvqzbsg...@mail.gmail.com%3e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 29831: Patch for KAFKA-1476

2015-01-15 Thread Onur Karaman


> On Jan. 13, 2015, 5:36 a.m., Neha Narkhede wrote:
> > To make review easier, could you add the output of the command for all 
> > options for 2 consumer groups that consume 2 or more topics to the JIRA? It 
> > will make it easier to review. One thing to watch out for is the ease of 
> > scripting the output from this tool. I'd also suggest asking Clark/Tood or 
> > one of the SREs to review the output from the tool.

I just attached the output from a run through all the commands and options on 
the JIRA. I excluded the noise coming from SLF4J that I was getting on every 
command.


- Onur


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29831/#review67789
---


On Jan. 15, 2015, 10:30 p.m., Onur Karaman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29831/
> ---
> 
> (Updated Jan. 15, 2015, 10:30 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1476
> https://issues.apache.org/jira/browse/KAFKA-1476
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Merged in work for KAFKA-1476 and sub-task KAFKA-1826
> 
> 
> Diffs
> -
> 
>   bin/kafka-consumer-groups.sh PRE-CREATION 
>   core/src/main/scala/kafka/admin/AdminUtils.scala 
> 28b12c7b89a56c113b665fbde1b95f873f8624a3 
>   core/src/main/scala/kafka/admin/ConsumerGroupCommand.scala PRE-CREATION 
>   core/src/main/scala/kafka/utils/ZkUtils.scala 
> c14bd455b6642f5e6eb254670bef9f57ae41d6cb 
>   core/src/test/scala/unit/kafka/admin/DeleteConsumerGroupTest.scala 
> PRE-CREATION 
>   core/src/test/scala/unit/kafka/utils/TestUtils.scala 
> ac15d34425795d5be20c51b01fa1108bdcd66583 
> 
> Diff: https://reviews.apache.org/r/29831/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Onur Karaman
> 
>



[jira] [Commented] (KAFKA-1333) Add consumer co-ordinator module to the server

2015-01-15 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279518#comment-14279518
 ] 

Guozhang Wang commented on KAFKA-1333:
--

[~abiletskyi] sorry for getting late on this. We are currently making the 
following progress on KAFKA-1326:

1. I am working on KAFKA-1333, which will define the coordinator module along 
with its functionality declaration.
2. [~onurkaraman] will work on KAFKA-1334 once it gets unblocked from 
KAFKA-1333.
3. I am also reviewing KAFKA-1670 for the new client API.

At the mean time, you could also take a look at KAFKA-1670 and see if the API 
definitions fits well for your use case.

> Add consumer co-ordinator module to the server
> --
>
> Key: KAFKA-1333
> URL: https://issues.apache.org/jira/browse/KAFKA-1333
> Project: Kafka
>  Issue Type: Sub-task
>  Components: consumer
>Affects Versions: 0.9.0
>Reporter: Neha Narkhede
>Assignee: Guozhang Wang
>
> Scope of this JIRA is to just add a consumer co-ordinator module that do the 
> following:
> 1) coordinator start-up, metadata initialization
> 2) simple join group handling (just updating metadata, no failure detection / 
> rebalancing): this should be sufficient for consumers doing self offset / 
> partition management.
> Offset manager will still run side-by-side with the coordinator in this JIRA, 
> and we will merge it in KAFKA-1740.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1333) Add consumer co-ordinator module to the server

2015-01-15 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279518#comment-14279518
 ] 

Guozhang Wang edited comment on KAFKA-1333 at 1/15/15 11:34 PM:


[~abiletskyi] sorry for getting late on this. We are currently making the 
following progress on KAFKA-1326:

1. I am working on KAFKA-1333, which will define the coordinator module along 
with its functionality declaration.
2. [~onurkaraman] will work on KAFKA-1334 once it gets unblocked from 
KAFKA-1333.
3. I am also reviewing KAFKA-1760 for the new client API.

At the mean time, you could also take a look at KAFKA-1760 and see if the API 
definitions fits well for your use case.


was (Author: guozhang):
[~abiletskyi] sorry for getting late on this. We are currently making the 
following progress on KAFKA-1326:

1. I am working on KAFKA-1333, which will define the coordinator module along 
with its functionality declaration.
2. [~onurkaraman] will work on KAFKA-1334 once it gets unblocked from 
KAFKA-1333.
3. I am also reviewing KAFKA-1670 for the new client API.

At the mean time, you could also take a look at KAFKA-1670 and see if the API 
definitions fits well for your use case.

> Add consumer co-ordinator module to the server
> --
>
> Key: KAFKA-1333
> URL: https://issues.apache.org/jira/browse/KAFKA-1333
> Project: Kafka
>  Issue Type: Sub-task
>  Components: consumer
>Affects Versions: 0.9.0
>Reporter: Neha Narkhede
>Assignee: Guozhang Wang
>
> Scope of this JIRA is to just add a consumer co-ordinator module that do the 
> following:
> 1) coordinator start-up, metadata initialization
> 2) simple join group handling (just updating metadata, no failure detection / 
> rebalancing): this should be sufficient for consumers doing self offset / 
> partition management.
> Offset manager will still run side-by-side with the coordinator in this JIRA, 
> and we will merge it in KAFKA-1740.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1866) LogStartOffset gauge throws exceptions after log.delete()

2015-01-15 Thread Gian Merlino (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279529#comment-14279529
 ] 

Gian Merlino commented on KAFKA-1866:
-

Anything that calls log.delete() should do it. It looks like the 
replica-stopping code does this.

In our case we started seeing these exceptions after reassigning some 
partitions between brokers-- perhaps a non-leading partition got moved and was 
delete()ed.

> LogStartOffset gauge throws exceptions after log.delete()
> -
>
> Key: KAFKA-1866
> URL: https://issues.apache.org/jira/browse/KAFKA-1866
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gian Merlino
>Assignee: Sriharsha Chintalapani
>
> The LogStartOffset gauge does "logSegments.head.baseOffset", which throws 
> NoSuchElementException on an empty list, which can occur after a delete() of 
> the log. This makes life harder for custom MetricsReporters, since they have 
> to deal with .value() possibly throwing an exception.
> Locally we're dealing with this by having Log.delete() also call removeMetric 
> on all the gauges. That also has the benefit of not having a bunch of metrics 
> floating around for logs that the broker is not actually handling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Latency Tracking Across All Kafka Component

2015-01-15 Thread Guozhang Wang
Hi,

At LinkedIn we used an audit module to track the latency / message counts
at each "tier" of the pipeline (for your example it will have the producer
/ local / central / HDFS tiers). Some details can be found on our recent
talk slides (slide 41/42):

http://www.slideshare.net/GuozhangWang/apache-kafka-at-linkedin-43307044

This audit is specific to the usage of Avro as our serialization tool
though, and we are considering ways to get it generalized hence
open-sourced.

Guozhang


On Mon, Jan 5, 2015 at 3:33 PM, Otis Gospodnetic  wrote:

> Hi,
>
> That sounds a bit like needing a full, cross-app, cross-network
> transaction/call tracing, and not something specific or limited to Kafka,
> doesn't it?
>
> Otis
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Mon, Jan 5, 2015 at 2:43 PM, Bhavesh Mistry  >
> wrote:
>
> > Hi Kafka Team/Users,
> >
> > We are using Linked-in Kafka data pipe-line end-to-end.
> >
> > Producer(s) ->Local DC Brokers -> MM -> Central brokers -> Camus Job ->
> > HDFS
> >
> > This is working out very well for us, but we need to have visibility of
> > latency at each layer (Local DC Brokers -> MM -> Central brokers -> Camus
> > Job ->  HDFS).  Our events are time-based (time event was produce).  Is
> > there any feature or any audit trail  mentioned at (
> > https://github.com/linkedin/camus/) ?  But, I would like to know
> > in-between
> > latency and time event spent in each hope? So, we do not know where is
> > problem and what t o optimize ?
> >
> > Any of this cover in 0.9.0 or any other version of upcoming Kafka release
> > ?  How might we achive this  latency tracking across all components ?
> >
> >
> > Thanks,
> >
> > Bhavesh
> >
>



-- 
-- Guozhang


[jira] [Created] (KAFKA-1867) liveBroker list not updated on a cluster with no topics

2015-01-15 Thread Jun Rao (JIRA)
Jun Rao created KAFKA-1867:
--

 Summary: liveBroker list not updated on a cluster with no topics
 Key: KAFKA-1867
 URL: https://issues.apache.org/jira/browse/KAFKA-1867
 Project: Kafka
  Issue Type: Bug
Reporter: Jun Rao
 Fix For: 0.8.3


Currently, when there is no topic in a cluster, the controller doesn't send any 
UpdateMetadataRequest to the broker when it starts up. As a result, the 
liveBroker list in metadataCache is empty. This means that we will return 
incorrect broker list in TopicMetatadataResponse.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1810) Add IP Filtering / Whitelists-Blacklists

2015-01-15 Thread Jeff Holoman (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Holoman updated KAFKA-1810:

Attachment: KAFKA-1810_2015-01-15_19:47:14.patch

> Add IP Filtering / Whitelists-Blacklists 
> -
>
> Key: KAFKA-1810
> URL: https://issues.apache.org/jira/browse/KAFKA-1810
> Project: Kafka
>  Issue Type: New Feature
>  Components: core, network
>Reporter: Jeff Holoman
>Assignee: Jeff Holoman
>Priority: Minor
> Fix For: 0.8.3
>
> Attachments: KAFKA-1810.patch, KAFKA-1810_2015-01-15_19:47:14.patch
>
>
> While longer-term goals of security in Kafka are on the roadmap there exists 
> some value for the ability to restrict connection to Kafka brokers based on 
> IP address. This is not intended as a replacement for security but more of a 
> precaution against misconfiguration and to provide some level of control to 
> Kafka administrators about who is reading/writing to their cluster.
> 1) In some organizations software administration vs o/s systems 
> administration and network administration is disjointed and not well 
> choreographed. Providing software administrators the ability to configure 
> their platform relatively independently (after initial configuration) from 
> Systems administrators is desirable.
> 2) Configuration and deployment is sometimes error prone and there are 
> situations when test environments could erroneously read/write to production 
> environments
> 3) An additional precaution against reading sensitive data is typically 
> welcomed in most large enterprise deployments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 29714: Patch for KAFKA-1810

2015-01-15 Thread Jeff Holoman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29714/
---

(Updated Jan. 16, 2015, 12:47 a.m.)


Review request for kafka.


Bugs: KAFKA-1810
https://issues.apache.org/jira/browse/KAFKA-1810


Repository: kafka


Description (updated)
---

KAFKA-1810 Refactor


KAFKA-1810 Refactor 2


Diffs (updated)
-

  core/src/main/scala/kafka/network/IPFilter.scala PRE-CREATION 
  core/src/main/scala/kafka/network/SocketServer.scala 
39b1651b680b2995cedfde95d74c086d9c6219ef 
  core/src/main/scala/kafka/server/KafkaConfig.scala 
6e26c5436feb4629d17f199011f3ebb674aa767f 
  core/src/main/scala/kafka/server/KafkaServer.scala 
1691ad7fc80ca0b112f68e3ea0cbab265c75b26b 
  core/src/main/scala/kafka/utils/VerifiableProperties.scala 
2ffc7f452dc7a1b6a06ca7a509ed49e1ab3d1e68 
  core/src/test/scala/unit/kafka/network/IPFilterTest.scala PRE-CREATION 
  core/src/test/scala/unit/kafka/network/SocketServerTest.scala 
78b431f9c88cca1bc5e430ffd41083d0e92b7e75 
  core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 
2377abe4933e065d037828a214c3a87e1773a8ef 

Diff: https://reviews.apache.org/r/29714/diff/


Testing
---

This code centers around a new class, CIDRRange in IPFilter.scala. The IPFilter 
class is created and holds two fields, the ruleType (allow|deny|none) and a 
list of CIDRRange objects. This is used in the Socket Server acceptor thread. 
The check does an exists on the value in the list if the rule type is allow or 
deny. On object creation, we pre-calculate the lower and upper range values and 
store those as a BigInt. The overhead of the check should be fairly minimal as 
it involves converting the incoming IP Address to a BigInt and then just doing 
a compare to the low/high values. In writing this review up I realized that I 
can optimize this further to convert to bigint first and move that conversion 
out of the range check, which I can address.

Testing covers the CIDRRange and IPFilter classes and validation of IPV6, IPV4, 
and configurations. Additionally the functionality is tested in 
SocketServerTest. Other changes are just to assist in configuration.

I modified the SocketServerTest to use a method for grabbing the Socket server 
to make the code a bit more concise.

One key point is that, if there is an error in configuration, we halt the 
startup of the broker. The thinking there is that if you screw up 
security-related configs, you want to know about it right away rather than 
silently accept connections. (thanks Joe Stein for the input).

There are two new exceptions realted to this functionality, one to handle 
configuration errors, and one to handle blocking the request. Currently the 
level is set to INFO. Does it make sense to move this to WARN ?


Thanks,

Jeff Holoman



[jira] [Commented] (KAFKA-1810) Add IP Filtering / Whitelists-Blacklists

2015-01-15 Thread Jeff Holoman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279609#comment-14279609
 ] 

Jeff Holoman commented on KAFKA-1810:
-

Updated reviewboard https://reviews.apache.org/r/29714/diff/
 against branch origin/trunk

> Add IP Filtering / Whitelists-Blacklists 
> -
>
> Key: KAFKA-1810
> URL: https://issues.apache.org/jira/browse/KAFKA-1810
> Project: Kafka
>  Issue Type: New Feature
>  Components: core, network
>Reporter: Jeff Holoman
>Assignee: Jeff Holoman
>Priority: Minor
> Fix For: 0.8.3
>
> Attachments: KAFKA-1810.patch, KAFKA-1810_2015-01-15_19:47:14.patch
>
>
> While longer-term goals of security in Kafka are on the roadmap there exists 
> some value for the ability to restrict connection to Kafka brokers based on 
> IP address. This is not intended as a replacement for security but more of a 
> precaution against misconfiguration and to provide some level of control to 
> Kafka administrators about who is reading/writing to their cluster.
> 1) In some organizations software administration vs o/s systems 
> administration and network administration is disjointed and not well 
> choreographed. Providing software administrators the ability to configure 
> their platform relatively independently (after initial configuration) from 
> Systems administrators is desirable.
> 2) Configuration and deployment is sometimes error prone and there are 
> situations when test environments could erroneously read/write to production 
> environments
> 3) An additional precaution against reading sensitive data is typically 
> welcomed in most large enterprise deployments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 29714: Patch for KAFKA-1810

2015-01-15 Thread Jeff Holoman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29714/
---

(Updated Jan. 16, 2015, 12:48 a.m.)


Review request for kafka.


Bugs: KAFKA-1810
https://issues.apache.org/jira/browse/KAFKA-1810


Repository: kafka


Description (updated)
---

Put back in the 1512 changes and moved the initial BigInt out of the check 
CIDRRange contains method


Diffs
-

  core/src/main/scala/kafka/network/IPFilter.scala PRE-CREATION 
  core/src/main/scala/kafka/network/SocketServer.scala 
39b1651b680b2995cedfde95d74c086d9c6219ef 
  core/src/main/scala/kafka/server/KafkaConfig.scala 
6e26c5436feb4629d17f199011f3ebb674aa767f 
  core/src/main/scala/kafka/server/KafkaServer.scala 
1691ad7fc80ca0b112f68e3ea0cbab265c75b26b 
  core/src/main/scala/kafka/utils/VerifiableProperties.scala 
2ffc7f452dc7a1b6a06ca7a509ed49e1ab3d1e68 
  core/src/test/scala/unit/kafka/network/IPFilterTest.scala PRE-CREATION 
  core/src/test/scala/unit/kafka/network/SocketServerTest.scala 
78b431f9c88cca1bc5e430ffd41083d0e92b7e75 
  core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 
2377abe4933e065d037828a214c3a87e1773a8ef 

Diff: https://reviews.apache.org/r/29714/diff/


Testing
---

This code centers around a new class, CIDRRange in IPFilter.scala. The IPFilter 
class is created and holds two fields, the ruleType (allow|deny|none) and a 
list of CIDRRange objects. This is used in the Socket Server acceptor thread. 
The check does an exists on the value in the list if the rule type is allow or 
deny. On object creation, we pre-calculate the lower and upper range values and 
store those as a BigInt. The overhead of the check should be fairly minimal as 
it involves converting the incoming IP Address to a BigInt and then just doing 
a compare to the low/high values. In writing this review up I realized that I 
can optimize this further to convert to bigint first and move that conversion 
out of the range check, which I can address.

Testing covers the CIDRRange and IPFilter classes and validation of IPV6, IPV4, 
and configurations. Additionally the functionality is tested in 
SocketServerTest. Other changes are just to assist in configuration.

I modified the SocketServerTest to use a method for grabbing the Socket server 
to make the code a bit more concise.

One key point is that, if there is an error in configuration, we halt the 
startup of the broker. The thinking there is that if you screw up 
security-related configs, you want to know about it right away rather than 
silently accept connections. (thanks Joe Stein for the input).

There are two new exceptions realted to this functionality, one to handle 
configuration errors, and one to handle blocking the request. Currently the 
level is set to INFO. Does it make sense to move this to WARN ?


Thanks,

Jeff Holoman



[jira] [Assigned] (KAFKA-1864) Revisit defaults for the internal offsets topic

2015-01-15 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao reassigned KAFKA-1864:
--

Assignee: Jun Rao

> Revisit defaults for the internal offsets topic
> ---
>
> Key: KAFKA-1864
> URL: https://issues.apache.org/jira/browse/KAFKA-1864
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Neha Narkhede
>Assignee: Jun Rao
>Priority: Blocker
>
> Realize this is late, but as I was reviewing the 0.8.2 RC, I found that our 
> defaults for the offsets topic are not ideal. The # of partitions currently 
> default to 1 and the replication factor is 1 as well. Granted that the 
> replication factor is changeable in the future (through the admin tool), 
> changing the # of partitions is a very disruptive change. The concern is that 
> this feature is on by default on the server and will be activated the moment 
> the first client turns on kafka based offset storage. 
> My proposal is to change the # of partitions to something large (50 or so) 
> and change the replication factor to min(# of alive brokers, configured 
> replication factor)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 29952: Patch for kafka-1864

2015-01-15 Thread Jun Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29952/
---

Review request for kafka.


Bugs: kafka-1864
https://issues.apache.org/jira/browse/kafka-1864


Repository: kafka


Description
---

create offset topic with a larger replication factor by default


Diffs
-

  core/src/main/scala/kafka/server/KafkaApis.scala 
d626b1710813648524eefa5a3df098393c3e7743 
  core/src/main/scala/kafka/server/KafkaConfig.scala 
6e26c5436feb4629d17f199011f3ebb674aa767f 
  core/src/main/scala/kafka/server/OffsetManager.scala 
43eb2a35bb54d32c66cdb94772df657b3a104d1a 
  core/src/test/scala/integration/kafka/api/ProducerFailureHandlingTest.scala 
07a7beee9dec733eae943b425ae58c54f08458d8 
  core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 
4a3a5b264a021e55c39f4d7424ce04ee591503ef 

Diff: https://reviews.apache.org/r/29952/diff/


Testing
---


Thanks,

Jun Rao



[jira] [Updated] (KAFKA-1864) Revisit defaults for the internal offsets topic

2015-01-15 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-1864:
---
Status: Patch Available  (was: Open)

> Revisit defaults for the internal offsets topic
> ---
>
> Key: KAFKA-1864
> URL: https://issues.apache.org/jira/browse/KAFKA-1864
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Neha Narkhede
>Assignee: Jun Rao
>Priority: Blocker
> Attachments: kafka-1864.patch
>
>
> Realize this is late, but as I was reviewing the 0.8.2 RC, I found that our 
> defaults for the offsets topic are not ideal. The # of partitions currently 
> default to 1 and the replication factor is 1 as well. Granted that the 
> replication factor is changeable in the future (through the admin tool), 
> changing the # of partitions is a very disruptive change. The concern is that 
> this feature is on by default on the server and will be activated the moment 
> the first client turns on kafka based offset storage. 
> My proposal is to change the # of partitions to something large (50 or so) 
> and change the replication factor to min(# of alive brokers, configured 
> replication factor)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1864) Revisit defaults for the internal offsets topic

2015-01-15 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-1864:
---
Attachment: kafka-1864.patch

> Revisit defaults for the internal offsets topic
> ---
>
> Key: KAFKA-1864
> URL: https://issues.apache.org/jira/browse/KAFKA-1864
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Neha Narkhede
>Assignee: Jun Rao
>Priority: Blocker
> Attachments: kafka-1864.patch
>
>
> Realize this is late, but as I was reviewing the 0.8.2 RC, I found that our 
> defaults for the offsets topic are not ideal. The # of partitions currently 
> default to 1 and the replication factor is 1 as well. Granted that the 
> replication factor is changeable in the future (through the admin tool), 
> changing the # of partitions is a very disruptive change. The concern is that 
> this feature is on by default on the server and will be activated the moment 
> the first client turns on kafka based offset storage. 
> My proposal is to change the # of partitions to something large (50 or so) 
> and change the replication factor to min(# of alive brokers, configured 
> replication factor)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1864) Revisit defaults for the internal offsets topic

2015-01-15 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279616#comment-14279616
 ] 

Jun Rao commented on KAFKA-1864:


Attached is a patch. A couple of the unit tests fail because of KAFKA-1867. 
Fixing KAFKA-1867 is a bit tricky and we probably don't want to do that in 
0.8.2. So, patching the unit test by overriding the default value for 
offsets.topic.replication.factor for now.


> Revisit defaults for the internal offsets topic
> ---
>
> Key: KAFKA-1864
> URL: https://issues.apache.org/jira/browse/KAFKA-1864
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Neha Narkhede
>Assignee: Jun Rao
>Priority: Blocker
> Attachments: kafka-1864.patch
>
>
> Realize this is late, but as I was reviewing the 0.8.2 RC, I found that our 
> defaults for the offsets topic are not ideal. The # of partitions currently 
> default to 1 and the replication factor is 1 as well. Granted that the 
> replication factor is changeable in the future (through the admin tool), 
> changing the # of partitions is a very disruptive change. The concern is that 
> this feature is on by default on the server and will be activated the moment 
> the first client turns on kafka based offset storage. 
> My proposal is to change the # of partitions to something large (50 or so) 
> and change the replication factor to min(# of alive brokers, configured 
> replication factor)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1864) Revisit defaults for the internal offsets topic

2015-01-15 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279615#comment-14279615
 ] 

Jun Rao commented on KAFKA-1864:


Created reviewboard https://reviews.apache.org/r/29952/diff/
 against branch origin/0.8.2

> Revisit defaults for the internal offsets topic
> ---
>
> Key: KAFKA-1864
> URL: https://issues.apache.org/jira/browse/KAFKA-1864
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Neha Narkhede
>Assignee: Jun Rao
>Priority: Blocker
> Attachments: kafka-1864.patch
>
>
> Realize this is late, but as I was reviewing the 0.8.2 RC, I found that our 
> defaults for the offsets topic are not ideal. The # of partitions currently 
> default to 1 and the replication factor is 1 as well. Granted that the 
> replication factor is changeable in the future (through the admin tool), 
> changing the # of partitions is a very disruptive change. The concern is that 
> this feature is on by default on the server and will be activated the moment 
> the first client turns on kafka based offset storage. 
> My proposal is to change the # of partitions to something large (50 or so) 
> and change the replication factor to min(# of alive brokers, configured 
> replication factor)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-1868) ConsoleConsumer shouldn't override dual.commit.enabled to false if not explicitly set

2015-01-15 Thread Jun Rao (JIRA)
Jun Rao created KAFKA-1868:
--

 Summary: ConsoleConsumer shouldn't override dual.commit.enabled to 
false if not explicitly set
 Key: KAFKA-1868
 URL: https://issues.apache.org/jira/browse/KAFKA-1868
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.2
Reporter: Jun Rao
Assignee: Jun Rao
Priority: Blocker
 Fix For: 0.8.2


In ConsoleConsumer, we override dual.commit.enabled to false if not explicitly 
set. However, if offset.storage is set to kafka, by default, 
dual.commit.enabled is set to true and we shouldn't override that. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [kafka-clients] Re: Heads up: KAFKA-1697 - remove code related to ack>1 on the broker

2015-01-15 Thread Gwen Shapira
Hi,

I got convinced by Joe and Dana that errors are indeed part of the
protocol and can't be randomly added.

So, it looks like we need to bump version of ProduceRequest in the
following way:
Version 0 -> accept acks >1. I think we should keep the existing
behavior too (i.e. not replace it with -1) to avoid surprising
clients, but I'm willing to hear other opinions.
Version 1 -> do not accept acks >1 and return an error.
Are we ok with the error I added in KAFKA-1697? We can use something
less specific like InvalidRequestParameter. This error can be reused
in the future and reduce the need to add errors, but will also be less
clear to the client and its users. Maybe even add the error message
string to the protocol in addition to the error code? (since we are
bumping versions)

I think maintaining the old version throughout 0.8.X makes sense. IMO
dropping it for 0.9 is feasible, but I'll let client owners help make
that call.

Am I missing anything? Should I start a KIP? It seems like a KIP-type
discussion :)

Gwen


On Thu, Jan 15, 2015 at 2:31 PM, Ewen Cheslack-Postava
 wrote:
> Gwen,
>
> I think the only option that wouldn't require a protocol version change is
> the one where acks > 1 is converted to acks = -1 since it's the only one
> that doesn't potentially break older clients. The protocol guide says that
> the expected upgrade path is servers first, then clients, so old clients,
> including non-java clients, that may be using acks > 1 should be able to
> work with a new broker version.
>
> It's more work, but I think dealing with the protocol change is the right
> thing to do since it eventually gets us to the behavior I think is better --
> the broker should reject requests with invalid values. I think Joe and I
> were basically in agreement. In my mind the major piece missing from his
> description is how long we're going to maintain his "case 0" behavior. It's
> impractical to maintain old versions forever, but it sounds like there
> hasn't been a decision on how long to maintain them. Maybe that's another
> item to add to KIPs -- protocol versions and behavior need to be listed as
> deprecated and the earliest version in which they'll be removed should be
> specified so users can understand which versions are guaranteed to be
> compatible, even if they're using (well-written) non-java clients.
>
> -Ewen
>
>
> On Thu, Jan 15, 2015 at 12:52 PM, Dana Powers  wrote:
>>
>> > clients don't break on unknown errors
>>
>> maybe true for the official java clients, but I dont think the assumption
>> holds true for community-maintained clients and users of those clients.
>>  kafka-python generally follows the fail-fast philosophy and raises an
>> exception on any unrecognized error code in any server response.  in this
>> case, kafka-python allows users to set their own required-acks policy when
>> creating a producer instance.  It is possible that users of kafka-python
>> have deployed producer code that uses ack>1 -- perhaps in production
>> environments -- and for those users the new error code will crash their
>> producer code.  I would not be surprised if the same were true of other
>> community clients.
>>
>> *one reason for the fail-fast approach is that there isn't great
>> documentation on what errors to expect for each request / response -- so
>> we
>> use failures to alert that some error case is not handled properly.  and
>> because of that, introducing new error cases without bumping the api
>> version is likely to cause those errors to get raised/thrown all the way
>> back up to the user.  of course we (client maintainers) can fix the issues
>> in the client libraries and suggest users upgrade, but it's not the ideal
>> situation.
>>
>>
>> long-winded way of saying: I agree w/ Joe.
>>
>> -Dana
>>
>>
>> On Thu, Jan 15, 2015 at 12:07 PM, Gwen Shapira 
>> wrote:
>>
>> > Is the protocol bump caused by the behavior change or the new error
>> > code?
>> >
>> > 1) IMO, error_codes are data, and clients can expect to receive errors
>> > that they don't understand (i.e. unknown errors). AFAIK, clients don't
>> > break on unknown errors, they are simple more challenging to debug. If
>> > we document the new behavior, then its definitely debuggable and
>> > fixable.
>> >
>> > 2) The behavior change is basically a deprecation - i.e. acks > 1 were
>> > never documented, and are not supported by Kafka clients starting with
>> > version 0.8.2. I'm not sure this requires a protocol bump either,
>> > although its a better case than new error codes.
>> >
>> > Thanks,
>> > Gwen
>> >
>> > On Thu, Jan 15, 2015 at 10:10 AM, Joe Stein 
>> > wrote:
>> > > Looping in the mailing list that the client developers live on because
>> > they
>> > > are all not on dev (though they should be if they want to be helping
>> > > to
>> > > build the best client libraries they can).
>> > >
>> > > I whole hardily believe that we need to not break existing
>> > > functionality
>> > of
>> > > the client protocol, ever.
>> > >
>

Review Request 29953: Patch for kafka-1868

2015-01-15 Thread Jun Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29953/
---

Review request for kafka.


Bugs: kafka-1868
https://issues.apache.org/jira/browse/kafka-1868


Repository: kafka


Description
---

remove the overrides


Diffs
-

  core/src/main/scala/kafka/tools/ConsoleConsumer.scala 
323fc8566d974acc4e5c7d7c2a065794f3b5df4a 

Diff: https://reviews.apache.org/r/29953/diff/


Testing
---


Thanks,

Jun Rao



[jira] [Updated] (KAFKA-1868) ConsoleConsumer shouldn't override dual.commit.enabled to false if not explicitly set

2015-01-15 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-1868:
---
Status: Patch Available  (was: Open)

> ConsoleConsumer shouldn't override dual.commit.enabled to false if not 
> explicitly set
> -
>
> Key: KAFKA-1868
> URL: https://issues.apache.org/jira/browse/KAFKA-1868
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Jun Rao
>Assignee: Jun Rao
>Priority: Blocker
> Fix For: 0.8.2
>
> Attachments: kafka-1868.patch
>
>
> In ConsoleConsumer, we override dual.commit.enabled to false if not 
> explicitly set. However, if offset.storage is set to kafka, by default, 
> dual.commit.enabled is set to true and we shouldn't override that. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1868) ConsoleConsumer shouldn't override dual.commit.enabled to false if not explicitly set

2015-01-15 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-1868:
---
Attachment: kafka-1868.patch

> ConsoleConsumer shouldn't override dual.commit.enabled to false if not 
> explicitly set
> -
>
> Key: KAFKA-1868
> URL: https://issues.apache.org/jira/browse/KAFKA-1868
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Jun Rao
>Assignee: Jun Rao
>Priority: Blocker
> Fix For: 0.8.2
>
> Attachments: kafka-1868.patch
>
>
> In ConsoleConsumer, we override dual.commit.enabled to false if not 
> explicitly set. However, if offset.storage is set to kafka, by default, 
> dual.commit.enabled is set to true and we shouldn't override that. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1868) ConsoleConsumer shouldn't override dual.commit.enabled to false if not explicitly set

2015-01-15 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279626#comment-14279626
 ] 

Jun Rao commented on KAFKA-1868:


Created reviewboard https://reviews.apache.org/r/29953/diff/
 against branch origin/0.8.2

> ConsoleConsumer shouldn't override dual.commit.enabled to false if not 
> explicitly set
> -
>
> Key: KAFKA-1868
> URL: https://issues.apache.org/jira/browse/KAFKA-1868
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Jun Rao
>Assignee: Jun Rao
>Priority: Blocker
> Fix For: 0.8.2
>
> Attachments: kafka-1868.patch
>
>
> In ConsoleConsumer, we override dual.commit.enabled to false if not 
> explicitly set. However, if offset.storage is set to kafka, by default, 
> dual.commit.enabled is set to true and we shouldn't override that. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [kafka-clients] Re: Heads up: KAFKA-1697 - remove code related to ack>1 on the broker

2015-01-15 Thread Steve Morin
Agree errors should be part of the protocol

> On Jan 15, 2015, at 17:59, Gwen Shapira  wrote:
> 
> Hi,
> 
> I got convinced by Joe and Dana that errors are indeed part of the
> protocol and can't be randomly added.
> 
> So, it looks like we need to bump version of ProduceRequest in the
> following way:
> Version 0 -> accept acks >1. I think we should keep the existing
> behavior too (i.e. not replace it with -1) to avoid surprising
> clients, but I'm willing to hear other opinions.
> Version 1 -> do not accept acks >1 and return an error.
> Are we ok with the error I added in KAFKA-1697? We can use something
> less specific like InvalidRequestParameter. This error can be reused
> in the future and reduce the need to add errors, but will also be less
> clear to the client and its users. Maybe even add the error message
> string to the protocol in addition to the error code? (since we are
> bumping versions)
> 
> I think maintaining the old version throughout 0.8.X makes sense. IMO
> dropping it for 0.9 is feasible, but I'll let client owners help make
> that call.
> 
> Am I missing anything? Should I start a KIP? It seems like a KIP-type
> discussion :)
> 
> Gwen
> 
> 
> On Thu, Jan 15, 2015 at 2:31 PM, Ewen Cheslack-Postava
>  wrote:
>> Gwen,
>> 
>> I think the only option that wouldn't require a protocol version change is
>> the one where acks > 1 is converted to acks = -1 since it's the only one
>> that doesn't potentially break older clients. The protocol guide says that
>> the expected upgrade path is servers first, then clients, so old clients,
>> including non-java clients, that may be using acks > 1 should be able to
>> work with a new broker version.
>> 
>> It's more work, but I think dealing with the protocol change is the right
>> thing to do since it eventually gets us to the behavior I think is better --
>> the broker should reject requests with invalid values. I think Joe and I
>> were basically in agreement. In my mind the major piece missing from his
>> description is how long we're going to maintain his "case 0" behavior. It's
>> impractical to maintain old versions forever, but it sounds like there
>> hasn't been a decision on how long to maintain them. Maybe that's another
>> item to add to KIPs -- protocol versions and behavior need to be listed as
>> deprecated and the earliest version in which they'll be removed should be
>> specified so users can understand which versions are guaranteed to be
>> compatible, even if they're using (well-written) non-java clients.
>> 
>> -Ewen
>> 
>> 
>>> On Thu, Jan 15, 2015 at 12:52 PM, Dana Powers  wrote:
>>> 
 clients don't break on unknown errors
>>> 
>>> maybe true for the official java clients, but I dont think the assumption
>>> holds true for community-maintained clients and users of those clients.
>>> kafka-python generally follows the fail-fast philosophy and raises an
>>> exception on any unrecognized error code in any server response.  in this
>>> case, kafka-python allows users to set their own required-acks policy when
>>> creating a producer instance.  It is possible that users of kafka-python
>>> have deployed producer code that uses ack>1 -- perhaps in production
>>> environments -- and for those users the new error code will crash their
>>> producer code.  I would not be surprised if the same were true of other
>>> community clients.
>>> 
>>> *one reason for the fail-fast approach is that there isn't great
>>> documentation on what errors to expect for each request / response -- so
>>> we
>>> use failures to alert that some error case is not handled properly.  and
>>> because of that, introducing new error cases without bumping the api
>>> version is likely to cause those errors to get raised/thrown all the way
>>> back up to the user.  of course we (client maintainers) can fix the issues
>>> in the client libraries and suggest users upgrade, but it's not the ideal
>>> situation.
>>> 
>>> 
>>> long-winded way of saying: I agree w/ Joe.
>>> 
>>> -Dana
>>> 
>>> 
>>> On Thu, Jan 15, 2015 at 12:07 PM, Gwen Shapira 
>>> wrote:
>>> 
 Is the protocol bump caused by the behavior change or the new error
 code?
 
 1) IMO, error_codes are data, and clients can expect to receive errors
 that they don't understand (i.e. unknown errors). AFAIK, clients don't
 break on unknown errors, they are simple more challenging to debug. If
 we document the new behavior, then its definitely debuggable and
 fixable.
 
 2) The behavior change is basically a deprecation - i.e. acks > 1 were
 never documented, and are not supported by Kafka clients starting with
 version 0.8.2. I'm not sure this requires a protocol bump either,
 although its a better case than new error codes.
 
 Thanks,
 Gwen
 
 On Thu, Jan 15, 2015 at 10:10 AM, Joe Stein 
 wrote:
> Looping in the mailing list that the client developers live on because
 they
> are all not on dev (though they should be 

Re: [kafka-clients] Re: Heads up: KAFKA-1697 - remove code related to ack>1 on the broker

2015-01-15 Thread Gwen Shapira
The errors are part of the KIP process now, so I think the clients are safe :)

https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals

On Thu, Jan 15, 2015 at 5:12 PM, Steve Morin  wrote:
> Agree errors should be part of the protocol
>
>> On Jan 15, 2015, at 17:59, Gwen Shapira  wrote:
>>
>> Hi,
>>
>> I got convinced by Joe and Dana that errors are indeed part of the
>> protocol and can't be randomly added.
>>
>> So, it looks like we need to bump version of ProduceRequest in the
>> following way:
>> Version 0 -> accept acks >1. I think we should keep the existing
>> behavior too (i.e. not replace it with -1) to avoid surprising
>> clients, but I'm willing to hear other opinions.
>> Version 1 -> do not accept acks >1 and return an error.
>> Are we ok with the error I added in KAFKA-1697? We can use something
>> less specific like InvalidRequestParameter. This error can be reused
>> in the future and reduce the need to add errors, but will also be less
>> clear to the client and its users. Maybe even add the error message
>> string to the protocol in addition to the error code? (since we are
>> bumping versions)
>>
>> I think maintaining the old version throughout 0.8.X makes sense. IMO
>> dropping it for 0.9 is feasible, but I'll let client owners help make
>> that call.
>>
>> Am I missing anything? Should I start a KIP? It seems like a KIP-type
>> discussion :)
>>
>> Gwen
>>
>>
>> On Thu, Jan 15, 2015 at 2:31 PM, Ewen Cheslack-Postava
>>  wrote:
>>> Gwen,
>>>
>>> I think the only option that wouldn't require a protocol version change is
>>> the one where acks > 1 is converted to acks = -1 since it's the only one
>>> that doesn't potentially break older clients. The protocol guide says that
>>> the expected upgrade path is servers first, then clients, so old clients,
>>> including non-java clients, that may be using acks > 1 should be able to
>>> work with a new broker version.
>>>
>>> It's more work, but I think dealing with the protocol change is the right
>>> thing to do since it eventually gets us to the behavior I think is better --
>>> the broker should reject requests with invalid values. I think Joe and I
>>> were basically in agreement. In my mind the major piece missing from his
>>> description is how long we're going to maintain his "case 0" behavior. It's
>>> impractical to maintain old versions forever, but it sounds like there
>>> hasn't been a decision on how long to maintain them. Maybe that's another
>>> item to add to KIPs -- protocol versions and behavior need to be listed as
>>> deprecated and the earliest version in which they'll be removed should be
>>> specified so users can understand which versions are guaranteed to be
>>> compatible, even if they're using (well-written) non-java clients.
>>>
>>> -Ewen
>>>
>>>
 On Thu, Jan 15, 2015 at 12:52 PM, Dana Powers  
 wrote:

> clients don't break on unknown errors

 maybe true for the official java clients, but I dont think the assumption
 holds true for community-maintained clients and users of those clients.
 kafka-python generally follows the fail-fast philosophy and raises an
 exception on any unrecognized error code in any server response.  in this
 case, kafka-python allows users to set their own required-acks policy when
 creating a producer instance.  It is possible that users of kafka-python
 have deployed producer code that uses ack>1 -- perhaps in production
 environments -- and for those users the new error code will crash their
 producer code.  I would not be surprised if the same were true of other
 community clients.

 *one reason for the fail-fast approach is that there isn't great
 documentation on what errors to expect for each request / response -- so
 we
 use failures to alert that some error case is not handled properly.  and
 because of that, introducing new error cases without bumping the api
 version is likely to cause those errors to get raised/thrown all the way
 back up to the user.  of course we (client maintainers) can fix the issues
 in the client libraries and suggest users upgrade, but it's not the ideal
 situation.


 long-winded way of saying: I agree w/ Joe.

 -Dana


 On Thu, Jan 15, 2015 at 12:07 PM, Gwen Shapira 
 wrote:

> Is the protocol bump caused by the behavior change or the new error
> code?
>
> 1) IMO, error_codes are data, and clients can expect to receive errors
> that they don't understand (i.e. unknown errors). AFAIK, clients don't
> break on unknown errors, they are simple more challenging to debug. If
> we document the new behavior, then its definitely debuggable and
> fixable.
>
> 2) The behavior change is basically a deprecation - i.e. acks > 1 were
> never documented, and are not supported by Kafka clients starting with
> version 0.8.2. I'm not sure this requires a protocol 

[jira] [Updated] (KAFKA-1810) Add IP Filtering / Whitelists-Blacklists

2015-01-15 Thread Joe Stein (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe Stein updated KAFKA-1810:
-
Reviewer: Gwen Shapira

> Add IP Filtering / Whitelists-Blacklists 
> -
>
> Key: KAFKA-1810
> URL: https://issues.apache.org/jira/browse/KAFKA-1810
> Project: Kafka
>  Issue Type: New Feature
>  Components: core, network
>Reporter: Jeff Holoman
>Assignee: Jeff Holoman
>Priority: Minor
> Fix For: 0.8.3
>
> Attachments: KAFKA-1810.patch, KAFKA-1810_2015-01-15_19:47:14.patch
>
>
> While longer-term goals of security in Kafka are on the roadmap there exists 
> some value for the ability to restrict connection to Kafka brokers based on 
> IP address. This is not intended as a replacement for security but more of a 
> precaution against misconfiguration and to provide some level of control to 
> Kafka administrators about who is reading/writing to their cluster.
> 1) In some organizations software administration vs o/s systems 
> administration and network administration is disjointed and not well 
> choreographed. Providing software administrators the ability to configure 
> their platform relatively independently (after initial configuration) from 
> Systems administrators is desirable.
> 2) Configuration and deployment is sometimes error prone and there are 
> situations when test environments could erroneously read/write to production 
> environments
> 3) An additional precaution against reading sensitive data is typically 
> welcomed in most large enterprise deployments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 29952: Patch for kafka-1864

2015-01-15 Thread Joel Koshy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29952/#review68381
---



core/src/main/scala/kafka/server/OffsetManager.scala


The only issue here is the problem raised in KAFKA-1867 - even though that 
should not happen in practice since you would generally only commit offsets 
after topics do exist in the cluster.

Anyway, wouldn't it just be simpler to keep the replication factor default 
as 1 given that it is possible to change it?


- Joel Koshy


On Jan. 16, 2015, 12:52 a.m., Jun Rao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29952/
> ---
> 
> (Updated Jan. 16, 2015, 12:52 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: kafka-1864
> https://issues.apache.org/jira/browse/kafka-1864
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> create offset topic with a larger replication factor by default
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> d626b1710813648524eefa5a3df098393c3e7743 
>   core/src/main/scala/kafka/server/KafkaConfig.scala 
> 6e26c5436feb4629d17f199011f3ebb674aa767f 
>   core/src/main/scala/kafka/server/OffsetManager.scala 
> 43eb2a35bb54d32c66cdb94772df657b3a104d1a 
>   core/src/test/scala/integration/kafka/api/ProducerFailureHandlingTest.scala 
> 07a7beee9dec733eae943b425ae58c54f08458d8 
>   core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 
> 4a3a5b264a021e55c39f4d7424ce04ee591503ef 
> 
> Diff: https://reviews.apache.org/r/29952/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jun Rao
> 
>



Re: Review Request 29953: Patch for kafka-1868

2015-01-15 Thread Joel Koshy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29953/#review68383
---

Ship it!


Ship It!

- Joel Koshy


On Jan. 16, 2015, 1:02 a.m., Jun Rao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29953/
> ---
> 
> (Updated Jan. 16, 2015, 1:02 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: kafka-1868
> https://issues.apache.org/jira/browse/kafka-1868
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> remove the overrides
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/tools/ConsoleConsumer.scala 
> 323fc8566d974acc4e5c7d7c2a065794f3b5df4a 
> 
> Diff: https://reviews.apache.org/r/29953/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jun Rao
> 
>



[jira] [Commented] (KAFKA-1868) ConsoleConsumer shouldn't override dual.commit.enabled to false if not explicitly set

2015-01-15 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279660#comment-14279660
 ] 

Joel Koshy commented on KAFKA-1868:
---

Appears to be a side-effect of KAFKA-924

> ConsoleConsumer shouldn't override dual.commit.enabled to false if not 
> explicitly set
> -
>
> Key: KAFKA-1868
> URL: https://issues.apache.org/jira/browse/KAFKA-1868
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Jun Rao
>Assignee: Jun Rao
>Priority: Blocker
> Fix For: 0.8.2
>
> Attachments: kafka-1868.patch
>
>
> In ConsoleConsumer, we override dual.commit.enabled to false if not 
> explicitly set. However, if offset.storage is set to kafka, by default, 
> dual.commit.enabled is set to true and we shouldn't override that. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1864) Revisit defaults for the internal offsets topic

2015-01-15 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279677#comment-14279677
 ] 

Jun Rao commented on KAFKA-1864:


Also, another impact in this patch is that the offset topic may not be 
guaranteed to be created with the configured offset replication factor since we 
take the min btw the configured value and the # of live brokers. An alternative 
is to use negative values as suggested in KAFKA-1846 as the default. Then we 
can treat the positive values as the hard requirement. Not sure if this will 
cause more confusing.

> Revisit defaults for the internal offsets topic
> ---
>
> Key: KAFKA-1864
> URL: https://issues.apache.org/jira/browse/KAFKA-1864
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Neha Narkhede
>Assignee: Jun Rao
>Priority: Blocker
> Attachments: kafka-1864.patch
>
>
> Realize this is late, but as I was reviewing the 0.8.2 RC, I found that our 
> defaults for the offsets topic are not ideal. The # of partitions currently 
> default to 1 and the replication factor is 1 as well. Granted that the 
> replication factor is changeable in the future (through the admin tool), 
> changing the # of partitions is a very disruptive change. The concern is that 
> this feature is on by default on the server and will be activated the moment 
> the first client turns on kafka based offset storage. 
> My proposal is to change the # of partitions to something large (50 or so) 
> and change the replication factor to min(# of alive brokers, configured 
> replication factor)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 29952: Patch for kafka-1864

2015-01-15 Thread Jun Rao


> On Jan. 16, 2015, 1:25 a.m., Joel Koshy wrote:
> > core/src/main/scala/kafka/server/OffsetManager.scala, line 79
> > 
> >
> > The only issue here is the problem raised in KAFKA-1867 - even though 
> > that should not happen in practice since you would generally only commit 
> > offsets after topics do exist in the cluster.
> > 
> > Anyway, wouldn't it just be simpler to keep the replication factor 
> > default as 1 given that it is possible to change it?

The main purpose of the patch is to make the default behavior good. For that, 
we want to have enough partitions and enough redundancy. The issue with 
defaulting replication factor to 1 is that people may not realize the issue 
until one of the brokers goes down. At that time, it's too late to change the 
replication factor.


- Jun


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29952/#review68381
---


On Jan. 16, 2015, 12:52 a.m., Jun Rao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29952/
> ---
> 
> (Updated Jan. 16, 2015, 12:52 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: kafka-1864
> https://issues.apache.org/jira/browse/kafka-1864
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> create offset topic with a larger replication factor by default
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> d626b1710813648524eefa5a3df098393c3e7743 
>   core/src/main/scala/kafka/server/KafkaConfig.scala 
> 6e26c5436feb4629d17f199011f3ebb674aa767f 
>   core/src/main/scala/kafka/server/OffsetManager.scala 
> 43eb2a35bb54d32c66cdb94772df657b3a104d1a 
>   core/src/test/scala/integration/kafka/api/ProducerFailureHandlingTest.scala 
> 07a7beee9dec733eae943b425ae58c54f08458d8 
>   core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 
> 4a3a5b264a021e55c39f4d7424ce04ee591503ef 
> 
> Diff: https://reviews.apache.org/r/29952/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jun Rao
> 
>



[jira] [Updated] (KAFKA-1868) ConsoleConsumer shouldn't override dual.commit.enabled to false if not explicitly set

2015-01-15 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-1868:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the review. Committed to both 0.8.2 and trunk.

> ConsoleConsumer shouldn't override dual.commit.enabled to false if not 
> explicitly set
> -
>
> Key: KAFKA-1868
> URL: https://issues.apache.org/jira/browse/KAFKA-1868
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Jun Rao
>Assignee: Jun Rao
>Priority: Blocker
> Fix For: 0.8.2
>
> Attachments: kafka-1868.patch
>
>
> In ConsoleConsumer, we override dual.commit.enabled to false if not 
> explicitly set. However, if offset.storage is set to kafka, by default, 
> dual.commit.enabled is set to true and we shouldn't override that. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [kafka-clients] Re: Heads up: KAFKA-1697 - remove code related to ack>1 on the broker

2015-01-15 Thread Guozhang Wang
+1 on Joe's suggestions, glad to see it happening!

On Thu, Jan 15, 2015 at 5:19 PM, Gwen Shapira  wrote:

> The errors are part of the KIP process now, so I think the clients are
> safe :)
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
>
> On Thu, Jan 15, 2015 at 5:12 PM, Steve Morin 
> wrote:
> > Agree errors should be part of the protocol
> >
> >> On Jan 15, 2015, at 17:59, Gwen Shapira  wrote:
> >>
> >> Hi,
> >>
> >> I got convinced by Joe and Dana that errors are indeed part of the
> >> protocol and can't be randomly added.
> >>
> >> So, it looks like we need to bump version of ProduceRequest in the
> >> following way:
> >> Version 0 -> accept acks >1. I think we should keep the existing
> >> behavior too (i.e. not replace it with -1) to avoid surprising
> >> clients, but I'm willing to hear other opinions.
> >> Version 1 -> do not accept acks >1 and return an error.
> >> Are we ok with the error I added in KAFKA-1697? We can use something
> >> less specific like InvalidRequestParameter. This error can be reused
> >> in the future and reduce the need to add errors, but will also be less
> >> clear to the client and its users. Maybe even add the error message
> >> string to the protocol in addition to the error code? (since we are
> >> bumping versions)
> >>
> >> I think maintaining the old version throughout 0.8.X makes sense. IMO
> >> dropping it for 0.9 is feasible, but I'll let client owners help make
> >> that call.
> >>
> >> Am I missing anything? Should I start a KIP? It seems like a KIP-type
> >> discussion :)
> >>
> >> Gwen
> >>
> >>
> >> On Thu, Jan 15, 2015 at 2:31 PM, Ewen Cheslack-Postava
> >>  wrote:
> >>> Gwen,
> >>>
> >>> I think the only option that wouldn't require a protocol version
> change is
> >>> the one where acks > 1 is converted to acks = -1 since it's the only
> one
> >>> that doesn't potentially break older clients. The protocol guide says
> that
> >>> the expected upgrade path is servers first, then clients, so old
> clients,
> >>> including non-java clients, that may be using acks > 1 should be able
> to
> >>> work with a new broker version.
> >>>
> >>> It's more work, but I think dealing with the protocol change is the
> right
> >>> thing to do since it eventually gets us to the behavior I think is
> better --
> >>> the broker should reject requests with invalid values. I think Joe and
> I
> >>> were basically in agreement. In my mind the major piece missing from
> his
> >>> description is how long we're going to maintain his "case 0" behavior.
> It's
> >>> impractical to maintain old versions forever, but it sounds like there
> >>> hasn't been a decision on how long to maintain them. Maybe that's
> another
> >>> item to add to KIPs -- protocol versions and behavior need to be
> listed as
> >>> deprecated and the earliest version in which they'll be removed should
> be
> >>> specified so users can understand which versions are guaranteed to be
> >>> compatible, even if they're using (well-written) non-java clients.
> >>>
> >>> -Ewen
> >>>
> >>>
>  On Thu, Jan 15, 2015 at 12:52 PM, Dana Powers 
> wrote:
> 
> > clients don't break on unknown errors
> 
>  maybe true for the official java clients, but I dont think the
> assumption
>  holds true for community-maintained clients and users of those
> clients.
>  kafka-python generally follows the fail-fast philosophy and raises an
>  exception on any unrecognized error code in any server response.  in
> this
>  case, kafka-python allows users to set their own required-acks policy
> when
>  creating a producer instance.  It is possible that users of
> kafka-python
>  have deployed producer code that uses ack>1 -- perhaps in production
>  environments -- and for those users the new error code will crash
> their
>  producer code.  I would not be surprised if the same were true of
> other
>  community clients.
> 
>  *one reason for the fail-fast approach is that there isn't great
>  documentation on what errors to expect for each request / response --
> so
>  we
>  use failures to alert that some error case is not handled properly.
> and
>  because of that, introducing new error cases without bumping the api
>  version is likely to cause those errors to get raised/thrown all the
> way
>  back up to the user.  of course we (client maintainers) can fix the
> issues
>  in the client libraries and suggest users upgrade, but it's not the
> ideal
>  situation.
> 
> 
>  long-winded way of saying: I agree w/ Joe.
> 
>  -Dana
> 
> 
>  On Thu, Jan 15, 2015 at 12:07 PM, Gwen Shapira  >
>  wrote:
> 
> > Is the protocol bump caused by the behavior change or the new error
> > code?
> >
> > 1) IMO, error_codes are data, and clients can expect to receive
> errors
> > that they don't understand (i.e. unknown errors). AFAIK, clients
> don't
> > break on

Re: [kafka-clients] Re: Heads up: KAFKA-1697 - remove code related to ack>1 on the broker

2015-01-15 Thread Gwen Shapira
I created a KIP for this suggestion:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-1+-+Remove+support+of+request.required.acks

Basically documenting what was already discussed here.  Comments will
be awesome!

Gwen

On Thu, Jan 15, 2015 at 5:19 PM, Gwen Shapira  wrote:
> The errors are part of the KIP process now, so I think the clients are safe :)
>
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
>
> On Thu, Jan 15, 2015 at 5:12 PM, Steve Morin  wrote:
>> Agree errors should be part of the protocol
>>
>>> On Jan 15, 2015, at 17:59, Gwen Shapira  wrote:
>>>
>>> Hi,
>>>
>>> I got convinced by Joe and Dana that errors are indeed part of the
>>> protocol and can't be randomly added.
>>>
>>> So, it looks like we need to bump version of ProduceRequest in the
>>> following way:
>>> Version 0 -> accept acks >1. I think we should keep the existing
>>> behavior too (i.e. not replace it with -1) to avoid surprising
>>> clients, but I'm willing to hear other opinions.
>>> Version 1 -> do not accept acks >1 and return an error.
>>> Are we ok with the error I added in KAFKA-1697? We can use something
>>> less specific like InvalidRequestParameter. This error can be reused
>>> in the future and reduce the need to add errors, but will also be less
>>> clear to the client and its users. Maybe even add the error message
>>> string to the protocol in addition to the error code? (since we are
>>> bumping versions)
>>>
>>> I think maintaining the old version throughout 0.8.X makes sense. IMO
>>> dropping it for 0.9 is feasible, but I'll let client owners help make
>>> that call.
>>>
>>> Am I missing anything? Should I start a KIP? It seems like a KIP-type
>>> discussion :)
>>>
>>> Gwen
>>>
>>>
>>> On Thu, Jan 15, 2015 at 2:31 PM, Ewen Cheslack-Postava
>>>  wrote:
 Gwen,

 I think the only option that wouldn't require a protocol version change is
 the one where acks > 1 is converted to acks = -1 since it's the only one
 that doesn't potentially break older clients. The protocol guide says that
 the expected upgrade path is servers first, then clients, so old clients,
 including non-java clients, that may be using acks > 1 should be able to
 work with a new broker version.

 It's more work, but I think dealing with the protocol change is the right
 thing to do since it eventually gets us to the behavior I think is better 
 --
 the broker should reject requests with invalid values. I think Joe and I
 were basically in agreement. In my mind the major piece missing from his
 description is how long we're going to maintain his "case 0" behavior. It's
 impractical to maintain old versions forever, but it sounds like there
 hasn't been a decision on how long to maintain them. Maybe that's another
 item to add to KIPs -- protocol versions and behavior need to be listed as
 deprecated and the earliest version in which they'll be removed should be
 specified so users can understand which versions are guaranteed to be
 compatible, even if they're using (well-written) non-java clients.

 -Ewen


> On Thu, Jan 15, 2015 at 12:52 PM, Dana Powers  
> wrote:
>
>> clients don't break on unknown errors
>
> maybe true for the official java clients, but I dont think the assumption
> holds true for community-maintained clients and users of those clients.
> kafka-python generally follows the fail-fast philosophy and raises an
> exception on any unrecognized error code in any server response.  in this
> case, kafka-python allows users to set their own required-acks policy when
> creating a producer instance.  It is possible that users of kafka-python
> have deployed producer code that uses ack>1 -- perhaps in production
> environments -- and for those users the new error code will crash their
> producer code.  I would not be surprised if the same were true of other
> community clients.
>
> *one reason for the fail-fast approach is that there isn't great
> documentation on what errors to expect for each request / response -- so
> we
> use failures to alert that some error case is not handled properly.  and
> because of that, introducing new error cases without bumping the api
> version is likely to cause those errors to get raised/thrown all the way
> back up to the user.  of course we (client maintainers) can fix the issues
> in the client libraries and suggest users upgrade, but it's not the ideal
> situation.
>
>
> long-winded way of saying: I agree w/ Joe.
>
> -Dana
>
>
> On Thu, Jan 15, 2015 at 12:07 PM, Gwen Shapira 
> wrote:
>
>> Is the protocol bump caused by the behavior change or the new error
>> code?
>>
>> 1) IMO, error_codes are data, and clients can expect to receive errors
>> that they don't understand (i.e. unknown errors). AFAIK, clie

[jira] [Commented] (KAFKA-1476) Get a list of consumer groups

2015-01-15 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279715#comment-14279715
 ] 

Neha Narkhede commented on KAFKA-1476:
--

Thanks for attaching the output, [~onurkaraman]. Here are some review comments-

1. The error stack trace for describing groups that don't exist is pretty ugly, 
so let's remove that and output a message that states the group doesn't exist
{code}
vagrant@worker1:/opt/kafka$ bin/kafka-consumer-groups.sh --zookeeper 
192.168.50.11:2181 --describe --group g3
Error while executing consumer group command 
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode 
for /consumers/g3/owners
org.I0Itec.zkclient.exception.ZkNoNodeException: 
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode 
for /consumers/g3/owners
at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:413)
at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:409)
at kafka.utils.ZkUtils$.getTopicsByConsumerGroup(ZkUtils.scala:758)
at 
kafka.admin.ConsumerGroupCommand$.describe(ConsumerGroupCommand.scala:81)
at kafka.admin.ConsumerGroupCommand$.main(ConsumerGroupCommand.scala:56)
at kafka.admin.ConsumerGroupCommand.main(ConsumerGroupCommand.scala)
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
KeeperErrorCode = NoNode for /consumers/g3/owners
at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472)
at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1500)
at org.I0Itec.zkclient.ZkConnection.getChildren(ZkConnection.java:99)
at org.I0Itec.zkclient.ZkClient$2.call(ZkClient.java:416)
at org.I0Itec.zkclient.ZkClient$2.call(ZkClient.java:413)
at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
... 6 more
{code}
2. I'm not sure why the header fields are not comma separated but the data is. 
It will be best to stick to one separator only. So if you pick , as the 
separator, then the output should change to
{code}
GROUP, TOPIC, PID, CURRENT, OFFSET, LOG SIZE, LAG, OWNER
g2, t2, 0, 1, 1, 0, none
{code}
3. For deleting only a specific topic's information, it is sufficient to rely 
on the user specifying the --topic along with --delete. The --delete-with-topic 
option seems unnecessary. 

> Get a list of consumer groups
> -
>
> Key: KAFKA-1476
> URL: https://issues.apache.org/jira/browse/KAFKA-1476
> Project: Kafka
>  Issue Type: Wish
>  Components: tools
>Affects Versions: 0.8.1.1
>Reporter: Ryan Williams
>Assignee: Balaji Seshadri
>  Labels: newbie
> Fix For: 0.9.0
>
> Attachments: ConsumerCommand.scala, KAFKA-1476-LIST-GROUPS.patch, 
> KAFKA-1476-RENAME.patch, KAFKA-1476-REVIEW-COMMENTS.patch, KAFKA-1476.patch, 
> KAFKA-1476.patch, KAFKA-1476.patch, KAFKA-1476.patch, 
> KAFKA-1476_2014-11-10_11:58:26.patch, KAFKA-1476_2014-11-10_12:04:01.patch, 
> KAFKA-1476_2014-11-10_12:06:35.patch, KAFKA-1476_2014-12-05_12:00:12.patch, 
> KAFKA-1476_2015-01-12_16:22:26.patch, KAFKA-1476_2015-01-12_16:31:20.patch, 
> KAFKA-1476_2015-01-13_10:36:18.patch, KAFKA-1476_2015-01-15_14:30:04.patch, 
> sample-kafka-consumer-groups-sh-output.txt
>
>
> It would be useful to have a way to get a list of consumer groups currently 
> active via some tool/script that ships with kafka. This would be helpful so 
> that the system tools can be explored more easily.
> For example, when running the ConsumerOffsetChecker, it requires a group 
> option
> bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --topic test --group 
> ?
> But, when just getting started with kafka, using the console producer and 
> consumer, it is not clear what value to use for the group option.  If a list 
> of consumer groups could be listed, then it would be clear what value to use.
> Background:
> http://mail-archives.apache.org/mod_mbox/kafka-users/201405.mbox/%3cCAOq_b1w=slze5jrnakxvak0gu9ctdkpazak1g4dygvqzbsg...@mail.gmail.com%3e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1476) Get a list of consumer groups

2015-01-15 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279715#comment-14279715
 ] 

Neha Narkhede edited comment on KAFKA-1476 at 1/16/15 2:21 AM:
---

Thanks for attaching the output, [~onurkaraman]. Here are some review comments-

1. The error stack trace for describing groups that don't exist is pretty ugly, 
so let's remove that and output a message that states the group doesn't exist
{code}
vagrant@worker1:/opt/kafka$ bin/kafka-consumer-groups.sh --zookeeper 
192.168.50.11:2181 --describe --group g3
Error while executing consumer group command 
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode 
for /consumers/g3/owners
org.I0Itec.zkclient.exception.ZkNoNodeException: 
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode 
for /consumers/g3/owners
at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:413)
at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:409)
at kafka.utils.ZkUtils$.getTopicsByConsumerGroup(ZkUtils.scala:758)
at 
kafka.admin.ConsumerGroupCommand$.describe(ConsumerGroupCommand.scala:81)
at kafka.admin.ConsumerGroupCommand$.main(ConsumerGroupCommand.scala:56)
at kafka.admin.ConsumerGroupCommand.main(ConsumerGroupCommand.scala)
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
KeeperErrorCode = NoNode for /consumers/g3/owners
at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472)
at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1500)
at org.I0Itec.zkclient.ZkConnection.getChildren(ZkConnection.java:99)
at org.I0Itec.zkclient.ZkClient$2.call(ZkClient.java:416)
at org.I0Itec.zkclient.ZkClient$2.call(ZkClient.java:413)
at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
... 6 more
{code}
2. I'm not sure why the header fields are not comma separated but the data is. 
It will be best to stick to one separator only. So if you pick comma as the 
separator, then the output should change to
{code}
GROUP, TOPIC, PID, CURRENT, OFFSET, LOG SIZE, LAG, OWNER
g2, t2, 0, 1, 1, 0, none
{code}
3. For deleting only a specific topic's information, it is sufficient to rely 
on the user specifying the --topic along with --delete. The --delete-with-topic 
option seems unnecessary. 


was (Author: nehanarkhede):
Thanks for attaching the output, [~onurkaraman]. Here are some review comments-

1. The error stack trace for describing groups that don't exist is pretty ugly, 
so let's remove that and output a message that states the group doesn't exist
{code}
vagrant@worker1:/opt/kafka$ bin/kafka-consumer-groups.sh --zookeeper 
192.168.50.11:2181 --describe --group g3
Error while executing consumer group command 
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode 
for /consumers/g3/owners
org.I0Itec.zkclient.exception.ZkNoNodeException: 
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode 
for /consumers/g3/owners
at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:413)
at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:409)
at kafka.utils.ZkUtils$.getTopicsByConsumerGroup(ZkUtils.scala:758)
at 
kafka.admin.ConsumerGroupCommand$.describe(ConsumerGroupCommand.scala:81)
at kafka.admin.ConsumerGroupCommand$.main(ConsumerGroupCommand.scala:56)
at kafka.admin.ConsumerGroupCommand.main(ConsumerGroupCommand.scala)
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
KeeperErrorCode = NoNode for /consumers/g3/owners
at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472)
at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1500)
at org.I0Itec.zkclient.ZkConnection.getChildren(ZkConnection.java:99)
at org.I0Itec.zkclient.ZkClient$2.call(ZkClient.java:416)
at org.I0Itec.zkclient.ZkClient$2.call(ZkClient.java:413)
at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
... 6 more
{code}
2. I'm not sure why the header fields are not comma separated but the data is. 
It will be best to stick to one separator only

Re: Compatibility + Unknown APIs

2015-01-15 Thread Guozhang Wang
The "hacky" method that Dana suggests does not sound too hacky to me
actually.

Since such scenario will only happen when 1) new clients talk to older
server and 2) older clients talk to new server with some APIs deprecated,
and "correlation_id" is always set to meaningful numbers before, old
clients will not check for its validity. So we only need to upgrade the
clients once for handling "-1 correlation_id" once at ANY time, while
before that happens the old client will just throw "SerializationException"
instead of "ERROR: Closing socket for" for both cases, which gives them
similar semantics. For such situation we do not need to require version
bump.

On Mon, Jan 12, 2015 at 6:36 PM, Jay Kreps  wrote:

> I totally agree but I still think we shouldn't do it. :-)
>
> That change would cause the reimplementation of ALL existing Kafka clients.
> (You can't chose not to implement a new protocol version or else we are
> committing to keeping the old version supported both ways on the server
> forever).
>
> The problem it fixes is fairly minor: clients that want to adaptively
> detect apis. In general I agree this isn't easy to do, but I also don't
> think it is really recommended. I think it is probably better for clients
> to just implement against reasonably conservative versions and trust us not
> to break them going forward. That is simpler and less likely to break.
>
> We also haven't actually addressed the issue originally brought up that
> lead to not doing it--how to interpret and set the top-level error in the
> presence of nested errors (which exception does the client throw and when).
> This is kind of icky to, though probably preferable if we were starting
> over. I see either of these alternatives as imperfect but changing now has
> a high cost and doesn't really address a top 50 pain point.
>
> But I do agree that KIPs would really help draw attention to these kinds of
> decisions as we make them and help us get serious about sticking with them
> without having that kind of "it sucks but..." feeling.
>
> -Jay
>
>
> On Mon, Jan 12, 2015 at 5:57 PM, Joe Stein  wrote:
>
> > There are benefits of moving the error code to the response header.
> >
> > 1) I think it is the right thing to-do from an implementation
> perspective.
> > It makes the most sense. You send a request and you get back a response.
> > The response tells you something is wrong in the header.
> >
> > 2) With such a large change we can make sure we have our solution to
> solve
> > these issues (see other thread on Compatibility and KIP) setup and in
> place
> > moving forward. If we can make such a large change then smaller ones
> should
> > work well too. We could even use this one change as a way to best flush
> out
> > the way we want to implement it preserving functionality AND adding the
> new
> > response format. When we release 0.8.3 (assuming this was in there)
> > developers can read KIP-1 (or whatever) and decide if they want to
> support
> > the version bump required, if not then fine keep working with 0.8.2 and
> you
> > are good to go.
> >
> > /***
> >  Joe Stein
> >  Founder, Principal Consultant
> >  Big Data Open Source Security LLC
> >  http://www.stealth.ly
> >  Twitter: @allthingshadoop 
> > /
> >
> > On Mon, Jan 12, 2015 at 8:37 PM, Jay Kreps  wrote:
> >
> > > Yeah, adding it to the metadata request probably makes sense.
> > >
> > > What you describe of making it a per-broker field is technically
> correct,
> > > since each broker could be on a different software version. But I
> wonder
> > if
> > > it might not be more usable to just give back a single list of api
> > > versions. This will be more compact and also easier to interpret as a
> > > client. An easy implementation of this would be for the broker that
> > answers
> > > the metadata request by just giving whatever versions it supports. A
> > > slightly better implementation would be for each broker to register
> what
> > it
> > > supports in ZK and have the responding broker give back the
> intersection
> > > (i.e. apis supported by all brokers).
> > >
> > > Since the broker actually supports multiple versions at the same time
> > this
> > > will need to be in the form [ApiId [ApiVersion]].
> > >
> > > -Jay
> > >
> > > On Mon, Jan 12, 2015 at 5:19 PM, Dana Powers 
> wrote:
> > >
> > > > Perhaps a bit hacky, but you could also reserve a specific
> > correlationId
> > > > (maybe -1)
> > > > to represent errors and send back to the client an UnknownAPIResponse
> > > like:
> > > >
> > > > Response => -1 UnknownAPIResponse
> > > >
> > > > UnknownAPIResponse => originalCorrelationId errorCode
> > > >
> > > > The benefit here would be that it does not break the current API and
> > > > current
> > > > clients should be able to continue operating as usual as long as they
> > > > ignore
> > > > unknown correlationIds and don't use the reserv

Re: Review Request 29952: Patch for kafka-1864

2015-01-15 Thread Gwen Shapira

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29952/#review68395
---



core/src/main/scala/kafka/server/OffsetManager.scala


I'm wondering why you chose to change defaults here and not in KafkaConfig?
Unless I'm missing something, it looks like we are defining defaults in two 
different places, and they don't match any more.


- Gwen Shapira


On Jan. 16, 2015, 12:52 a.m., Jun Rao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29952/
> ---
> 
> (Updated Jan. 16, 2015, 12:52 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: kafka-1864
> https://issues.apache.org/jira/browse/kafka-1864
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> create offset topic with a larger replication factor by default
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> d626b1710813648524eefa5a3df098393c3e7743 
>   core/src/main/scala/kafka/server/KafkaConfig.scala 
> 6e26c5436feb4629d17f199011f3ebb674aa767f 
>   core/src/main/scala/kafka/server/OffsetManager.scala 
> 43eb2a35bb54d32c66cdb94772df657b3a104d1a 
>   core/src/test/scala/integration/kafka/api/ProducerFailureHandlingTest.scala 
> 07a7beee9dec733eae943b425ae58c54f08458d8 
>   core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 
> 4a3a5b264a021e55c39f4d7424ce04ee591503ef 
> 
> Diff: https://reviews.apache.org/r/29952/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jun Rao
> 
>



Re: Review Request 29952: Patch for kafka-1864

2015-01-15 Thread Jun Rao


> On Jan. 16, 2015, 2:54 a.m., Gwen Shapira wrote:
> > core/src/main/scala/kafka/server/OffsetManager.scala, line 77
> > 
> >
> > I'm wondering why you chose to change defaults here and not in 
> > KafkaConfig?
> > Unless I'm missing something, it looks like we are defining defaults in 
> > two different places, and they don't match any more.

KafkaConfig references this value in OffsetManager, right?


- Jun


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29952/#review68395
---


On Jan. 16, 2015, 12:52 a.m., Jun Rao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29952/
> ---
> 
> (Updated Jan. 16, 2015, 12:52 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: kafka-1864
> https://issues.apache.org/jira/browse/kafka-1864
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> create offset topic with a larger replication factor by default
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> d626b1710813648524eefa5a3df098393c3e7743 
>   core/src/main/scala/kafka/server/KafkaConfig.scala 
> 6e26c5436feb4629d17f199011f3ebb674aa767f 
>   core/src/main/scala/kafka/server/OffsetManager.scala 
> 43eb2a35bb54d32c66cdb94772df657b3a104d1a 
>   core/src/test/scala/integration/kafka/api/ProducerFailureHandlingTest.scala 
> 07a7beee9dec733eae943b425ae58c54f08458d8 
>   core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 
> 4a3a5b264a021e55c39f4d7424ce04ee591503ef 
> 
> Diff: https://reviews.apache.org/r/29952/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jun Rao
> 
>



Re: Review Request 29952: Patch for kafka-1864

2015-01-15 Thread Gwen Shapira


> On Jan. 16, 2015, 2:54 a.m., Gwen Shapira wrote:
> > core/src/main/scala/kafka/server/OffsetManager.scala, line 77
> > 
> >
> > I'm wondering why you chose to change defaults here and not in 
> > KafkaConfig?
> > Unless I'm missing something, it looks like we are defining defaults in 
> > two different places, and they don't match any more.
> 
> Jun Rao wrote:
> KafkaConfig references this value in OffsetManager, right?

Right, sorry - got confused. Typically its the other way around, I think. But 
this will work too.


- Gwen


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29952/#review68395
---


On Jan. 16, 2015, 12:52 a.m., Jun Rao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29952/
> ---
> 
> (Updated Jan. 16, 2015, 12:52 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: kafka-1864
> https://issues.apache.org/jira/browse/kafka-1864
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> create offset topic with a larger replication factor by default
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> d626b1710813648524eefa5a3df098393c3e7743 
>   core/src/main/scala/kafka/server/KafkaConfig.scala 
> 6e26c5436feb4629d17f199011f3ebb674aa767f 
>   core/src/main/scala/kafka/server/OffsetManager.scala 
> 43eb2a35bb54d32c66cdb94772df657b3a104d1a 
>   core/src/test/scala/integration/kafka/api/ProducerFailureHandlingTest.scala 
> 07a7beee9dec733eae943b425ae58c54f08458d8 
>   core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 
> 4a3a5b264a021e55c39f4d7424ce04ee591503ef 
> 
> Diff: https://reviews.apache.org/r/29952/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jun Rao
> 
>



Re: kafka consumer

2015-01-15 Thread Guozhang Wang
Hi,

1. Not sure if I understand your question.. could you elaborate?
2. Yes, and then the data for that topic will be distributed at the
granularity of partitions to your consumers.
3. The default value is set to 60 seconds I believe. You can read the
config docs for its semantics here:

http://kafka.apache.org/documentation.html#brokerconfigs

Guozhang


On Sun, Dec 28, 2014 at 6:00 PM, panqing...@163.com 
wrote:

>
> HI,
>
> I recently learning Kafka, there are several problems
>
> 1, ActiveMQ is broker push message, consumer established the
> messagelistener gets the message, but the message in Kafka are consumer
> pull from broker, Timing acquisition from brokeror can build the
> listener on the broker?
> 2, I now have more than one consumer, to consume the same topic, should
> put them in the same group?
>
> 3, this value should be set to zookeeper.session.timeout.ms how much?
> 400ms java example, but will appear  Unable to connect to zookeeper server
> within timeout: 400
>
>
> panqing...@163.com
>



-- 
-- Guozhang


Re: Review Request 29952: Patch for kafka-1864

2015-01-15 Thread Joel Koshy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29952/#review68406
---

Ship it!



core/src/main/scala/kafka/server/OffsetManager.scala


Although I think the documentation makes this clear the default should 
probably be conservative so this sounds reasonable.


- Joel Koshy


On Jan. 16, 2015, 12:52 a.m., Jun Rao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29952/
> ---
> 
> (Updated Jan. 16, 2015, 12:52 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: kafka-1864
> https://issues.apache.org/jira/browse/kafka-1864
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> create offset topic with a larger replication factor by default
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> d626b1710813648524eefa5a3df098393c3e7743 
>   core/src/main/scala/kafka/server/KafkaConfig.scala 
> 6e26c5436feb4629d17f199011f3ebb674aa767f 
>   core/src/main/scala/kafka/server/OffsetManager.scala 
> 43eb2a35bb54d32c66cdb94772df657b3a104d1a 
>   core/src/test/scala/integration/kafka/api/ProducerFailureHandlingTest.scala 
> 07a7beee9dec733eae943b425ae58c54f08458d8 
>   core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 
> 4a3a5b264a021e55c39f4d7424ce04ee591503ef 
> 
> Diff: https://reviews.apache.org/r/29952/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jun Rao
> 
>



Re: [DISCUSS] KIPs

2015-01-15 Thread Jay Kreps
Hey Joe,

Yeah I guess the question is what is the definition of major? I agree we
definitely don't want to generate a bunch of paperwork. We have enough
problems just getting all the contributions reviewed and checked in in a
timely fashion...

So obviously bug fixes would not apply here.

I think it is also pretty clear that big features should get reviewed and
discussed. To pick on myself, for example, the log compaction work was done
without enough public discussion about how it worked and why (imho). I
hope/claim that enough rigour in thinking about use-cases and operations
and so on was done that it turned out well, but the discussion was just
between a few people with no real public output. This kind of feature is
clearly a big change and something we should discuss.

If we limit ourselves to just the public contracts the KIP introduces the
discussion would just be on the new configs and monitoring without really a
discussion of the design and how it works which is obviously closely
related.

I don't think this should be more work because in practice we are making
wiki pages for any big thing anyway. So this would just be a consistent way
of numbering and structuring these pages. This would also give a clear call
to action: "hey kafka people, come read my wiki and think this through".

I actually thinking the voting aspect is less important. I think it is
generally clear when there is agreement on something and not. So from my
point of view we could actually just eliminate that part if that is too
formal, it just seemed like a good way to formally adopt something.

To address some of your comments from the wiki:

1. This doesn't inhibit someone coming along and putting up a patch. It is
just that when they do if it is a big thing introducing new functionality
we would ask for a little discussion on the basic feature/contracts prior
to code review.

2. We definitely definitely don't want people generating a lot of these
things every time they have an idea that they aren't going to implement. So
this is only applicable to things you absolutely will check in code for. We
also don't want to be making proposals before things are thought through,
which often requires writing the code. So I think the right time for a KIP
is when you are far enough along that you know the issues and tradeoffs but
not so far along that you are going to be totally opposed to any change.
Sometimes that is prior to writing any code and sometimes not until you are
practically done.

The key problem I see this fixing is that there is enough new development
happening that it is pretty hard for everyone to review every line of every
iteration of every patch. But all of us should be fully aware of new
features, the ramifications, the new public interfaces, etc. If we aren't
aware of that we can't really build a holistic system that is beautiful and
consistent across. So the idea is that if you fully review the KIPs you can
be sure that even if you don't know every new line of code, you know the
major changes coming in.

-Jay



On Thu, Jan 15, 2015 at 12:18 PM, Joe Stein  wrote:

> Thanks Jay for kicking this off! I think the confluence page you wrote up
> is a great start.
>
>
> The KIP makes sense to me (at a minimum) if there is going to be any
> "breaking change". This way Kafka can continue to grow and blossom and we
> have a process in place if we are going to release a thorn ... and when we
> do it is *CLEAR* about what and why that is. We can easily document which
> KIPs where involved with this release (which I think should get committed
> afterwards somewhere so no chance of edit after release). This approach I
> had been thinking about also allows changes to occur as they do now as long
> as they are backwards compatible.  Hopefully we never need a KIP but when
> we do the PMC can vote on it and folks can read the release notes with
> *CLEAR* understanding what is going to break their existing setup... at
> least that is how I have been thinking about it.
>
>
> Let me know what you think about this base minimum approach... I hadn't
> really thought of the KIP for *ANY* "major change" and I have to think more
> about that. I have some other comments for minor items in the confluence
> page I will make once I think more about how I feel having a KIP for more
> than what I was thinking about already.
>
>
> I do think we should have "major changes" go through confluence, mailing
> list discuss and JIRA but kind of feel we have been doing that already ...
> if there are cases where that isn't the case we should highlight and learn
> from them and formalize that more if need be.
>
>
> /***
>  Joe Stein
>  Founder, Principal Consultant
>  Big Data Open Source Security LLC
>  http://www.stealth.ly
>  Twitter: @allthingshadoop 
> /
>
> On Thu, Jan 15, 2015 at 1:42 PM, Jay Kreps  wrote:
>
> > The idea of KIPs 

Re: [kafka-clients] Re: Heads up: KAFKA-1697 - remove code related to ack>1 on the broker

2015-01-15 Thread Jay Kreps
This is a good case to discuss.

Let's figure the general case of how we want to handle errors and get that
documented in the protocol. The problem right now is that we give no
guidance on this. I actually thought Gwen's suggestion made sense on the
guidance we should have given which is that we will enumerate a set of
errors and their meaning for each API but it is possible that other errors
will occur and they should be handled (maybe poorly) in the same way
UNKNOWN_ERROR is handled which is our normal escape hatch for things like
OOMException.

I really do think we shouldn't be dogmatic here: In considering a change to
errors we should consider the potential ill-effect vs the complexity of yet
another protocol version.

In this case I actually am not sure we need to bump the protocol because
the whole point of the change was to make a setting we think doesn't make
sense break, right? Well this will break it. It seems like the only
downside is that older clients will get a bad error message instead of a
good one. But it isn't like we will have rendered a client unusable, it is
just that they will need to change their config.

-Jay

On Thu, Jan 15, 2015 at 6:14 PM, Gwen Shapira  wrote:

> I created a KIP for this suggestion:
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1+-+Remove+support+of+request.required.acks
>
> Basically documenting what was already discussed here.  Comments will
> be awesome!
>
> Gwen
>
> On Thu, Jan 15, 2015 at 5:19 PM, Gwen Shapira 
> wrote:
> > The errors are part of the KIP process now, so I think the clients are
> safe :)
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
> >
> > On Thu, Jan 15, 2015 at 5:12 PM, Steve Morin 
> wrote:
> >> Agree errors should be part of the protocol
> >>
> >>> On Jan 15, 2015, at 17:59, Gwen Shapira  wrote:
> >>>
> >>> Hi,
> >>>
> >>> I got convinced by Joe and Dana that errors are indeed part of the
> >>> protocol and can't be randomly added.
> >>>
> >>> So, it looks like we need to bump version of ProduceRequest in the
> >>> following way:
> >>> Version 0 -> accept acks >1. I think we should keep the existing
> >>> behavior too (i.e. not replace it with -1) to avoid surprising
> >>> clients, but I'm willing to hear other opinions.
> >>> Version 1 -> do not accept acks >1 and return an error.
> >>> Are we ok with the error I added in KAFKA-1697? We can use something
> >>> less specific like InvalidRequestParameter. This error can be reused
> >>> in the future and reduce the need to add errors, but will also be less
> >>> clear to the client and its users. Maybe even add the error message
> >>> string to the protocol in addition to the error code? (since we are
> >>> bumping versions)
> >>>
> >>> I think maintaining the old version throughout 0.8.X makes sense. IMO
> >>> dropping it for 0.9 is feasible, but I'll let client owners help make
> >>> that call.
> >>>
> >>> Am I missing anything? Should I start a KIP? It seems like a KIP-type
> >>> discussion :)
> >>>
> >>> Gwen
> >>>
> >>>
> >>> On Thu, Jan 15, 2015 at 2:31 PM, Ewen Cheslack-Postava
> >>>  wrote:
>  Gwen,
> 
>  I think the only option that wouldn't require a protocol version
> change is
>  the one where acks > 1 is converted to acks = -1 since it's the only
> one
>  that doesn't potentially break older clients. The protocol guide says
> that
>  the expected upgrade path is servers first, then clients, so old
> clients,
>  including non-java clients, that may be using acks > 1 should be able
> to
>  work with a new broker version.
> 
>  It's more work, but I think dealing with the protocol change is the
> right
>  thing to do since it eventually gets us to the behavior I think is
> better --
>  the broker should reject requests with invalid values. I think Joe
> and I
>  were basically in agreement. In my mind the major piece missing from
> his
>  description is how long we're going to maintain his "case 0"
> behavior. It's
>  impractical to maintain old versions forever, but it sounds like there
>  hasn't been a decision on how long to maintain them. Maybe that's
> another
>  item to add to KIPs -- protocol versions and behavior need to be
> listed as
>  deprecated and the earliest version in which they'll be removed
> should be
>  specified so users can understand which versions are guaranteed to be
>  compatible, even if they're using (well-written) non-java clients.
> 
>  -Ewen
> 
> 
> > On Thu, Jan 15, 2015 at 12:52 PM, Dana Powers 
> wrote:
> >
> >> clients don't break on unknown errors
> >
> > maybe true for the official java clients, but I dont think the
> assumption
> > holds true for community-maintained clients and users of those
> clients.
> > kafka-python generally follows the fail-fast philosophy and raises an
> > exception on any unrecognized error code in any server response.  in
> this
>

[jira] [Updated] (KAFKA-1674) auto.create.topics.enable docs are misleading

2015-01-15 Thread Manikumar Reddy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar Reddy updated KAFKA-1674:
---
Fix Version/s: 0.8.2

This is simple documentation correction. we can push this to 0.8.2

> auto.create.topics.enable docs are misleading
> -
>
> Key: KAFKA-1674
> URL: https://issues.apache.org/jira/browse/KAFKA-1674
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.1
>Reporter: Stevo Slavic
>Priority: Minor
>  Labels: newbie
> Fix For: 0.8.2
>
>
> {{auto.create.topics.enable}} is currently 
> [documented|http://kafka.apache.org/08/configuration.html] with
> {quote}
> Enable auto creation of topic on the server. If this is set to true then 
> attempts to produce, consume, or fetch metadata for a non-existent topic will 
> automatically create it with the default replication factor and number of 
> partitions.
> {quote}
> In Kafka 0.8.1.1 reality, topics are only created when trying to publish a 
> message on non-existing topic.
> After 
> [discussion|http://mail-archives.apache.org/mod_mbox/kafka-users/201410.mbox/%3CCAFbh0Q1WXLUDO-im1fQ1yEvrMduxmXbj5HXVc3Cq8B%3DfeMso9g%40mail.gmail.com%3E]
>  with [~junrao] conclusion was that it's documentation issue which needs to 
> be fixed.
> Before fixing docs, please check once more if this is just non-working 
> functionality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1333) Add consumer co-ordinator module to the server

2015-01-15 Thread Andrii Biletskyi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279930#comment-14279930
 ] 

Andrii Biletskyi commented on KAFKA-1333:
-

[~guozhang] many thanks for such informative update!

> Add consumer co-ordinator module to the server
> --
>
> Key: KAFKA-1333
> URL: https://issues.apache.org/jira/browse/KAFKA-1333
> Project: Kafka
>  Issue Type: Sub-task
>  Components: consumer
>Affects Versions: 0.9.0
>Reporter: Neha Narkhede
>Assignee: Guozhang Wang
>
> Scope of this JIRA is to just add a consumer co-ordinator module that do the 
> following:
> 1) coordinator start-up, metadata initialization
> 2) simple join group handling (just updating metadata, no failure detection / 
> rebalancing): this should be sufficient for consumers doing self offset / 
> partition management.
> Offset manager will still run side-by-side with the coordinator in this JIRA, 
> and we will merge it in KAFKA-1740.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1804) Kafka network thread lacks top exception handler

2015-01-15 Thread Alexey Ozeritskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279396#comment-14279396
 ] 

Alexey Ozeritskiy edited comment on KAFKA-1804 at 1/16/15 7:44 AM:
---

We've written the simple patch for kafka-network-thread:
{code:java}
  override def run(): Unit = {
try {
  iteration() // = the original run()
} catch {
  case e: Throwable => 
error("ERROR IN NETWORK THREAD: %s".format(e), e)
Runtime.getRuntime.halt(1)
}
  }
{code}
and got the following trace:
{code}
[2015-01-15 23:04:08,537] ERROR ERROR IN NETWORK THREAD: 
java.util.NoSuchElementException: None.get (kafka.network.Processor)
java.util.NoSuchElementException: None.get
at scala.None$.get(Option.scala:313)
at scala.None$.get(Option.scala:311)
at kafka.network.ConnectionQuotas.dec(SocketServer.scala:544)
at kafka.network.AbstractServerThread.close(SocketServer.scala:165)
at kafka.network.AbstractServerThread.close(SocketServer.scala:157)
at kafka.network.Processor.close(SocketServer.scala:394)
at kafka.network.Processor.processNewResponses(SocketServer.scala:426)
at kafka.network.Processor.iteration(SocketServer.scala:328)
at kafka.network.Processor.run(SocketServer.scala:381)
at java.lang.Thread.run(Thread.java:745)
{code}


was (Author: aozeritsky):
We've written the simple patch for kafka-network-thread:
{code:java}
  override def run(): Unit = {
try {
  original_run()
} catch {
  case e: Throwable => 
error("ERROR IN NETWORK THREAD: %s".format(e), e)
Runtime.getRuntime.halt(1)
}
  }
{code}
and got the following trace:
{code}
[2015-01-15 23:04:08,537] ERROR ERROR IN NETWORK THREAD: 
java.util.NoSuchElementException: None.get (kafka.network.Processor)
java.util.NoSuchElementException: None.get
at scala.None$.get(Option.scala:313)
at scala.None$.get(Option.scala:311)
at kafka.network.ConnectionQuotas.dec(SocketServer.scala:544)
at kafka.network.AbstractServerThread.close(SocketServer.scala:165)
at kafka.network.AbstractServerThread.close(SocketServer.scala:157)
at kafka.network.Processor.close(SocketServer.scala:394)
at kafka.network.Processor.processNewResponses(SocketServer.scala:426)
at kafka.network.Processor.iteration(SocketServer.scala:328)
at kafka.network.Processor.run(SocketServer.scala:381)
at java.lang.Thread.run(Thread.java:745)
{code}

> Kafka network thread lacks top exception handler
> 
>
> Key: KAFKA-1804
> URL: https://issues.apache.org/jira/browse/KAFKA-1804
> Project: Kafka
>  Issue Type: Bug
>Reporter: Oleg Golovin
>
> We have faced the problem that some kafka network threads may fail, so that 
> jstack attached to Kafka process showed fewer threads than we had defined in 
> our Kafka configuration. This leads to API requests processed by this thread 
> getting stuck unresponed.
> There were no error messages in the log regarding thread failure.
> We have examined Kafka code to find out there is no top try-catch block in 
> the network thread code, which could at least log possible errors.
> Could you add top-level try-catch block for the network thread, which should 
> recover network thread in case of exception?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)