[jira] [Commented] (KAFKA-689) Can't append to a topic/partition that does not already exist

2013-01-10 Thread ben fleis (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549500#comment-13549500
 ] 

ben fleis commented on KAFKA-689:
-

Although it's not precisely the same, perhaps thinking about topic|partition as 
a remote file open is a useful metaphor.  An open() call is where you would set 
normal open params (flush interval, O_CREAT, etc.), and stat() is where you get 
broker and other real time updates.  Of course, if create is explicit, where 
does delete come into play?

@Jun - I don't see where in server.properties anything about topic creation 
exists?  And further, does an extra RPC matter if it's only during 
setup/periodic?

> Can't append to a topic/partition that does not already exist
> -
>
> Key: KAFKA-689
> URL: https://issues.apache.org/jira/browse/KAFKA-689
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.8
>Reporter: David Arthur
> Attachments: kafka.log, produce-payload.bin
>
>
> With a totally fresh Kafka (empty logs dir and empty ZK), if I send a 
> ProduceRequest for a new topic, Kafka responds with 
> "kafka.common.UnknownTopicOrPartitionException: Topic test partition 0 
> doesn't exist on 0". This is when sending a ProduceRequest over the network 
> (from Python, in this case).
> If I use the console producer it works fine (topic and partition get 
> created). If I then send the same payload from before over the network, it 
> works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-133) Publish kafka jar to a public maven repository

2013-01-10 Thread ben fleis (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549515#comment-13549515
 ] 

ben fleis commented on KAFKA-133:
-

I'm new to maven, so this may be a dumb question: would it be reasonable/easy 
to publish nightly jars (via Apache?), under the 0.8.0-SNAPSHOT tag?  I have an 
automated build system that currently git's and builds the whole thing, which I 
would love to replace with simple jar files.

> Publish kafka jar to a public maven repository
> --
>
> Key: KAFKA-133
> URL: https://issues.apache.org/jira/browse/KAFKA-133
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.6, 0.8
>Reporter: Neha Narkhede
>  Labels: patch
> Fix For: 0.8
>
> Attachments: KAFKA-133.patch, pom.xml
>
>
> The released kafka jar must be download manually and then deploy to a private 
> repository before they can be used by a developer using maven2.
> Similar to other Apache projects, it will be nice to have a way to publish 
> Kafka releases to a public maven repo. 
> In the past, we gave it a try using sbt publish to Sonatype Nexus maven repo, 
> but ran into some authentication problems. It will be good to revisit this 
> and get it resolved.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-692) ConsoleConsumer outputs diagnostic message to stdout instead of stderr

2013-01-10 Thread ben fleis (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ben fleis updated KAFKA-692:


Status: Patch Available  (was: Open)

from stdout -> stderr

> ConsoleConsumer outputs diagnostic message to stdout instead of stderr
> --
>
> Key: KAFKA-692
> URL: https://issues.apache.org/jira/browse/KAFKA-692
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.8
>Reporter: ben fleis
>Priority: Minor
> Fix For: 0.8
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> At the end of its handling loop, ConsoleConsumer prints "Consumed %d 
> messages" to standard out.  Clients who use customer formatters, and read 
> this output, shouldn't need to special case this line, or accept a parse 
> error.
> It should instead go (as all diagnostics should) to stderr.
> patch attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (KAFKA-692) ConsoleConsumer outputs diagnostic message to stdout instead of stderr

2013-01-10 Thread ben fleis (JIRA)
ben fleis created KAFKA-692:
---

 Summary: ConsoleConsumer outputs diagnostic message to stdout 
instead of stderr
 Key: KAFKA-692
 URL: https://issues.apache.org/jira/browse/KAFKA-692
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 0.8
Reporter: ben fleis
Priority: Minor
 Fix For: 0.8


At the end of its handling loop, ConsoleConsumer prints "Consumed %d messages" 
to standard out.  Clients who use customer formatters, and read this output, 
shouldn't need to special case this line, or accept a parse error.

It should instead go (as all diagnostics should) to stderr.

patch attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-692) ConsoleConsumer outputs diagnostic message to stdout instead of stderr

2013-01-10 Thread ben fleis (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ben fleis updated KAFKA-692:


Attachment: kafka_692_v1.diff

stdout -> stderr

> ConsoleConsumer outputs diagnostic message to stdout instead of stderr
> --
>
> Key: KAFKA-692
> URL: https://issues.apache.org/jira/browse/KAFKA-692
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.8
>Reporter: ben fleis
>Priority: Minor
> Fix For: 0.8
>
> Attachments: kafka_692_v1.diff
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> At the end of its handling loop, ConsoleConsumer prints "Consumed %d 
> messages" to standard out.  Clients who use customer formatters, and read 
> this output, shouldn't need to special case this line, or accept a parse 
> error.
> It should instead go (as all diagnostics should) to stderr.
> patch attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Issue Comment Deleted] (KAFKA-692) ConsoleConsumer outputs diagnostic message to stdout instead of stderr

2013-01-10 Thread ben fleis (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ben fleis updated KAFKA-692:


Comment: was deleted

(was: from stdout -> stderr)

> ConsoleConsumer outputs diagnostic message to stdout instead of stderr
> --
>
> Key: KAFKA-692
> URL: https://issues.apache.org/jira/browse/KAFKA-692
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.8
>Reporter: ben fleis
>Priority: Minor
> Fix For: 0.8
>
> Attachments: kafka_692_v1.diff
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> At the end of its handling loop, ConsoleConsumer prints "Consumed %d 
> messages" to standard out.  Clients who use customer formatters, and read 
> this output, shouldn't need to special case this line, or accept a parse 
> error.
> It should instead go (as all diagnostics should) to stderr.
> patch attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-691) Fault tolerance broken with replication factor 1

2013-01-10 Thread Maxime Brugidou (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxime Brugidou updated KAFKA-691:
--

Attachment: KAFKA-691-v1.patch

Here is a first draft (v1) patch.

1. Added the consumer property "producer.metadata.refresh.interval.ms" defaults 
to 60 (10min)

2. The metadata is refreshed every 10min (only if a message is sent), and the 
set of topics to refresh is tracked in the topicMetadataToRefresh Set (cleared 
after every refresh) - I think the added value of refreshing regardless of 
partition availability is to detect new partitions

3. The good news is that I didn't touch the Partitioner API, I only changed the 
code to use available partitions if the key is null (as suggested by Jun), it 
will also throw a UnknownTopicOrPartitionException("No leader for any 
partition") if no partition is available at all

Let me know what you think about this patch. I ran a producer with that code 
successfully and tested with a broker down.

I now have some concerns about the consumer: the refresh.leader.backoff.ms 
config could help me (if i increase it to say, 10min) BUT the rebalance fails 
in any case since there is no leader for some partitions

I don't have a good workaround yet for that, any help/suggestion appreciated.

> Fault tolerance broken with replication factor 1
> 
>
> Key: KAFKA-691
> URL: https://issues.apache.org/jira/browse/KAFKA-691
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8
>Reporter: Jay Kreps
> Attachments: KAFKA-691-v1.patch
>
>
> In 0.7 if a partition was down we would just send the message elsewhere. This 
> meant that the partitioning was really more of a "stickiness" then a hard 
> guarantee. This made it impossible to depend on it for partitioned, stateful 
> processing.
> In 0.8 when running with replication this should not be a problem generally 
> as the partitions are now highly available and fail over to other replicas. 
> However in the case of replication factor = 1 no longer really works for most 
> cases as now a dead broker will give errors for that broker.
> I am not sure of the best fix. Intuitively I think this is something that 
> should be handled by the Partitioner interface. However currently the 
> partitioner has no knowledge of which nodes are available. So you could use a 
> random partitioner, but that would keep going back to the down node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-683) Fix correlation ids in all requests sent to kafka

2013-01-10 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549792#comment-13549792
 ] 

Jun Rao commented on KAFKA-683:
---

Patch v2 doesn't seem to apply on 0.8. Could you rebase?

For 6, I didn't mean to remove the trace logging in RequestChannel. What I 
meant is that we already print out requestObj which includes every field in a 
request. So, there is no need to explicitly print out clientid, correlationid 
and versionid.

> Fix correlation ids in all requests sent to kafka
> -
>
> Key: KAFKA-683
> URL: https://issues.apache.org/jira/browse/KAFKA-683
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.8
>Reporter: Neha Narkhede
>Assignee: Neha Narkhede
>Priority: Critical
>  Labels: improvement, replication
> Attachments: kafka-683-v1.patch, kafka-683-v2.patch
>
>
> We should fix the correlation ids in every request sent to Kafka and fix the 
> request log on the broker to specify not only the type of request and who 
> sent it, but also the correlation id. This will be very helpful while 
> troubleshooting problems in production.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-691) Fault tolerance broken with replication factor 1

2013-01-10 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549842#comment-13549842
 ] 

Jun Rao commented on KAFKA-691:
---

Thanks for the patch. Overall, the patch is pretty good and is well thought 
out. Some comments:

1. DefaultEventHandler:
1.1 In handle(), I don't think we need to add the if test in the following 
statement. The reason is that a message could fail to be sent because the 
leader changes immediately after the previous metadata refresh. Normally, 
leaders are elected very quickly. So, it makes sense to refresh the metadata 
again.
  if (topicMetadataToRefresh.nonEmpty)
  
Utils.swallowError(brokerPartitionInfo.updateInfo(outstandingProduceRequests.map(_.topic).toSet))
1.2 In handle(), it seems that it's better to call the following code before 
dispatchSerializedData().
if (topicMetadataRefreshInterval >= 0 &&
SystemTime.milliseconds - lastTopicMetadataRefresh > 
topicMetadataRefreshInterval) {
  
Utils.swallowError(brokerPartitionInfo.updateInfo(topicMetadataToRefresh.toSet))
  topicMetadataToRefresh.clear
  lastTopicMetadataRefresh = SystemTime.milliseconds
}
1.3 getPartition(): If none of the partitions is available, we should throw 
LeaderNotAvailableException, instead of UnknownTopicOrPartitionException.

2. DefaultPartitioner: Since key is not expected to be null, we should remove 
the code that deals with null key. 

3. The consumer side logic is fine. The consumer rebalance is only triggered 
when there are changes in partitions, not when there are changes in the 
availability of the partition. The rebalance logic doesn't depend on a 
partition being available. If a partition is not available, 
ConsumerFetcherManager will keep refreshing metadata. If you have a replication 
factor of 1, you will need to set a larger refresh.leader.backoff.ms, if a 
broker is expected to go down for a long time. 

> Fault tolerance broken with replication factor 1
> 
>
> Key: KAFKA-691
> URL: https://issues.apache.org/jira/browse/KAFKA-691
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8
>Reporter: Jay Kreps
> Attachments: KAFKA-691-v1.patch
>
>
> In 0.7 if a partition was down we would just send the message elsewhere. This 
> meant that the partitioning was really more of a "stickiness" then a hard 
> guarantee. This made it impossible to depend on it for partitioned, stateful 
> processing.
> In 0.8 when running with replication this should not be a problem generally 
> as the partitions are now highly available and fail over to other replicas. 
> However in the case of replication factor = 1 no longer really works for most 
> cases as now a dead broker will give errors for that broker.
> I am not sure of the best fix. Intuitively I think this is something that 
> should be handled by the Partitioner interface. However currently the 
> partitioner has no knowledge of which nodes are available. So you could use a 
> random partitioner, but that would keep going back to the down node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-691) Fault tolerance broken with replication factor 1

2013-01-10 Thread Maxime Brugidou (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxime Brugidou updated KAFKA-691:
--

Attachment: KAFKA-691-v2.patch

Thanks for your feedback, I updated it (v2) according to your notes (1. and 2.).

for 3. I believe you are right, except that:
3.1 It seems (correct me if i'm wrong) that a rebalance happen at the consumer 
initialization, so that means a consumer can't start if a broker is down
3.2 Can a rebalance be triggered when a partition is added or moved? Having a 
broker down shouldn't prevent me from reassigning partitions or adding 
partitions.


> Fault tolerance broken with replication factor 1
> 
>
> Key: KAFKA-691
> URL: https://issues.apache.org/jira/browse/KAFKA-691
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8
>Reporter: Jay Kreps
> Attachments: KAFKA-691-v1.patch, KAFKA-691-v2.patch
>
>
> In 0.7 if a partition was down we would just send the message elsewhere. This 
> meant that the partitioning was really more of a "stickiness" then a hard 
> guarantee. This made it impossible to depend on it for partitioned, stateful 
> processing.
> In 0.8 when running with replication this should not be a problem generally 
> as the partitions are now highly available and fail over to other replicas. 
> However in the case of replication factor = 1 no longer really works for most 
> cases as now a dead broker will give errors for that broker.
> I am not sure of the best fix. Intuitively I think this is something that 
> should be handled by the Partitioner interface. However currently the 
> partitioner has no knowledge of which nodes are available. So you could use a 
> random partitioner, but that would keep going back to the down node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-689) Can't append to a topic/partition that does not already exist

2013-01-10 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549888#comment-13549888
 ] 

Jun Rao commented on KAFKA-689:
---

In KafkaConfig, we have a property auto.create.topics. We probably need to keep 
this feature so that an admin can choose to only allow topics created through 
admin tools.

The extra RPC is not a big deal. 

> Can't append to a topic/partition that does not already exist
> -
>
> Key: KAFKA-689
> URL: https://issues.apache.org/jira/browse/KAFKA-689
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.8
>Reporter: David Arthur
> Attachments: kafka.log, produce-payload.bin
>
>
> With a totally fresh Kafka (empty logs dir and empty ZK), if I send a 
> ProduceRequest for a new topic, Kafka responds with 
> "kafka.common.UnknownTopicOrPartitionException: Topic test partition 0 
> doesn't exist on 0". This is when sending a ProduceRequest over the network 
> (from Python, in this case).
> If I use the console producer it works fine (topic and partition get 
> created). If I then send the same payload from before over the network, it 
> works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-683) Fix correlation ids in all requests sent to kafka

2013-01-10 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-683:


Attachment: kafka-683-v2-rebased.patch

I see, I misunderstood your suggestion then. The reason I added the version, 
correlation id information explicitly is to make it easier to trace requests 
through the socket server to the request handler. In fact, I think logging the 
entire request might not be very useful if we do a good job of logging the 
correlation id and request type properly throughout our codebase. My plan was 
to remove it after we are sufficiently satisfied with troubleshooting problems 
just based on the correlation id wiring that we have currently. I will do a 
follow up to remove the logging of the request after that. Does that work for 
you ?

> Fix correlation ids in all requests sent to kafka
> -
>
> Key: KAFKA-683
> URL: https://issues.apache.org/jira/browse/KAFKA-683
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.8
>Reporter: Neha Narkhede
>Assignee: Neha Narkhede
>Priority: Critical
>  Labels: improvement, replication
> Attachments: kafka-683-v1.patch, kafka-683-v2.patch, 
> kafka-683-v2-rebased.patch
>
>
> We should fix the correlation ids in every request sent to Kafka and fix the 
> request log on the broker to specify not only the type of request and who 
> sent it, but also the correlation id. This will be very helpful while 
> troubleshooting problems in production.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Subscription: outstanding kafka patches

2013-01-10 Thread jira
Issue Subscription
Filter: outstanding kafka patches (57 issues)
The list of outstanding kafka patches
Subscriber: kafka-mailing-list

Key Summary
KAFKA-692   ConsoleConsumer outputs diagnostic message to stdout instead of 
stderr
https://issues.apache.org/jira/browse/KAFKA-692
KAFKA-691   Fault tolerance broken with replication factor 1
https://issues.apache.org/jira/browse/KAFKA-691
KAFKA-688   System Test - Update all testcase__properties.json for 
properties keys uniform naming convention
https://issues.apache.org/jira/browse/KAFKA-688
KAFKA-682   java.lang.OutOfMemoryError: Java heap space
https://issues.apache.org/jira/browse/KAFKA-682
KAFKA-677   Retention process gives exception if an empty segment is chosen for 
collection
https://issues.apache.org/jira/browse/KAFKA-677
KAFKA-674   Clean Shutdown Testing - Log segments checksums mismatch
https://issues.apache.org/jira/browse/KAFKA-674
KAFKA-651   Create testcases on auto create topics
https://issues.apache.org/jira/browse/KAFKA-651
KAFKA-648   Use uniform convention for naming properties keys 
https://issues.apache.org/jira/browse/KAFKA-648
KAFKA-645   Create a shell script to run System Test with DEBUG details and 
"tee" console output to a file
https://issues.apache.org/jira/browse/KAFKA-645
KAFKA-637   Separate log4j environment variable from KAFKA_OPTS in 
kafka-run-class.sh
https://issues.apache.org/jira/browse/KAFKA-637
KAFKA-621   System Test 9051 : ConsoleConsumer doesn't receives any data for 20 
topics but works for 10
https://issues.apache.org/jira/browse/KAFKA-621
KAFKA-607   System Test Transient Failure (case 4011 Log Retention) - 
ConsoleConsumer receives less data
https://issues.apache.org/jira/browse/KAFKA-607
KAFKA-606   System Test Transient Failure (case 0302 GC Pause) - Log segments 
mismatched across replicas
https://issues.apache.org/jira/browse/KAFKA-606
KAFKA-604   Add missing metrics in 0.8
https://issues.apache.org/jira/browse/KAFKA-604
KAFKA-598   decouple fetch size from max message size
https://issues.apache.org/jira/browse/KAFKA-598
KAFKA-583   SimpleConsumerShell may receive less data inconsistently
https://issues.apache.org/jira/browse/KAFKA-583
KAFKA-552   No error messages logged for those failing-to-send messages from 
Producer
https://issues.apache.org/jira/browse/KAFKA-552
KAFKA-547   The ConsumerStats MBean name should include the groupid
https://issues.apache.org/jira/browse/KAFKA-547
KAFKA-530   kafka.server.KafkaApis: kafka.common.OffsetOutOfRangeException
https://issues.apache.org/jira/browse/KAFKA-530
KAFKA-493   High CPU usage on inactive server
https://issues.apache.org/jira/browse/KAFKA-493
KAFKA-479   ZK EPoll taking 100% CPU usage with Kafka Client
https://issues.apache.org/jira/browse/KAFKA-479
KAFKA-465   Performance test scripts - refactoring leftovers from tools to perf 
package
https://issues.apache.org/jira/browse/KAFKA-465
KAFKA-438   Code cleanup in MessageTest
https://issues.apache.org/jira/browse/KAFKA-438
KAFKA-419   Updated PHP client library to support kafka 0.7+
https://issues.apache.org/jira/browse/KAFKA-419
KAFKA-414   Evaluate mmap-based writes for Log implementation
https://issues.apache.org/jira/browse/KAFKA-414
KAFKA-411   Message Error in high cocurrent environment
https://issues.apache.org/jira/browse/KAFKA-411
KAFKA-404   When using chroot path, create chroot on startup if it doesn't exist
https://issues.apache.org/jira/browse/KAFKA-404
KAFKA-399   0.7.1 seems to show less performance than 0.7.0
https://issues.apache.org/jira/browse/KAFKA-399
KAFKA-398   Enhance SocketServer to Enable Sending Requests
https://issues.apache.org/jira/browse/KAFKA-398
KAFKA-397   kafka.common.InvalidMessageSizeException: null
https://issues.apache.org/jira/browse/KAFKA-397
KAFKA-388   Add a highly available consumer co-ordinator to a Kafka cluster
https://issues.apache.org/jira/browse/KAFKA-388
KAFKA-346   Don't call commitOffsets() during rebalance
https://issues.apache.org/jira/browse/KAFKA-346
KAFKA-345   Add a listener to ZookeeperConsumerConnector to get notified on 
rebalance events
https://issues.apache.org/jira/browse/KAFKA-345
KAFKA-319   compression support added to php client does not pass unit tests
https://issues.apache.org/jira/browse/KAFKA-319
KAFKA-318   update zookeeper dependency to 3.3.5
https://issues.apache.org/jira/browse/KAFKA-318
KAFKA-314   Go Client Multi-produce
https://issues.apache.org/jira/browse/KAFKA-314
KAFKA-313   Add JSON output and looping options to ConsumerOffsetChecker
https://issues.apache.org/jira/bro

[jira] [Resolved] (KAFKA-691) Fault tolerance broken with replication factor 1

2013-01-10 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-691.
---

   Resolution: Fixed
Fix Version/s: 0.8
 Assignee: Maxime Brugidou

Thanks for patch v2. Committed to 0.8 by renaming lastTopicMetadataRefresh to 
lastTopicMetadataRefreshTime and removing an unused comment.

3.1 Rebalance happens during consumer initialization. It only needs the 
partition data to be in ZK and doesn't require all brokers to be up. Of course, 
if a broker is not up, the consumer may not be able to consume data from it. 
ConsumerFetcherManager is responsible for checking if a partition becomes 
available again.

3.2 If the partition path changes in ZK, a rebalance will be triggered.

> Fault tolerance broken with replication factor 1
> 
>
> Key: KAFKA-691
> URL: https://issues.apache.org/jira/browse/KAFKA-691
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8
>Reporter: Jay Kreps
>Assignee: Maxime Brugidou
> Fix For: 0.8
>
> Attachments: KAFKA-691-v1.patch, KAFKA-691-v2.patch
>
>
> In 0.7 if a partition was down we would just send the message elsewhere. This 
> meant that the partitioning was really more of a "stickiness" then a hard 
> guarantee. This made it impossible to depend on it for partitioned, stateful 
> processing.
> In 0.8 when running with replication this should not be a problem generally 
> as the partitions are now highly available and fail over to other replicas. 
> However in the case of replication factor = 1 no longer really works for most 
> cases as now a dead broker will give errors for that broker.
> I am not sure of the best fix. Intuitively I think this is something that 
> should be handled by the Partitioner interface. However currently the 
> partitioner has no knowledge of which nodes are available. So you could use a 
> random partitioner, but that would keep going back to the down node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-683) Fix correlation ids in all requests sent to kafka

2013-01-10 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549953#comment-13549953
 ] 

Jun Rao commented on KAFKA-683:
---

I agree that seeing the whole request is not useful, especially Message, since 
it's binary. Maybe what we should do is to fix the toString() method in each 
request so that it only prints out meaningful info. This can be fixed in a 
separate jira. One benefit of this is that we can remove the extra 
deserialization of those 3 special fields in RequestChannel, which depends on 
all requests having those 3 fields in the same order and seems brittle.

> Fix correlation ids in all requests sent to kafka
> -
>
> Key: KAFKA-683
> URL: https://issues.apache.org/jira/browse/KAFKA-683
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.8
>Reporter: Neha Narkhede
>Assignee: Neha Narkhede
>Priority: Critical
>  Labels: improvement, replication
> Attachments: kafka-683-v1.patch, kafka-683-v2.patch, 
> kafka-683-v2-rebased.patch
>
>
> We should fix the correlation ids in every request sent to Kafka and fix the 
> request log on the broker to specify not only the type of request and who 
> sent it, but also the correlation id. This will be very helpful while 
> troubleshooting problems in production.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-683) Fix correlation ids in all requests sent to kafka

2013-01-10 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549967#comment-13549967
 ] 

Neha Narkhede commented on KAFKA-683:
-

That is a good suggestion. However, that still doesn't get rid of the 3 special 
fields from RequestChannel since we don't have access to the individual request 
objects there unless we cast/convert, which I thought was unnecessary.

> Fix correlation ids in all requests sent to kafka
> -
>
> Key: KAFKA-683
> URL: https://issues.apache.org/jira/browse/KAFKA-683
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.8
>Reporter: Neha Narkhede
>Assignee: Neha Narkhede
>Priority: Critical
>  Labels: improvement, replication
> Attachments: kafka-683-v1.patch, kafka-683-v2.patch, 
> kafka-683-v2-rebased.patch
>
>
> We should fix the correlation ids in every request sent to Kafka and fix the 
> request log on the broker to specify not only the type of request and who 
> sent it, but also the correlation id. This will be very helpful while 
> troubleshooting problems in production.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-691) Fault tolerance broken with replication factor 1

2013-01-10 Thread Maxime Brugidou (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550338#comment-13550338
 ] 

Maxime Brugidou commented on KAFKA-691:
---

Thanks for committing the patch.

3.1 Are you sure that the rebalance doesn't require all partitions to have a 
leader? My experience earlier today was that the rebalance would fail and throw 
ConsumerRebalanceFailedException after having stopped all fetchers and cleared 
all queues. If you are sure then i'll try to reproduce the behavior I 
encountered, and maybe open a separate JIRA?

> Fault tolerance broken with replication factor 1
> 
>
> Key: KAFKA-691
> URL: https://issues.apache.org/jira/browse/KAFKA-691
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8
>Reporter: Jay Kreps
>Assignee: Maxime Brugidou
> Fix For: 0.8
>
> Attachments: KAFKA-691-v1.patch, KAFKA-691-v2.patch
>
>
> In 0.7 if a partition was down we would just send the message elsewhere. This 
> meant that the partitioning was really more of a "stickiness" then a hard 
> guarantee. This made it impossible to depend on it for partitioned, stateful 
> processing.
> In 0.8 when running with replication this should not be a problem generally 
> as the partitions are now highly available and fail over to other replicas. 
> However in the case of replication factor = 1 no longer really works for most 
> cases as now a dead broker will give errors for that broker.
> I am not sure of the best fix. Intuitively I think this is something that 
> should be handled by the Partitioner interface. However currently the 
> partitioner has no knowledge of which nodes are available. So you could use a 
> random partitioner, but that would keep going back to the down node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-691) Fault tolerance broken with replication factor 1

2013-01-10 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550450#comment-13550450
 ] 

Jun Rao commented on KAFKA-691:
---

3.1 It shouldn't. However, if you can reproduce this problem, please file a new 
jira.

> Fault tolerance broken with replication factor 1
> 
>
> Key: KAFKA-691
> URL: https://issues.apache.org/jira/browse/KAFKA-691
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8
>Reporter: Jay Kreps
>Assignee: Maxime Brugidou
> Fix For: 0.8
>
> Attachments: KAFKA-691-v1.patch, KAFKA-691-v2.patch
>
>
> In 0.7 if a partition was down we would just send the message elsewhere. This 
> meant that the partitioning was really more of a "stickiness" then a hard 
> guarantee. This made it impossible to depend on it for partitioned, stateful 
> processing.
> In 0.8 when running with replication this should not be a problem generally 
> as the partitions are now highly available and fail over to other replicas. 
> However in the case of replication factor = 1 no longer really works for most 
> cases as now a dead broker will give errors for that broker.
> I am not sure of the best fix. Intuitively I think this is something that 
> should be handled by the Partitioner interface. However currently the 
> partitioner has no knowledge of which nodes are available. So you could use a 
> random partitioner, but that would keep going back to the down node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-683) Fix correlation ids in all requests sent to kafka

2013-01-10 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550453#comment-13550453
 ] 

Jun Rao commented on KAFKA-683:
---

We can include in the request string sth like clientId:aaa,correlationId:bbb. 
Will this be good enough for debugging?

> Fix correlation ids in all requests sent to kafka
> -
>
> Key: KAFKA-683
> URL: https://issues.apache.org/jira/browse/KAFKA-683
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.8
>Reporter: Neha Narkhede
>Assignee: Neha Narkhede
>Priority: Critical
>  Labels: improvement, replication
> Attachments: kafka-683-v1.patch, kafka-683-v2.patch, 
> kafka-683-v2-rebased.patch
>
>
> We should fix the correlation ids in every request sent to Kafka and fix the 
> request log on the broker to specify not only the type of request and who 
> sent it, but also the correlation id. This will be very helpful while 
> troubleshooting problems in production.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-692) ConsoleConsumer outputs diagnostic message to stdout instead of stderr

2013-01-10 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550561#comment-13550561
 ] 

Neha Narkhede commented on KAFKA-692:
-

+1. Thanks for the patch

> ConsoleConsumer outputs diagnostic message to stdout instead of stderr
> --
>
> Key: KAFKA-692
> URL: https://issues.apache.org/jira/browse/KAFKA-692
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.8
>Reporter: ben fleis
>Priority: Minor
> Fix For: 0.8
>
> Attachments: kafka_692_v1.diff
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> At the end of its handling loop, ConsoleConsumer prints "Consumed %d 
> messages" to standard out.  Clients who use customer formatters, and read 
> this output, shouldn't need to special case this line, or accept a parse 
> error.
> It should instead go (as all diagnostics should) to stderr.
> patch attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-648) Use uniform convention for naming properties keys

2013-01-10 Thread Sriram Subramanian (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriram Subramanian updated KAFKA-648:
-

Attachment: configchanges-v5.patch

40 / 42/ 43
Accepted the suggestions but handled them differently. Specifying max and min 
at the beginning will cause configs related to the same feature to not look 
similar. For Example, 

max.log.Index.size and log.roll.hours are both configs related to logs but end 
up looking different. 

Instead, the configs use the following format - 

  ConfigName => ComponentName AnyString [Max/Min] [Unit]

  FeatureName => Name of the  component/feature this config is used for. 
Example - log, replica, etc.

  AnyString => A string that represents what this config is used for

  Max/Min => Optional. Used if the config represents a max or min value. For 
example, replicaLagTimeMaxMs

  Unit => Optional. The unit of the value the config represents. For example, 
replicaLagMaxBytes for value specified in bytes.

41 Removed the producer prefix in producer configs. 

John you may have to fix the json files once more to work with the new changes.

> Use uniform convention for naming properties keys 
> --
>
> Key: KAFKA-648
> URL: https://issues.apache.org/jira/browse/KAFKA-648
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8
>Reporter: Swapnil Ghike
>Assignee: Sriram Subramanian
>Priority: Blocker
> Fix For: 0.8, 0.8.1
>
> Attachments: configchanges-1.patch, configchanges-v2.patch, 
> configchanges-v3.patch, configchanges-v4.patch, configchanges-v5.patch
>
>
> Currently, the convention that we seem to use to get a property value in 
> *Config is as follows:
> val configVal = property.getType("config.val", ...) // dot is used to 
> separate two words in the key and the first letter of second word is 
> capitalized in configVal.
> We should use similar convention for groupId, consumerId, clientId, 
> correlationId.
> This change will probably be backward non-compatible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-683) Fix correlation ids in all requests sent to kafka

2013-01-10 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550879#comment-13550879
 ] 

Jun Rao commented on KAFKA-683:
---

For rebased v2 patch, the problem in #3 still exists. Also, it seems that you 
need to rebase again.

> Fix correlation ids in all requests sent to kafka
> -
>
> Key: KAFKA-683
> URL: https://issues.apache.org/jira/browse/KAFKA-683
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.8
>Reporter: Neha Narkhede
>Assignee: Neha Narkhede
>Priority: Critical
>  Labels: improvement, replication
> Attachments: kafka-683-v1.patch, kafka-683-v2.patch, 
> kafka-683-v2-rebased.patch
>
>
> We should fix the correlation ids in every request sent to Kafka and fix the 
> request log on the broker to specify not only the type of request and who 
> sent it, but also the correlation id. This will be very helpful while 
> troubleshooting problems in production.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira