[jira] [Commented] (KAFKA-689) Can't append to a topic/partition that does not already exist
[ https://issues.apache.org/jira/browse/KAFKA-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549500#comment-13549500 ] ben fleis commented on KAFKA-689: - Although it's not precisely the same, perhaps thinking about topic|partition as a remote file open is a useful metaphor. An open() call is where you would set normal open params (flush interval, O_CREAT, etc.), and stat() is where you get broker and other real time updates. Of course, if create is explicit, where does delete come into play? @Jun - I don't see where in server.properties anything about topic creation exists? And further, does an extra RPC matter if it's only during setup/periodic? > Can't append to a topic/partition that does not already exist > - > > Key: KAFKA-689 > URL: https://issues.apache.org/jira/browse/KAFKA-689 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 0.8 >Reporter: David Arthur > Attachments: kafka.log, produce-payload.bin > > > With a totally fresh Kafka (empty logs dir and empty ZK), if I send a > ProduceRequest for a new topic, Kafka responds with > "kafka.common.UnknownTopicOrPartitionException: Topic test partition 0 > doesn't exist on 0". This is when sending a ProduceRequest over the network > (from Python, in this case). > If I use the console producer it works fine (topic and partition get > created). If I then send the same payload from before over the network, it > works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-133) Publish kafka jar to a public maven repository
[ https://issues.apache.org/jira/browse/KAFKA-133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549515#comment-13549515 ] ben fleis commented on KAFKA-133: - I'm new to maven, so this may be a dumb question: would it be reasonable/easy to publish nightly jars (via Apache?), under the 0.8.0-SNAPSHOT tag? I have an automated build system that currently git's and builds the whole thing, which I would love to replace with simple jar files. > Publish kafka jar to a public maven repository > -- > > Key: KAFKA-133 > URL: https://issues.apache.org/jira/browse/KAFKA-133 > Project: Kafka > Issue Type: Improvement >Affects Versions: 0.6, 0.8 >Reporter: Neha Narkhede > Labels: patch > Fix For: 0.8 > > Attachments: KAFKA-133.patch, pom.xml > > > The released kafka jar must be download manually and then deploy to a private > repository before they can be used by a developer using maven2. > Similar to other Apache projects, it will be nice to have a way to publish > Kafka releases to a public maven repo. > In the past, we gave it a try using sbt publish to Sonatype Nexus maven repo, > but ran into some authentication problems. It will be good to revisit this > and get it resolved. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-692) ConsoleConsumer outputs diagnostic message to stdout instead of stderr
[ https://issues.apache.org/jira/browse/KAFKA-692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ben fleis updated KAFKA-692: Status: Patch Available (was: Open) from stdout -> stderr > ConsoleConsumer outputs diagnostic message to stdout instead of stderr > -- > > Key: KAFKA-692 > URL: https://issues.apache.org/jira/browse/KAFKA-692 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 0.8 >Reporter: ben fleis >Priority: Minor > Fix For: 0.8 > > Original Estimate: 1m > Remaining Estimate: 1m > > At the end of its handling loop, ConsoleConsumer prints "Consumed %d > messages" to standard out. Clients who use customer formatters, and read > this output, shouldn't need to special case this line, or accept a parse > error. > It should instead go (as all diagnostics should) to stderr. > patch attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (KAFKA-692) ConsoleConsumer outputs diagnostic message to stdout instead of stderr
ben fleis created KAFKA-692: --- Summary: ConsoleConsumer outputs diagnostic message to stdout instead of stderr Key: KAFKA-692 URL: https://issues.apache.org/jira/browse/KAFKA-692 Project: Kafka Issue Type: Bug Components: clients Affects Versions: 0.8 Reporter: ben fleis Priority: Minor Fix For: 0.8 At the end of its handling loop, ConsoleConsumer prints "Consumed %d messages" to standard out. Clients who use customer formatters, and read this output, shouldn't need to special case this line, or accept a parse error. It should instead go (as all diagnostics should) to stderr. patch attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-692) ConsoleConsumer outputs diagnostic message to stdout instead of stderr
[ https://issues.apache.org/jira/browse/KAFKA-692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ben fleis updated KAFKA-692: Attachment: kafka_692_v1.diff stdout -> stderr > ConsoleConsumer outputs diagnostic message to stdout instead of stderr > -- > > Key: KAFKA-692 > URL: https://issues.apache.org/jira/browse/KAFKA-692 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 0.8 >Reporter: ben fleis >Priority: Minor > Fix For: 0.8 > > Attachments: kafka_692_v1.diff > > Original Estimate: 1m > Remaining Estimate: 1m > > At the end of its handling loop, ConsoleConsumer prints "Consumed %d > messages" to standard out. Clients who use customer formatters, and read > this output, shouldn't need to special case this line, or accept a parse > error. > It should instead go (as all diagnostics should) to stderr. > patch attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Deleted] (KAFKA-692) ConsoleConsumer outputs diagnostic message to stdout instead of stderr
[ https://issues.apache.org/jira/browse/KAFKA-692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ben fleis updated KAFKA-692: Comment: was deleted (was: from stdout -> stderr) > ConsoleConsumer outputs diagnostic message to stdout instead of stderr > -- > > Key: KAFKA-692 > URL: https://issues.apache.org/jira/browse/KAFKA-692 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 0.8 >Reporter: ben fleis >Priority: Minor > Fix For: 0.8 > > Attachments: kafka_692_v1.diff > > Original Estimate: 1m > Remaining Estimate: 1m > > At the end of its handling loop, ConsoleConsumer prints "Consumed %d > messages" to standard out. Clients who use customer formatters, and read > this output, shouldn't need to special case this line, or accept a parse > error. > It should instead go (as all diagnostics should) to stderr. > patch attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-691) Fault tolerance broken with replication factor 1
[ https://issues.apache.org/jira/browse/KAFKA-691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxime Brugidou updated KAFKA-691: -- Attachment: KAFKA-691-v1.patch Here is a first draft (v1) patch. 1. Added the consumer property "producer.metadata.refresh.interval.ms" defaults to 60 (10min) 2. The metadata is refreshed every 10min (only if a message is sent), and the set of topics to refresh is tracked in the topicMetadataToRefresh Set (cleared after every refresh) - I think the added value of refreshing regardless of partition availability is to detect new partitions 3. The good news is that I didn't touch the Partitioner API, I only changed the code to use available partitions if the key is null (as suggested by Jun), it will also throw a UnknownTopicOrPartitionException("No leader for any partition") if no partition is available at all Let me know what you think about this patch. I ran a producer with that code successfully and tested with a broker down. I now have some concerns about the consumer: the refresh.leader.backoff.ms config could help me (if i increase it to say, 10min) BUT the rebalance fails in any case since there is no leader for some partitions I don't have a good workaround yet for that, any help/suggestion appreciated. > Fault tolerance broken with replication factor 1 > > > Key: KAFKA-691 > URL: https://issues.apache.org/jira/browse/KAFKA-691 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.8 >Reporter: Jay Kreps > Attachments: KAFKA-691-v1.patch > > > In 0.7 if a partition was down we would just send the message elsewhere. This > meant that the partitioning was really more of a "stickiness" then a hard > guarantee. This made it impossible to depend on it for partitioned, stateful > processing. > In 0.8 when running with replication this should not be a problem generally > as the partitions are now highly available and fail over to other replicas. > However in the case of replication factor = 1 no longer really works for most > cases as now a dead broker will give errors for that broker. > I am not sure of the best fix. Intuitively I think this is something that > should be handled by the Partitioner interface. However currently the > partitioner has no knowledge of which nodes are available. So you could use a > random partitioner, but that would keep going back to the down node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-683) Fix correlation ids in all requests sent to kafka
[ https://issues.apache.org/jira/browse/KAFKA-683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549792#comment-13549792 ] Jun Rao commented on KAFKA-683: --- Patch v2 doesn't seem to apply on 0.8. Could you rebase? For 6, I didn't mean to remove the trace logging in RequestChannel. What I meant is that we already print out requestObj which includes every field in a request. So, there is no need to explicitly print out clientid, correlationid and versionid. > Fix correlation ids in all requests sent to kafka > - > > Key: KAFKA-683 > URL: https://issues.apache.org/jira/browse/KAFKA-683 > Project: Kafka > Issue Type: Improvement >Affects Versions: 0.8 >Reporter: Neha Narkhede >Assignee: Neha Narkhede >Priority: Critical > Labels: improvement, replication > Attachments: kafka-683-v1.patch, kafka-683-v2.patch > > > We should fix the correlation ids in every request sent to Kafka and fix the > request log on the broker to specify not only the type of request and who > sent it, but also the correlation id. This will be very helpful while > troubleshooting problems in production. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-691) Fault tolerance broken with replication factor 1
[ https://issues.apache.org/jira/browse/KAFKA-691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549842#comment-13549842 ] Jun Rao commented on KAFKA-691: --- Thanks for the patch. Overall, the patch is pretty good and is well thought out. Some comments: 1. DefaultEventHandler: 1.1 In handle(), I don't think we need to add the if test in the following statement. The reason is that a message could fail to be sent because the leader changes immediately after the previous metadata refresh. Normally, leaders are elected very quickly. So, it makes sense to refresh the metadata again. if (topicMetadataToRefresh.nonEmpty) Utils.swallowError(brokerPartitionInfo.updateInfo(outstandingProduceRequests.map(_.topic).toSet)) 1.2 In handle(), it seems that it's better to call the following code before dispatchSerializedData(). if (topicMetadataRefreshInterval >= 0 && SystemTime.milliseconds - lastTopicMetadataRefresh > topicMetadataRefreshInterval) { Utils.swallowError(brokerPartitionInfo.updateInfo(topicMetadataToRefresh.toSet)) topicMetadataToRefresh.clear lastTopicMetadataRefresh = SystemTime.milliseconds } 1.3 getPartition(): If none of the partitions is available, we should throw LeaderNotAvailableException, instead of UnknownTopicOrPartitionException. 2. DefaultPartitioner: Since key is not expected to be null, we should remove the code that deals with null key. 3. The consumer side logic is fine. The consumer rebalance is only triggered when there are changes in partitions, not when there are changes in the availability of the partition. The rebalance logic doesn't depend on a partition being available. If a partition is not available, ConsumerFetcherManager will keep refreshing metadata. If you have a replication factor of 1, you will need to set a larger refresh.leader.backoff.ms, if a broker is expected to go down for a long time. > Fault tolerance broken with replication factor 1 > > > Key: KAFKA-691 > URL: https://issues.apache.org/jira/browse/KAFKA-691 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.8 >Reporter: Jay Kreps > Attachments: KAFKA-691-v1.patch > > > In 0.7 if a partition was down we would just send the message elsewhere. This > meant that the partitioning was really more of a "stickiness" then a hard > guarantee. This made it impossible to depend on it for partitioned, stateful > processing. > In 0.8 when running with replication this should not be a problem generally > as the partitions are now highly available and fail over to other replicas. > However in the case of replication factor = 1 no longer really works for most > cases as now a dead broker will give errors for that broker. > I am not sure of the best fix. Intuitively I think this is something that > should be handled by the Partitioner interface. However currently the > partitioner has no knowledge of which nodes are available. So you could use a > random partitioner, but that would keep going back to the down node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-691) Fault tolerance broken with replication factor 1
[ https://issues.apache.org/jira/browse/KAFKA-691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxime Brugidou updated KAFKA-691: -- Attachment: KAFKA-691-v2.patch Thanks for your feedback, I updated it (v2) according to your notes (1. and 2.). for 3. I believe you are right, except that: 3.1 It seems (correct me if i'm wrong) that a rebalance happen at the consumer initialization, so that means a consumer can't start if a broker is down 3.2 Can a rebalance be triggered when a partition is added or moved? Having a broker down shouldn't prevent me from reassigning partitions or adding partitions. > Fault tolerance broken with replication factor 1 > > > Key: KAFKA-691 > URL: https://issues.apache.org/jira/browse/KAFKA-691 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.8 >Reporter: Jay Kreps > Attachments: KAFKA-691-v1.patch, KAFKA-691-v2.patch > > > In 0.7 if a partition was down we would just send the message elsewhere. This > meant that the partitioning was really more of a "stickiness" then a hard > guarantee. This made it impossible to depend on it for partitioned, stateful > processing. > In 0.8 when running with replication this should not be a problem generally > as the partitions are now highly available and fail over to other replicas. > However in the case of replication factor = 1 no longer really works for most > cases as now a dead broker will give errors for that broker. > I am not sure of the best fix. Intuitively I think this is something that > should be handled by the Partitioner interface. However currently the > partitioner has no knowledge of which nodes are available. So you could use a > random partitioner, but that would keep going back to the down node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-689) Can't append to a topic/partition that does not already exist
[ https://issues.apache.org/jira/browse/KAFKA-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549888#comment-13549888 ] Jun Rao commented on KAFKA-689: --- In KafkaConfig, we have a property auto.create.topics. We probably need to keep this feature so that an admin can choose to only allow topics created through admin tools. The extra RPC is not a big deal. > Can't append to a topic/partition that does not already exist > - > > Key: KAFKA-689 > URL: https://issues.apache.org/jira/browse/KAFKA-689 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 0.8 >Reporter: David Arthur > Attachments: kafka.log, produce-payload.bin > > > With a totally fresh Kafka (empty logs dir and empty ZK), if I send a > ProduceRequest for a new topic, Kafka responds with > "kafka.common.UnknownTopicOrPartitionException: Topic test partition 0 > doesn't exist on 0". This is when sending a ProduceRequest over the network > (from Python, in this case). > If I use the console producer it works fine (topic and partition get > created). If I then send the same payload from before over the network, it > works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-683) Fix correlation ids in all requests sent to kafka
[ https://issues.apache.org/jira/browse/KAFKA-683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neha Narkhede updated KAFKA-683: Attachment: kafka-683-v2-rebased.patch I see, I misunderstood your suggestion then. The reason I added the version, correlation id information explicitly is to make it easier to trace requests through the socket server to the request handler. In fact, I think logging the entire request might not be very useful if we do a good job of logging the correlation id and request type properly throughout our codebase. My plan was to remove it after we are sufficiently satisfied with troubleshooting problems just based on the correlation id wiring that we have currently. I will do a follow up to remove the logging of the request after that. Does that work for you ? > Fix correlation ids in all requests sent to kafka > - > > Key: KAFKA-683 > URL: https://issues.apache.org/jira/browse/KAFKA-683 > Project: Kafka > Issue Type: Improvement >Affects Versions: 0.8 >Reporter: Neha Narkhede >Assignee: Neha Narkhede >Priority: Critical > Labels: improvement, replication > Attachments: kafka-683-v1.patch, kafka-683-v2.patch, > kafka-683-v2-rebased.patch > > > We should fix the correlation ids in every request sent to Kafka and fix the > request log on the broker to specify not only the type of request and who > sent it, but also the correlation id. This will be very helpful while > troubleshooting problems in production. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Subscription: outstanding kafka patches
Issue Subscription Filter: outstanding kafka patches (57 issues) The list of outstanding kafka patches Subscriber: kafka-mailing-list Key Summary KAFKA-692 ConsoleConsumer outputs diagnostic message to stdout instead of stderr https://issues.apache.org/jira/browse/KAFKA-692 KAFKA-691 Fault tolerance broken with replication factor 1 https://issues.apache.org/jira/browse/KAFKA-691 KAFKA-688 System Test - Update all testcase__properties.json for properties keys uniform naming convention https://issues.apache.org/jira/browse/KAFKA-688 KAFKA-682 java.lang.OutOfMemoryError: Java heap space https://issues.apache.org/jira/browse/KAFKA-682 KAFKA-677 Retention process gives exception if an empty segment is chosen for collection https://issues.apache.org/jira/browse/KAFKA-677 KAFKA-674 Clean Shutdown Testing - Log segments checksums mismatch https://issues.apache.org/jira/browse/KAFKA-674 KAFKA-651 Create testcases on auto create topics https://issues.apache.org/jira/browse/KAFKA-651 KAFKA-648 Use uniform convention for naming properties keys https://issues.apache.org/jira/browse/KAFKA-648 KAFKA-645 Create a shell script to run System Test with DEBUG details and "tee" console output to a file https://issues.apache.org/jira/browse/KAFKA-645 KAFKA-637 Separate log4j environment variable from KAFKA_OPTS in kafka-run-class.sh https://issues.apache.org/jira/browse/KAFKA-637 KAFKA-621 System Test 9051 : ConsoleConsumer doesn't receives any data for 20 topics but works for 10 https://issues.apache.org/jira/browse/KAFKA-621 KAFKA-607 System Test Transient Failure (case 4011 Log Retention) - ConsoleConsumer receives less data https://issues.apache.org/jira/browse/KAFKA-607 KAFKA-606 System Test Transient Failure (case 0302 GC Pause) - Log segments mismatched across replicas https://issues.apache.org/jira/browse/KAFKA-606 KAFKA-604 Add missing metrics in 0.8 https://issues.apache.org/jira/browse/KAFKA-604 KAFKA-598 decouple fetch size from max message size https://issues.apache.org/jira/browse/KAFKA-598 KAFKA-583 SimpleConsumerShell may receive less data inconsistently https://issues.apache.org/jira/browse/KAFKA-583 KAFKA-552 No error messages logged for those failing-to-send messages from Producer https://issues.apache.org/jira/browse/KAFKA-552 KAFKA-547 The ConsumerStats MBean name should include the groupid https://issues.apache.org/jira/browse/KAFKA-547 KAFKA-530 kafka.server.KafkaApis: kafka.common.OffsetOutOfRangeException https://issues.apache.org/jira/browse/KAFKA-530 KAFKA-493 High CPU usage on inactive server https://issues.apache.org/jira/browse/KAFKA-493 KAFKA-479 ZK EPoll taking 100% CPU usage with Kafka Client https://issues.apache.org/jira/browse/KAFKA-479 KAFKA-465 Performance test scripts - refactoring leftovers from tools to perf package https://issues.apache.org/jira/browse/KAFKA-465 KAFKA-438 Code cleanup in MessageTest https://issues.apache.org/jira/browse/KAFKA-438 KAFKA-419 Updated PHP client library to support kafka 0.7+ https://issues.apache.org/jira/browse/KAFKA-419 KAFKA-414 Evaluate mmap-based writes for Log implementation https://issues.apache.org/jira/browse/KAFKA-414 KAFKA-411 Message Error in high cocurrent environment https://issues.apache.org/jira/browse/KAFKA-411 KAFKA-404 When using chroot path, create chroot on startup if it doesn't exist https://issues.apache.org/jira/browse/KAFKA-404 KAFKA-399 0.7.1 seems to show less performance than 0.7.0 https://issues.apache.org/jira/browse/KAFKA-399 KAFKA-398 Enhance SocketServer to Enable Sending Requests https://issues.apache.org/jira/browse/KAFKA-398 KAFKA-397 kafka.common.InvalidMessageSizeException: null https://issues.apache.org/jira/browse/KAFKA-397 KAFKA-388 Add a highly available consumer co-ordinator to a Kafka cluster https://issues.apache.org/jira/browse/KAFKA-388 KAFKA-346 Don't call commitOffsets() during rebalance https://issues.apache.org/jira/browse/KAFKA-346 KAFKA-345 Add a listener to ZookeeperConsumerConnector to get notified on rebalance events https://issues.apache.org/jira/browse/KAFKA-345 KAFKA-319 compression support added to php client does not pass unit tests https://issues.apache.org/jira/browse/KAFKA-319 KAFKA-318 update zookeeper dependency to 3.3.5 https://issues.apache.org/jira/browse/KAFKA-318 KAFKA-314 Go Client Multi-produce https://issues.apache.org/jira/browse/KAFKA-314 KAFKA-313 Add JSON output and looping options to ConsumerOffsetChecker https://issues.apache.org/jira/bro
[jira] [Resolved] (KAFKA-691) Fault tolerance broken with replication factor 1
[ https://issues.apache.org/jira/browse/KAFKA-691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Rao resolved KAFKA-691. --- Resolution: Fixed Fix Version/s: 0.8 Assignee: Maxime Brugidou Thanks for patch v2. Committed to 0.8 by renaming lastTopicMetadataRefresh to lastTopicMetadataRefreshTime and removing an unused comment. 3.1 Rebalance happens during consumer initialization. It only needs the partition data to be in ZK and doesn't require all brokers to be up. Of course, if a broker is not up, the consumer may not be able to consume data from it. ConsumerFetcherManager is responsible for checking if a partition becomes available again. 3.2 If the partition path changes in ZK, a rebalance will be triggered. > Fault tolerance broken with replication factor 1 > > > Key: KAFKA-691 > URL: https://issues.apache.org/jira/browse/KAFKA-691 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.8 >Reporter: Jay Kreps >Assignee: Maxime Brugidou > Fix For: 0.8 > > Attachments: KAFKA-691-v1.patch, KAFKA-691-v2.patch > > > In 0.7 if a partition was down we would just send the message elsewhere. This > meant that the partitioning was really more of a "stickiness" then a hard > guarantee. This made it impossible to depend on it for partitioned, stateful > processing. > In 0.8 when running with replication this should not be a problem generally > as the partitions are now highly available and fail over to other replicas. > However in the case of replication factor = 1 no longer really works for most > cases as now a dead broker will give errors for that broker. > I am not sure of the best fix. Intuitively I think this is something that > should be handled by the Partitioner interface. However currently the > partitioner has no knowledge of which nodes are available. So you could use a > random partitioner, but that would keep going back to the down node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-683) Fix correlation ids in all requests sent to kafka
[ https://issues.apache.org/jira/browse/KAFKA-683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549953#comment-13549953 ] Jun Rao commented on KAFKA-683: --- I agree that seeing the whole request is not useful, especially Message, since it's binary. Maybe what we should do is to fix the toString() method in each request so that it only prints out meaningful info. This can be fixed in a separate jira. One benefit of this is that we can remove the extra deserialization of those 3 special fields in RequestChannel, which depends on all requests having those 3 fields in the same order and seems brittle. > Fix correlation ids in all requests sent to kafka > - > > Key: KAFKA-683 > URL: https://issues.apache.org/jira/browse/KAFKA-683 > Project: Kafka > Issue Type: Improvement >Affects Versions: 0.8 >Reporter: Neha Narkhede >Assignee: Neha Narkhede >Priority: Critical > Labels: improvement, replication > Attachments: kafka-683-v1.patch, kafka-683-v2.patch, > kafka-683-v2-rebased.patch > > > We should fix the correlation ids in every request sent to Kafka and fix the > request log on the broker to specify not only the type of request and who > sent it, but also the correlation id. This will be very helpful while > troubleshooting problems in production. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-683) Fix correlation ids in all requests sent to kafka
[ https://issues.apache.org/jira/browse/KAFKA-683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549967#comment-13549967 ] Neha Narkhede commented on KAFKA-683: - That is a good suggestion. However, that still doesn't get rid of the 3 special fields from RequestChannel since we don't have access to the individual request objects there unless we cast/convert, which I thought was unnecessary. > Fix correlation ids in all requests sent to kafka > - > > Key: KAFKA-683 > URL: https://issues.apache.org/jira/browse/KAFKA-683 > Project: Kafka > Issue Type: Improvement >Affects Versions: 0.8 >Reporter: Neha Narkhede >Assignee: Neha Narkhede >Priority: Critical > Labels: improvement, replication > Attachments: kafka-683-v1.patch, kafka-683-v2.patch, > kafka-683-v2-rebased.patch > > > We should fix the correlation ids in every request sent to Kafka and fix the > request log on the broker to specify not only the type of request and who > sent it, but also the correlation id. This will be very helpful while > troubleshooting problems in production. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-691) Fault tolerance broken with replication factor 1
[ https://issues.apache.org/jira/browse/KAFKA-691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550338#comment-13550338 ] Maxime Brugidou commented on KAFKA-691: --- Thanks for committing the patch. 3.1 Are you sure that the rebalance doesn't require all partitions to have a leader? My experience earlier today was that the rebalance would fail and throw ConsumerRebalanceFailedException after having stopped all fetchers and cleared all queues. If you are sure then i'll try to reproduce the behavior I encountered, and maybe open a separate JIRA? > Fault tolerance broken with replication factor 1 > > > Key: KAFKA-691 > URL: https://issues.apache.org/jira/browse/KAFKA-691 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.8 >Reporter: Jay Kreps >Assignee: Maxime Brugidou > Fix For: 0.8 > > Attachments: KAFKA-691-v1.patch, KAFKA-691-v2.patch > > > In 0.7 if a partition was down we would just send the message elsewhere. This > meant that the partitioning was really more of a "stickiness" then a hard > guarantee. This made it impossible to depend on it for partitioned, stateful > processing. > In 0.8 when running with replication this should not be a problem generally > as the partitions are now highly available and fail over to other replicas. > However in the case of replication factor = 1 no longer really works for most > cases as now a dead broker will give errors for that broker. > I am not sure of the best fix. Intuitively I think this is something that > should be handled by the Partitioner interface. However currently the > partitioner has no knowledge of which nodes are available. So you could use a > random partitioner, but that would keep going back to the down node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-691) Fault tolerance broken with replication factor 1
[ https://issues.apache.org/jira/browse/KAFKA-691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550450#comment-13550450 ] Jun Rao commented on KAFKA-691: --- 3.1 It shouldn't. However, if you can reproduce this problem, please file a new jira. > Fault tolerance broken with replication factor 1 > > > Key: KAFKA-691 > URL: https://issues.apache.org/jira/browse/KAFKA-691 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.8 >Reporter: Jay Kreps >Assignee: Maxime Brugidou > Fix For: 0.8 > > Attachments: KAFKA-691-v1.patch, KAFKA-691-v2.patch > > > In 0.7 if a partition was down we would just send the message elsewhere. This > meant that the partitioning was really more of a "stickiness" then a hard > guarantee. This made it impossible to depend on it for partitioned, stateful > processing. > In 0.8 when running with replication this should not be a problem generally > as the partitions are now highly available and fail over to other replicas. > However in the case of replication factor = 1 no longer really works for most > cases as now a dead broker will give errors for that broker. > I am not sure of the best fix. Intuitively I think this is something that > should be handled by the Partitioner interface. However currently the > partitioner has no knowledge of which nodes are available. So you could use a > random partitioner, but that would keep going back to the down node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-683) Fix correlation ids in all requests sent to kafka
[ https://issues.apache.org/jira/browse/KAFKA-683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550453#comment-13550453 ] Jun Rao commented on KAFKA-683: --- We can include in the request string sth like clientId:aaa,correlationId:bbb. Will this be good enough for debugging? > Fix correlation ids in all requests sent to kafka > - > > Key: KAFKA-683 > URL: https://issues.apache.org/jira/browse/KAFKA-683 > Project: Kafka > Issue Type: Improvement >Affects Versions: 0.8 >Reporter: Neha Narkhede >Assignee: Neha Narkhede >Priority: Critical > Labels: improvement, replication > Attachments: kafka-683-v1.patch, kafka-683-v2.patch, > kafka-683-v2-rebased.patch > > > We should fix the correlation ids in every request sent to Kafka and fix the > request log on the broker to specify not only the type of request and who > sent it, but also the correlation id. This will be very helpful while > troubleshooting problems in production. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-692) ConsoleConsumer outputs diagnostic message to stdout instead of stderr
[ https://issues.apache.org/jira/browse/KAFKA-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550561#comment-13550561 ] Neha Narkhede commented on KAFKA-692: - +1. Thanks for the patch > ConsoleConsumer outputs diagnostic message to stdout instead of stderr > -- > > Key: KAFKA-692 > URL: https://issues.apache.org/jira/browse/KAFKA-692 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 0.8 >Reporter: ben fleis >Priority: Minor > Fix For: 0.8 > > Attachments: kafka_692_v1.diff > > Original Estimate: 1m > Remaining Estimate: 1m > > At the end of its handling loop, ConsoleConsumer prints "Consumed %d > messages" to standard out. Clients who use customer formatters, and read > this output, shouldn't need to special case this line, or accept a parse > error. > It should instead go (as all diagnostics should) to stderr. > patch attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-648) Use uniform convention for naming properties keys
[ https://issues.apache.org/jira/browse/KAFKA-648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-648: - Attachment: configchanges-v5.patch 40 / 42/ 43 Accepted the suggestions but handled them differently. Specifying max and min at the beginning will cause configs related to the same feature to not look similar. For Example, max.log.Index.size and log.roll.hours are both configs related to logs but end up looking different. Instead, the configs use the following format - ConfigName => ComponentName AnyString [Max/Min] [Unit] FeatureName => Name of the component/feature this config is used for. Example - log, replica, etc. AnyString => A string that represents what this config is used for Max/Min => Optional. Used if the config represents a max or min value. For example, replicaLagTimeMaxMs Unit => Optional. The unit of the value the config represents. For example, replicaLagMaxBytes for value specified in bytes. 41 Removed the producer prefix in producer configs. John you may have to fix the json files once more to work with the new changes. > Use uniform convention for naming properties keys > -- > > Key: KAFKA-648 > URL: https://issues.apache.org/jira/browse/KAFKA-648 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.8 >Reporter: Swapnil Ghike >Assignee: Sriram Subramanian >Priority: Blocker > Fix For: 0.8, 0.8.1 > > Attachments: configchanges-1.patch, configchanges-v2.patch, > configchanges-v3.patch, configchanges-v4.patch, configchanges-v5.patch > > > Currently, the convention that we seem to use to get a property value in > *Config is as follows: > val configVal = property.getType("config.val", ...) // dot is used to > separate two words in the key and the first letter of second word is > capitalized in configVal. > We should use similar convention for groupId, consumerId, clientId, > correlationId. > This change will probably be backward non-compatible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-683) Fix correlation ids in all requests sent to kafka
[ https://issues.apache.org/jira/browse/KAFKA-683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550879#comment-13550879 ] Jun Rao commented on KAFKA-683: --- For rebased v2 patch, the problem in #3 still exists. Also, it seems that you need to rebase again. > Fix correlation ids in all requests sent to kafka > - > > Key: KAFKA-683 > URL: https://issues.apache.org/jira/browse/KAFKA-683 > Project: Kafka > Issue Type: Improvement >Affects Versions: 0.8 >Reporter: Neha Narkhede >Assignee: Neha Narkhede >Priority: Critical > Labels: improvement, replication > Attachments: kafka-683-v1.patch, kafka-683-v2.patch, > kafka-683-v2-rebased.patch > > > We should fix the correlation ids in every request sent to Kafka and fix the > request log on the broker to specify not only the type of request and who > sent it, but also the correlation id. This will be very helpful while > troubleshooting problems in production. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira