Re: [DISCUSS] KIP-18 - JBOD Support

2015-04-11 Thread Jay Kreps
I think this KIP is not really about JBOD support it is just about
remaining available in the presence of individual disk failures.

I agree this is nice to have but in the large scope of things is this
really a big deal? Kafka deployments are usually a lot smaller than Hadoop
deployments so even if you have 10% of your machines offline the total cost
is not very much.

In practice RAID doesn't actually solve this problem either, I don't think,
because the I/O hit from rebuilding the RAID after a failure is so high you
have to bring the server down anyway.

I kind of agree with Todd's point that balancing data over individual disks
might be a higher priority for making JBOD more practical.

-Jay

On Thu, Apr 9, 2015 at 5:36 AM, Andrii Biletskyi <
andrii.bilets...@stealth.ly> wrote:

> Hi,
>
> Let me start discussion thread for KIP-18 - JBOD Support.
>
> Link to wiki:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-18+-+JBOD+Support
>
>
> Thanks,
> Andrii Biletskyi
>


Re: [DISCUSS] KIP-18 - JBOD Support

2015-04-11 Thread Jay Kreps
Hey Todd,

The problem you pointed out is real. Unfortunately, placing by available
size at creation time actually makes things worse. The original plan was to
place new partitions on the disk with the most space, but consider a common
case:
 disk 1: 500M
 disk 2: 0M
Now say you are creating 10 partitions for what will be a massively large
topic. You will place them all on disk 2 as it has the most space, but then
immediately you will discover that that was a bad idea as those partitions
get huge. I think the current balancing by number of partitions is better
than this because at least you get basically random assignment.

I think to solve the problem you describe we need to do active
rebalancing--predicting the size of partitions ahead of time is basically
impossible.

I think having the controller be unaware of disks is probably a good thing.
So the controller would balance partitions over servers and the server
would be responsible for balancing over disks.

I think this kind of balancing is possible though not totally trivial.
Basically a background thread in LogManager would decide that there was too
much skew in data assignment (actually debatable whether it is data size or
I/O throughput that you should optimize) and would then try to rebalance.
To do the rebalance it would do a background copy of the log from the
current disk to the new disk, then it would take the partition offline and
delete the old log, then bring the partition back using the new log and
catch it back up off the leader.

-Jay

On Thu, Apr 9, 2015 at 8:19 AM, Todd Palino  wrote:

> I think this is a good start. We've been discussing JBOD internally, so
> it's good to see a discussion going externally about it as well.
>
> The other big blocker to using JBOD is the lack of intelligent partition
> assignment logic, and the lack of tools to adjust it. The controller is not
> smart enough to take into account disk usage when deciding to place a
> partition, which may not be a big concern (at the controller level, you
> worry about broker usage, not individual mounts). However, the broker is
> not smart enough to do it either, when looking at the local directories. It
> just round robins.
>
> In addition, there is no tool available to move a partition from one mount
> point to another. So in the case that you do have a hot disk, you cannot do
> anything about it without shutting down the broker and doing a manual move
> of the log segments.
>
> -Todd
>
>
> On Thu, Apr 9, 2015 at 5:36 AM, Andrii Biletskyi <
> andrii.bilets...@stealth.ly> wrote:
>
> > Hi,
> >
> > Let me start discussion thread for KIP-18 - JBOD Support.
> >
> > Link to wiki:
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-18+-+JBOD+Support
> >
> >
> > Thanks,
> > Andrii Biletskyi
> >
>


Re: Review Request 29091: Improve 1646 fix by reduce check if Os.IsWindows

2015-04-11 Thread Jay Kreps

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29091/#review79812
---


I think this looks more reasonable. I am pretty confused by the code in roll(), 
have you tried that? Maybe I am misreading it?

Let's do this:
1. Push the preallocation as far down in log as possible. I.e. ideally 
FileMessageSet should do the preallocation and a trim() method which would be 
invoked on Log.role() and during FileMessageSet.close().
2. Get rid of the Os.isWindows option and make this a proper configuration as 
you suggested. I think there is nothing Windows specific, right?
3. Couple of other comments I left.


core/src/main/scala/kafka/log/Log.scala


This should not be exposed as it doesn't make sense as a public operation. 
The fact that we preallocate and shrink the data file is internal. I think we 
can just have this happen as part of close(), right?



core/src/main/scala/kafka/log/Log.scala


I'm confused by this. It looks like you are running log recovery--i.e. 
iterating over all messages in the segment and checksumming them--in the middle 
of rolling the log? That seems wrong, isn't it? I mean this means that if you 
had a 1GB log you would spend several minutes blocking during rolling the new 
segment, right? I think you do need to truncate off any preallocated bytes in 
the active segment but that should go through the same code path as close() 
right?



core/src/main/scala/kafka/utils/Utils.scala


This utility is a bit unusual now since it is something like 
openChannelAndPreallocate.

Let's do the following:
1. Remove the debug logging. We try to make the logging intelligible to 
non-programmer operators reading the log without reference to the code. And 
that won't make any sense.
2. Combine the two openChannel methods into one with an optional 
preallocateToSize option.
3. Move that method into FileMessageSet since that is the only place it is 
used and now it is a bit more idiosyncratic.


- Jay Kreps


On March 13, 2015, 3:12 a.m., Qianlin Xia wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29091/
> ---
> 
> (Updated March 13, 2015, 3:12 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1646
> https://issues.apache.org/jira/browse/KAFKA-1646
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Improve 1646 fix by reduce check if Os.IsWindows
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/log/FileMessageSet.scala 
> e1f8b979c3e6f62ea235bd47bc1587a1291443f9 
>   core/src/main/scala/kafka/log/Log.scala 
> 46df8d99d977a3b010a9b9f4698187fa9bfb2498 
>   core/src/main/scala/kafka/log/LogManager.scala 
> 7cee5435b23fcd0d76f531004911a2ca499df4f8 
>   core/src/main/scala/kafka/log/LogSegment.scala 
> 0d6926ea105a99c9ff2cfc9ea6440f2f2d37bde8 
>   core/src/main/scala/kafka/utils/Utils.scala 
> a89b0463685e6224d263bc9177075e1bb6b93d04 
> 
> Diff: https://reviews.apache.org/r/29091/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Qianlin Xia
> 
>



[jira] [Commented] (KAFKA-1646) Improve consumer read performance for Windows

2015-04-11 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14491172#comment-14491172
 ] 

Jay Kreps commented on KAFKA-1646:
--

This is pretty critical code and there have been a few issues with these 
patches so far which is what is making me a little skiddish. [~waldenchen] can 
you walk me through what you guys are doing in the way of validation and 
performance testing?

Also, I agree with your suggestion of making this a configuration rather than 
an OS specific check. Theoretically this could help with Linux filesystems and 
conversely Windows may later get smarter about preallocation. How about 
log.preallocate={true/false} defaulting to false.

> Improve consumer read performance for Windows
> -
>
> Key: KAFKA-1646
> URL: https://issues.apache.org/jira/browse/KAFKA-1646
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.8.1.1
> Environment: Windows
>Reporter: xueqiang wang
>Assignee: xueqiang wang
>  Labels: newbie, patch
> Attachments: Improve consumer read performance for Windows.patch, 
> KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, 
> KAFKA-1646_20141216_163008.patch, KAFKA-1646_20150306_005526.patch, 
> KAFKA-1646_20150312_200352.patch
>
>
> This patch is for Window platform only. In Windows platform, if there are 
> more than one replicas writing to disk, the segment log files will not be 
> consistent in disk and then consumer reading performance will be dropped down 
> greatly. This fix allocates more disk spaces when rolling a new segment, and 
> then it will improve the consumer reading performance in NTFS file system.
> This patch doesn't affect file allocation of other filesystems, for it only 
> adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 31850: Patch for KAFKA-1660

2015-04-11 Thread Jay Kreps

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31850/#review79813
---



clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java


Minor improvement: Use the javadoc reference for callback {@link Callback} 
to avoid confusion here (there are many kinds of callbacks).



clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java


Minor nit: I think "send request" references our internal class name. I 
think substituting "request" would be more clear.



clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java


I think this section is very confusing. I don't think most people will 
differentiate between immediately exiting vs waiting for 0 ms and then exiting, 
since after all isn't waiting 0 ms the same as immediately exiting.



clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java


It wouldn't block forever, that isn't correct, it would just block for the 
period of time they specified.



clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java


This is not a deadlock, just a useless blocking.



clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java


This scheme is clever but non-obvious, is there a simpler way?



clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java


I think to be correct the check for whether the producer is closed should 
happen before we consider an append in progress since you loop on that check 
later.



clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java


Can you explain this check? I don't think this actually fixes things as the 
close could happen after the check.


- Jay Kreps


On April 10, 2015, 10:09 p.m., Jiangjie Qin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31850/
> ---
> 
> (Updated April 10, 2015, 10:09 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1660
> https://issues.apache.org/jira/browse/KAFKA-1660
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> A minor fix.
> 
> 
> Incorporated Guozhang's comments.
> 
> 
> Modify according to the latest conclusion.
> 
> 
> Patch for the finally passed KIP-15git status
> 
> 
> Addressed Joel and Guozhang's comments.
> 
> 
> rebased on trunk
> 
> 
> Rebase on trunk
> 
> 
> Addressed Joel's comments.
> 
> 
> Addressed Joel's comments
> 
> 
> Diffs
> -
> 
>   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
> b91e2c52ed0acb1faa85915097d97bafa28c413a 
>   clients/src/main/java/org/apache/kafka/clients/producer/MockProducer.java 
> 6913090af03a455452b0b5c3df78f266126b3854 
>   clients/src/main/java/org/apache/kafka/clients/producer/Producer.java 
> 5b3e75ed83a940bc922f9eca10d4008d67aa37c9 
>   
> clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java
>  0e7ab29a07301309aa8ca5b75b32b8e05ddc3a94 
>   
> clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
> 70954cadb5c7e9a4c326afcf9d9a07db230e7db2 
>   
> clients/src/main/java/org/apache/kafka/common/errors/InterruptException.java 
> fee322fa0dd9704374db4a6964246a7d2918d3e4 
>   clients/src/main/java/org/apache/kafka/common/serialization/Serializer.java 
> c2fdc23239bd2196cd912c3d121b591f21393eab 
>   
> clients/src/test/java/org/apache/kafka/clients/producer/internals/RecordAccumulatorTest.java
>  05e2929c0a9fc8cf2a8ebe0776ca62d5f3962d5c 
>   core/src/test/scala/integration/kafka/api/ProducerSendTest.scala 
> 9811a2b2b1e9bf1beb301138f7626e12d275a8db 
> 
> Diff: https://reviews.apache.org/r/31850/diff/
> 
> 
> Testing
> ---
> 
> Unit tests passed.
> 
> 
> Thanks,
> 
> Jiangjie Qin
> 
>



[jira] [Created] (KAFKA-2114) Unable to change min.insync.replicas default

2015-04-11 Thread Bryan Baugher (JIRA)
Bryan Baugher created KAFKA-2114:


 Summary: Unable to change min.insync.replicas default
 Key: KAFKA-2114
 URL: https://issues.apache.org/jira/browse/KAFKA-2114
 Project: Kafka
  Issue Type: Bug
Reporter: Bryan Baugher
 Fix For: 0.8.2.1


Following the comment here[1] I was unable to change the min.insync.replicas 
default value. I tested this by setting up a 3 node cluster, wrote to a topic 
with a replication factor of 3, using request.required.acks=-1 and setting 
min.insync.replicas=2 on the broker's server.properties. I then shutdown 2 
brokers but I was still able to write successfully. Only after running the 
alter topic command setting min.insync.replicas=2 on the topic did I see write 
failures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-2114) Unable to change min.insync.replicas default

2015-04-11 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira reassigned KAFKA-2114:
---

Assignee: Gwen Shapira

> Unable to change min.insync.replicas default
> 
>
> Key: KAFKA-2114
> URL: https://issues.apache.org/jira/browse/KAFKA-2114
> Project: Kafka
>  Issue Type: Bug
>Reporter: Bryan Baugher
>Assignee: Gwen Shapira
> Fix For: 0.8.2.1
>
>
> Following the comment here[1] I was unable to change the min.insync.replicas 
> default value. I tested this by setting up a 3 node cluster, wrote to a topic 
> with a replication factor of 3, using request.required.acks=-1 and setting 
> min.insync.replicas=2 on the broker's server.properties. I then shutdown 2 
> brokers but I was still able to write successfully. Only after running the 
> alter topic command setting min.insync.replicas=2 on the topic did I see 
> write failures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 31850: Patch for KAFKA-1660

2015-04-11 Thread Jiangjie Qin


> On April 11, 2015, 8:02 p.m., Jay Kreps wrote:
> >

Thanks for the review, Jay. Please see the reply below.


> On April 11, 2015, 8:02 p.m., Jay Kreps wrote:
> > clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java, 
> > line 530
> > 
> >
> > I think this section is very confusing. I don't think most people will 
> > differentiate between immediately exiting vs waiting for 0 ms and then 
> > exiting, since after all isn't waiting 0 ms the same as immediately exiting.

The information we want to deliver here is that when timeout = 0, the behavior 
would be different depending on the context. i.e. if the method is invoked from 
user thread, it will try to join sender thread. If it is invoked from sender 
thread, it won't try join itself - that is what we meant by immediately.


> On April 11, 2015, 8:02 p.m., Jay Kreps wrote:
> > clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java, 
> > line 539
> > 
> >
> > It wouldn't block forever, that isn't correct, it would just block for 
> > the period of time they specified.

We are saying we will call close(0) instead of sender thread call 
close(timeout). And we do this to *avoid* blocking forever.


> On April 11, 2015, 8:02 p.m., Jay Kreps wrote:
> > clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java,
> >  line 157
> > 
> >
> > I think to be correct the check for whether the producer is closed 
> > should happen before we consider an append in progress since you loop on 
> > that check later.

Yeah, the current solution is based on an assumption that if a thread received 
IllegalStateException of producer closed, it won't call send() again.
The problem of putting close check before increment appendsInProgress is what 
if close is invoked from another thread after the close flag check but before 
incrementing the appendsInProgress value? In this case we might miss this last 
message or batch.


> On April 11, 2015, 8:02 p.m., Jay Kreps wrote:
> > clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java,
> >  line 155
> > 
> >
> > This scheme is clever but non-obvious, is there a simpler way?

I'm not sure if there is a simpler way. Maybe we can review the current 
approach again and see if we can simplify them.

The goals we want to achieve here are:
1. When abortImcompleteBatch finishes, no more message should be appended. 
2. Make sure when hasUnsent() return false, it does not miss any batch.

The current solutions for them both depending on setting close flag first.
To achieve (1), the implementation now is setting a close flag first and wait 
until all on going appends (if any) to finish.
To achieve (2), the implementation synchoronizes on the deque. When an append 
grabs deque lock, it first check if close flag is set or not. If it is set, 
that means hasUnsent() might have already checked this deque, so it is not safe 
to append a new batch anymore. Otherwise it is safe to append a new batch.


> On April 11, 2015, 8:02 p.m., Jay Kreps wrote:
> > clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java,
> >  line 176
> > 
> >
> > Can you explain this check? I don't think this actually fixes things as 
> > the close could happen after the check.

Please see previous reply.


- Jiangjie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31850/#review79813
---


On April 10, 2015, 10:09 p.m., Jiangjie Qin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31850/
> ---
> 
> (Updated April 10, 2015, 10:09 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1660
> https://issues.apache.org/jira/browse/KAFKA-1660
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> A minor fix.
> 
> 
> Incorporated Guozhang's comments.
> 
> 
> Modify according to the latest conclusion.
> 
> 
> Patch for the finally passed KIP-15git status
> 
> 
> Addressed Joel and Guozhang's comments.
> 
> 
> rebased on trunk
> 
> 
> Rebase on trunk
> 
> 
> Addressed Joel's comments.
> 
> 
> Addressed Joel's comments
> 
> 
> Diffs
> -
> 
>   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
> b91e2c52ed0acb1faa85915097d97bafa28c413a 
>   clients/src/main/java/org/apache/kafka/clients/producer/Mo

Re: [DISCUSS] KIP-18 - JBOD Support

2015-04-11 Thread Todd Palino
I'm with you on that, Jay, although I don't consider your case common. More
common is that there are things on all the disks at their normal retention,
and you add something new. That said, it doesn't really matter because what
you're illustrating is a valid concern. Automatic balancing would probable
alleviate any issues coming from a bad initial placement.

Jumping back an email, yes, it is a really big deal that the entire broker
fails when one mount point fails. It is much better to run with degraded
performance than it is to run with degraded replication, and disks fail
constantly. If I have 10% of my machines offline, Kafka's not going to last
very long at LinkedIn ;)

-Todd


On Sat, Apr 11, 2015 at 11:58 AM, Jay Kreps  wrote:

> Hey Todd,
>
> The problem you pointed out is real. Unfortunately, placing by available
> size at creation time actually makes things worse. The original plan was to
> place new partitions on the disk with the most space, but consider a common
> case:
>  disk 1: 500M
>  disk 2: 0M
> Now say you are creating 10 partitions for what will be a massively large
> topic. You will place them all on disk 2 as it has the most space, but then
> immediately you will discover that that was a bad idea as those partitions
> get huge. I think the current balancing by number of partitions is better
> than this because at least you get basically random assignment.
>
> I think to solve the problem you describe we need to do active
> rebalancing--predicting the size of partitions ahead of time is basically
> impossible.
>
> I think having the controller be unaware of disks is probably a good thing.
> So the controller would balance partitions over servers and the server
> would be responsible for balancing over disks.
>
> I think this kind of balancing is possible though not totally trivial.
> Basically a background thread in LogManager would decide that there was too
> much skew in data assignment (actually debatable whether it is data size or
> I/O throughput that you should optimize) and would then try to rebalance.
> To do the rebalance it would do a background copy of the log from the
> current disk to the new disk, then it would take the partition offline and
> delete the old log, then bring the partition back using the new log and
> catch it back up off the leader.
>
> -Jay
>
> On Thu, Apr 9, 2015 at 8:19 AM, Todd Palino  wrote:
>
> > I think this is a good start. We've been discussing JBOD internally, so
> > it's good to see a discussion going externally about it as well.
> >
> > The other big blocker to using JBOD is the lack of intelligent partition
> > assignment logic, and the lack of tools to adjust it. The controller is
> not
> > smart enough to take into account disk usage when deciding to place a
> > partition, which may not be a big concern (at the controller level, you
> > worry about broker usage, not individual mounts). However, the broker is
> > not smart enough to do it either, when looking at the local directories.
> It
> > just round robins.
> >
> > In addition, there is no tool available to move a partition from one
> mount
> > point to another. So in the case that you do have a hot disk, you cannot
> do
> > anything about it without shutting down the broker and doing a manual
> move
> > of the log segments.
> >
> > -Todd
> >
> >
> > On Thu, Apr 9, 2015 at 5:36 AM, Andrii Biletskyi <
> > andrii.bilets...@stealth.ly> wrote:
> >
> > > Hi,
> > >
> > > Let me start discussion thread for KIP-18 - JBOD Support.
> > >
> > > Link to wiki:
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-18+-+JBOD+Support
> > >
> > >
> > > Thanks,
> > > Andrii Biletskyi
> > >
> >
>


Re: Review Request 33088: add heartbeat to coordinator

2015-04-11 Thread Onur Karaman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33088/
---

(Updated April 12, 2015, 5:47 a.m.)


Review request for kafka.


Bugs: KAFKA-1334
https://issues.apache.org/jira/browse/KAFKA-1334


Repository: kafka


Description (updated)
---

add heartbeat to coordinator

todo:
- see how it performs under real load
- see if IO can be moved out of the locks
- figure out how to make ConsumerCoordinator easy to test
- figure out how to close connections on consumer failure
- decide if consumer ids should be predictable or random
- figure out a good timeout value for JoinGroupRequest


Diffs (updated)
-

  
clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
 e55ab11df4db0b0084f841a74cbcf819caf780d5 
  clients/src/main/java/org/apache/kafka/common/protocol/Errors.java 
36aa412404ff1458c7bef0feecaaa8bc45bed9c7 
  core/src/main/scala/kafka/api/RequestKeys.scala 
ef7a86ec3324028496d6bb7c7c6fec7d7d19d64e 
  core/src/main/scala/kafka/coordinator/ConsumerCoordinator.scala 
456b602245e111880e1b8b361319cabff38ee0e9 
  core/src/main/scala/kafka/coordinator/ConsumerRegistry.scala 
2f5797064d4131ecfc9d2750d9345a9fa3972a9a 
  core/src/main/scala/kafka/coordinator/DelayedHeartbeat.scala 
6a6bc7bc4ceb648b67332e789c2c33de88e4cd86 
  core/src/main/scala/kafka/coordinator/DelayedJoinGroup.scala 
df60cbc35d09937b4e9c737c67229889c69d8698 
  core/src/main/scala/kafka/coordinator/DelayedRebalance.scala 
8defa2e41c92f1ebe255177679d275c70dae5b3e 
  core/src/main/scala/kafka/coordinator/Group.scala PRE-CREATION 
  core/src/main/scala/kafka/coordinator/GroupRegistry.scala 
94ef5829b3a616c90018af1db7627bfe42e259e5 
  core/src/main/scala/kafka/coordinator/HeartbeatBucket.scala 
821e26e97eaa97b5f4520474fff0fedbf406c82a 
  core/src/main/scala/kafka/coordinator/PartitionAssignor.scala PRE-CREATION 
  core/src/main/scala/kafka/network/RequestChannel.scala 
1d9c57b0b5a0ad31e4f3d7562f0266af83cc9024 
  core/src/main/scala/kafka/server/DelayedOperationKey.scala 
b673e43b0ba401b2e22f27aef550e3ab0ef4323c 
  core/src/main/scala/kafka/server/KafkaApis.scala 
b4004aa3a1456d337199aa1245fb0ae61f6add46 
  core/src/main/scala/kafka/server/KafkaServer.scala 
c63f4ba9d622817ea8636d4e6135fba917ce085a 
  core/src/main/scala/kafka/server/OffsetManager.scala 
420e2c3535e722c503f13d093849469983f6f08d 

Diff: https://reviews.apache.org/r/33088/diff/


Testing
---


Thanks,

Onur Karaman



[jira] [Commented] (KAFKA-1334) Add failure detection capability to the coordinator / consumer

2015-04-11 Thread Onur Karaman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14491338#comment-14491338
 ] 

Onur Karaman commented on KAFKA-1334:
-

Updated reviewboard https://reviews.apache.org/r/33088/diff/
 against branch origin/trunk

> Add failure detection capability to the coordinator / consumer
> --
>
> Key: KAFKA-1334
> URL: https://issues.apache.org/jira/browse/KAFKA-1334
> Project: Kafka
>  Issue Type: Sub-task
>  Components: consumer
>Affects Versions: 0.9.0
>Reporter: Neha Narkhede
>Assignee: Onur Karaman
> Attachments: KAFKA-1334.patch, KAFKA-1334_2015-04-11_22:47:27.patch
>
>
> 1) Add coordinator discovery and failure detection to the consumer.
> 2) Add failure detection capability to the coordinator when group management 
> is used.
> This will not include any rebalancing logic, just the logic to detect 
> consumer failures using session.timeout.ms. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1334) Add failure detection capability to the coordinator / consumer

2015-04-11 Thread Onur Karaman (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Onur Karaman updated KAFKA-1334:

Attachment: KAFKA-1334_2015-04-11_22:47:27.patch

> Add failure detection capability to the coordinator / consumer
> --
>
> Key: KAFKA-1334
> URL: https://issues.apache.org/jira/browse/KAFKA-1334
> Project: Kafka
>  Issue Type: Sub-task
>  Components: consumer
>Affects Versions: 0.9.0
>Reporter: Neha Narkhede
>Assignee: Onur Karaman
> Attachments: KAFKA-1334.patch, KAFKA-1334_2015-04-11_22:47:27.patch
>
>
> 1) Add coordinator discovery and failure detection to the consumer.
> 2) Add failure detection capability to the coordinator when group management 
> is used.
> This will not include any rebalancing logic, just the logic to detect 
> consumer failures using session.timeout.ms. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)