I'm getting ready to try out this configuration (use multiple disks, no
RAID, per broker).  One concern is the procedure for recovering if there is
a disk failure.

If a disk fails, will the broker go offline, or will it continue serving
partitions on its remaining good disks?  And if so, is there a procedure
for moving the partitions that were on the failed disk, but not necessarily
all the others on that broker?

Jason


On Thu, Jun 20, 2013 at 3:15 PM, Jason Rosenberg <j...@squareup.com> wrote:

> yeah, that would work!
>
>
> On Thu, Jun 20, 2013 at 1:20 PM, Jay Kreps <jay.kr...@gmail.com> wrote:
>
>> Yeah we didn't go as far as adding weighting or anything like that--I
>> think we'd be open to a patch that did that as long as it was
>> optional. In the short term you can obviously add multiple directories
>> on the same disk to increase its share.
>>
>> -Jay
>>
>> On Thu, Jun 20, 2013 at 12:59 PM, Jason Rosenberg <j...@squareup.com>
>> wrote:
>> > This sounds like a great idea, to just disks as "just a bunch of disks"
>> or
>> > JBOD.....hdfs works well this way.
>> >
>> > Do all the disks need to be the same size, to use them evenly?  Since it
>> > will allocate partitions randomly?
>> >
>> > It would be nice if you had 2 disks, with one twice as large as the
>> other,
>> > if the larger would be twice as likely to receive partitions as the
>> smaller
>> > one, etc.
>> >
>> > I suppose this goes into my earlier question to the list, vis-a-vis
>> > heterogeneous brokers (e.g. utilize brokers with different sized
>> storage,
>> > using some sort of weighting scheme, etc.).
>> >
>> > Jason
>> >
>> >
>> > On Thu, Jun 20, 2013 at 11:07 AM, Jay Kreps <jay.kr...@gmail.com>
>> wrote:
>> >
>> >> The intention is to allow the use of multiple disks without RAID or
>> >> logical volume management. We have found that there are a lot of
>> >> downsides to RAID--in particular a huge throughput hit. Since we
>> >> already have a parallelism model due to partitioning and a fault
>> >> tolerance model with replication RAID doesn't actually buy much. With
>> >> this feature you can directly mount multiple disks as their own
>> >> directory and the server will randomly assign partitions to them.
>> >>
>> >> Obviously this will only work well if there are enough high-throughput
>> >> partitions to make load balance evenly (e.g. if you have only one big
>> >> partition per server then this isn't going to work).
>> >>
>> >> -Jay
>> >>
>> >> On Wed, Jun 19, 2013 at 11:01 PM, Jason Rosenberg <j...@squareup.com>
>> >> wrote:
>> >> > is it possible for a partition to have multiple replicas on different
>> >> > directories on the same broker?  (hopefully no!)
>> >> >
>> >> >
>> >> > On Wed, Jun 19, 2013 at 10:47 PM, Jun Rao <jun...@gmail.com> wrote:
>> >> >
>> >> >> It takes a comma separated list and partition replicas are randomly
>> >> >> distributed to the list.
>> >> >>
>> >> >> Thanks,
>> >> >>
>> >> >> Jun
>> >> >>
>> >> >>
>> >> >> On Wed, Jun 19, 2013 at 10:25 PM, Jason Rosenberg <j...@squareup.com
>> >
>> >> >> wrote:
>> >> >>
>> >> >> > In the 0.8 config, log.dir is now log.dirs.  It looks like the
>> >> singular
>> >> >> > log.dir is still supported, but under the covers the property is
>> >> >> log.dirs.
>> >> >> >
>> >> >> > I'm curious, does this take a comma separated list of directories?
>> >>  The
>> >> >> new
>> >> >> > config page just says:
>> >> >> > "The directories in which the log data is kept"
>> >> >> >
>> >> >> > Also, how does kafka handle multiple directories?  Does it treat
>> each
>> >> >> > directory as a separate replica partition, or what?
>> >> >> >
>> >> >> > Jason
>> >> >> >
>> >> >>
>> >>
>>
>
>

Reply via email to