Good morning Jay. When you say delete directory from the list, did you mean from file system? Can I see through JMX which partitions online and which one are not?
Thanks On Thu, Aug 15, 2013 at 10:04 AM, Jason Rosenberg <j...@squareup.com> wrote: > Thanks Jay, I'll do some testing with this and report back. > > Jason > > > On Thu, Aug 15, 2013 at 7:10 AM, Jay Kreps <jay.kr...@gmail.com> wrote: > > > I believe either should work. The broker has a record of what it should > > have in zk and will recreate any missing logs. Try it to make sure > though. > > > > Sent from my iPhone > > > > On Aug 15, 2013, at 12:52 AM, Jason Rosenberg <j...@squareup.com> wrote: > > > > > Ok, that makes sense that the broker will shut itself down. > > > > > > If we bring it back up, can this be with an altered set of log.dirs? > > Will > > > the destroyed partitions get rebuilt on a new log.dir? Or do we have > to > > > bring it back up with a new or repaired disk, matching the old log.dir, > > in > > > order for those replicas to be rebuilt? > > > > > > Jason > > > > > > > > > On Wed, Aug 14, 2013 at 4:16 PM, Jay Kreps <jay.kr...@gmail.com> > wrote: > > > > > >> If you get a disk error that results in an IOException the broker will > > >> shut itself down. You would then have the option of replacing the disk > > or > > >> deleting that data directory from the list. When the broker is brought > > back > > >> up the intact partitions will quickly catch up and be online; the > > destroyed > > >> partitions will have to fully rebuild off the other replicas and will > > take > > >> a little longer but will automatically come back online once they have > > >> restored off the replicas. > > >> > > >> -jay > > >> > > >> Sent from my iPhone > > >> > > >> On Aug 14, 2013, at 1:49 PM, Jason Rosenberg <j...@squareup.com> > wrote: > > >> > > >>> I'm getting ready to try out this configuration (use multiple disks, > no > > >>> RAID, per broker). One concern is the procedure for recovering if > > there > > >> is > > >>> a disk failure. > > >>> > > >>> If a disk fails, will the broker go offline, or will it continue > > serving > > >>> partitions on its remaining good disks? And if so, is there a > > procedure > > >>> for moving the partitions that were on the failed disk, but not > > >> necessarily > > >>> all the others on that broker? > > >>> > > >>> Jason > > >>> > > >>> > > >>> On Thu, Jun 20, 2013 at 3:15 PM, Jason Rosenberg <j...@squareup.com> > > >> wrote: > > >>> > > >>>> yeah, that would work! > > >>>> > > >>>> > > >>>> On Thu, Jun 20, 2013 at 1:20 PM, Jay Kreps <jay.kr...@gmail.com> > > wrote: > > >>>> > > >>>>> Yeah we didn't go as far as adding weighting or anything like > that--I > > >>>>> think we'd be open to a patch that did that as long as it was > > >>>>> optional. In the short term you can obviously add multiple > > directories > > >>>>> on the same disk to increase its share. > > >>>>> > > >>>>> -Jay > > >>>>> > > >>>>> On Thu, Jun 20, 2013 at 12:59 PM, Jason Rosenberg < > j...@squareup.com> > > >>>>> wrote: > > >>>>>> This sounds like a great idea, to just disks as "just a bunch of > > >> disks" > > >>>>> or > > >>>>>> JBOD.....hdfs works well this way. > > >>>>>> > > >>>>>> Do all the disks need to be the same size, to use them evenly? > > Since > > >> it > > >>>>>> will allocate partitions randomly? > > >>>>>> > > >>>>>> It would be nice if you had 2 disks, with one twice as large as > the > > >>>>> other, > > >>>>>> if the larger would be twice as likely to receive partitions as > the > > >>>>> smaller > > >>>>>> one, etc. > > >>>>>> > > >>>>>> I suppose this goes into my earlier question to the list, > vis-a-vis > > >>>>>> heterogeneous brokers (e.g. utilize brokers with different sized > > >>>>> storage, > > >>>>>> using some sort of weighting scheme, etc.). > > >>>>>> > > >>>>>> Jason > > >>>>>> > > >>>>>> > > >>>>>> On Thu, Jun 20, 2013 at 11:07 AM, Jay Kreps <jay.kr...@gmail.com> > > >>>>> wrote: > > >>>>>> > > >>>>>>> The intention is to allow the use of multiple disks without RAID > or > > >>>>>>> logical volume management. We have found that there are a lot of > > >>>>>>> downsides to RAID--in particular a huge throughput hit. Since we > > >>>>>>> already have a parallelism model due to partitioning and a fault > > >>>>>>> tolerance model with replication RAID doesn't actually buy much. > > With > > >>>>>>> this feature you can directly mount multiple disks as their own > > >>>>>>> directory and the server will randomly assign partitions to them. > > >>>>>>> > > >>>>>>> Obviously this will only work well if there are enough > > >> high-throughput > > >>>>>>> partitions to make load balance evenly (e.g. if you have only one > > big > > >>>>>>> partition per server then this isn't going to work). > > >>>>>>> > > >>>>>>> -Jay > > >>>>>>> > > >>>>>>> On Wed, Jun 19, 2013 at 11:01 PM, Jason Rosenberg < > > j...@squareup.com> > > >>>>>>> wrote: > > >>>>>>>> is it possible for a partition to have multiple replicas on > > >> different > > >>>>>>>> directories on the same broker? (hopefully no!) > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> On Wed, Jun 19, 2013 at 10:47 PM, Jun Rao <jun...@gmail.com> > > wrote: > > >>>>>>>> > > >>>>>>>>> It takes a comma separated list and partition replicas are > > randomly > > >>>>>>>>> distributed to the list. > > >>>>>>>>> > > >>>>>>>>> Thanks, > > >>>>>>>>> > > >>>>>>>>> Jun > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> On Wed, Jun 19, 2013 at 10:25 PM, Jason Rosenberg < > > >> j...@squareup.com > > >>>>>> > > >>>>>>>>> wrote: > > >>>>>>>>> > > >>>>>>>>>> In the 0.8 config, log.dir is now log.dirs. It looks like the > > >>>>>>> singular > > >>>>>>>>>> log.dir is still supported, but under the covers the property > is > > >>>>>>>>> log.dirs. > > >>>>>>>>>> > > >>>>>>>>>> I'm curious, does this take a comma separated list of > > directories? > > >>>>>>> The > > >>>>>>>>> new > > >>>>>>>>>> config page just says: > > >>>>>>>>>> "The directories in which the log data is kept" > > >>>>>>>>>> > > >>>>>>>>>> Also, how does kafka handle multiple directories? Does it > treat > > >>>>> each > > >>>>>>>>>> directory as a separate replica partition, or what? > > >>>>>>>>>> > > >>>>>>>>>> Jason > > >> > > >