Your understanding of RAID 10 is slightly off. Because it is a combination of striping and mirroring, trying to say that there are 4000 open files per pair of disks is not accurate. The disk, as far as the system is concerned, is the entire RAID. Files are striped across all mirrors, so any open file will cross all 7 mirror sets.
Even if you were to operate on a single disk, you're never going to be able to ensure sequential disk access with Kafka. Even if you have a single partition on a disk, there will be multiple log files for that partition and you will have to seek to read older data. What you have to do is use multiple spindles, with sufficiently fast disk speeds, to increase your overall IO capacity. You can also tune to get a little more. For example, we use a 120 second commit on that mount point to reduce the frequency of flushing to disk. -Todd On Wed, Oct 22, 2014 at 10:09 PM, Xiaobin She <xiaobin...@gmail.com> wrote: > Todd, > > Thank you for the information. > > With 28,000+ files and 14 disks, that makes there are averagely about 4000 > open files on two disk ( which is treated as one single disk) , am I right? > > How do you manage to make the all the write operation to thest 4000 open > files be sequential to the disk? > > As far as I know, write operation to different files on the same disk will > cause random write, which is not good for performance. > > xiaobinshe > > > > > 2014-10-23 1:00 GMT+08:00 Todd Palino <tpal...@gmail.com>: > > > In fact there are many more than 4000 open files. Many of our brokers run > > with 28,000+ open files (regular file handles, not network connections). > In > > our case, we're beefing up the disk performance as much as we can by > > running in a RAID-10 configuration with 14 disks. > > > > -Todd > > > > On Tue, Oct 21, 2014 at 7:58 PM, Xiaobin She <xiaobin...@gmail.com> > wrote: > > > > > Todd, > > > > > > Actually I'm wondering how kafka handle so much partition, with one > > > partition there is at least one file on disk, and with 4000 partition, > > > there will be at least 4000 files. > > > > > > When all these partitions have write request, how did Kafka make the > > write > > > operation on the disk to be sequential (which is emphasized in the > design > > > document of Kafka) and make sure the disk access is effective? > > > > > > Thank you for your reply. > > > > > > xiaobinshe > > > > > > > > > > > > 2014-10-22 5:10 GMT+08:00 Todd Palino <tpal...@gmail.com>: > > > > > > > As far as the number of partitions a single broker can handle, we've > > set > > > > our cap at 4000 partitions (including replicas). Above that we've > seen > > > some > > > > performance and stability issues. > > > > > > > > -Todd > > > > > > > > On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She <xiaobin...@gmail.com> > > > > wrote: > > > > > > > > > hello, everyone > > > > > > > > > > I'm new to kafka, I'm wondering what's the max num of partition can > > one > > > > > siggle machine handle in Kafka? > > > > > > > > > > Is there an sugeest num? > > > > > > > > > > Thanks. > > > > > > > > > > xiaobinshe > > > > > > > > > > > > > > >