Richard Elling wrote: > On May 26, 2010, at 8:38 AM, Neil Perrin wrote: > >> On 05/26/10 07:10, sensille wrote: >>> My idea goes as follows: don't write linearly. Track the rotation >>> and write to the position the head will hit next. This might be done >>> by a re-mapping layer or integrated into ZFS. This works only because >>> ZIL device are basically write-only. Reads from this device will be >>> horribly slow. >>> >> Yes, I agree this seems very appealing. I have investigated and >> observed similar results. Just allocating larger intent log blocks but >> only writing to say the first half of them has seen the same effect. >> Despite the impressive results, we have not pursued this further mainly >> because of it's maintainability. There is quite a variance between >> drives so, as mentioned, feedback profiling of the device is needed >> in the working system. The layering of the Solaris IO subsystem doesn't >> provide the feedback necessary and the ZIL code is layered on the SPA/DMU. >> Still it should be possible. Good luck! > > I agree. If you search the literature, you will find many cases where > people have tried to optimize file systems based on device geometry > and all have ended up as roadkill. File systems last much longer than > the hardware and writing hardware-specific optimizations into the file > system just doesn't make good sense.
I see the point that the filesystem itself is not the right place for this kind of optimization. > > Meanwhile, though there are doubters, Intel's datasheet for the X-25V > clearly states support for the ATA FLUSH CACHE feature. These can > be bought for around $120 and can do 2,500 random write IOPS. > http://download.intel.com/design/flash/nand/value/datashts/322736.pdf > Similarly, for the X-25E > http://download.intel.com/design/flash/nand/extreme/319984.pdf The datasheet states that they understand the command, yes. I haven't testet myself, but there are many indications on the net that they does not honor it properly, at least for the X-25E. As to the 2500 writes/s, the datasheet says "up to", using a queue depth of 32 and utilizing the write cache. Similarly I just tested a Hitachi 15k disk to see how many linear 4k writes I can issue, and it can handle approx. 20000 writes/s. This is a completely useless number, because as soon as I insert cache flushes it drops down to 250/s (or 15k/minute, of course). Don't understand me wrong, I would be glad if SSDs would hold their promises, it would save us a lot of trouble, but I don't see they are there yet. > > I think the effort is better spent making sure the SSD vendors do the > right thing. That might be true if I had any influence with Intel. I think this is the responsibility of big companies like Oracle and NetApp. All I can do is not to buy broken hardware. -- Arne > -- richard > _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss