Richard Elling wrote:
> On May 26, 2010, at 8:38 AM, Neil Perrin wrote:
> 
>> On 05/26/10 07:10, sensille wrote:
>>> My idea goes as follows: don't write linearly. Track the rotation
>>> and write to the position the head will hit next. This might be done
>>> by a re-mapping layer or integrated into ZFS. This works only because
>>> ZIL device are basically write-only. Reads from this device will be
>>> horribly slow.
>>>  
>> Yes, I agree this seems very appealing. I have investigated and
>> observed similar results. Just allocating larger intent log blocks but
>> only writing to say the first half of them has seen the same effect.
>> Despite the impressive results, we have not pursued this further mainly
>> because of it's maintainability. There is quite a variance between
>> drives so, as mentioned, feedback profiling of the device is needed
>> in the working system. The layering of the Solaris IO subsystem doesn't
>> provide the feedback necessary and the ZIL code is layered on the SPA/DMU.
>> Still it should be possible. Good luck!
> 
> I agree.  If you search the literature, you will find many cases where
> people have tried to optimize file systems based on device geometry
> and all have ended up as roadkill.  File systems last much longer than
> the hardware and writing hardware-specific optimizations into the file
> system just doesn't make good sense.

I see the point that the filesystem itself is not the right place for this
kind of optimization.

> 
> Meanwhile, though there are doubters, Intel's datasheet for the X-25V
> clearly states support for the ATA FLUSH CACHE feature.  These can
> be bought for around $120 and can do 2,500 random write IOPS.
> http://download.intel.com/design/flash/nand/value/datashts/322736.pdf
> Similarly, for the X-25E
> http://download.intel.com/design/flash/nand/extreme/319984.pdf

The datasheet states that they understand the command, yes. I haven't
testet myself, but there are many indications on the net that they does
not honor it properly, at least for the X-25E. As to the 2500 writes/s, the
datasheet says "up to", using a queue depth of 32 and utilizing the write
cache. Similarly I just tested a Hitachi 15k disk to see how many linear
4k writes I can issue, and it can handle approx. 20000 writes/s. This is
a completely useless number, because as soon as I insert cache flushes it
drops down to 250/s (or 15k/minute, of course).
Don't understand me wrong, I would be glad if SSDs would hold their
promises, it would save us a lot of trouble, but I don't see they are
there yet.

> 
> I think the effort is better spent making sure the SSD vendors do the
> right thing.

That might be true if I had any influence with Intel. I think this is
the responsibility of big companies like Oracle and NetApp. All I can
do is not to buy broken hardware.

--
Arne

>  -- richard
> 

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to