> Hi Jake,
> 
> Good to see it’s not just me.
> 
> I’m guessing that the fact you are doing 1MB writes means that the latency
> difference is having a less noticeable impact on the overall write bandwidth.
> What I have been discovering with Ceph + iSCSi is that due to all the extra
> hops (client->iscsi proxy->pri OSD-> sec OSD) is that you get a lot of latency
> serialisation which dramatically impacts single threaded iops at small IO 
> sizes.
> 
> That makes sense.  I don't really understand how latency is going down if tgt
> is not really doing caching.
> 
> 
> A few days back I tested adding a tiny SSD write cache on the iscsi proxy and
> this had a dramatic effect in “hiding” the latency behind it from the client.
> 
> Nick
> 
> 
> After seeing your results, I've been considering experimenting with
> that.  Currently, my iSCSI proxy nodes are VMs.
> 
> I would like to build a few dedicated servers with fast SSDs or fusion-io
> devices.  It depends on my budget, it's hard to justify getting a card that 
> costs
> 10x the rest of the server...  I would run all my tgt instances in containers
> pointing to the rbd disk+cache device.  A fusion-io device could support many
> tgt containers.
> I don't really want to go back to krbd.  I have a few rbd's that are format 2
> with striping, there aren't any stable kernels that support that (or any 
> kernels
> at all yet for "fancy striping").  I wish there was a way to incorporate a 
> local
> cache device into tgt with librbd backends.
> 
> Jake

Hi Jake,

I spent a bit more time look at this. The fastest solution was probably a 
supermicro 2u twin with shared SAS backplane and then a couple of dual port SAS 
SSD's (about £600 each), so that the cache could fail between servers. 

I also looked at doing DRBD with SSD's, but it looks like DRBD also has latency 
overheads and I'm not sure it would a whole do much better than Ceph to make it 
worthwhile.

The other thing I looked at is that with flashcache it does everything with 4kb 
blocks, so even if you write 1MB, it does 256x4kb IOS to the SSD, this requires 
very low latency for it not become a bottleneck itself. You can limit max IO 
size to cache, but I still feel it’s a limiting factor.

EnhanceIO on the other hand looks like it writes the actual IO size down to the 
SSD so may not suffer from this problem.

I'm hoping to be able to invest in 10GB cards for the ESXi hosts and waiting 
for Hammer to see if there are any improvements.

Nick




_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to