On Thursday, August 29, 2013, Corin Langosch wrote: > Hi there, > > I read about how striping of rbd works at http://ceph.com/docs/next/man/** > 8/rbd/ <http://ceph.com/docs/next/man/8/rbd/> and it seems rather complex > to me. As the individual objects are placed randomly over all osds taking > crush into account anyway, what's the benefit over simply calculating > object_id = (position / chunk_size).to_i or even faster with object_id = > position >> order?
You get fewer disk hits for large streaming writes this way, since a sufficiently-large write will wrap around. But for small sequential IO you're still spreading the load. However, by default it is just chunked as you suggest. The more powerful striping is simply available if you determine it benefits your application. > > I also wonder what object size is recommended for vm images? I assume the > default of 4 MB is not optimal, something bigger like 64 MB would be much > better as it'd require much fewer objects (less overhead on osds' > filestores) and much fewer client-osds roundtrips (reads/ write from/ to > different rados objects) for most vm workloads? The distribution should > still be ok, as most vm images are several GB and so still have several > hundrets or thousands of objects with 64MB objects? Are there any > benchmarks available for this? :) I'm not aware of any. 4MB is just a nice, safe default and given the speeds of everything involved there seemed little point in changing it. -Greg > > Cheers, > Corin > > ______________________________**_________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/**listinfo.cgi/ceph-users-ceph.**com<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> > -- Software Engineer #42 @ http://inktank.com | http://ceph.com
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com