1a) The Ceph documentation on Openstack integration make a big (and
valuable) point that cloning images should be instantaneous/quick due to
the copy-on-write functionality. See "Boot from volume" at bottom of
http://ceph.com/docs/master/rbd/rbd-openstack/. Here's the excerpt:

When Glance and Cinder are both using Ceph block devices, the image is a
copy-on-write clone, so volume creation is very fast.

However, is this true *only* if we are using btrfs as the underlying file
system for the OSDs? If so, then I don't think we can get this nice "quick"
cloning, since CEPH documentation states all over the place that btrfs is
not yet production ready.

1b) Ceph also describes snapshoting/layering as being super quick due to
"copy on write". http://ceph.com/docs/master/rbd/rbd-snapshot/

Does this feature also depend on btrfs being used as underlying filesystem
for OSDs?

2) If we have about 10 TB of data to transfer to CEPH (initial migration),
would all 10 TB pass through the journals? If so, would it make sense to
initially put the journals on each disk's separate partition (instead of an
SSD), then once the 10 TB have been copied, to then change the Ceph
configuration to now use SSDs for journaling instead of a partition on each
disk? In this way, we don't "kill" (or significantly reduce) the SSDs life
expectancy on day 1? (It's ok if the intiial migration takes longer if
we're not using SSDs -- and I'm not sure that it will take more than twice
as long anyways....)

3) Ceph documentation recommends multiple networks (front-side and
back-side). I was wondering though which is "better": one large bonded
interface of 6*1 GB/s = 6 GB/s or two or three interfaces, each of which
would only be 2 or 3 GB/s (after bonding). My initial instincts is to just
go for the nice fat 6 GB/s one, since I'm not worried about denial of
service attacks (DOS) on my internal network and I figure this way I'll get
excellent performance *most* of the time with some (minor?) risk that
occassionally a client request may (or may not?) experience latency due to
network traffic from back-end activities like replication? (My replication
level will most likely be 2.)

4a) Regarding bonding: If I understood Ceph architecture correctly, any
client request will automatically be routed to the individual OSDs that
contain the a piece (a stripe) of the overall object that is being sought.
So a single client request for an object could generate "n" requests to "n"
OSDs. Since the OSDs (in a perfect world) will reside equally on all
servers, then the normal hashing algorithm that Linux + LACP switches uses
should balance these "n" requests accross "m" physical ethernet ports. So
if I have 6 ethernet ports per server and say 6 servers, then in a perfect
world, my "n" requests would use 6 ethernet ports. (In a real world, I
imagine the hashing is not perfect and so maybe only 4 ethernet ports get
used and the other two do nothing....). Is this understanding correct? If
so, normal LACP hashing should suffice for my needs.

4b) A variation of the above question: if the 6 servers I have are NOT of
equal size, such that the storage distributions are 24TB, 16TB, 12TB, 6TB,
4TB and 4TB (for a total of 68 TB hard disks across all servers) -- would
it be reasonably to assume that CEPH would balance any object data roughly
proportionally to the size of each server? (You can assume that the CRUSH
setup is just using the default setup that comes with ceph-deploy, and that
each server typically has 6 to 8 disks.) So a 1 TB vm, for example, would
be split 24/68 on server 1; 16/68 on server 2; 12/68 on server 3; 4/68 on
server 4; and 4/68 on servers 5 and 6?



-- 
*Gautam Saxena *
President & CEO
Integrated Analysis Inc.

Making Sense of Data.™
Biomarker Discovery Software | Bioinformatics Services | Data Warehouse
Consulting | Data Migration Consulting
www.i-a-inc.com  <http://www.i-a-inc.com/>
gsax...@i-a-inc.com
(301) 760-3077  office
(240) 479-4272  direct
(301) 560-3463  fax
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to