----- Message from Brian Candler <b.cand...@pobox.com> ---------
   Date: Thu, 03 Apr 2014 14:44:13 +0100
   From: Brian Candler <b.cand...@pobox.com>
Subject: [ceph-users] PGID query
     To: ceph-us...@ceph.com


I'm having trouble understanding the description of Placement Group IDs at http://ceph.com/docs/master/architecture/

There it says:

...
2. CRUSH takes the object ID and hashes it.
3. CRUSH calculates the hash modulo the number of OSDs. (e.g., 0x58) to get a PG ID.
...

That seems to imply that the number of placement groups is the same as the number of OSDs. Should it instead say "the hash modulo the number of Placement Groups in the pool"?

 Yes indeed. I also was confused with that some time ago:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-November/005878.html

Later on, under REBALANCING, the diagram shows two OSDs with 10 PGs spread across them.

I found http://ceph.com/docs/firefly/dev/placement-group/ which says
...
obj_hash = hash(locator)
pg = obj_hash % num_pg
...

so I read that as confirming the hash is modulo number of placement groups, not number of OSDs.

Looking at the remaining pseudocode:

...
OSDs_for_pg = crush(pg)  # returns a list of OSDs
primary = osds_for_pg[0]
replicas = osds_for_pg[1:]
...

This implies that each PG has a fixed set of OSDs (until the crush map changes).

The paper at http://ceph.com/papers/weil-crush-sc06.pdf doesn't mention 'placement group' at all, as far as I can see. It talks about "buckets" and "devices". So the input to the CRUSH algorithm is the PG ID, not the object ID, is that right?
As far as I understand(!):
CRUSH first calculate the pg id with input the object ID, and the pg id is mapped to an OSD. I think 'OSDs_for_pg = crush(pg)' is the core of CRUSH, but the whole algorithm is also called CRUSH , or am I wrong ?

Brian.


----- End message from Brian Candler <b.cand...@pobox.com> -----

--

Met vriendelijke groeten,
Kenneth Waegeman

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to