Hi folks,
I've come across an issue which I found a "fix" for, but I'm not sure whether
it's correct or if there is some other misconfiguration on my end and this is
merely a symptom. I'd appreciate any insights anyone could provide based on the
information below, and happy to provide more details as necessary.
Summary: A fresh install of Ceph 0.80.5 comes up with all pgs marked as
active+degraded. This reproduces on 12.04 as well as CentOS 7 with a varying
number of OSD hosts (1, 2, 3), where each OSD host has four storage drives. The
configuration file defines a default replica size of 2, and allows leafs of
type 0. Specific snippet:
[global]
...
osd pool default size = 2
osd crush chooseleaf type = 0
I verified the crush rules were as expected:
"rules": [
{ "rule_id": 0,
"rule_name": "replicated_ruleset",
"ruleset": 0,
"type": 1,
"min_size": 1,
"max_size": 10,
"steps": [
{ "op": "take",
"item": -1,
"item_name": "default"},
{ "op": "choose_firstn",
"num": 0,
"type": "osd"},
{ "op": "emit"}]}],
Inspecting the pg dump I observed that all pgs had a single osd in the
up/acting sets. That seemed to explain why the pgs were degraded, but it was
unclear to me why a second OSD wasn't in the set. After trying a variety of
things, I noticed that there was a difference between Emperor (which works fine
in these configurations) and Firefly with the default tunables, where Firefly
comes up with the bobtail profile. The setting choose_local_fallback_tries is 0
in this profile while it used to default to 5 on Emperor. Sure enough, if I
modify my crush map and set the parameter to a non-zero value, the cluster
remaps and goes healthy with all pgs active+clean.
The documentation states the optimal value of choose_local_fallback_tries is 0
for FF, so I'd like to get a better understanding of this parameter and why
modifying the default value moves the pgs to a clean state in my scenarios.
Thanks,
Ripal
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com