Hi all,
first, thank you all for your answers. I will try to respond everyone
and to everything.
First, ceph osd dump | grep pool
pool 0 'data' replicated size 2 min_size 2 crush_ruleset 0 object_hash
rjenkins pg_num 100 pgp_num 64 last_change 80 owner 0 flags hashpspool
crash_replay_interval 45 stripe_width 0
pool 1 'metadata' replicated size 2 min_size 2 crush_ruleset 0
object_hash rjenkins pg_num 64 pgp_num 64 last_change 32 owner 0 flags
hashpspool stripe_width 0
pool 2 'rbd' replicated size 2 min_size 2 crush_ruleset 0 object_hash
rjenkins pg_num 64 pgp_num 64 last_change 34 owner 0 flags hashpspool
stripe_width 0
min_size it's set to 2. So this may be the problem. I will change it
and try again. It seem that defaults are no good for me. I'm using
'data' pool.
This is the crushmap modified to add rack. Nothing changes.
# buckets
host blue-compute {
id -2 # do not change unnecessarily
# weight 0.450
alg straw
hash 0 # rjenkins1
item osd.0 weight 0.450
}
host red-compute {
id -3 # do not change unnecessarily
# weight 0.450
alg straw
hash 0 # rjenkins1
item osd.1 weight 0.450
}
rack rack-1 {
id -4 # do not change unnecessarily
# weight 3.000
alg straw
hash 0 # rjenkins1
item blue-compute weight 0.450
item red-compute weight 0.450
}
root default {
id -1 # do not change unnecessarily
# weight 0.900
alg straw
hash 0 # rjenkins1
item rack-1 weight 1.000
}
# rules
rule replicated_ruleset {
ruleset 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
-----------------------
JC ->
> Recommendation: for testing efficiently and most options available,
functionnally speaking, deploy a cluster with 3 nodes, 3 OSDs each is
my best practice.
Yes, this would be nice but for now I just have two servers. I will add
several disks to each one. I planned about 5 each. With total size of
about 3TB each. I want to evaluate overhead of ceph to see if I can run
some KVM inside. But this is another story.
Since I only have 2 servers the cluster didn't was able to go
clean+active. This is because default pools were set to 3 replicas. I
changed the option to set them to 2 replicas. I want to switch from
raid to ceph so I have fault tolerance.
Will scale to 3 nodes later, so I have also 3 mons and can have quorum
since I tried with 2 mons and that was not possible.
> Do tou have chooseleaf type host or type node in your crush map?
Yep, I was looking for this also.
rule replicated_ruleset {
ruleset 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
The idea is to have same osd number and sizes in both hosts so
everything is correctly replicated to both nodes.
Any help?
El sáb, 19 de abr 2014 a las 6:53 , Michael J. Kidd
<michael.k...@inktank.com> escribió:
You may also want to check your 'min_size'... if it's 2, then you'll
be incomplete even with 1 complete copy.
ceph osd dump | grep pool
You can reduce the min size with the following syntax:
ceph osd pool set <poolname> min_size 1
Thanks,
Michael J. Kidd
Sent from my mobile device. Please excuse brevity and typographical
errors.
On Apr 19, 2014 12:50 PM, "Jean-Charles Lopez" <jc.lo...@inktank.com>
wrote:
Hi again
Looked at your ceph -s.
You have only 2 OSDs, one on each node. The default replica count is
2, the default crush map says each replica on a different host, or
may be you set it to 2 different OSDs. Anyway, when one of your OSD
goes down, Ceph can no longer find another OSDs to host the second
replica it must create.
Looking at your crushmap we would know better.
Recommendation: for testing efficiently and most options available,
functionnally speaking, deploy a cluster with 3 nodes, 3 OSDs each
is my best practice.
Or make 1 node with 3 OSDs modifying your crushmap to "choose type
osd" in your rulesets.
JC
On Saturday, April 19, 2014, Gonzalo Aguilar Delgado
<gagui...@aguilardelgado.com> wrote:
Hi,
I'm building a cluster where two nodes replicate objects inside. I
found that shutting down just one of the nodes (the second one),
makes everything "incomplete".
I cannot find why, since crushmap looks good to me.
after shutting down one node
cluster 9028f4da-0d77-462b-be9b-dbdf7fa57771
health HEALTH_WARN 192 pgs incomplete; 96 pgs stuck inactive;
96 pgs stuck unclean; 1/2 in osds are down
monmap e9: 1 mons at {blue-compute=172.16.0.119:6789/0},
election epoch 1, quorum 0 blue-compute
osdmap e73: 2 osds: 1 up, 2 in
pgmap v172: 192 pgs, 3 pools, 275 bytes data, 1 objects
7552 kB used, 919 GB / 921 GB avail
192 incomplete
Both nodes has WD Caviar Black 500MB disk with btrfs filesystem on
it. Full disk used.
I cannot understand why does not replicate to both nodes.
Someone can help?
Best regards,
--
Sent while moving
Pardon my French and any spelling &| grammar glitches
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com