On 14.12.2017 18:34, James Okken wrote:
Hi all,
Please let me know if I am missing steps or using the wrong steps
I'm hoping to expand my small CEPH cluster by adding 4TB hard drives to each of
the 3 servers in the cluster.
I also need to change my replication factor from 1 to 3.
This is part of an Openstack environment deployed by Fuel and I had foolishly
set my replication factor to 1 in the Fuel settings before deploy. I know this
would have been done better at the beginning. I do want to keep the current
cluster and not start over. I know this is going thrash my cluster for a while
replicating, but there isn't too much data on it yet.
To start I need to safely turn off each CEPH server and add in the 4TB drive:
To do that I am going to run:
ceph osd set noout
systemctl stop ceph-osd@1 (or 2 or 3 on the other servers)
ceph osd tree (to verify it is down)
poweroff, install the 4TB drive, bootup again
ceph osd unset noout
Next step wouyld be to get CEPH to use the 4TB drives. Each CEPH server already
has a 836GB OSD.
ceph> osd df
ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
0 0.81689 1.00000 836G 101G 734G 12.16 0.90 167
1 0.81689 1.00000 836G 115G 721G 13.76 1.02 166
2 0.81689 1.00000 836G 121G 715G 14.49 1.08 179
TOTAL 2509G 338G 2171G 13.47
MIN/MAX VAR: 0.90/1.08 STDDEV: 0.97
ceph> df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
2509G 2171G 338G 13.47
POOLS:
NAME ID USED %USED MAX AVAIL OBJECTS
rbd 0 0 0 2145G 0
images 1 216G 9.15 2145G 27745
backups 2 0 0 2145G 0
volumes 3 114G 5.07 2145G 29717
compute 4 0 0 2145G 0
Once I get the 4TB drive into each CEPH server should I look to increasing the
current OSD (ie: to 4836GB)?
Or create a second 4000GB OSD on each CEPH server?
If I am going to create a second OSD on each CEPH server I hope to use this doc:
http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/
As far as changing the replication factor from 1 to 3:
Here are my pools now:
ceph osd pool ls detail
pool 0 'rbd' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins
pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
pool 1 'images' replicated size 1 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 64 pgp_num 64 last_change 116 flags hashpspool stripe_width 0
removed_snaps [1~3,b~6,12~8,20~2,24~6,2b~8,34~2,37~20]
pool 2 'backups' replicated size 1 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 64 pgp_num 64 last_change 7 flags hashpspool stripe_width 0
pool 3 'volumes' replicated size 1 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 256 pgp_num 256 last_change 73 flags hashpspool stripe_width 0
removed_snaps [1~3]
pool 4 'compute' replicated size 1 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 64 pgp_num 64 last_change 34 flags hashpspool stripe_width 0
I plan on using these steps I saw online:
ceph osd pool set rbd size 3
ceph -s (Verify that replication completes successfully)
ceph osd pool set images size 3
ceph -s
ceph osd pool set backups size 3
ceph -s
ceph osd pool set volumes size 3
ceph -s
please let me know any advice or better methods...
you normaly want each drive to be it's own osd. it is the number of
osd's that give ceph it's scaleabillity. so more osd's = more aggeregate
performance. only exception is if you are limited by something like cpu
or ram and must limit osd count becouse of that.
also remember to up your min_size from 1 to the default 2. with 1 your
cluster will accept writes with only a single operational osd. and if
that one fail you will have dataloss corruption and inconsistencies.
you might also consider upping your size and min_size before taking down
a osd, since you obviously will have the pg's on that osd unavailable.
and you may want to have the extra redundancy before shaking the tree.
with max usage 15% on the most used OSD you should have the space for it.
good luck
Ronny Aasen
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com