Hi Marcus,
for a fast help you can perhaps increase the mon_osd_full_ratio?
What values do you have?
Please post the output of (on host ceph1, because osd.0.asok)
ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show | grep
full_ratio
after that it would be helpfull to use on all hosts 2 OSDs...
Udo
On 01.11.2016 20:14, Marcus Müller wrote:
> Hi all,
>
> i have a big problem and i really hope someone can help me!
>
> We are running a ceph cluster since a year now. Version is: 0.94.7
> (Hammer)
> Here is some info:
>
> Our osd map is:
>
> ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
> -1 26.67998 root default
> -2 3.64000 host ceph1
> 0 3.64000 osd.0 up 1.00000 1.00000
> -3 3.50000 host ceph2
> 1 3.50000 osd.1 up 1.00000 1.00000
> -4 3.64000 host ceph3
> 2 3.64000 osd.2 up 1.00000 1.00000
> -5 15.89998 host ceph4
> 3 4.00000 osd.3 up 1.00000 1.00000
> 4 3.59999 osd.4 up 1.00000 1.00000
> 5 3.29999 osd.5 up 1.00000 1.00000
> 6 5.00000 osd.6 up 1.00000 1.00000
>
> ceph df:
>
> GLOBAL:
> SIZE AVAIL RAW USED %RAW USED
> 40972G 26821G 14151G 34.54
> POOLS:
> NAME ID USED %USED MAX AVAIL OBJECTS
> blocks 7 4490G 10.96 1237G 7037004
> commits 8 473M 0 1237G 802353
> fs 9 9666M 0.02 1237G 7863422
>
> ceph osd df:
>
> ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR
> 0 3.64000 1.00000 3724G 3128G 595G 84.01 2.43
> 1 3.50000 1.00000 3724G 3237G 487G 86.92 2.52
> 2 3.64000 1.00000 3724G 3180G 543G 85.41 2.47
> 3 4.00000 1.00000 7450G 1616G 5833G 21.70 0.63
> 4 3.59999 1.00000 7450G 1246G 6203G 16.74 0.48
> 5 3.29999 1.00000 7450G 1181G 6268G 15.86 0.46
> 6 5.00000 1.00000 7450G 560G 6889G 7.52 0.22
> TOTAL 40972G 14151G 26820G 34.54
> MIN/MAX VAR: 0.22/2.52 STDDEV: 36.53
>
>
> Our current cluster state is:
>
> health HEALTH_WARN
> 63 pgs backfill
> 8 pgs backfill_toofull
> 9 pgs backfilling
> 11 pgs degraded
> 1 pgs recovering
> 10 pgs recovery_wait
> 11 pgs stuck degraded
> 89 pgs stuck unclean
> recovery 8237/52179437 objects degraded (0.016%)
> recovery 9620295/52179437 objects misplaced (18.437%)
> 2 near full osd(s)
> noout,noscrub,nodeep-scrub flag(s) set
> monmap e8: 4 mons at
> {ceph1=192.168.10.3:6789/0,ceph2=192.168.10.4:6789/0,ceph3=192.168.10.5:6789/0,ceph4=192.168.60.6:6789/0}
> election epoch 400, quorum 0,1,2,3 ceph1,ceph2,ceph3,ceph4
> osdmap e1774: 7 osds: 7 up, 7 in; 84 remapped pgs
> flags noout,noscrub,nodeep-scrub
> pgmap v7316159: 320 pgs, 3 pools, 4501 GB data, 15336 kobjects
> 14152 GB used, 26820 GB / 40972 GB avail
> 8237/52179437 objects degraded (0.016%)
> 9620295/52179437 objects misplaced (18.437%)
> 231 active+clean
> 61 active+remapped+wait_backfill
> 9 active+remapped+backfilling
> 6 active+recovery_wait+degraded+remapped
> 6 active+remapped+backfill_toofull
> 4 active+recovery_wait+degraded
> 2 active+remapped+wait_backfill+backfill_toofull
> 1 active+recovering+degraded
> recovery io 11754 kB/s, 35 objects/s
> client io 1748 kB/s rd, 249 kB/s wr, 44 op/s
>
>
> My main problems are:
>
> - As you can see from the osd tree, we have three separate hosts with
> only one osd each. Another one has four osds. Ceph allows me not to
> get data back from these three nodes with only one HDD, which are all
> near full. I tried to set the weight of the osds in the bigger node
> higher but this just does not work. So i added a new osd yesterday
> which made things not better, as you can see now. What do i have to do
> to just become these three nodes empty again and put more data on the
> other node with the four HDDs.
>
> - I added the „ceph4“ node later, this resulted in a strange ip change
> as you can see in the mon list. The public network and the cluster
> network were swapped or not assigned right. See ceph.conf
>
> [global]
> fsid = xxx
> mon_initial_members = ceph1
> mon_host = 192.168.10.3, 192.168.10.4, 192.168.10.5, 192.168.10.11
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
> filestore_xattr_use_omap = true
> public_network = 192.168.60.0/24
> cluster_network = 192.168.10.0/24
> osd pool default size = 3
> osd pool default min size = 1
> osd pool default pg num = 128
> osd pool default pgp num = 128
> osd recovery max active = 50
> osd recovery threads = 3
> mon_pg_warn_max_per_osd = 0
>
> What can i do in this case (it’s no big problem since the network is
> 2x 10 GBE and everything works)?
>
> - One other thing. Even if i just prepare the osd, it’s automatically
> added to the cluster. I can not activate it. Has had someone other
> already such behavior?
>
> I’m now trying to delete something in the cluster, which already
> helped a bit:
>
> health HEALTH_WARN
> 63 pgs backfill
> 8 pgs backfill_toofull
> 10 pgs backfilling
> 7 pgs degraded
> 3 pgs recovery_wait
> 7 pgs stuck degraded
> 82 pgs stuck unclean
> recovery 6498/52085528 objects degraded (0.012%)
> recovery 9507140/52085528 objects misplaced (18.253%)
> 2 near full osd(s)
> noout,noscrub,nodeep-scrub flag(s) set
> monmap e8: 4 mons at
> {ceph1=192.168.10.3:6789/0,ceph2=192.168.10.4:6789/0,ceph3=192.168.10.5:6789/0,ceph4=192.168.60.6:6789/0}
> election epoch 400, quorum 0,1,2,3 ceph1,ceph2,ceph3,ceph4
> osdmap e1780: 7 osds: 7 up, 7 in; 81 remapped pgs
> flags noout,noscrub,nodeep-scrub
> pgmap v7317114: 320 pgs, 3 pools, 4499 GB data, 15333 kobjects
> 14100 GB used, 26872 GB / 40972 GB avail
> 6498/52085528 objects degraded (0.012%)
> 9507140/52085528 objects misplaced (18.253%)
> 238 active+clean
> 60 active+remapped+wait_backfill
> 7 active+remapped+backfilling
> 6 active+remapped+backfill_toofull
> 3 active+degraded+remapped+backfilling
> 2 active+remapped+wait_backfill+backfill_toofull
> 2 active+recovery_wait+degraded+remapped
> 1 active+degraded+remapped+wait_backfill
> 1 active+recovery_wait+degraded
> recovery io 7844 kB/s, 27 objects/s
> client io 343 kB/s rd, 1 op/s
>
>
> If you need more information, just say it. I need really help!
>
> Thank you so far for reading!
>
>
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com