[ceph-users] Reboot 1 OSD server, now ceph says 60% misplaced?
One of my 9 ceph osd nodes just spontaneously rebooted. This particular osd server only holds 4% of total storage. Why, after it has come back up and rejoined the cluster, does ceph health say that 60% of my objects are misplaced? I'm wondering if I have something setup wrong in my cluster. This cluster has been operating well for the most part for about a year but I have noticed this sort of behavior before. This is going to take many hours to recover. Ceph 10.2.3. Thanks for any insights you may be able to provide! -- Tracy Reed http://tracyreed.org Digital signature attached for your safety. signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Reboot 1 OSD server, now ceph says 60% misplaced?
6:6850/5091 10.0.5.16:6851/5091 exists,up 03e2b2bb-b7a6-4d28-8ff4-b73f208812e2 osd.75 up in weight 0.992599 up_from 112255 up_thru 119791 down_at 112241 last_clean_interval [94225,112240) 10.0.5.16:6832/4480 10.0.5.16:6833/4480 10.0.5.16:6834/4480 10.0.5.16:6835/4480 exists,up 0a4e41b8-2cd5-496f-910c-a2fd7732a42e osd.76 up in weight 1 up_from 112254 up_thru 119770 down_at 112241 last_clean_interval [94247,112240) 10.0.5.16:6844/4907 10.0.5.16:6845/4907 10.0.5.16:6846/4907 10.0.5.16:6847/4907 exists,up ea9fc955-6f28-4ca6-b06b-0bd4f178f22e osd.77 up in weight 0.940018 up_from 112249 up_thru 119569 down_at 112241 last_clean_interval [94260,112240) 10.0.5.16:6820/3813 10.0.5.16:6821/3813 10.0.5.16:6822/3813 10.0.5.16:6823/3813 exists,up 361ae33c-39b3-4572-b5a9-69bdfbbb4c3f osd.78 up in weight 0.962784 up_from 112249 up_thru 119411 down_at 112241 last_clean_interval [94277,112240) 10.0.5.16:6816/3628 10.0.5.16:6817/3628 10.0.5.16:6818/3628 10.0.5.16:6819/3628 exists,up 1902cb84-465c-4fab-bac0-9a5d1fa9a5ae osd.79 up in weight 1 up_from 112246 up_thru 119541 down_at 112241 last_clean_interval [94304,112240) 10.0.5.16:6812/3497 10.0.5.16:6813/3497 10.0.5.16:6814/3497 10.0.5.16:6815/3497 exists,up 5d4041db-398d-4fbb-a672-029444f3d974 osd.80 up in weight 0.993805 up_from 113233 up_thru 119411 down_at 113231 last_clean_interval [112253,113230) 10.0.5.16:6852/169626 10.0.5.16:6854/169626 10.0.5.16:6856/169626 10.0.5.16:6857/169626 exists,up 9e6d643b-8a2e-43d6-b112-126dd162ad0b osd.81 up in weight 1 up_from 112245 up_thru 119500 down_at 112241 last_clean_interval [94345,112240) 10.0.5.16:6808/3355 10.0.5.16:6809/3355 10.0.5.16:6810/3355 10.0.5.16:6811/3355 exists,up 144a1a6f-9a11-4950-a54f-7f649989fac7 osd.82 up in weight 1 up_from 112253 up_thru 119429 down_at 112241 last_clean_interval [94354,112240) 10.0.5.16:6840/4733 10.0.5.16:6841/4733 10.0.5.16:6842/4733 10.0.5.16:6843/4733 exists,up 991499cf-8334-4b46-97fb-535c8a703e45 osd.83 up in weight 1 up_from 112249 up_thru 119788 down_at 112241 last_clean_interval [94371,112240) 10.0.5.16:6824/3944 10.0.5.16:6825/3944 10.0.5.16:6826/3944 10.0.5.16:6827/3944 exists,up 22819ad7-ab30-42a2-97fb-4d8fdd1937aa osd.84 up in weight 1 up_from 112245 up_thru 119411 down_at 112241 last_clean_interval [94384,112240) 10.0.5.16:6800/3071 10.0.5.16:6801/3071 10.0.5.16:6802/3071 10.0.5.16:6803/3071 exists,up 91da19ca-4948-4d2a-baa0-7a075f59d2ea osd.85 up in weight 1 up_from 112243 up_thru 119571 down_at 112241 last_clean_interval [94396,112240) 10.0.5.16:6804/3196 10.0.5.16:6805/3196 10.0.5.16:6806/3196 10.0.5.16:6807/3196 exists,up b79d7033-fdf6-4f4d-97bf-26a24f903b98 pg_temp 0.0 [52,16,73] pg_temp 0.1 [61,26,77] pg_temp 0.5 [84,48,29] pg_temp 0.6 [77,70,46] pg_temp 0.7 [29,73,46] pg_temp 0.8 [61,16,73] pg_temp 0.9 [67,83,47] pg_temp 0.b [83,0,49] pg_temp 0.c [0,64,77] pg_temp 0.e [67,0,69] < a couple thousand more lines like these pg_temp lines > -- Tracy Reed http://tracyreed.org Digital signature attached for your safety. signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Reboot 1 OSD server, now ceph says 60% misplaced?
1.0 64 1.81799 osd.64 up 0.55638 1.0 63 1.81799 osd.63 up 0.60434 1.0 -8 5.45398 host ceph07 65 1.81799 osd.65 up 0.52611 1.0 67 1.81799 osd.67 up 0.61052 1.0 70 1.81799 osd.70 up 0.56075 1.0 -9 2.69798 host ceph08 4 0.90900 osd.4up 0.45261 1.0 5 0.87999 osd.5up 0.46480 1.0 16 0.90900 osd.16 up 0.48987 1.0 -10 29.09595 host ceph10 66 1.81850 osd.66 down0 1.0 71 1.81850 osd.71 down0 1.0 72 1.81850 osd.72 down0 1.0 73 1.81850 osd.73 up 0.89394 1.0 74 1.81850 osd.74 up 1.0 1.0 75 1.81850 osd.75 up 0.99260 1.0 76 1.81850 osd.76 up 1.0 1.0 77 1.81850 osd.77 up 0.94002 1.0 78 1.81850 osd.78 up 0.96278 1.0 79 1.81850 osd.79 up 1.0 1.0 80 1.81850 osd.80 up 0.99380 1.0 81 1.81850 osd.81 up 1.0 1.0 82 1.81850 osd.82 up 1.0 1.0 83 1.81850 osd.83 up 1.0 1.0 84 1.81850 osd.84 up 1.0 1.0 85 1.81850 osd.85 up 1.0 1.0 -- Tracy Reed http://tracyreed.org Digital signature attached for your safety. signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Reboot 1 OSD server, now ceph says 60% misplaced?
On Sun, Nov 19, 2017 at 02:41:56AM PST, Gregory Farnum spake thusly: > Okay, so the hosts look okay (although very uneven numbers of OSDs). > > But the sizes are pretty wonky. Are the disks really that mismatched > in size? I note that many of them in host10 are set to 1.0, but most > of the others are some fraction less than that. Yes, they are that mismatched. This is a very mix and match cluster we built out of what we had laying around. I know that isn't ideal. Possibly due to the large mismatch in disk sizes (although I had always expected CRUSH to manage it batter given the default weighting proportional to size) we used to run into situations where the small disks would fill up even when the large disks were barely at 50%. So back in June we ran bc-ceph-reweight-by-utilization.py fairly frequently for a few days until things were happy and stable and it stayed that way until tonight's incident. I'm pretty sure you are right: The weights got reset to defaults causing lots of movement. I had forgotten that ceph osd reweight is not a persistent setting. So it looks like once things settle I need to adjust crush weights appropriately and set reweights back to 1 to make this permanent. That explains it. Thanks! -- Tracy Reed http://tracyreed.org Digital signature attached for your safety. signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Can't activate OSD
Hello all, Over the past few weeks I've been trying to go through the Quick Ceph Deploy tutorial at: http://docs.ceph.com/docs/jewel/start/quick-ceph-deploy/ just trying to get a basic 2 OSD ceph cluster up and running. Everything seems to go well until I get to the: ceph-deploy osd activate ceph02:/dev/sdc ceph03:/dev/sdc part. It never actually seems to activate the OSD and eventually times out: [ceph02][DEBUG ] connection detected need for sudo [ceph02][DEBUG ] connected to host: ceph02 [ceph02][DEBUG ] detect platform information from remote host [ceph02][DEBUG ] detect machine type [ceph02][DEBUG ] find the location of an executable [ceph_deploy.osd][INFO ] Distro info: CentOS Linux 7.2.1511 Core [ceph_deploy.osd][DEBUG ] activating host ceph02 disk /dev/sdc [ceph_deploy.osd][DEBUG ] will use init type: systemd [ceph02][DEBUG ] find the location of an executable [ceph02][INFO ] Running command: sudo /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /dev/sdc [ceph02][WARNIN] main_activate: path = /dev/sdc [ceph02][WARNIN] No data was received after 300 seconds, disconnecting... [ceph02][INFO ] checking OSD status... [ceph02][DEBUG ] find the location of an executable [ceph02][INFO ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json [ceph02][INFO ] Running command: sudo systemctl enable ceph.target [ceph03][DEBUG ] connection detected need for sudo [ceph03][DEBUG ] connected to host: ceph03 [ceph03][DEBUG ] detect platform information from remote host [ceph03][DEBUG ] detect machine type [ceph03][DEBUG ] find the location of an executable [ceph_deploy.osd][INFO ] Distro info: CentOS Linux 7.2.1511 Core [ceph_deploy.osd][DEBUG ] activating host ceph03 disk /dev/sdc [ceph_deploy.osd][DEBUG ] will use init type: systemd [ceph03][DEBUG ] find the location of an executable [ceph03][INFO ] Running command: sudo /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /dev/sdc [ceph03][WARNIN] main_activate: path = /dev/sdc [ceph03][WARNIN] No data was received after 300 seconds, disconnecting... [ceph03][INFO ] checking OSD status... [ceph03][DEBUG ] find the location of an executable [ceph03][INFO ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json [ceph03][INFO ] Running command: sudo systemctl enable ceph.target Machines involved are ceph-deploy (deploy server), ceph01 (monitor), ceph02 and ceph03 (OSD servers). ceph log is here: http://pastebin.com/A2kP28c4 This is CentOS 5. iptables and selinux are both off. When I first started doing this the volume would be left mounted in the tmp location on the OSDs. But I have since upgraded my version of ceph and now nothing is left mounted on the OSD but it still times out. Please let me know if there is any other info I can provide which might help. Any help you can offer is greatly appreciated! I've been stuck on this for weeks. Thanks! -- Tracy Reed pgpmPpa4E7s3Y.pgp Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Can't activate OSD
Oops, I said CentOS 5 (old habit, ran it for years!). I meant CentOS 7. And I'm running the following Ceph package versions from the ceph repo: root@ceph02 ~]# rpm -qa |grep -i ceph libcephfs1-10.2.3-0.el7.x86_64 ceph-common-10.2.3-0.el7.x86_64 ceph-mon-10.2.3-0.el7.x86_64 ceph-release-1-1.el7.noarch python-cephfs-10.2.3-0.el7.x86_64 ceph-selinux-10.2.3-0.el7.x86_64 ceph-osd-10.2.3-0.el7.x86_64 ceph-mds-10.2.3-0.el7.x86_64 ceph-radosgw-10.2.3-0.el7.x86_64 ceph-base-10.2.3-0.el7.x86_64 ceph-10.2.3-0.el7.x86_64 On Mon, Oct 03, 2016 at 03:34:50PM PDT, Tracy Reed spake thusly: > Hello all, > > Over the past few weeks I've been trying to go through the Quick Ceph Deploy > tutorial at: > > http://docs.ceph.com/docs/jewel/start/quick-ceph-deploy/ > > just trying to get a basic 2 OSD ceph cluster up and running. Everything seems > to go well until I get to the: > > ceph-deploy osd activate ceph02:/dev/sdc ceph03:/dev/sdc > > part. It never actually seems to activate the OSD and eventually times out: > > [ceph02][DEBUG ] connection detected need for sudo > [ceph02][DEBUG ] connected to host: ceph02 > [ceph02][DEBUG ] detect platform information from remote host > [ceph02][DEBUG ] detect machine type > [ceph02][DEBUG ] find the location of an executable > [ceph_deploy.osd][INFO ] Distro info: CentOS Linux 7.2.1511 Core > [ceph_deploy.osd][DEBUG ] activating host ceph02 disk /dev/sdc > [ceph_deploy.osd][DEBUG ] will use init type: systemd > [ceph02][DEBUG ] find the location of an executable > [ceph02][INFO ] Running command: sudo /usr/sbin/ceph-disk -v activate > --mark-init systemd --mount /dev/sdc > [ceph02][WARNIN] main_activate: path = /dev/sdc > [ceph02][WARNIN] No data was received after 300 seconds, disconnecting... > [ceph02][INFO ] checking OSD status... > [ceph02][DEBUG ] find the location of an executable > [ceph02][INFO ] Running command: sudo /bin/ceph --cluster=ceph osd stat > --format=json > [ceph02][INFO ] Running command: sudo systemctl enable ceph.target > [ceph03][DEBUG ] connection detected need for sudo > [ceph03][DEBUG ] connected to host: ceph03 > [ceph03][DEBUG ] detect platform information from remote host > [ceph03][DEBUG ] detect machine type > [ceph03][DEBUG ] find the location of an executable > [ceph_deploy.osd][INFO ] Distro info: CentOS Linux 7.2.1511 Core > [ceph_deploy.osd][DEBUG ] activating host ceph03 disk /dev/sdc > [ceph_deploy.osd][DEBUG ] will use init type: systemd > [ceph03][DEBUG ] find the location of an executable > [ceph03][INFO ] Running command: sudo /usr/sbin/ceph-disk -v activate > --mark-init systemd --mount /dev/sdc > [ceph03][WARNIN] main_activate: path = /dev/sdc > [ceph03][WARNIN] No data was received after 300 seconds, disconnecting... > [ceph03][INFO ] checking OSD status... > [ceph03][DEBUG ] find the location of an executable > [ceph03][INFO ] Running command: sudo /bin/ceph --cluster=ceph osd stat > --format=json > [ceph03][INFO ] Running command: sudo systemctl enable ceph.target > > Machines involved are ceph-deploy (deploy server), ceph01 (monitor), ceph02 > and > ceph03 (OSD servers). > > ceph log is here: > > http://pastebin.com/A2kP28c4 > > This is CentOS 5. iptables and selinux are both off. When I first started > doing > this the volume would be left mounted in the tmp location on the OSDs. But I > have since upgraded my version of ceph and now nothing is left mounted on the > OSD but it still times out. > > Please let me know if there is any other info I can provide which might help. > Any help you can offer is greatly appreciated! I've been stuck on this for > weeks. Thanks! > > -- > Tracy Reed > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Tracy Reed pgpIRPsYCGgTx.pgp Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Ceph consultants?
Hello all, Any independent Ceph consultants out there? We have been trying to get Ceph going and it's been very slow going. We don't have anything working yet after a month! We really can't waste much more time on this by ourselves. At this point we're looking to pay someone for a few hours to get us over the initial roadblock and advise us occasionally as we move forward. Probably just a few hours of work but if there's an experienced ceph person out there looking to make a little extra money please drop me a line. Thanks! -- Tracy Reed pgpoAdnW9Acn4.pgp Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph consultants?
On Wed, Oct 05, 2016 at 01:17:52PM PDT, Peter Maloney spake thusly: > What do you need help with specifically? Setting up ceph isn't very > complicated... just fixing it when things go wrong should be. What type > of scale are you working with, and do you already have hardware? Or is > the problem more to do with integrating it with clients? Hi Peter, I agree, setting up Ceph isn't very complicated. I posted to the list on 10/03/16 with the initial problem I have run into under the subject "Can't activate OSD". Please refer to that thread as it has logs, details of my setup, etc. I started working on this about a month ago then spent several days on it and a few hours with a couple different people on IRC. Nobody has been able to figure out how to get my OSD activated. I took a couple weeks off and now I'm back at it as I really need to get this going soon. Basically, I'm following the quickstart guide at http://docs.ceph.com/docs/jewel/start/quick-ceph-deploy/ and when I run the command to activate the OSDs like so: ceph-deploy osd activate ceph02:/dev/sdc ceph03:/dev/sdc I get this in the ceph-deploy log: [2016-10-03 15:16:10,193][ceph_deploy.osd][INFO ] Distro info: CentOS Linux 7.2.1511 Core [2016-10-03 15:16:10,193][ceph_deploy.osd][DEBUG ] activating host ceph03 disk /dev/sdc [2016-10-03 15:16:10,193][ceph_deploy.osd][DEBUG ] will use init type: systemd [2016-10-03 15:16:10,194][ceph03][DEBUG ] find the location of an executable [2016-10-03 15:16:10,200][ceph03][INFO ] Running command: sudo /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /dev/sdc [2016-10-03 15:16:10,377][ceph03][WARNING] main_activate: path = /dev/sdc [2016-10-03 15:21:10,380][ceph03][WARNING] No data was received after 300 seconds, disconnecting... [2016-10-03 15:21:15,387][ceph03][INFO ] checking OSD status... [2016-10-03 15:21:15,401][ceph03][DEBUG ] find the location of an executable [2016-10-03 15:21:15,472][ceph03][INFO ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json [2016-10-03 15:21:15,698][ceph03][INFO ] Running command: sudo systemctl enable ceph.target More details in other thread. Where am I going wrong here? Thanks! -- Tracy Reed pgpf71_DOjtT2.pgp Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] SOLVED Re: Can't activate OSD
SOLVED! Thanks to a very kind person from this list who helped me debug, we found that when I created the VLAN on the switch I didn't set it allow jumbo packets. This was preventing the OSDs from activating because some traffic was being blocked. Once I fixed that everything started working. Sometimes it really helps to have a second pair of eyes. So this wasn't a Ceph problem at all, really. Thanks! On Mon, Oct 03, 2016 at 03:39:45PM PDT, Tracy Reed spake thusly: > Oops, I said CentOS 5 (old habit, ran it for years!). I meant CentOS 7. And > I'm > running the following Ceph package versions from the ceph repo: > > root@ceph02 ~]# rpm -qa |grep -i ceph > libcephfs1-10.2.3-0.el7.x86_64 > ceph-common-10.2.3-0.el7.x86_64 > ceph-mon-10.2.3-0.el7.x86_64 > ceph-release-1-1.el7.noarch > python-cephfs-10.2.3-0.el7.x86_64 > ceph-selinux-10.2.3-0.el7.x86_64 > ceph-osd-10.2.3-0.el7.x86_64 > ceph-mds-10.2.3-0.el7.x86_64 > ceph-radosgw-10.2.3-0.el7.x86_64 > ceph-base-10.2.3-0.el7.x86_64 > ceph-10.2.3-0.el7.x86_64 > > On Mon, Oct 03, 2016 at 03:34:50PM PDT, Tracy Reed spake thusly: > > Hello all, > > > > Over the past few weeks I've been trying to go through the Quick Ceph > > Deploy tutorial at: > > > > http://docs.ceph.com/docs/jewel/start/quick-ceph-deploy/ > > > > just trying to get a basic 2 OSD ceph cluster up and running. Everything > > seems > > to go well until I get to the: > > > > ceph-deploy osd activate ceph02:/dev/sdc ceph03:/dev/sdc > > > > part. It never actually seems to activate the OSD and eventually times out: > > > > [ceph02][DEBUG ] connection detected need for sudo > > [ceph02][DEBUG ] connected to host: ceph02 > > [ceph02][DEBUG ] detect platform information from remote host > > [ceph02][DEBUG ] detect machine type > > [ceph02][DEBUG ] find the location of an executable > > [ceph_deploy.osd][INFO ] Distro info: CentOS Linux 7.2.1511 Core > > [ceph_deploy.osd][DEBUG ] activating host ceph02 disk /dev/sdc > > [ceph_deploy.osd][DEBUG ] will use init type: systemd > > [ceph02][DEBUG ] find the location of an executable > > [ceph02][INFO ] Running command: sudo /usr/sbin/ceph-disk -v activate > > --mark-init systemd --mount /dev/sdc > > [ceph02][WARNIN] main_activate: path = /dev/sdc > > [ceph02][WARNIN] No data was received after 300 seconds, disconnecting... > > [ceph02][INFO ] checking OSD status... > > [ceph02][DEBUG ] find the location of an executable > > [ceph02][INFO ] Running command: sudo /bin/ceph --cluster=ceph osd stat > > --format=json > > [ceph02][INFO ] Running command: sudo systemctl enable ceph.target > > [ceph03][DEBUG ] connection detected need for sudo > > [ceph03][DEBUG ] connected to host: ceph03 > > [ceph03][DEBUG ] detect platform information from remote host > > [ceph03][DEBUG ] detect machine type > > [ceph03][DEBUG ] find the location of an executable > > [ceph_deploy.osd][INFO ] Distro info: CentOS Linux 7.2.1511 Core > > [ceph_deploy.osd][DEBUG ] activating host ceph03 disk /dev/sdc > > [ceph_deploy.osd][DEBUG ] will use init type: systemd > > [ceph03][DEBUG ] find the location of an executable > > [ceph03][INFO ] Running command: sudo /usr/sbin/ceph-disk -v activate > > --mark-init systemd --mount /dev/sdc > > [ceph03][WARNIN] main_activate: path = /dev/sdc > > [ceph03][WARNIN] No data was received after 300 seconds, disconnecting... > > [ceph03][INFO ] checking OSD status... > > [ceph03][DEBUG ] find the location of an executable > > [ceph03][INFO ] Running command: sudo /bin/ceph --cluster=ceph osd stat > > --format=json > > [ceph03][INFO ] Running command: sudo systemctl enable ceph.target > > > > Machines involved are ceph-deploy (deploy server), ceph01 (monitor), ceph02 > > and > > ceph03 (OSD servers). > > > > ceph log is here: > > > > http://pastebin.com/A2kP28c4 > > > > This is CentOS 5. iptables and selinux are both off. When I first started > > doing > > this the volume would be left mounted in the tmp location on the OSDs. But I > > have since upgraded my version of ceph and now nothing is left mounted on > > the > > OSD but it still times out. > > > > Please let me know if there is any other info I can provide which might > > help. > > Any help you can offer is greatly appreciated! I've been stuck on this for > > weeks. Thanks! > > > > -- > > Tracy Reed > > > > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > -- > Tracy Reed > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Tracy Reed pgpVN3wp3MUC4.pgp Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Monitor troubles
atus [2016-10-05 14:48:54,272][ceph01][DEBUG ] [2016-10-05 14:48:54,273][ceph01][DEBUG ] status for monitor: mon.ceph01 [2016-10-05 14:48:54,274][ceph01][DEBUG ] { [2016-10-05 14:48:54,275][ceph01][DEBUG ] "election_epoch": 5, [2016-10-05 14:48:54,275][ceph01][DEBUG ] "extra_probe_peers": [], [2016-10-05 14:48:54,275][ceph01][DEBUG ] "monmap": { [2016-10-05 14:48:54,276][ceph01][DEBUG ] "created": "2016-09-05 01:22:09.228315", [2016-10-05 14:48:54,276][ceph01][DEBUG ] "epoch": 1, [2016-10-05 14:48:54,276][ceph01][DEBUG ] "fsid": "3e84db5d-3dc8-4104-89e7-da23c103ef50", [2016-10-05 14:48:54,276][ceph01][DEBUG ] "modified": "2016-09-05 01:22:09.228315", [2016-10-05 14:48:54,277][ceph01][DEBUG ] "mons": [ [2016-10-05 14:48:54,277][ceph01][DEBUG ] { [2016-10-05 14:48:54,277][ceph01][DEBUG ] "addr": "10.0.5.2:6789/0", [2016-10-05 14:48:54,277][ceph01][DEBUG ] "name": "ceph01", [2016-10-05 14:48:54,278][ceph01][DEBUG ] "rank": 0 [2016-10-05 14:48:54,278][ceph01][DEBUG ] } [2016-10-05 14:48:54,279][ceph01][DEBUG ] ] [2016-10-05 14:48:54,279][ceph01][DEBUG ] }, [2016-10-05 14:48:54,280][ceph01][DEBUG ] "name": "ceph01", [2016-10-05 14:48:54,280][ceph01][DEBUG ] "outside_quorum": [], [2016-10-05 14:48:54,281][ceph01][DEBUG ] "quorum": [ [2016-10-05 14:48:54,282][ceph01][DEBUG ] 0 [2016-10-05 14:48:54,282][ceph01][DEBUG ] ], [2016-10-05 14:48:54,282][ceph01][DEBUG ] "rank": 0, [2016-10-05 14:48:54,282][ceph01][DEBUG ] "state": "leader", [2016-10-05 14:48:54,282][ceph01][DEBUG ] "sync_provider": [] [2016-10-05 14:48:54,283][ceph01][DEBUG ] } [2016-10-05 14:48:54,283][ceph01][DEBUG ] [2016-10-05 14:48:54,283][ceph01][INFO ] monitor: mon.ceph01 is running But the cluster worked just fine until I tried adding two more monitors. In the troubleshooting section "Recovering a Monitor’s Broken monmap" I thought maybe I would try extracting a monmap with the idea that maybe I would learn something or possibly change the fsid on ceph01 or something. [root@ceph01 ~]# ceph-mon -i mon.ceph01 --extract-monmap /tmp/monmap monitor data directory at '/var/lib/ceph/mon/ceph-mon.ceph01' does not exist: have you run 'mkfs'? So that didn't get me anything either. mon log on ceph01 contains repetitions of: 2016-11-01 21:34:33.588396 7ff029c70700 0 mon.ceph01@0(probing) e2 handle_probe ignoring fsid e2e43abc-e634-4a04-ae24-0c486a035b6e != 3e84db5d-3dc8-4104-89e7-da23c103ef50 2016-11-01 21:34:35.739479 7ff029c70700 0 mon.ceph01@0(probing) e2 handle_probe ignoring fsid e2e43abc-e634-4a04-ae24-0c486a035b6e != 3e84db5d-3dc8-4104-89e7-da23c103ef50 2016-11-01 21:34:35.936020 7ff024f3f700 0 -- 10.0.5.2:6789/0 >> 10.0.5.5:0/3093707402 pipe(0x7ff03d57e800 sd=20 :6789 s=0 pgs=0 cs=0 l=0 c=0x7ff03d81e580).accept peer addr is really 10.0.5.5:0/3093707402 (socket is 10.0.5.5:44360/0) 2016-11-01 21:34:37.890073 7ff029c70700 0 mon.ceph01@0(probing) e2 handle_probe ignoring fsid e2e43abc-e634-4a04-ae24-0c486a035b6e != 3e84db5d-3dc8-4104-89e7-da23c103ef50 2016-11-01 21:34:40.043113 7ff029c70700 0 mon.ceph01@0(probing) e2 handle_probe ignoring fsid e2e43abc-e634-4a04-ae24-0c486a035b6e != 3e84db5d-3dc8-4104-89e7-da23c103ef50 2016-11-01 21:34:40.554165 7ff02a471700 0 mon.ceph01@0(probing).data_health(0) update_stats avail 96% total 51175 MB, used 1850 MB, avail 49324 MB while mon log on ceph02 contains repetitions of: 2016-11-01 21:34:11.327458 7f33f4284700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch 2016-11-01 21:34:11.327623 7f33f4284700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished 2016-11-01 21:34:12.451514 7f33f4284700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch 2016-11-01 21:34:12.451683 7f33f4284700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished 2016-11-01 21:34:12.780988 7f33f1715700 0 mon.ceph02@0(probing) e0 handle_probe ignoring fsid 3e84db5d-3dc8-4104-89e7-da23c103ef50 != e2e43abc-e634-4a04-ae24-0c486a035b6e Any ideas how to recover from this situation are greatly appreciated! -- Tracy Reed signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Monitor troubles
On Tue, Nov 01, 2016 at 09:36:16PM PDT, Tracy Reed spake thusly: > I initially setup my ceph cluster on CentOS 7 with just one monitor. The > monitor runs on an osd server (not ideal, will change soon). I've Sorry, forgot to add that I'm running the following ceph version from the ceph repo: # rpm -qa|grep ceph libcephfs1-10.2.3-0.el7.x86_64 ceph-release-1-1.el7.noarch ceph-mds-10.2.3-0.el7.x86_64 ceph-radosgw-10.2.3-0.el7.x86_64 python-cephfs-10.2.3-0.el7.x86_64 ceph-common-10.2.3-0.el7.x86_64 ceph-selinux-10.2.3-0.el7.x86_64 ceph-mon-10.2.3-0.el7.x86_64 ceph-10.2.3-0.el7.x86_64 ceph-base-10.2.3-0.el7.x86_64 ceph-osd-10.2.3-0.el7.x86_64 -- Tracy Reed signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Monitor troubles
After a lot of messing about I have manually created a monmap and got the two new monitors working for a total of three. But to do that I had to delete the first monitor which for some reason was coming up with a bogus fsid after manipulated the monmap which I checked and it had the correct fsid. So I recreated it with the right fsid and after that I had three monitors with quorum. I had to manually put the keys in place on the recreated monitor too. But now all of my OSDs have disapeared! Apparently the mon I deleted was storing some special knowledge of the OSDs? The mon log says: 2016-11-03 18:31:26.612012 7f7139529700 0 mon.ceph01@0(leader).data_health(14) update_stats avail 96% total 51175 MB, used 1744 MB, avail 49430 MB 2016-11-03 18:31:26.679911 7f7138d28700 0 cephx server osd.3: couldn't find entity name: osd.3 2016-11-03 18:31:26.876589 7f7138d28700 0 cephx server osd.6: couldn't find entity name: osd.6 2016-11-03 18:31:26.996219 7f7138d28700 0 cephx server osd.14: couldn't find entity name: osd.14 2016-11-03 18:31:27.016283 7f7138d28700 0 cephx server osd.41: couldn't find entity name: osd.41 2016-11-03 18:31:27.016406 7f7138d28700 0 cephx server osd.37: couldn't find entity name: osd.37 2016-11-03 18:31:27.016606 7f7138d28700 0 cephx server osd.40: couldn't find entity name: osd.40 2016-11-03 18:31:27.017276 7f7138d28700 0 cephx server osd.48: couldn't find entity name: osd.48 2016-11-03 18:31:27.291934 7f7138d28700 0 cephx server osd.4: couldn't find entity name: osd.4 2016-11-03 18:31:27.292598 7f7138d28700 0 cephx server osd.5: couldn't find entity name: osd.5 2016-11-03 18:31:27.339803 7f7138d28700 0 cephx server osd.7: couldn't find entity name: osd.7 So how do I tell the mon about the OSDs? Any pointers are greatly appreciated. -- Tracy Reed signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] What is in the mon leveldb?
Hello all, It seems I have underprovisioned storage space for my mons and my /var/lib/ceph/mon filesystem is getting full. When I first started using ceph this only took up tens of megabytes and I assumed it would stay that way and 5G for this filesystem seemed luxurious. Little did I know that mon was going to be storing multiple gigs of data! That's still a trivial amount of course but larger than what I expected and now I have to do some work to rebuild my monitors on bigger storage. I'm curious: Exactly what is being stored and is there any way to trim it down a bit? It has slowly grown over time. I've already run a compact on it which gained me only a few percent. Thanks! -- Tracy Reed http://tracyreed.org Digital signature attached for your safety. signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] What is in the mon leveldb?
On Mon, Mar 26, 2018 at 11:15:34PM PDT, Wido den Hollander spake thusly: > The MONs keep a history of OSDMaps and other maps. Normally these maps > are trimmed from the database, but if one or more PGs are not > active+clean the MONs will keep a large history to get old OSDs up to > speed which might be needed to bring that PGs to a clean state again. > > What is the status of your Ceph cluster (ceph -s) and what version are > you running? Ah...well. That leads to my next question which may resolve this issue: Current state of my cluster is: health: HEALTH_WARN recovery 1230/13361271 objects misplaced (0.009%) and no recovery is happening. I'm not sure why. This hasn't happened before. But the mon db had been growing since long before this circumstance. Any idea why it might be stuck like this? I suppose I need to clear this up before I can know if this is the cause of the disk usage. > And yes, make sure your MONs do have a tens of GBs available should they > need it for a very long recovery. Yeah...I've temporarily moved the store.db to another disk and symlinked it back but I'm working towards rebuilding my mons. > For example, I'm working on a 2200 OSD cluster which has been doing a > recovery operation for a week now and the MON DBs are about 50GB now. Wow. My cluster is only around 70 OSDs. Thanks! -- Tracy Reed http://tracyreed.org Digital signature attached for your safety. signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] What is in the mon leveldb?
> health: HEALTH_WARN > recovery 1230/13361271 objects misplaced (0.009%) > > and no recovery is happening. I'm not sure why. This hasn't happened > before. But the mon db had been growing since long before this > circumstance. Hmmok, the recent trouble started a few days ago when we removed a node containing 4 OSDs from the cluster. The OSDs on that node were shut down but were not removed from the crush map. So apparently this has caused some issues. I just removed the OSDs properly and now there is recovery happening. Unfortunately it now says 30% of my objects are misplaced so I'm looking at 24 hours of recovery. Maybe the store.db will be smaller when it finally finishes. -- Tracy Reed http://tracyreed.org Digital signature attached for your safety. signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] mgr dashboard differs from ceph status
My ceph status says: cluster: id: b2b00aae-f00d-41b4-a29b-58859aa41375 health: HEALTH_OK services: mon: 3 daemons, quorum ceph01,ceph03,ceph07 mgr: ceph01(active), standbys: ceph-ceph07, ceph03 osd: 78 osds: 78 up, 78 in data: pools: 4 pools, 3240 pgs objects: 4384k objects, 17533 GB usage: 53141 GB used, 27311 GB / 80452 GB avail pgs: 3240 active+clean io: client: 4108 kB/s rd, 10071 kB/s wr, 27 op/s rd, 331 op/s wr but my mgr dashboard web interface says: Health Overall status: HEALTH_WARN PG_AVAILABILITY: Reduced data availability: 2563 pgs inactive Anyone know why the discrepency? Hopefully the dashboard is very mistaken! Everything seems to be operating normally. If I had 2/3 of my pgs inactive I'm sure all of my rbd backing my VMs would be blocked etc. I'm running ceph-12.2.4-0.el7.x86_64 on CentOS 7. Almost all filestore except for one OSD which recently had to be replaced which I made bluestore. I plan to slowly migrate everything over to bluestore over the course of the next month. Thanks! -- Tracy Reed http://tracyreed.org Digital signature attached for your safety. signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph mgr module not working
Hello all, I can seemingly enable the balancer ok: $ ceph mgr module enable balancer but if I try to check its status: $ ceph balancer status Error EINVAL: unrecognized command or turn it on: $ ceph balancer on Error EINVAL: unrecognized command $ which ceph /bin/ceph $ rpm -qf /bin/ceph ceph-common-12.2.4-0.el7.x86_64 So it's not like I'm running an old version of the ceph command which wouldn't know about the balancer. I'm running ceph-12.2.4-0.el7.x86_64 on CentOS 7. Almost all filestore except for one OSD which recently had to be replaced which I made bluestore. I plan to slowly migrate everything over to bluestore over the course of the next month. Thanks! -- Tracy Reed http://tracyreed.org Digital signature attached for your safety. signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Place on separate hosts?
I've been using ceph for nearly a year and one of the things I ran into quite a while back was that it seems like ceph is placing copies of objects on different OSDs but sometimes those OSDs can be on the same host by default. Is that correct? I discovered this by taking down one host and having some pgs become inactive. So I guess you could say I want my failure domain to be the host, not the OSD. How would I accomplish this? I understand it involves changing the crush map. I've been reading over http://docs.ceph.com/docs/master/rados/operations/crush-map/ and it still isn't clear to me what needs to change. I expect I need to change the default replicated_ruleset which I'm still running: $ ceph osd crush rule dump [ { "rule_id": 0, "rule_name": "replicated_ruleset", "ruleset": 0, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -1, "item_name": "default" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] } ] And that I need something like: ceph osd crush rule create-replicated then: ceph osd pool set crush_rule but I'm not sure what the values of would be in my situation. Maybe: ceph osd crush rule create-replicated different-host default but I don't know what failure-domain or class should just by inspecting my current crush map. Suggestions are greatly appreciated! -- Tracy Reed http://tracyreed.org Digital signature attached for your safety. signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Place on separate hosts?
On Fri, May 04, 2018 at 12:08:35AM PDT, Tracy Reed spake thusly: > I've been using ceph for nearly a year and one of the things I ran into > quite a while back was that it seems like ceph is placing copies of > objects on different OSDs but sometimes those OSDs can be on the same > host by default. Is that correct? I discovered this by taking down one > host and having some pgs become inactive. Actually, this (admittedly ancient) document: https://jcftang.github.io/2012/09/06/going-from-replicating-across-osds-to-replicating-across-hosts-in-a-ceph-cluster/ says "As the default CRUSH map replicates across OSD’s I wanted to try replicating data across hosts just to see what would happen." This would seem to align with my experience as far as the default goes. However, this: http://docs.ceph.com/docs/master/rados/operations/crush-map/ says: "When you deploy OSDs they are automatically placed within the CRUSH map under a host node named with the hostname for the host they are running on. This, combined with the default CRUSH failure domain, ensures that replicas or erasure code shards are separated across hosts and a single host failure will not affect availability." How can I tell which way mine is configured? I could post the whole crushmap if necessary but it's a bit large to copy and paste. -- Tracy Reed http://tracyreed.org Digital signature attached for your safety. signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Place on separate hosts?
On Fri, May 04, 2018 at 12:18:15AM PDT, Tracy Reed spake thusly: > https://jcftang.github.io/2012/09/06/going-from-replicating-across-osds-to-replicating-across-hosts-in-a-ceph-cluster/ > How can I tell which way mine is configured? I could post the whole > crushmap if necessary but it's a bit large to copy and paste. To further answer my own question (sorry for the spam) the above linked doc says this should do what I want: step chooseleaf firstn 0 type host which is what I already have in my crush map. So it looks like the default is as I want it. In which case I wonder why I had the problem previously... I guess the only way to know for sure is to stop one osd node and see what happens. -- Tracy Reed http://tracyreed.org Digital signature attached for your safety. signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] mgr dashboard differs from ceph status
On Mon, May 07, 2018 at 12:13:00AM PDT, Janne Johansson spake thusly: > > mgr: ceph01(active), standbys: ceph-ceph07, ceph03 > > Don't know if it matters, but the naming seems different even though I guess > you are running mgr's on the same nodes as the mons, but ceph07 is called > "ceph-ceph07" in the mgr list. Yes, I did make a mistake when I started the manager on that one and provided it with an inconsistent name. That has been corrected. I have also since restarted all of the managers but the problem persists. But I'm not in a position to do any debugging on it right now but will try to look into it more in the morning. Thanks for the feedback! -- Tracy Reed http://tracyreed.org Digital signature attached for your safety. signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] rbd map hangs
Hello all! I'm running luminous with old style non-bluestore OSDs. ceph 10.2.9 clients though, haven't been able to upgrade those yet. Occasionally I have access to rbds hang on the client such as right now. I tried to dd a VM image into a mapped rbd and it just hung. Then I tried to map a new rbd and that hangs also. How would I troubleshoot this? /var/log/ceph is empty, nothing in /var/log/messages or dmesg etc. I just discovered: find /sys/kernel/debug/ceph -type f -print -exec cat {} \; which produces (among other seemingly innocuous things, let me know if anyone wants to see the rest): osd2(unknown sockaddr family 0) 0%(doesn't exist) 100% which seems suspicious. rbd ls works reliably. As does create. Cluster is healthy. But the processes which hung trying to access that mapped rbd appear to be completely unkillable. What else should I check? Thanks! -- Tracy Reed http://tracyreed.org Digital signature attached for your safety. signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rbd map hangs
et-alloc-hint,write 16271496osd12.5d28d036 rbd_data.1c55496b8b4567.08cb set-alloc-hint,write 16271497osd12.5d28d036 rbd_data.1c55496b8b4567.08cb set-alloc-hint,write 16271498osd12.5d28d036 rbd_data.1c55496b8b4567.08cb set-alloc-hint,write 16271499osd12.5d28d036 rbd_data.1c55496b8b4567.08cb set-alloc-hint,write 16271500osd12.5d28d036 rbd_data.1c55496b8b4567.08cb set-alloc-hint,write 32154589osd12.5d28d036 rbd_data.1c55496b8b4567.08cb set-alloc-hint,write 32154590osd12.5d28d036 rbd_data.1c55496b8b4567.08cb set-alloc-hint,write 32155075osd12.5d28d036 rbd_data.1c55496b8b4567.08cb set-alloc-hint,write 32155250osd12.5d28d036 rbd_data.1c55496b8b4567.08cb set-alloc-hint,write 32156442osd12.5d28d036 rbd_data.1c55496b8b4567.08cb set-alloc-hint,write 32156983osd12.5d28d036 rbd_data.1c55496b8b4567.08cb set-alloc-hint,write 33982347osd12.678ef636 rbd_data.93285b6b8b4567.139d set-alloc-hint,write 34517953osd79 0.e321b924 rbd_data.51f32238e1f29.0d30 set-alloc-hint,write 34517955osd79 0.e321b924 rbd_data.51f32238e1f29.0d30 set-alloc-hint,write 34517956osd79 0.e321b924 rbd_data.51f32238e1f29.0d30 set-alloc-hint,write 34517957osd79 0.e321b924 rbd_data.51f32238e1f29.0d30 set-alloc-hint,write 34517958osd79 0.e321b924 rbd_data.51f32238e1f29.0d30 set-alloc-hint,write 34517959osd79 0.e321b924 rbd_data.51f32238e1f29.0d30 set-alloc-hint,write 34517960osd79 0.e321b924 rbd_data.51f32238e1f29.0d30 set-alloc-hint,write 34517961osd79 0.e321b924 rbd_data.51f32238e1f29.0d30 set-alloc-hint,write 34517963osd79 0.e321b924 rbd_data.51f32238e1f29.0d30 set-alloc-hint,write 34533231osd73 0.f0ae1f02 rbd_data.51f32238e1f29.13de set-alloc-hint,write 34533233osd73 0.f0ae1f02 rbd_data.51f32238e1f29.13de set-alloc-hint,write 34533234osd73 0.f0ae1f02 rbd_data.51f32238e1f29.13de set-alloc-hint,write 34533235osd73 0.f0ae1f02 rbd_data.51f32238e1f29.13de set-alloc-hint,write 34533236osd73 0.f0ae1f02 rbd_data.51f32238e1f29.13de set-alloc-hint,write 34533237osd73 0.f0ae1f02 rbd_data.51f32238e1f29.13de set-alloc-hint,write 34533238osd73 0.f0ae1f02 rbd_data.51f32238e1f29.13de set-alloc-hint,write 34533239osd73 0.f0ae1f02 rbd_data.51f32238e1f29.13de set-alloc-hint,write 34533241osd73 0.f0ae1f02 rbd_data.51f32238e1f29.13de set-alloc-hint,write /sys/kernel/debug/ceph/b2b00aae-f00d-41b4-a29b-58859aa41375.client31276017/monc have osdmap 232455 want next osdmap Thanks! -- Tracy Reed http://tracyreed.org Digital signature attached for your safety. signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rbd map hangs
On Thu, Jun 07, 2018 at 08:40:50AM PDT, Ilya Dryomov spake thusly: > > Kernel is Linux cpu04.mydomain.com 3.10.0-229.20.1.el7.x86_64 #1 SMP Tue > > Nov 3 19:10:07 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux > > This is a *very* old kernel. It's what's shipping with CentOS/RHEL 7 and probably what the vast majority of people are using aside from perhaps the Ubuntu LTS people. Does anyone really still compile their own latest kernels? Back in the mid-90's I'd compile a new kernel at the drop of a hat. But now it has gotten so complicated with so many options and drivers etc. that it's actually pretty hard to get it right. > These lines indicate in-flight requests. Looks like there may have > been a problem with osd1 in the past, as some of these are much older > than others. Try bouncing osd1 with "ceph osd down 1" (it should > come back up automatically) and see if that clears up this batch. Thanks! -- Tracy Reed http://tracyreed.org Digital signature attached for your safety. signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rbd map hangs
xists, up)100% osd61 10.0.5.3:680352%(exists, up)100% osd62 10.0.5.12:6800 42%(exists, up)100% osd63 10.0.5.12:6819 46%(exists, up)100% osd64 10.0.5.12:6809 44%(exists, up)100% osd65 10.0.5.13:6800 44%(exists, up)100% osd66 (unknown sockaddr family 0) 0%(doesn't exist) 100% osd67 10.0.5.13:6808 50%(exists, up)100% osd68 10.0.5.4:680441%(exists, up)100% osd69 10.0.5.4:680039%(exists, up)100% osd70 10.0.5.13:6804 42%(exists, up)100% osd71 (unknown sockaddr family 0) 0%(doesn't exist) 100% osd72 (unknown sockaddr family 0) 0%(doesn't exist) 100% osd73 10.0.5.16:6826 92%(exists, up)100% osd74 10.0.5.16:6846 100%(exists, up)100% osd75 10.0.5.16:6811 98%(exists, up)100% osd76 10.0.5.16:6815 100%(exists, up)100% osd77 10.0.5.16:6835 93%(exists, up)100% osd78 10.0.5.16:6802 97%(exists, up)100% osd79 10.0.5.16:6858 100%(exists, up)100% osd80 10.0.5.16:6839 91%(exists, up)100% osd81 10.0.5.16:6801 100%(exists, up)100% osd82 10.0.5.16:6820 99%(exists, up)100% osd83 10.0.5.16:6852 98%(exists, up)100% osd84 10.0.5.16:6862 93%(exists, up)100% osd85 10.0.5.16:6800 96%(exists, up)100% /sys/kernel/debug/ceph/b2b00aae-f00d-41b4-a29b-58859aa41375.client31276017/monmap epoch 12 mon010.0.5.2:6789 mon110.0.5.4:6789 mon210.0.5.13:6789 /sys/kernel/debug/ceph/b2b00aae-f00d-41b4-a29b-58859aa41375.client31276017/osdc 34533231osd73 0.f0ae1f02 rbd_data.51f32238e1f29.13de set-alloc-hint,write 34533233osd73 0.f0ae1f02 rbd_data.51f32238e1f29.13de set-alloc-hint,write 34533234osd73 0.f0ae1f02 rbd_data.51f32238e1f29.13de set-alloc-hint,write 34533235osd73 0.f0ae1f02 rbd_data.51f32238e1f29.13de set-alloc-hint,write 34533236osd73 0.f0ae1f02 rbd_data.51f32238e1f29.13de set-alloc-hint,write 34533237osd73 0.f0ae1f02 rbd_data.51f32238e1f29.13de set-alloc-hint,write 34533238osd73 0.f0ae1f02 rbd_data.51f32238e1f29.13de set-alloc-hint,write 34533239osd73 0.f0ae1f02 rbd_data.51f32238e1f29.13de set-alloc-hint,write 34533241osd73 0.f0ae1f02 rbd_data.51f32238e1f29.13de set-alloc-hint,write 34919983osd67 0.f4cdfa38 rbd_header.51f32238e1f29 5613'998386622791680watch 34919984osd62.5aca5ef2 rbd_header.93285b6b8b4567 4422885'943544185389056 watch 34919985osd67 2.4dbc6037 rbd_header.5f75476b8b4567 28922'998386622791680 watch 34919986osd12.ba8d973e rbd_header.dd3b556b8b4567 5305738'894263730634752 watch /sys/kernel/debug/ceph/b2b00aae-f00d-41b4-a29b-58859aa41375.client31276017/monc have osdmap 232501 want next osdmap -- Tracy Reed http://tracyreed.org Digital signature attached for your safety. signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] virt-install into rbd hangs during Anaconda package installation
This is what I'm doing on my CentOS 7/KVM/virtlib server: rbd create --size 20G pool/vm.mydomain.com rbd map pool/vm.mydomain.com --name client.admin virt-install --name vm.mydomain.com --ram 2048 --disk path=/dev/rbd/pool/vm.mydomain.com --vcpus 1 --os-type linux --os-variant rhel6 --network bridge=dmz --graphics none --console pty,target_type=serial --location http://repo.mydomain.com/centos/7/os/x86_64 --extra-args "ip=en0:dhcp ks=http://repo.mydomain.com/ks/ks.cfg.vm console=ttyS0 ksdevice=eth0 inst.repo=http://10.0.10.5/http://repo.mydomain.com/centos/7/os/x86_64"; And then it creates partitions, filesystems (xfs), and starts installing packages. 9 times out of 10 it hangs while installing packages. And I have no idea why. I can't kill the VM. Trying to destroy it shows: virsh # destroy vm.mydomain.com error: Failed to destroy domain vm.mydomain.com error: Failed to terminate process 19629 with SIGKILL: Device or resource busy and then virsh ls shows: virsh ls shows: 127 vm.mydomain.comin shutdown The log for this vm in /var/log/libvirt/qemu/vm.mydomain.com contains only: 2017-02-06 08:14:12.256+: starting up libvirt version: 2.0.0, package: 10.el7_3.2 (CentOS BuildSystem <http://bugs.centos.org>, 2016-12-06-19:53:38, c1bm.rdu2.centos.org), qemu version: 1.5.3 (qemu-kvm-1.5.3-105.el7_2.7), hostname: cpu01.mydomain.com LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name secclass2.mydomain.com -S -machine pc-i440fx-rhel7.0.0,accel=kvm,usb=off -cpu SandyBridge,+vme,+f16c,+rdrand,+fsgsbase,+smep,+erms -m 2048 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 5dadf01e-b996-411f-b95f-26ce6b790bae -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-127-secclass2.mydomain./monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-reboot -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -kernel /var/lib/libvirt/boot/virtinst-vmlinuz.9Ax4zt -initrd /var/lib/libvirt/boot/virtinst-initrd.img.ALJE43 -append 'ip=en0:dhcp ks=http://util1.mydomain.com/ks/ks.cfg.vm. console=ttyS0 ksdevice=eth0 inst.repo=http://10.0.10.5/http://util1.mydomain.com/centos/7/os/x86_64 method=http://util1.mydomain.com/centos/7/os/x86_64' -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive file=/dev/rbd/security-class/secclass2.mydomain.com,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=55,id=hostnet0,vhost=on,vhostfd=57 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:87:d2:12,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-127-secclass2.mydomain./org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -msg timestamp=on char device redirected to /dev/pts/24 (label charserial0) qemu: terminating on signal 15 from pid 23385 Any ideas? If this is a libvirt/kvm problem I'll take it to the appropriate forum but we can install into iscsi LUNs with no problem at all. Someone on IRC mentioned mkfs discard starting a zero on the rbd image which can take a long time but that should be doable in background and not hang the whole VM forever, right? Thanks for any insight you can provide! -- Tracy Reed signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] virt-install into rbd hangs during Anaconda package installation
Weird. Now the VMs that were hung in interruptable wait state have now disappeared. No idea why. Additional information: ceph-mds-10.2.3-0.el7.x86_64 python-cephfs-10.2.3-0.el7.x86_64 ceph-osd-10.2.3-0.el7.x86_64 ceph-radosgw-10.2.3-0.el7.x86_64 libcephfs1-10.2.3-0.el7.x86_64 ceph-common-10.2.3-0.el7.x86_64 ceph-base-10.2.3-0.el7.x86_64 ceph-10.2.3-0.el7.x86_64 ceph-selinux-10.2.3-0.el7.x86_64 ceph-mon-10.2.3-0.el7.x86_64 cluster b2b00aae-f00d-41b4-a29b-58859aa41375 health HEALTH_OK monmap e11: 3 mons at {ceph01=10.0.5.2:6789/0,ceph03=10.0.5.4:6789/0,ceph07=10.0.5.13:6789/0} election epoch 76, quorum 0,1,2 ceph01,ceph03,ceph07 osdmap e14396: 70 osds: 66 up, 66 in flags sortbitwise,require_jewel_osds pgmap v7116569: 1664 pgs, 3 pools, 7876 GB data, 1969 kobjects 23648 GB used, 24310 GB / 47958 GB avail 1661 active+clean 2 active+clean+scrubbing+deep 1 active+clean+scrubbing client io 839 kB/s wr, 0 op/s rd, 159 op/s wr On Mon, Feb 06, 2017 at 06:57:23PM PST, Tracy Reed spake thusly: > This is what I'm doing on my CentOS 7/KVM/virtlib server: > > rbd create --size 20G pool/vm.mydomain.com > > rbd map pool/vm.mydomain.com --name client.admin > > virt-install --name vm.mydomain.com --ram 2048 --disk > path=/dev/rbd/pool/vm.mydomain.com --vcpus 1 --os-type linux --os-variant > rhel6 --network bridge=dmz --graphics none --console pty,target_type=serial > --location http://repo.mydomain.com/centos/7/os/x86_64 --extra-args > "ip=en0:dhcp ks=http://repo.mydomain.com/ks/ks.cfg.vm console=ttyS0 > ksdevice=eth0 > inst.repo=http://10.0.10.5/http://repo.mydomain.com/centos/7/os/x86_64"; > > And then it creates partitions, filesystems (xfs), and > starts installing packages. 9 times out of 10 it hangs while > installing packages. And I have no idea why. I can't kill > the VM. > > Trying to destroy it shows: > > virsh # destroy vm.mydomain.com > error: Failed to destroy domain vm.mydomain.com > error: Failed to terminate process 19629 with SIGKILL: > Device or resource busy > > and then virsh ls shows: > > virsh ls shows: > > 127 vm.mydomain.comin shutdown > > The log for this vm in > /var/log/libvirt/qemu/vm.mydomain.com contains only: > > 2017-02-06 08:14:12.256+: starting up libvirt version: > 2.0.0, package: 10.el7_3.2 (CentOS BuildSystem > <http://bugs.centos.org>, 2016-12-06-19:53:38, > c1bm.rdu2.centos.org), qemu version: 1.5.3 > (qemu-kvm-1.5.3-105.el7_2.7), hostname: cpu01.mydomain.com > LC_ALL=C > PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin > QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name > secclass2.mydomain.com -S -machine > pc-i440fx-rhel7.0.0,accel=kvm,usb=off -cpu > SandyBridge,+vme,+f16c,+rdrand,+fsgsbase,+smep,+erms -m > 2048 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 > -uuid 5dadf01e-b996-411f-b95f-26ce6b790bae -nographic > -no-user-config -nodefaults -chardev > socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-127-secclass2.mydomain./monitor.sock,server,nowait > -mon chardev=charmonitor,id=monitor,mode=control -rtc > base=utc,driftfix=slew -global > kvm-pit.lost_tick_policy=discard -no-hpet -no-reboot > -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 > -boot strict=on -kernel > /var/lib/libvirt/boot/virtinst-vmlinuz.9Ax4zt -initrd > /var/lib/libvirt/boot/virtinst-initrd.img.ALJE43 -append > 'ip=en0:dhcp ks=http://util1.mydomain.com/ks/ks.cfg.vm. > console=ttyS0 ksdevice=eth0 > inst.repo=http://10.0.10.5/http://util1.mydomain.com/centos/7/os/x86_64 > method=http://util1.mydomain.com/centos/7/os/x86_64' > -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 > -device > ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 > -device > ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 > -device > ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 > -device > virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 > -drive > file=/dev/rbd/security-class/secclass2.mydomain.com,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native > -device > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 > -netdev tap,fd=55,id=hostnet0,vhost=on,vhostfd=57 -device > virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:87:d2:12,bus=pci.0,addr=0x3 > -chardev pty,id=charserial0 -device > isa-serial,chardev=charserial0,id=serial0 -chardev > socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-127-secclass2.mydomain./org.qemu.guest_agent.0,server,nowait > -device > virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=ch
Re: [ceph-users] virt-install into rbd hangs during Anaconda package installation
On Tue, Feb 07, 2017 at 12:25:08AM PST, koukou73gr spake thusly: > On 2017-02-07 10:11, Tracy Reed wrote: > > Weird. Now the VMs that were hung in interruptable wait state have now > > disappeared. No idea why. > > Have you tried the same procedure but with local storage instead? Yes. I have local storage and iSCSI storage and they both install just fine. -- Tracy Reed signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] virt-install into rbd hangs during Anaconda package installation
On Wed, Feb 08, 2017 at 10:57:38AM PST, Shinobu Kinjo spake thusly: > If you would be able to reproduce the issue intentionally under > particular condition which I have no idea about at the moment, it > would be helpful. The issue is very reproduceable. It hangs every time. Any install I do with virt-install causes a hang at some point during the install. I have reproduces it 3 times this morning already. > There were some MLs previously regarding to *similar* issue. > > # google "libvirt rbd issue" I found: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-September/004179.html which suggested file descriptors as the problem. That's good to know for when my cluster gets bigger but I have only 70 OSDs and the number of fds used did not exceed 90 when the soft limit is 1024. My problem also manifests itself a little differently than described in that post. I can dd large machine images into rbd all day long with no problems. In fact I am considering bypassing anaconda kickstart installs for the moment and just copying the machine image which gets successfully installed occasionally but this is not our normal deployment workflow so is not ideal. Plus I'm still concerned there is an actual underlying problem or something I am not understanding which may bite us later. That post also mentions jumbo frames. We have jumbo frames enabled everywhere. We did have a problem months ago with getting ceph up and running initially because we forgot to tell the switch to use jumbo frames and learned our lesson on that. Not sure what else I can look at. I'm not seeing any clues. -- Tracy Reed signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] How safe is ceph pg repair these days?
I have a 3 replica cluster. A couple times I have run into inconsistent PGs. I googled it and ceph docs and various blogs say run a repair first. But a couple people on IRC and a mailing list thread from 2015 say that ceph blindly copies the primary over the secondaries and calls it good. http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-May/001370.html I sure hope that isn't the case. If so it would seem highly irresponsible to implement such a naive command called "repair". I have recently learned how to properly analyze the OSD logs and manually fix these things but not before having run repair on a dozen inconsistent PGs. Now I'm worried about what sort of corruption I may have introduced. Repairing things by hand is a simple heuristic based on comparing the size or checksum (as indicated by the logs) for each of the 3 copies and figuring out which is correct. Presumably matching two out of three should win and the odd object out should be deleted since having the exact same kind of error on two different OSDs is highly improbable. I don't understand why ceph repair wouldn't have done this all along. What is the current best practice in the use of ceph repair? Thanks! -- Tracy Reed signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How safe is ceph pg repair these days?
Well, that's the question...is that safe? Because the link to the mailing list post (possibly outdated) says that what you just suggested is definitely NOT safe. Is the mailing list post wrong? Has the situation changed? Exactly what does ceph repair do now? I suppose I could go dig into the code but I'm not an expert and would hate to get it wrong and post possibly bogus info the the list for other newbies to find and worry about and possibly lose their data. On Fri, Feb 17, 2017 at 06:08:39PM PST, Shinobu Kinjo spake thusly: > if ``ceph pg deep-scrub `` does not work > then > do > ``ceph pg repair > > > On Sat, Feb 18, 2017 at 10:02 AM, Tracy Reed wrote: > > I have a 3 replica cluster. A couple times I have run into inconsistent > > PGs. I googled it and ceph docs and various blogs say run a repair > > first. But a couple people on IRC and a mailing list thread from 2015 > > say that ceph blindly copies the primary over the secondaries and calls > > it good. > > > > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-May/001370.html > > > > I sure hope that isn't the case. If so it would seem highly > > irresponsible to implement such a naive command called "repair". I have > > recently learned how to properly analyze the OSD logs and manually fix > > these things but not before having run repair on a dozen inconsistent > > PGs. Now I'm worried about what sort of corruption I may have > > introduced. Repairing things by hand is a simple heuristic based on > > comparing the size or checksum (as indicated by the logs) for each of > > the 3 copies and figuring out which is correct. Presumably matching two > > out of three should win and the odd object out should be deleted since > > having the exact same kind of error on two different OSDs is highly > > improbable. I don't understand why ceph repair wouldn't have done this > > all along. > > > > What is the current best practice in the use of ceph repair? > > > > Thanks! > > > > -- > > Tracy Reed > > > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- Tracy Reed signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How safe is ceph pg repair these days?
On Mon, Feb 20, 2017 at 02:12:52PM PST, Gregory Farnum spake thusly: > Hmm, I went digging in and sadly this isn't quite right. Thanks for looking into this! This is the answer I was afraid of. Aren't all of those blog entries which talk about using repair and the ceph docs themselves putting people's data at risk? It seems like the only responsible way to deal with inconsistent PGs is to dig into the osd log, look at the reason for the inconistency, examine the data on disk, determine which one is good and which is bad, and delete the bad one? > The code has a lot of internal plumbing to allow more smarts than were > previously feasible and the erasure-coded pools make use of them for > noticing stuff like local corruption. Replicated pools make an attempt > but it's not as reliable as one would like and it still doesn't > involve any kind of voting mechanism. This is pretty surprising. I would have thought a best two out of three voting mechanism in a triple replicated setup would be the obvious way to go. It must be more difficult to implement than I suppose. > A self-inconsistent replicated primary won't get chosen. A primary is > self-inconsistent when its digest doesn't match the data, which > happens when: > 1) the object hasn't been written since it was last scrubbed, or > 2) the object was written in full, or > 3) the object has only been appended to since the last time its digest > was recorded, or > 4) something has gone terribly wrong in/under LevelDB and the omap > entries don't match what the digest says should be there. At least there's some sort of basic heuristic which attempts to do the right thing even if the whole process isn't as thorough as it could be. > David knows more and correct if I'm missing something. He's also > working on interfaces for scrub that are more friendly in general and > allow administrators to make more fine-grained decisions about > recovery in ways that cooperate with RADOS. These will be very welcome improvements! -- Tracy Reed signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com