Re: [ceph-users] getting pg inconsistent periodly
Den ons 24 apr. 2019 kl 08:46 skrev Zhenshi Zhou : > Hi, > > I'm running a cluster for a period of time. I find the cluster usually > run into unhealthy state recently. > > With 'ceph health detail', one or two pg are inconsistent. What's > more, pg in wrong state each day are not placed on the same disk, > so that I don't think it's a disk problem. > > The cluster is using version 12.2.5. Any idea about this strange issue? > > There was lots of fixes for releases around that version, do read https://ceph.com/releases/12-2-7-luminous-released/ and later release notes on the 12.2.x series. -- May the most significant bit of your life be positive. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] getting pg inconsistent periodly
Hi, I remember that there's some bug about cephfs when upgrading from 12.2.5. Is it safe to upgrade the cluster now? Thanks Janne Johansson 于2019年4月24日周三 下午4:06写道: > > > Den ons 24 apr. 2019 kl 08:46 skrev Zhenshi Zhou : > >> Hi, >> >> I'm running a cluster for a period of time. I find the cluster usually >> run into unhealthy state recently. >> >> With 'ceph health detail', one or two pg are inconsistent. What's >> more, pg in wrong state each day are not placed on the same disk, >> so that I don't think it's a disk problem. >> >> The cluster is using version 12.2.5. Any idea about this strange issue? >> >> > There was lots of fixes for releases around that version, > do read https://ceph.com/releases/12-2-7-luminous-released/ > and later release notes on the 12.2.x series. > > > -- > May the most significant bit of your life be positive. > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] unable to manually flush cache: failed to flush /xxx: (2) No such file or directory
Hi, we're having issue on one of our clusters, while wanting to remove cache tier, trying to manually flush cache always ends up with error: rados -p ssd-cache cache-flush-evict-all . . . failed to flush /rb.0.965780.238e1f29.1641: (2) No such file or directory rb.0.965780.238e1f29.02c8 failed to flush /rb.0.965780.238e1f29.02c8: (2) No such file or directory rb.0.965780.238e1f29.9113 failed to flush /rb.0.965780.238e1f29.9113: (2) No such file or directory rb.0.965780.238e1f29.9b0f failed to flush /rb.0.965780.238e1f29.9b0f: (2) No such file or directory rb.0.965780.238e1f29.62b6 failed to flush /rb.0.965780.238e1f29.62b6: (2) No such file or directory rb.0.965780.238e1f29.030c . . . cluster is healthy, running 13.2.5 any idea on what might be wrong? should I provide more details, please let me know BR nik -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28.rijna 168, 709 00 Ostrava tel.: +420 591 166 214 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] rbd omap disappeared
my cluster occur a big error this morning. many osd suicide because of heartbeat_map timeout. when I start all osd manually.it looks well. but when I using rbd info for a rbd from rbd ls, it say not such file or directory. And the I use the way in https://fnordahl.com/2017/04/17/ceph-rbd-volume-header-recovery/ to recovery omap. At last,most rbd works well.But still have 5 rbd can not reocvery. 2019-04-24 16:00:40.045342 7fd5fdffb700 -1 librbd::image::OpenRequest: failed to retreive immutable metadata: (5) Input/output error 2019-04-24 16:00:40.045419 7fd5fd7fa700 -1 librbd::ImageState: 0x5593262fb830 failed to open image: (5) Input/output error rbd: error opening image volume-bef74858-0fcb-4a0b-b197-9618e6824c46: (5) Input/output error I try to locate its omap head object, and stop master osd, delete the pg data in master osd, then restart master osd. It looks three replication is the same when I using attr. But I can not get the rbd info.Alwayse the vm among this rbd is still alive and works well. ceph -s cluster 2bec9425-ea5f-4a48-b56a-fe88e126bced health HEALTH_WARN noout flag(s) set monmap e1: 3 mons at {a= 10.191.175.249:6789/0,b=10.191.175.250:6789/0,c=10.191.175.251:6789/0} election epoch 26, quorum 0,1,2 a,b,c osdmap e22551: 1080 osds: 1078 up, 1078 in flags noout,sortbitwise,require_jewel_osds pgmap v29327873: 90112 pgs, 3 pools, 69753 GB data, 30081 kobjects 214 TB used, 1500 TB / 1715 TB avail 90111 active+clean 1 active+clean+scrubbing+deep client io 57082 kB/s rd, 207 MB/s wr, 1091 op/s rd, 7658 op/s wr the log of osd.219 which is the firstest log heartbeat no reply. 2019-04-24 04:00:54.905504 7f5eb7fab700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f5e97182700' had timed out after 15 ... 2019-04-24 04:00:54.905536 7f5eb7fab700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f5e9b18a700' had timed out after 15 2019-04-24 04:00:57.499903 7f5df2997700 0 -- 10.191.175.20:6855/3169631 >> 10.191.175.33:6947/3001469 pipe(0x55d93b470800 sd=1914 :6855 s=2 pgs=514958 cs=17 l=0 c=0x55d91a3c1200).fault with nothing to send, going to standby ... repeat large 2019-04-24 04:00:57.651603 7f5e77735700 0 -- 10.191.175.20:6855/3169631 >> 10.191.175.36:6845/1062649 pipe(0x55d92d2cb400 sd=105 :45295 s=2 pgs=493749 cs=11 l=0 c=0x55d943755200).fault with nothing to send, going to standby 2019-04-24 04:00:59.905846 7f5eb7fab700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f5e97182700' had timed out after 15 ... 2019-04-24 04:01:04.666159 7f5dbb6d7700 0 -- 10.191.175.20:6856/3169631 >> :/0 pipe(0x55d9234b1400 sd=1926 :6856 s=0 pgs=0 cs=0 l=0 c=0x55d924743600).accept failed to getpeername (107) Transport endpoint is not connected 2019-04-24 04:01:04.905958 7f5eb7fab700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f5e97182700' had timed out after 15 2019-04-24 04:01:14.550812 7f5db7793700 0 -- 10.191.175.20:6856/3169631 >> :/0 pipe(0x55d9234b2800 sd=1927 :6856 s=0 pgs=0 cs=0 l=0 c=0x55d924742a00).accept failed to getpeername (107) Transport endpoint is not connected ... 2019-04-24 04:01:14.691430 7f5db7793700 0 -- 10.191.175.20:6858/3169631 >> :/0 pipe(0x55d936a78800 sd=1383 :6858 s=0 pgs=0 cs=0 l=0 c=0x55d94ecf5a80).accept failed to getpeername (107) Transport endpoint is not connected 2019-04-24 04:01:14.692176 7f5e90469700 0 -- 10.191.175.20:6855/3169631 >> 10.191.175.31:6835/2092007 pipe(0x55d93c41a800 sd=67 :17328 s=2 pgs=571754 cs=83 l=0 c=0x55d939913180).fault, initiating reconnect 2019-04-24 04:01:14.693742 7f5e4c889700 0 -- 10.191.175.20:6855/3169631 >> 10.191.175.31:6835/2092007 pipe(0x55d93c41a800 sd=67 :17766 s=1 pgs=571754 cs=84 l=0 c=0x55d939913180).connect got RESETSESSION 2019-04-24 04:01:14.697098 7f5db256c700 0 -- 10.191.175.20:6908/169631 >> :/0 pipe(0x55d95606c800 sd=352 :6908 s=0 pgs=0 cs=0 l=0 c=0x55d920d79500).accept failed to getpeername (107) Transport endpoint is not connected 2019-04-24 04:01:14.697516 7f5e08523700 0 -- 10.191.175.20:6908/169631 >> :/0 pipe(0x55d91b30c000 sd=1926 :6908 s=0 pgs=0 cs=0 l=0 c=0x55d9263cb780).accept failed to getpeername (107) Transport endpoint is not connected ... 2019-04-24 04:01:14.704225 7f5dd7790700 0 -- 10.191.175.20:6908/169631 >> :/0 pipe(0x55d9531c5400 sd=1927 :6908 s=0 pgs=0 cs=0 l=0 c=0x55d9263ca280).accept failed to getpeername (107) Transport endpoint is not connected 2019-04-24 04:01:14.704511 7f5e78c4a700 0 -- 10.191.175.20:6855/3169631 >> 10.191.175.37:6801/2131766 pipe(0x55d939274800 sd=507 :10332 s=1 pgs=604906 cs=3 l=0 c=0x55d925714600).connect got RESETSESSION ... 2019-04-24 04:01:14.705970 7f5dc4980700 0 -- 10.191.175.20:6855/3169631 >> 10.191.175.38:6833/3181256 pipe(0x55d950fed400 sd=455 :58907 s=1 pgs=563194 cs=17 l=0 c=0x55d9320a2480).connect got RESETSESSION 2019-04-24 04:01:14.696315 7f5db548e700 0 -- 10.191.175.20:6908/169631 >> :/0 pipe(0x55d939e45400 sd=1929 :6908 s=0
Re: [ceph-users] VM management setup
Hello, I would also recommend proxmox It is very easy to install and to Manage your kvm/lxc with Huge amount of Support for possible storages. Just my 2 Cents Hth - Mehmet Am 6. April 2019 17:48:32 MESZ schrieb Marc Roos : > >We have also hybrid ceph/libvirt-kvm setup, using some scripts to do >live migration, do you have auto failover in your setup? > > > >-Original Message- >From: jes...@krogh.cc [mailto:jes...@krogh.cc] >Sent: 05 April 2019 21:34 >To: ceph-users >Subject: [ceph-users] VM management setup > >Hi. Knowing this is a bit off-topic but seeking recommendations and >advise anyway. > >We're seeking a "management" solution for VM's - currently in the 40-50 > >VM - but would like to have better access in managing them and >potintially migrate them across multiple hosts, setup block devices, >etc, etc. > >This is only to be used internally in a department where a bunch of >engineering people will manage it, no costumers and that kind of thing. > >Up until now we have been using virt-manager with kvm - and have been >quite satisfied when we were in the "few vms", but it seems like the >time to move on. > >Thus we're looking for something "simple" that can help manage a >ceph+kvm based setup - the simpler and more to the point the better. > >Any recommendations? > >.. found a lot of names allready .. >OpenStack >CloudStack >Proxmox >.. > >But recommendations are truely welcome. > >Thanks. > >___ >ceph-users mailing list >ceph-users@lists.ceph.com >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > >___ >ceph-users mailing list >ceph-users@lists.ceph.com >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] msgr2 and cephfs
Hi, I'm standing up a new cluster on nautilus to play with some of the new features, and I've somehow got my monitors only listening on msgrv2 port (3300) and not the legacy port (6789). I'm running kernel 4.15 on my clients. Can I mount cephfs via port 3300 or do I have to figure out how to get my mons listening to both? Thanks, Aaron CONFIDENTIALITY NOTICE This e-mail message and any attachments are only for the use of the intended recipient and may contain information that is privileged, confidential or exempt from disclosure under applicable law. If you are not the intended recipient, any disclosure, distribution or other use of this e-mail message or attachments is prohibited. If you have received this e-mail message in error, please delete and notify the sender immediately. Thank you. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] msgr2 and cephfs
AFAIK, the kernel clients for CephFS and RBD do not support msgr2 yet. On Wed, Apr 24, 2019 at 4:19 PM Aaron Bassett wrote: > > Hi, > I'm standing up a new cluster on nautilus to play with some of the new > features, and I've somehow got my monitors only listening on msgrv2 port > (3300) and not the legacy port (6789). I'm running kernel 4.15 on my clients. > Can I mount cephfs via port 3300 or do I have to figure out how to get my > mons listening to both? > > Thanks, > Aaron > CONFIDENTIALITY NOTICE > This e-mail message and any attachments are only for the use of the intended > recipient and may contain information that is privileged, confidential or > exempt from disclosure under applicable law. If you are not the intended > recipient, any disclosure, distribution or other use of this e-mail message > or attachments is prohibited. If you have received this e-mail message in > error, please delete and notify the sender immediately. Thank you. > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jason ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] msgr2 and cephfs
Yea ok thats what I guessed. I'm struggling to get my mons to listen on both ports. On startup they report: 2019-04-24 19:58:43.652 7fcf9cd3c040 -1 WARNING: 'mon addr' config option [v2:172.17.40.143:3300/0,v1:172.17.40.143:6789/0] does not match monmap file continuing with monmap configuration 2019-04-24 19:58:43.652 7fcf9cd3c040 0 starting mon.bos-r1-r3-head1 rank 0 at public addrs v2:172.17.40.143:3300/0 at bind addrs v2:172.17.40.143:3300/0 mon_data /var/lib/ceph/mon/ceph-bos-r1-r3-head1 fsid 4a361f9c-e28b-4b6b-ab59-264dcb51da97 which means i assume I have to jump through the add/remove mons hoops or just burn it down and start over? FWIW the docs seem to indicate they'll listen to both by default (in nautilus). Aaron > On Apr 24, 2019, at 4:29 PM, Jason Dillaman wrote: > > AFAIK, the kernel clients for CephFS and RBD do not support msgr2 yet. > > On Wed, Apr 24, 2019 at 4:19 PM Aaron Bassett > wrote: >> >> Hi, >> I'm standing up a new cluster on nautilus to play with some of the new >> features, and I've somehow got my monitors only listening on msgrv2 port >> (3300) and not the legacy port (6789). I'm running kernel 4.15 on my >> clients. Can I mount cephfs via port 3300 or do I have to figure out how to >> get my mons listening to both? >> >> Thanks, >> Aaron >> CONFIDENTIALITY NOTICE >> This e-mail message and any attachments are only for the use of the intended >> recipient and may contain information that is privileged, confidential or >> exempt from disclosure under applicable law. If you are not the intended >> recipient, any disclosure, distribution or other use of this e-mail message >> or attachments is prohibited. If you have received this e-mail message in >> error, please delete and notify the sender immediately. Thank you. >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com&d=DwIFaQ&c=Tpa2GKmmYSmpYS4baANxQwQYqA0vwGXwkJOPBegaiTs&r=5nKer5huNDFQXjYpOR4o_7t5CRI8wb5Vb_v1pBywbYw&m=zjPqBuK3C5vPalm69GpAWDz3vdkT0jYEVhvV0NG3OyI&s=wUk0q5ArWhrXvqzMNGRcL3qzKPjAoDQ481ek_5j4BQ0&e= > > > > -- > Jason ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] msgr2 and cephfs
Ah nevermind, I found ceph mon set addrs and I'm good to go. Aaron > On Apr 24, 2019, at 4:36 PM, Aaron Bassett > wrote: > > Yea ok thats what I guessed. I'm struggling to get my mons to listen on both > ports. On startup they report: > > 2019-04-24 19:58:43.652 7fcf9cd3c040 -1 WARNING: 'mon addr' config option > [v2:172.17.40.143:3300/0,v1:172.17.40.143:6789/0] does not match monmap file > continuing with monmap configuration > 2019-04-24 19:58:43.652 7fcf9cd3c040 0 starting mon.bos-r1-r3-head1 rank 0 > at public addrs v2:172.17.40.143:3300/0 at bind addrs v2:172.17.40.143:3300/0 > mon_data /var/lib/ceph/mon/ceph-bos-r1-r3-head1 fsid > 4a361f9c-e28b-4b6b-ab59-264dcb51da97 > > > which means i assume I have to jump through the add/remove mons hoops or just > burn it down and start over? FWIW the docs seem to indicate they'll listen to > both by default (in nautilus). > > Aaron > >> On Apr 24, 2019, at 4:29 PM, Jason Dillaman wrote: >> >> AFAIK, the kernel clients for CephFS and RBD do not support msgr2 yet. >> >> On Wed, Apr 24, 2019 at 4:19 PM Aaron Bassett >> wrote: >>> >>> Hi, >>> I'm standing up a new cluster on nautilus to play with some of the new >>> features, and I've somehow got my monitors only listening on msgrv2 port >>> (3300) and not the legacy port (6789). I'm running kernel 4.15 on my >>> clients. Can I mount cephfs via port 3300 or do I have to figure out how to >>> get my mons listening to both? >>> >>> Thanks, >>> Aaron >>> CONFIDENTIALITY NOTICE >>> This e-mail message and any attachments are only for the use of the >>> intended recipient and may contain information that is privileged, >>> confidential or exempt from disclosure under applicable law. If you are not >>> the intended recipient, any disclosure, distribution or other use of this >>> e-mail message or attachments is prohibited. If you have received this >>> e-mail message in error, please delete and notify the sender immediately. >>> Thank you. >>> >>> ___ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com&d=DwIFaQ&c=Tpa2GKmmYSmpYS4baANxQwQYqA0vwGXwkJOPBegaiTs&r=5nKer5huNDFQXjYpOR4o_7t5CRI8wb5Vb_v1pBywbYw&m=zjPqBuK3C5vPalm69GpAWDz3vdkT0jYEVhvV0NG3OyI&s=wUk0q5ArWhrXvqzMNGRcL3qzKPjAoDQ481ek_5j4BQ0&e= >> >> >> >> -- >> Jason > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] VM management setup
+1 for proxmox. (I'm contributor and I can say that ceph support is very good) - Mail original - De: jes...@krogh.cc À: "ceph-users" Envoyé: Vendredi 5 Avril 2019 21:34:02 Objet: [ceph-users] VM management setup Hi. Knowing this is a bit off-topic but seeking recommendations and advise anyway. We're seeking a "management" solution for VM's - currently in the 40-50 VM - but would like to have better access in managing them and potintially migrate them across multiple hosts, setup block devices, etc, etc. This is only to be used internally in a department where a bunch of engineering people will manage it, no costumers and that kind of thing. Up until now we have been using virt-manager with kvm - and have been quite satisfied when we were in the "few vms", but it seems like the time to move on. Thus we're looking for something "simple" that can help manage a ceph+kvm based setup - the simpler and more to the point the better. Any recommendations? .. found a lot of names allready .. OpenStack CloudStack Proxmox .. But recommendations are truely welcome. Thanks. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com