[ceph-users] Re: Nautilus 14.2.6 ceph-volume bluestore _read_fsid unparsable uuid
On Tue, Jan 28, 2020 at 08:03:35PM +0100, bauen1 wrote: >Hi, > >I've run into the same issue while testing: > >ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) >nautilus (stable) > >debian bullseye > >Ceph was installed using ceph-ansible on a vm from the repo >http://download.ceph.com/debian-nautilus > >The output of `sudo sh -c 'CEPH_VOLUME_DEBUG=true ceph-volume >--cluster test lvm batch --bluestore /dev/vdb'` has been attached. Thx, I opened https://tracker.ceph.com/issues/43868. This looks like a bluestore/osd issue to me, though it might end up being ceph-volumes fault. > >Also worth noting might be that '/var/lib/ceph/osd/test-0/fsid' is >empty (but I don't know too much about the internals) > >- bauen1 > >On 1/28/20 4:54 PM, Dave Hall wrote: >>Jan, >> >>Unfortunately I'm under immense pressure right now to get some form >>of Ceph into production, so it's going to be Luminous for now, or >>maybe a live upgrade to Nautilus without recreating the OSDs (if >>that's possible). >> >>The good news is that in the next couple months I expect to add more >>hardware that should be nearly identical. I will gladly give it a >>go at that time and see if I can recreate. (Or, if I manage to >>thoroughly crash my current fledgling cluster, I'll give it another >>go on one node while I'm up all night recovering.) >> >>If you could tell me where to look I'd gladly read some code and see >>if I can find anything that way. Or if there's any sort of design >>document describing the deep internals I'd be glad to scan it to see >>if I've hit a corner case of some sort. Actually, I'd be interested >>in reading those documents anyway if I could. >> >>Thanks. >> >>-Dave >> >>Dave Hall >> >>On 1/28/2020 3:05 AM, Jan Fajerski wrote: >>>On Mon, Jan 27, 2020 at 03:23:55PM -0500, Dave Hall wrote: All, I've just spent a significant amount of time unsuccessfully chasing the _read_fsid unparsable uuid error on Debian 10 / Natilus 14.2.6. Since this is a brand new cluster, last night I gave up and moved back to Debian 9 / Luminous 12.2.11. In both cases I'm using the packages from Debian Backports with ceph-ansible as my deployment tool. Note that above I said 'the _read_fsid unparsable uuid' error. I've searched around a bit and found some previously reported issues, but I did not see any conclusive resolutions. I would like to get to Nautilus as quickly as possible, so I'd gladly provide additional information to help track down the cause of this symptom. I can confirm that, looking at the ceph-volume.log on the OSD host I see no difference between the ceph-volume lvm batch command generated by the ceph-ansible versions associated with these two Ceph releases: ceph-volume --cluster ceph lvm batch --bluestore --yes --block-db-size 133358734540 /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/nvme0n1 Note that I'm using --block-db-size to divide my NVMe into 12 segments as I have 4 empty drive bays on my OSD servers that I may eventually be able to fill. My OSD hardware is: Disk /dev/nvme0n1: 1.5 TiB, 1600321314816 bytes, 3125627568 sectors Disk /dev/sdc: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors Disk /dev/sdd: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors Disk /dev/sde: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors Disk /dev/sdf: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors Disk /dev/sdg: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors Disk /dev/sdh: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors Disk /dev/sdi: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors Disk /dev/sdj: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors I'd send the output of ceph-volume inventory on Luminous, but I'm getting -->: KeyError: 'human_readable_size'. Please let me know if I can provide any further information. >>>Mind re-running you ceph-volume command with debug output >>>enabled: >>>CEPH_VOLUME_DEBUG=true ceph-volume --cluster ceph lvm batch >>>--bluestore ... >>> >>>Ideally you could also openen a bug report here >>>https://tracker.ceph.com/projects/ceph-volume/issues/new >>> >>>Thanks! Thanks. -Dave -- Dave Hall Binghamton University ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io >>___ >>ceph-users mailing list -- ceph-users@ceph.io >>To unsubscribe send an email to ceph-users-le...@ceph.io >sysadmin@ceph-test:~$ sudo setenforce 0 >sysadmin@ceph-test:~$ sudo sh -c 'CEPH_VOLUME_DEBUG=true ceph-volume --cluster >test lvm batch --bluestore /dev/vdb' > >Total OSDs: 1 > > TypePath
[ceph-users] Re: Ceph MDS specific perf info disappeared in Nautilus
Hi, Quoting Dan van der Ster (d...@vanderster.com): > Maybe you're checking a standby MDS ? Looks like it. Active does have performance metrics. Thanks, Stefan -- | BIT BV https://www.bit.nl/Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / i...@bit.nl ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Write i/o in CephFS metadata pool
Hi! I've been running CephFS for a while now and ever since setting it up, I've seen unexpectedly large write i/o on the CephFS metadata pool. The filesystem is otherwise stable and I'm seeing no usage issues. I'm in a read-intensive environment, from the clients' perspective and throughput for the metadata pool is consistently larger than that of the data pool. For example: # ceph osd pool stats pool cephfs_data id 1 client io 7.6 MiB/s rd, 19 KiB/s wr, 404 op/s rd, 1 op/s wr pool cephfs_metadata id 2 client io 338 KiB/s rd, 43 MiB/s wr, 84 op/s rd, 26 op/s wr I realise, of course, that this is a momentary display of statistics, but I see this unbalanced r/w activity consistently when monitoring it live. I would like some insight into what may be causing this large imbalance in r/w, especially since I'm in a read-intensive (web hosting) environment. Some of it may be expected in when considering details of my environment and CephFS implementation specifics, so please ask away if more details are needed. With my experience using NFS, I would start by looking at client io stats, like `nfsstat` and tuning e.g. mount options, but I haven't been able to find such statistics for CephFS clients. Is there anything of the sort for CephFS? Are similar stats obtainable in some other way? This might be a somewhat broad question and shallow description, so yeah, let me know if there's anything you would like more details on. Thanks a lot, Samy ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: getting rid of incomplete pg errors
There should be docs on how to mark an OSD lost, which I would expect to be linked from the troubleshooting PGs page. There is also a command to force create PGs but I don’t think that will help in this case since you already have at least one copy. On Tue, Jan 28, 2020 at 5:15 PM Hartwig Hauschild wrote: > Hi. > > before I descend into what happened and why it happened: I'm talking about > a > test-cluster so I don't really care about the data in this case. > > We've recently started upgrading from luminous to nautilus, and for us that > means we're retiring ceph-disk in favour of ceph-volume with lvm and > dmcrypt. > > Our setup is in containers and we've got DBs separated from Data. > When testing our upgrade-path we discovered that running the host on > ubuntu-xenial and the containers on centos-7.7 leads to lvm inside the > containers not using lvmetad because it's too old. That in turn means that > not running `vgscan --cache` on the host before adding a LV to a VG > essentially zeros the metadata for all LVs in that VG. > > That happened on two out of three hosts for a bunch of OSDs and those OSDs > are gone. I have no way of getting them back, they've been overwritten > multiple times trying to figure out what went wrong. > > So now I have a cluster that's got 16 pgs in 'incomplete', 14 of them with > 0 > objects, 2 with about 150 objects each. > > I have found a couple of howtos that tell me to use ceph-objectstore-tool > to > find the pgs on the active osds and I've given that a try, but > ceph-objectstore-tool always tells me it can't find the pg I am looking > for. > > Can I tell ceph to re-init the pgs? Do I have to delete the pools and > recreate them? > > There's no data I can't get back in there, I just don't feel like > scrapping and redeploying the whole cluster. > > > -- > Cheers, > Hardy > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Concurrent append operations
The core RADOS api will order these on the osd as it receives the operations from clients, and nothing will break if you submit 2 in parallel. I’m less familiar with the S3 interface but I believe appends there will be ordered by the rgw daemon and so will be much slower. Or maybe it works the same as a normal s3 object overwrite and the second one will win and the first one goes into oblivion? Either way it probably won’t fit your needs. -Greg On Mon, Jan 20, 2020 at 3:42 PM David Bell wrote: > Hello, > > I am currently evaluating Ceph for our needs and I have a question > about the 'object append' feature. I note that the rados core API > supports an 'append' operation, and the S3-compatible interface has > too. > > My question is: does Ceph support concurrent append? I would like to > use Ceph as a temporary store, a "buffer" if you will, for incoming > data from a variety of sources. Each object would hold data for a > particular identifier. I'd like to know if two or more different > clients can 'append' to the same object, and the data doesn't > overwrite each other, and each 'append' is added to the end of the > object? > > Performance wise we'd likely be performing 15-20 thousand writes per > second, so we'd be building a pretty big cluster on very fast flash > disk. Data would only reside on the system for about an hour at most > before being read and deleted. > > Cheers, > > David Bell > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: High CPU usage by ceph-mgr in 14.2.6
On 2020-01-29 01:19, jbardg...@godaddy.com wrote: > We feel this is related to the size of the cluster, similarly to the > previous report. > > Anyone else experiencing this and/or can provide some direction on > how to go about resolving this? What Manager modules are enabled on that node? Have you tried disabling some of them, e.g. the Dashboard or Balancer module? Lenz -- SUSE Software Solutions Germany GmbH - Maxfeldstr. 5 - 90409 Nuernberg GF: Felix Imendörffer, HRB 36809 (AG Nürnberg) signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Luminous Bluestore OSDs crashing with ASSERT
Hi Stefan, the proper Ceph way of sending log for developer analysis is ceph-post-file but I'm not good in retrieving them from there... Ideally I'd prefer to start with log snippets covering 20K lines prior to crash. 3 or 4 of them. This wouldn't take so much space and you can send them by email, attach to a ticket or share via some publicly available URL. Accessing the whole logs also works for me. If this doesn't work then let's go ceph-post-file way, I have to pass this trail one day... Thanks, Igor On 1/28/2020 3:09 PM, Stefan Priebe - Profihost AG wrote: Hello Igor, i updated all servers to latest 4.19.97 kernel but this doesn't fix the situation. I can provide you with all those logs - any idea where to upload / how to sent them to you? Greets, Stefan Am 20.01.20 um 13:12 schrieb Igor Fedotov: Hi Stefan, these lines are result of transaction dump performed on a failure during transaction submission (which is shown as "submit_transaction error: Corruption: block checksum mismatch code = 2" Most probably they are out of interest (checksum errors are unlikely to be caused by transaction content) and hence we need earlier stuff to learn what caused that checksum mismatch. It's hard to give any formal overview of what you should look for, from my troubleshooting experience generally one may try to find: - some previous error/warning indications (e.g. allocation, disk access, etc) - prior OSD crashes (sometimes they might have different causes/stack traces/assertion messages) - any timeout or retry indications - any uncommon log patterns which aren't present during regular running but happen each time before the crash/failure. Anyway I think the inspection depth should be much(?) deeper than presumably it is (from what I can see from your log snippets). Ceph keeps last 1 log events with an increased log level and dumps them on crash with negative index starting at - up to -1 as a prefix. -1> 2020-01-16 01:10:13.404090 7f3350a14700 -1 rocksdb: It would be great If you share several log snippets for different crashes containing these last 1 lines. Thanks, Igor On 1/19/2020 9:42 PM, Stefan Priebe - Profihost AG wrote: Hello Igor, there's absolutely nothing in the logs before. What do those lines mean: Put( Prefix = O key = 0x7f8001cc45c881217262'd_data.4303206b8b4567.9632!='0xfffe6f0012'x' Value size = 480) Put( Prefix = O key = 0x7f8001cc45c881217262'd_data.4303206b8b4567.9632!='0xfffe'o' Value size = 510) on the right size i always see 0xfffe on all failed OSDs. greets, Stefan Am 19.01.20 um 14:07 schrieb Stefan Priebe - Profihost AG: Yes, except that this happens on 8 different clusters with different hw but same ceph version and same kernel version. Greets, Stefan Am 19.01.2020 um 11:53 schrieb Igor Fedotov : So the intermediate summary is: Any OSD in the cluster can experience interim RocksDB checksum failure. Which isn't present after OSD restart. No HW issues observed, no persistent artifacts (except OSD log) afterwards. And looks like the issue is rather specific to the cluster as no similar reports from other users seem to be present. Sorry, I'm out of ideas other then collect all the failure logs and try to find something common in them. May be this will shed some light.. BTW from my experience it might make sense to inspect OSD log prior to failure (any error messages and/or prior restarts, etc) sometimes this might provide some hints. Thanks, Igor On 1/17/2020 2:30 PM, Stefan Priebe - Profihost AG wrote: HI Igor, Am 17.01.20 um 12:10 schrieb Igor Fedotov: hmmm.. Just in case - suggest to check H/W errors with dmesg. this happens on around 80 nodes - i don't expect all of those have not identified hw errors. Also all of them are monitored - no dmesg outpout contains any errors. Also there are some (not very much though) chances this is another incarnation of the following bug: https://tracker.ceph.com/issues/22464 https://github.com/ceph/ceph/pull/24649 The corresponding PR works around it for main device reads (user data only!) but theoretically it might still happen either for DB device or DB data at main device. Can you observe any bluefs spillovers? Are there any correlation between failing OSDs and spillover presence if any, e.g. failing OSDs always have a spillover. While OSDs without spillovers never face the issue... To validate this hypothesis one can try to monitor/check (e.g. once a day for a week or something) "bluestore_reads_with_retries" counter over OSDs to learn if the issue is happening in the system. Non-zero values mean it's there for user data/main device and hence is likely to happen for DB ones as well (which doesn't have any workaround yet). OK i checked bluestore_reads_with_retries on 360 osds but all of them say 0. Additionally you might want to
[ceph-users] ceph fs dir-layouts and sub-directory mounts
I would like to (in this order) - set the data pool for the root "/" of a ceph-fs to a custom value, say "P" (not the initial data pool used in fs new) - create a sub-directory of "/", for example "/a" - mount the sub-directory "/a" with a client key with access restricted to "/a" The client will not be able to see the dir layout attribute set at "/", its not mounted. Will the data of this client still go to the pool "P", that is, does "/a" inherit the dir layout transparently to the client when following the steps above? Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: getting rid of incomplete pg errors
Hi, I had looked at the output of `ceph health detail` which told me to search for 'incomplete' in the docs. Since that said to file a bug (and I was sure that filing a bug did not help) I continued to purge the Disks that we hat overwritten and ceph then did some magic and told me that the PGs were again available on three OSDs but were incomplete. I have now gone ahead and marked all three of the OSDs where one of my incomplete PGs is (according to `ceph pg ls incomplete`) as lost one by one, waiting for ceph status to settle in between and that lead to the PG now being incomplete on three different OSDs. Also, force-create-pg tells me "already created". Am 29.01.2020 schrieb Gregory Farnum: > There should be docs on how to mark an OSD lost, which I would expect to be > linked from the troubleshooting PGs page. > > There is also a command to force create PGs but I don’t think that will > help in this case since you already have at least one copy. > > On Tue, Jan 28, 2020 at 5:15 PM Hartwig Hauschild > wrote: > > > Hi. > > > > before I descend into what happened and why it happened: I'm talking about > > a > > test-cluster so I don't really care about the data in this case. > > > > We've recently started upgrading from luminous to nautilus, and for us that > > means we're retiring ceph-disk in favour of ceph-volume with lvm and > > dmcrypt. > > > > Our setup is in containers and we've got DBs separated from Data. > > When testing our upgrade-path we discovered that running the host on > > ubuntu-xenial and the containers on centos-7.7 leads to lvm inside the > > containers not using lvmetad because it's too old. That in turn means that > > not running `vgscan --cache` on the host before adding a LV to a VG > > essentially zeros the metadata for all LVs in that VG. > > > > That happened on two out of three hosts for a bunch of OSDs and those OSDs > > are gone. I have no way of getting them back, they've been overwritten > > multiple times trying to figure out what went wrong. > > > > So now I have a cluster that's got 16 pgs in 'incomplete', 14 of them with > > 0 > > objects, 2 with about 150 objects each. > > > > I have found a couple of howtos that tell me to use ceph-objectstore-tool > > to > > find the pgs on the active osds and I've given that a try, but > > ceph-objectstore-tool always tells me it can't find the pg I am looking > > for. > > > > Can I tell ceph to re-init the pgs? Do I have to delete the pools and > > recreate them? > > > > There's no data I can't get back in there, I just don't feel like > > scrapping and redeploying the whole cluster. > > > > > > -- > > Cheers, > > Hardy > > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > > -- Cheers, Hardy ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Write i/o in CephFS metadata pool
On 29/01/2020 10:24, Samy Ascha wrote: > I've been running CephFS for a while now and ever since setting it up, I've > seen unexpectedly large write i/o on the CephFS metadata pool. > > The filesystem is otherwise stable and I'm seeing no usage issues. > > I'm in a read-intensive environment, from the clients' perspective and > throughput for the metadata pool is consistently larger than that of the data > pool. > > [...] > > This might be a somewhat broad question and shallow description, so yeah, let > me know if there's anything you would like more details on. No explanation, but chiming in, as I've seen something similar happen on my single node "cluster" at home, where I'm exposing a cephfs through Samba using vfs_ceph, mostly for time machine backups. Running ceph 14.2.6 on debian buster. I can easily perform debugging operations there, no SLA in place :) Jasper ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: cephfs : write error: Operation not permitted
Hello, Sorry, this should be ceph osd pool application set cephfs_data cephfs data cephfs ceph osd pool application set cephfs_metadata cephfs metadata cephfs so that the json output looks like "cephfs_data" { "cephfs": { "data": "cephfs" } } "cephfs_metadata" { "cephfs": { "metadata": "cephfs" } } Thanks a lot, that has fixed my issue! Best, -- Yoann Moulin EPFL IC-IT ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Network performance checks
After having upgraded my ceph cluster from Luminous to Nautilus 14.2.6 , from time to time "ceph health detail" claims about some"Long heartbeat ping times on front/back interface seen". As far as I can understand (after having read https://docs.ceph.com/docs/nautilus/rados/operations/monitoring/), this means that the ping from one OSD to another one exceeded 1 s. I have some questions on these network performance checks 1) What is meant exactly with front and back interface ? 2) I can see the involved OSDs only in the output of "ceph health detail" (when there is the problem) but I can't find this information in the log files. In the mon log file I can only see messages such as: 2020-01-28 11:14:07.641 7f618e644700 0 log_channel(cluster) log [WRN] : Health check failed: Long heartbeat ping times on back interface seen, longest is 1416.618 msec (OSD_SLOW_PING_TIME_BACK) but the involved OSDs are not reported in this log. Do I just need to increase the verbosity of the mon log ? 3) Is 1 s a reasonable value for this threshold ? How could this value be changed ? What is the relevant configuration variable ? 4) https://docs.ceph.com/docs/nautilus/rados/operations/monitoring/ suggests to use the dump_osd_network command. I think there is an error in that page: it says that the command should be issued on ceph-mgr.x.asok, while I think that instead the ceph-osd-x.asok should be used I have an other ceph cluster (running nautilus 14.2.6 as well) where there aren't OSD_SLOW_PING_* error messages in the mon logs, but: ceph daemon /var/run/ceph/ceph-osd..asok dump_osd_network 1 reports a lot of entries (i.e. pings exceeded 1 s). How can this be explained ? Thanks, Massimo ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Write i/o in CephFS metadata pool
Sammy; I had a thought; since you say the FS has high read activity, but you're seeing large write I/O... is it possible that this is related to atime (Linux last access time)? If I remember my Linux FS basics, atime is stored in the file entry for the file in the directory, and I believe directory information is stored in the metadata pool (dentries?). As a test; you might try mounting the CephFS with the noatime flag. Then see if the write I/O is reduced. I honestly don't know if CephFS supports atime, but I would expect it would. Thank you, Dominic L. Hilsbos, MBA Director - Information Technology Perform Air International Inc. dhils...@performair.com www.PerformAir.com -Original Message- From: Samy Ascha [mailto:s...@xel.nl] Sent: Wednesday, January 29, 2020 2:25 AM To: ceph-users@ceph.io Subject: [ceph-users] Write i/o in CephFS metadata pool Hi! I've been running CephFS for a while now and ever since setting it up, I've seen unexpectedly large write i/o on the CephFS metadata pool. The filesystem is otherwise stable and I'm seeing no usage issues. I'm in a read-intensive environment, from the clients' perspective and throughput for the metadata pool is consistently larger than that of the data pool. For example: # ceph osd pool stats pool cephfs_data id 1 client io 7.6 MiB/s rd, 19 KiB/s wr, 404 op/s rd, 1 op/s wr pool cephfs_metadata id 2 client io 338 KiB/s rd, 43 MiB/s wr, 84 op/s rd, 26 op/s wr I realise, of course, that this is a momentary display of statistics, but I see this unbalanced r/w activity consistently when monitoring it live. I would like some insight into what may be causing this large imbalance in r/w, especially since I'm in a read-intensive (web hosting) environment. Some of it may be expected in when considering details of my environment and CephFS implementation specifics, so please ask away if more details are needed. With my experience using NFS, I would start by looking at client io stats, like `nfsstat` and tuning e.g. mount options, but I haven't been able to find such statistics for CephFS clients. Is there anything of the sort for CephFS? Are similar stats obtainable in some other way? This might be a somewhat broad question and shallow description, so yeah, let me know if there's anything you would like more details on. Thanks a lot, Samy ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Write i/o in CephFS metadata pool
Hi Dominic, I should have mentioned that I've set noatime already. I have not found any obvious other mount options that would contribute to 'write on read' behaviour.. Thx Samy > On 29 Jan 2020, at 15:43, dhils...@performair.com wrote: > > Sammy; > > I had a thought; since you say the FS has high read activity, but you're > seeing large write I/O... is it possible that this is related to atime > (Linux last access time)? If I remember my Linux FS basics, atime is stored > in the file entry for the file in the directory, and I believe directory > information is stored in the metadata pool (dentries?). > > As a test; you might try mounting the CephFS with the noatime flag. Then see > if the write I/O is reduced. > > I honestly don't know if CephFS supports atime, but I would expect it would. > > Thank you, > > Dominic L. Hilsbos, MBA > Director - Information Technology > Perform Air International Inc. > dhils...@performair.com > www.PerformAir.com > > > > -Original Message- > From: Samy Ascha [mailto:s...@xel.nl] > Sent: Wednesday, January 29, 2020 2:25 AM > To: ceph-users@ceph.io > Subject: [ceph-users] Write i/o in CephFS metadata pool > > Hi! > > I've been running CephFS for a while now and ever since setting it up, I've > seen unexpectedly large write i/o on the CephFS metadata pool. > > The filesystem is otherwise stable and I'm seeing no usage issues. > > I'm in a read-intensive environment, from the clients' perspective and > throughput for the metadata pool is consistently larger than that of the data > pool. > > For example: > > # ceph osd pool stats > pool cephfs_data id 1 > client io 7.6 MiB/s rd, 19 KiB/s wr, 404 op/s rd, 1 op/s wr > > pool cephfs_metadata id 2 > client io 338 KiB/s rd, 43 MiB/s wr, 84 op/s rd, 26 op/s wr > > I realise, of course, that this is a momentary display of statistics, but I > see this unbalanced r/w activity consistently when monitoring it live. > > I would like some insight into what may be causing this large imbalance in > r/w, especially since I'm in a read-intensive (web hosting) environment. > > Some of it may be expected in when considering details of my environment and > CephFS implementation specifics, so please ask away if more details are > needed. > > With my experience using NFS, I would start by looking at client io stats, > like `nfsstat` and tuning e.g. mount options, but I haven't been able to find > such statistics for CephFS clients. > > Is there anything of the sort for CephFS? Are similar stats obtainable in > some other way? > > This might be a somewhat broad question and shallow description, so yeah, let > me know if there's anything you would like more details on. > > Thanks a lot, > Samy > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: High CPU usage by ceph-mgr in 14.2.6
Modules that are normally enabled: ceph mgr module ls | jq -r '.enabled_modules' [ "dashboard", "prometheus", "restful" ] We did test with all modules disabled, restarted the mgrs and saw no difference. Joe ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Servicing multiple OpenStack clusters from the same Ceph cluster
Hello, We have a medium-sized Ceph Luminous cluster that, up til now, has been the RBD image backend solely for an OpenStack Newton cluster that's marked for upgrade to Stein later this year. Recently we deployed a brand new Stein cluster however, and I'm curious whether the idea of pointing the new OpenStack cluster at the same RBD pools for Cinder/Glance/Nova as the Luminous cluster would be considered bad practice, or even potentially dangerous. One argument for doing it may be that multiple CInder/Glance/Nova pools serving disparate groups of clients would come at a PG cost to the cluster, though the separation of multiple, distinct pools also has its advantages. The UUIDs generated for RBD images in the pools by OpenStack services *should* be unique and collision-less between the 2 OpenStack clusters, in theory. One other point I was curious about was RBD image feature sets; Stein Ceph clients will be running later versions of Ceph libraries than Newton clients. If the 2 sets of clients were to share pools, would that itself cause problems (in the case that neither set needed to share RBD images within pools, only the pool itself) with some images in the pool having different feature lists? -- *** Paul Browne Research Computing Platforms University Information Services Roger Needham Building JJ Thompson Avenue University of Cambridge Cambridge United Kingdom E-Mail: pf...@cam.ac.uk Tel: 0044-1223-746548 *** ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Servicing multiple OpenStack clusters from the same Ceph cluster [EXT]
Hi, On 29/01/2020 16:40, Paul Browne wrote: > Recently we deployed a brand new Stein cluster however, and I'm curious > whether the idea of pointing the new OpenStack cluster at the same RBD > pools for Cinder/Glance/Nova as the Luminous cluster would be considered > bad practice, or even potentially dangerous. I think that would be pretty risky - here we have a Ceph cluster that provides backing for our OpenStacks, and each OpenStack has its own set of pools -metrics,-images,-volumes,-vms (and its own credential). Regards, Matthew signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Servicing multiple OpenStack clusters from the same Ceph cluster
Hello, We have recently deployed that and it's working fine. We have deployed different keys for the different openstack clusters ofcourse and they are using the same cinder/nova/glance pools. The only risk is if a client from one openstack cluster creates a volume and the id that will be generated ends up being the same on an existing volume from the other openstack cluster. But that's like possibility of 1 in 5 billion or something. We took the risk. Regards ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Servicing multiple OpenStack clusters from the same Ceph cluster
You should have used separate pool name scemes for each OpenStack cluster.. From: tda...@hotmail.com Sent: Wednesday, January 29, 2020 12:29 PM To: ceph-users@ceph.io Subject: [ceph-users] Re: Servicing multiple OpenStack clusters from the same Ceph cluster Hello, We have recently deployed that and it's working fine. We have deployed different keys for the different openstack clusters ofcourse and they are using the same cinder/nova/glance pools. The only risk is if a client from one openstack cluster creates a volume and the id that will be generated ends up being the same on an existing volume from the other openstack cluster. But that's like possibility of 1 in 5 billion or something. We took the risk. Regards ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Servicing multiple OpenStack clusters from the same Ceph cluster
Yes but we are offering our rbd volumes in another cloud product which can enable them migrate their volumes to openstack when they want. Sent from my iPhone On 29 Jan 2020, at 18:38, Matthew H wrote: You should have used separate pool name scemes for each OpenStack cluster.. From: tda...@hotmail.com Sent: Wednesday, January 29, 2020 12:29 PM To: ceph-users@ceph.io Subject: [ceph-users] Re: Servicing multiple OpenStack clusters from the same Ceph cluster Hello, We have recently deployed that and it's working fine. We have deployed different keys for the different openstack clusters ofcourse and they are using the same cinder/nova/glance pools. The only risk is if a client from one openstack cluster creates a volume and the id that will be generated ends up being the same on an existing volume from the other openstack cluster. But that's like possibility of 1 in 5 billion or something. We took the risk. Regards ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: High CPU usage by ceph-mgr in 14.2.6
Hi Joe, Can you grab a wallclock profiler dump from the mgr process and share it with us? This was useful for us to get to the root cause of the issue in 14.2.5. Quoting Mark's suggestion from "[ceph-users] High CPU usage by ceph-mgr in 14.2.5" below. If you can get a wallclock profiler on the mgr process we might be able to figure out specifics of what's taking so much time (ie processing pg_summary or something else). Assuming you have gdb with the python bindings and the ceph debug packages installed, if you (are anyone) could try gdbpmp on the 100% mgr process that would be fantastic. https://github.com/markhpc/gdbpmp gdbpmp.py -p`pidof ceph-mgr` -n 1000 -o mgr.gdbpmp If you want to view the results: gdbpmp.py -i mgr.gdbpmp -t 1 Thanks, Neha On Wed, Jan 29, 2020 at 7:35 AM wrote: > > Modules that are normally enabled: > > ceph mgr module ls | jq -r '.enabled_modules' > [ > "dashboard", > "prometheus", > "restful" > ] > > We did test with all modules disabled, restarted the mgrs and saw no > difference. > > Joe > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Nautilus 14.2.6 ceph-volume bluestore _read_fsid unparsable uuid
Jan, I have something new on this topic. I had gone back to Debian 9 backports and Luminous (distro packages). I had all of my OSDs working and I was about to deploy an MDS. But I noticed that the same Luminous packages where in Debian 10 (not backports), so I upgraded my OS to Debian 10. The OSDs, MONs, and MGRs survived the trip, although a couple of the OSDs needed me to 'systemctl start ceph-volume@lvm' before they came online. Then I couldn't resist, so I did one further upgrade to Debian 10 Backports, which moved my Ceph to Nautilus. What could go wrong? I did refer to https://pve.proxmox.com/wiki/Ceph_Luminous_to_Nautilus even though it's not exactly equivalent. After the dist-upgrade the MONs and MGRs were all good, but 17 of my 24 OSDs are down and don't seem to want to come up: root@ceph00:~# ceph versions { "mon": { "ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus (stable)": 3 }, "mgr": { "ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus (stable)": 3 }, "osd": { "ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable)": 7 }, "mds": {}, "overall": { "ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable)": 7, "ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus (stable)": 6 } } root@ceph00:~# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 261.93823 root default -7 87.31274 host ceph00 15 hdd 10.91409 osd.15 down 1.0 1.0 16 hdd 10.91409 osd.16 down 1.0 1.0 17 hdd 10.91409 osd.17 down 1.0 1.0 18 hdd 10.91409 osd.18 down 1.0 1.0 19 hdd 10.91409 osd.19 down 1.0 1.0 20 hdd 10.91409 osd.20 down 1.0 1.0 21 hdd 10.91409 osd.21 down 1.0 1.0 22 hdd 10.91409 osd.22 down 1.0 1.0 -5 87.31274 host ceph01 7 hdd 10.91409 osd.7 down 1.0 1.0 8 hdd 10.91409 osd.8 down 1.0 1.0 9 hdd 10.91409 osd.9 down 1.0 1.0 10 hdd 10.91409 osd.10 down 1.0 1.0 11 hdd 10.91409 osd.11 down 1.0 1.0 12 hdd 10.91409 osd.12 down 1.0 1.0 13 hdd 10.91409 osd.13 down 1.0 1.0 14 hdd 10.91409 osd.14 down 1.0 1.0 -3 87.31274 host ceph02 0 hdd 10.91409 osd.0 down 1.0 1.0 1 hdd 10.91409 osd.1 up 1.0 1.0 2 hdd 10.91409 osd.2 up 1.0 1.0 3 hdd 10.91409 osd.3 up 1.0 1.0 4 hdd 10.91409 osd.4 up 1.0 1.0 5 hdd 10.91409 osd.5 up 1.0 1.0 6 hdd 10.91409 osd.6 up 1.0 1.0 23 hdd 10.91409 osd.23 up 1.0 1.0 root@ceph00:~# ceph-volume inventory Device Path Size rotates available Model name /dev/md0 186.14 GB False False /dev/md1 37.27 GB False False /dev/nvme0n1 1.46 TB False False SAMSUNG MZPLL1T6HEHP-3 /dev/sda 223.57 GB False False Samsung SSD 883 /dev/sdb 223.57 GB False False Samsung SSD 883 /dev/sdc 10.91 TB True False ST12000NM0027 /dev/sdd 10.91 TB True False ST12000NM0027 /dev/sde 10.91 TB True False ST12000NM0027 /dev/sdf 10.91 TB True False ST12000NM0027 /dev/sdg 10.91 TB True False ST12000NM0027 /dev/sdh 10.91 TB True False ST12000NM0027 /dev/sdi 10.91 TB True False ST12000NM0027 /dev/sdj 10.91 TB True False ST12000NM0027 I'm going to try a couple things on one of the two nodes, but I will save the other until I hear from you on any further information I could collect. Note that all 3 nodes are identical hardware and software. Since I don't have any data on these OSDs yet I don't have any problem with destroying and rebuilding them. What would be really interesting would be a sequence of low-level commands that could be issued to manually create these OSDs. There's some evidence of this in /var/log/ceph/ceph-volume.log, but there's some detail missing and it's really hard to follow. If you can provide this list I'd gladly give it a try and let you know how it goes. Thanks. -Dave Dave Hall Binghamton University On 1/29/2
[ceph-users] health_warn: slow_ops 4 slow ops
Hi Ceph Community (I'm new here :), I'm learning Ceph in a Virtual Environment Vagrant/Virtualbox (I understand this is far from a real environment in several ways, mainly performance, but I'm ok with that at this point :) I've 3 nodes, and after few *vagrant halt/up*, when I do *ceph -s*, I got the following message: [vagrant@ceph-node1 ~]$ sudo ceph -s cluster: id: 7f8cb5f0-1989-4ab1-8fb9-d5c08aa96658 health: *HEALTH_WARN* Reduced data availability: 512 pgs inactive 4 slow ops, oldest one blocked for 1576 sec, daemons [osd.6,osd.7,osd.8] have slow ops. services: mon: 3 daemons, quorum ceph-node1,ceph-node2,ceph-node3 (age 7m) mgr: ceph-node1(active, since 26m), standbys: ceph-node2, ceph-node3 osd: 9 osds: 9 up (since 25m), 9 in (since 2d) data: pools: 1 pools, 512 pgs objects: 0 objects, 0 B usage: 9.1 GiB used, 162 GiB / 171 GiB avail pgs: 100.000% pgs unknown 512 unknown Here the output of *ceph health detail*: [vagrant@ceph-node1 ~]$ sudo ceph health detail HEALTH_WARN Reduced data availability: 512 pgs inactive; 4 slow ops, oldest one blocked for 1810 sec, daemons [osd.6,osd.7,osd.8] have slow ops. PG_AVAILABILITY Reduced data availability: 512 pgs inactive pg 2.1cd is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1ce is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1cf is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1d0 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1d1 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1d2 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1d3 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1d4 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1d5 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1d6 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1d7 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1d8 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1d9 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1da is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1db is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1dc is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1dd is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1de is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1df is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1e0 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1e1 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1e2 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1e3 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1e4 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1e5 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1e6 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1e7 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1e8 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1e9 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1ea is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1eb is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1ec is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1ed is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1ee is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1ef is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1f0 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1f1 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1f2 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1f3 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1f4 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1f5 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1f6 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1f7 is stuck inactive for 1815.881027, current state unknown, last acting [] pg 2.1f8 is stuc
[ceph-users] Re: ceph fs dir-layouts and sub-directory mounts
On 1/29/20 6:03 PM, Frank Schilder wrote: I would like to (in this order) - set the data pool for the root "/" of a ceph-fs to a custom value, say "P" (not the initial data pool used in fs new) - create a sub-directory of "/", for example "/a" - mount the sub-directory "/a" with a client key with access restricted to "/a" The client will not be able to see the dir layout attribute set at "/", its not mounted. Will the data of this client still go to the pool "P", that is, does "/a" inherit the dir layout transparently to the client when following the steps above? AFAIU: "/" (the cephfs root) is "pool_A" "/folder" is "pool_B" If you want for your client to put data to "/folder" only, you should set "caps: [mds] allow rw path=/folder" and yes, client data will be putted to "pool_B". k ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Nautilus 14.2.6 ceph-volume bluestore _read_fsid unparsable uuid
Jan, In trying to recover my OSDs after the upgrade from Nautilus described earlier, I eventually managed to make things worse to the point where I'm going to scrub and fully reinstall. So I zapped all of the devices on one of my nodes and reproduced the ceph-volume lvm create error I mentioned earlier, using the procedure from https://docs.ceph.com/docs/mimic/rados/configuration/bluestore-config-ref/ to lay out the LVs and issue ceph-volume lvm create. As I was concerned that maybe it was a size thing, I only create a 4TB block LV for my first attempt, and the full 12TB drive for my second attempt. The output is: root@ceph01:~# ceph-volume lvm create --bluestore --data ceph-block-0/block-0 --block.db ceph-db-0/db-0 Running command: /usr/bin/ceph-authtool --gen-print-key Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 6441f236-8694-46b9-9c6a-bf82af89765d Running command: /usr/bin/ceph-authtool --gen-print-key Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-24 --> Absolute path not found for executable: selinuxenabled --> Ensure $PATH environment variable contains common executable locations Running command: /bin/chown -h ceph:ceph /dev/ceph-block-0/block-0 Running command: /bin/chown -R ceph:ceph /dev/dm-0 Running command: /bin/ln -s /dev/ceph-block-0/block-0 /var/lib/ceph/osd/ceph-24/block Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-24/activate.monmap stderr: got monmap epoch 4 Running command: /usr/bin/ceph-authtool /var/lib/ceph/osd/ceph-24/keyring --create-keyring --name osd.24 --add-key AQAuMjJe5OGHBRAAP94+1E7CzV5Rv9HFj9WVqA== stdout: creating /var/lib/ceph/osd/ceph-24/keyring added entity osd.24 auth(key=AQAuMjJe5OGHBRAAP94+1E7CzV5Rv9HFj9WVqA==) Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-24/keyring Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-24/ Running command: /bin/chown -h ceph:ceph /dev/ceph-db-0/db-0 Running command: /bin/chown -R ceph:ceph /dev/dm-1 Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 24 --monmap /var/lib/ceph/osd/ceph-24/activate.monmap --keyfile - --bluestore-block-db-path /dev/ceph-db-0/db-0 --osd-data /var/lib/ceph/osd/ceph-24/ --osd-uuid 6441f236-8694-46b9-9c6a-bf82af89765d --setuser ceph --setgroup ceph stderr: 2020-01-29 20:32:33.054 7ff4c24abc80 -1 bluestore(/var/lib/ceph/osd/ceph-24/) _read_fsid unparsable uuid stderr: terminate called after throwing an instance of 'boost::exception_detail::clone_impl >' stderr: what(): boost::bad_get: failed value get using boost::get stderr: *** Caught signal (Aborted) ** stderr: in thread 7ff4c24abc80 thread_name:ceph-osd stderr: ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus (stable) stderr: 1: (()+0x12730) [0x7ff4c2f54730] stderr: 2: (gsignal()+0x10b) [0x7ff4c2a377bb] stderr: 3: (abort()+0x121) [0x7ff4c2a22535] stderr: 4: (()+0x8c983) [0x7ff4c2dea983] stderr: 5: (()+0x928c6) [0x7ff4c2df08c6] stderr: 6: (()+0x92901) [0x7ff4c2df0901] stderr: 7: (()+0x92b34) [0x7ff4c2df0b34] stderr: 8: (()+0x5a3f53) [0x564eed1c4f53] stderr: 9: (Option::size_t const md_config_t::get_val(ConfigValues const&, std::__cxx11::basic_string, std::allocator > const&) const+0x81) [0x564eed1cac91] stderr: 10: (BlueStore::_set_cache_sizes()+0x15a) [0x564eed645d8a] stderr: 11: (BlueStore::_open_bdev(bool)+0x173) [0x564eed648b23] stderr: 12: (BlueStore::mkfs()+0x42b) [0x564eed6adeab] stderr: 13: (OSD::mkfs(CephContext*, ObjectStore*, uuid_d, int)+0xd5) [0x564eed1e4bf5] stderr: 14: (main()+0x1796) [0x564eed191366] stderr: 15: (__libc_start_main()+0xeb) [0x7ff4c2a2409b] stderr: 16: (_start()+0x2a) [0x564eed1c4c6a] stderr: 2020-01-29 20:32:33.062 7ff4c24abc80 -1 *** Caught signal (Aborted) ** stderr: in thread 7ff4c24abc80 thread_name:ceph-osd stderr: ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus (stable) stderr: 1: (()+0x12730) [0x7ff4c2f54730] stderr: 2: (gsignal()+0x10b) [0x7ff4c2a377bb] stderr: 3: (abort()+0x121) [0x7ff4c2a22535] stderr: 4: (()+0x8c983) [0x7ff4c2dea983] stderr: 5: (()+0x928c6) [0x7ff4c2df08c6] stderr: 6: (()+0x92901) [0x7ff4c2df0901] stderr: 7: (()+0x92b34) [0x7ff4c2df0b34] stderr: 8: (()+0x5a3f53) [0x564eed1c4f53] stderr: 9: (Option::size_t const md_config_t::get_val(ConfigValues const&, std::__cxx11::basic_string, std::allocator > const&) const+0x81) [0x564eed1cac91] stderr: 10: (BlueStore::_set_cache_sizes()+0x15a) [0x564eed645d8a] stderr: 11: (BlueStore::_open_bdev(bool)+0x173) [0x564eed648b23] stderr: 12: (BlueStore::mkfs()+0x42b) [0x564eed6adeab] stderr: 13: (OSD::mkfs(CephContext*, ObjectStore*, uuid_d, int)+0xd5) [0x564eed1e4bf5] stderr: 14: (main()+0x1796) [0x564eed191366] stderr: 15: (__libc_start_main()+0xeb) [0x7ff4c2a2409b] stderr: 16: (_s
[ceph-users] Re: Nautilus 14.2.6 ceph-volume bluestore _read_fsid unparsable uuid
Hi, Installing ceph from the debian unstable repository (ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus (stable), debian package: 14.2.6-6) has fixed things form me. (See also the bug report and the duplicate of it and the changelog of 14.2.6-6 - bauen1 On 1/30/20 3:31 AM, Dave Hall wrote: Jan, In trying to recover my OSDs after the upgrade from Nautilus described earlier, I eventually managed to make things worse to the point where I'm going to scrub and fully reinstall. So I zapped all of the devices on one of my nodes and reproduced the ceph-volume lvm create error I mentioned earlier, using the procedure from https://docs.ceph.com/docs/mimic/rados/configuration/bluestore-config-ref/ to lay out the LVs and issue ceph-volume lvm create. As I was concerned that maybe it was a size thing, I only create a 4TB block LV for my first attempt, and the full 12TB drive for my second attempt. The output is: root@ceph01:~# ceph-volume lvm create --bluestore --data ceph-block-0/block-0 --block.db ceph-db-0/db-0 Running command: /usr/bin/ceph-authtool --gen-print-key Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 6441f236-8694-46b9-9c6a-bf82af89765d Running command: /usr/bin/ceph-authtool --gen-print-key Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-24 --> Absolute path not found for executable: selinuxenabled --> Ensure $PATH environment variable contains common executable locations Running command: /bin/chown -h ceph:ceph /dev/ceph-block-0/block-0 Running command: /bin/chown -R ceph:ceph /dev/dm-0 Running command: /bin/ln -s /dev/ceph-block-0/block-0 /var/lib/ceph/osd/ceph-24/block Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-24/activate.monmap stderr: got monmap epoch 4 Running command: /usr/bin/ceph-authtool /var/lib/ceph/osd/ceph-24/keyring --create-keyring --name osd.24 --add-key AQAuMjJe5OGHBRAAP94+1E7CzV5Rv9HFj9WVqA== stdout: creating /var/lib/ceph/osd/ceph-24/keyring added entity osd.24 auth(key=AQAuMjJe5OGHBRAAP94+1E7CzV5Rv9HFj9WVqA==) Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-24/keyring Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-24/ Running command: /bin/chown -h ceph:ceph /dev/ceph-db-0/db-0 Running command: /bin/chown -R ceph:ceph /dev/dm-1 Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 24 --monmap /var/lib/ceph/osd/ceph-24/activate.monmap --keyfile - --bluestore-block-db-path /dev/ceph-db-0/db-0 --osd-data /var/lib/ceph/osd/ceph-24/ --osd-uuid 6441f236-8694-46b9-9c6a-bf82af89765d --setuser ceph --setgroup ceph stderr: 2020-01-29 20:32:33.054 7ff4c24abc80 -1 bluestore(/var/lib/ceph/osd/ceph-24/) _read_fsid unparsable uuid stderr: terminate called after throwing an instance of 'boost::exception_detail::clone_impl ' stderr: what(): boost::bad_get: failed value get using boost::get stderr: *** Caught signal (Aborted) ** stderr: in thread 7ff4c24abc80 thread_name:ceph-osd stderr: ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus (stable) stderr: 1: (()+0x12730) [0x7ff4c2f54730] stderr: 2: (gsignal()+0x10b) [0x7ff4c2a377bb] stderr: 3: (abort()+0x121) [0x7ff4c2a22535] stderr: 4: (()+0x8c983) [0x7ff4c2dea983] stderr: 5: (()+0x928c6) [0x7ff4c2df08c6] stderr: 6: (()+0x92901) [0x7ff4c2df0901] stderr: 7: (()+0x92b34) [0x7ff4c2df0b34] stderr: 8: (()+0x5a3f53) [0x564eed1c4f53] stderr: 9: (Option::size_t const md_config_t::get_val(ConfigValues const&, std::__cxx11::basic_string, std::allocator > const&) const+0x81) [0x564eed1cac91] stderr: 10: (BlueStore::_set_cache_sizes()+0x15a) [0x564eed645d8a] stderr: 11: (BlueStore::_open_bdev(bool)+0x173) [0x564eed648b23] stderr: 12: (BlueStore::mkfs()+0x42b) [0x564eed6adeab] stderr: 13: (OSD::mkfs(CephContext*, ObjectStore*, uuid_d, int)+0xd5) [0x564eed1e4bf5] stderr: 14: (main()+0x1796) [0x564eed191366] stderr: 15: (__libc_start_main()+0xeb) [0x7ff4c2a2409b] stderr: 16: (_start()+0x2a) [0x564eed1c4c6a] stderr: 2020-01-29 20:32:33.062 7ff4c24abc80 -1 *** Caught signal (Aborted) ** stderr: in thread 7ff4c24abc80 thread_name:ceph-osd stderr: ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus (stable) stderr: 1: (()+0x12730) [0x7ff4c2f54730] stderr: 2: (gsignal()+0x10b) [0x7ff4c2a377bb] stderr: 3: (abort()+0x121) [0x7ff4c2a22535] stderr: 4: (()+0x8c983) [0x7ff4c2dea983] stderr: 5: (()+0x928c6) [0x7ff4c2df08c6] stderr: 6: (()+0x92901) [0x7ff4c2df0901] stderr: 7: (()+0x92b34) [0x7ff4c2df0b34] stderr: 8: (()+0x5a3f53) [0x564eed1c4f53] stderr: 9: (Option::size_t const md_config_t::get_val(ConfigValues const&, std::__cxx11::basic_string, std::allocator > const&) const+0x81) [0x564eed1cac91] stderr: 10: (BlueStore::_set_ca
[ceph-users] Re: health_warn: slow_ops 4 slow ops
Quoting Ignacio Ocampo (naf...@gmail.com): > Hi Ceph Community (I'm new here :), Welcome! > Do you have any guidance on how to proceed with this? I'm trying to > understand why the cluster is HEALTH_WARN and what I need to do in order to > make it health again. This might be because there is no CRUSH rule [1] that matches your layout. Can you provide output of "ceph osd tree" "ceph osd crush rule dump" and "ceph osd pool get $pool-name all" It might also be that the PGs cannot peer with each other because of networking / firewall issues. G. Stefan [1]: https://docs.ceph.com/docs/master/rados/operations/crush-map/ -- | BIT BV https://www.bit.nl/Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / i...@bit.nl ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io