date:20200129

[ceph-users] Re: Nautilus 14.2.6 ceph-volume bluestore _read_fsid unparsable uuid

2020-01-29 Thread Jan Fajerski

On Tue, Jan 28, 2020 at 08:03:35PM +0100, bauen1 wrote:
>Hi,
>
>I've run into the same issue while testing:
>
>ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) 
>nautilus (stable)
>
>debian bullseye
>
>Ceph was installed using ceph-ansible on a vm from the repo 
>http://download.ceph.com/debian-nautilus
>
>The output of `sudo sh -c 'CEPH_VOLUME_DEBUG=true ceph-volume 
>--cluster test lvm batch --bluestore /dev/vdb'` has been attached.
Thx, I opened https://tracker.ceph.com/issues/43868.
This looks like a bluestore/osd issue to me, though it might end up being 
ceph-volumes fault.
>
>Also worth noting might be that '/var/lib/ceph/osd/test-0/fsid' is 
>empty (but I don't know too much about the internals)
>
>- bauen1
>
>On 1/28/20 4:54 PM, Dave Hall wrote:
>>Jan,
>>
>>Unfortunately I'm under immense pressure right now to get some form 
>>of Ceph into production, so it's going to be Luminous for now, or 
>>maybe a live upgrade to Nautilus without recreating the OSDs (if 
>>that's possible).
>>
>>The good news is that in the next couple months I expect to add more 
>>hardware that should be nearly identical.  I will gladly give it a 
>>go at that time and see if I can recreate.  (Or, if I manage to 
>>thoroughly crash my current fledgling cluster, I'll give it another 
>>go on one node while I'm up all night recovering.)
>>
>>If you could tell me where to look I'd gladly read some code and see 
>>if I can find anything that way.  Or if there's any sort of design 
>>document describing the deep internals I'd be glad to scan it to see 
>>if I've hit a corner case of some sort.  Actually, I'd be interested 
>>in reading those documents anyway if I could.
>>
>>Thanks.
>>
>>-Dave
>>
>>Dave Hall
>>
>>On 1/28/2020 3:05 AM, Jan Fajerski wrote:
>>>On Mon, Jan 27, 2020 at 03:23:55PM -0500, Dave Hall wrote:
All,

I've just spent a significant amount of time unsuccessfully chasing
the  _read_fsid unparsable uuid error on Debian 10 / Natilus 14.2.6.
Since this is a brand new cluster, last night I gave up and moved back
to Debian 9 / Luminous 12.2.11.  In both cases I'm using the packages
from Debian Backports with ceph-ansible as my deployment tool.
Note that above I said 'the _read_fsid unparsable uuid' error. I've
searched around a bit and found some previously reported issues, but I
did not see any conclusive resolutions.

I would like to get to Nautilus as quickly as possible, so I'd gladly
provide additional information to help track down the cause of this
symptom.  I can confirm that, looking at the ceph-volume.log on the
OSD host I see no difference between the ceph-volume lvm batch command
generated by the ceph-ansible versions associated with these two Ceph
releases:

   ceph-volume --cluster ceph lvm batch --bluestore --yes
   --block-db-size 133358734540 /dev/sdc /dev/sdd /dev/sde /dev/sdf
   /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/nvme0n1

Note that I'm using --block-db-size to divide my NVMe into 12 segments
as I have 4 empty drive bays on my OSD servers that I may eventually
be able to fill.

My OSD hardware is:

   Disk /dev/nvme0n1: 1.5 TiB, 1600321314816 bytes, 3125627568 sectors
   Disk /dev/sdc: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors
   Disk /dev/sdd: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors
   Disk /dev/sde: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors
   Disk /dev/sdf: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors
   Disk /dev/sdg: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors
   Disk /dev/sdh: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors
   Disk /dev/sdi: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors
   Disk /dev/sdj: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors

I'd send the output of ceph-volume inventory on Luminous, but I'm
getting  -->: KeyError: 'human_readable_size'.

Please let me know if I can provide any further information.
>>>Mind re-running you ceph-volume command with  debug output
>>>enabled:
>>>CEPH_VOLUME_DEBUG=true ceph-volume --cluster ceph lvm batch 
>>>--bluestore ...
>>>
>>>Ideally you could also openen a bug report here
>>>https://tracker.ceph.com/projects/ceph-volume/issues/new
>>>
>>>Thanks!
Thanks.

-Dave

-- 
Dave Hall
Binghamton University

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
>>___
>>ceph-users mailing list -- ceph-users@ceph.io
>>To unsubscribe send an email to ceph-users-le...@ceph.io

>sysadmin@ceph-test:~$ sudo setenforce 0
>sysadmin@ceph-test:~$ sudo sh -c 'CEPH_VOLUME_DEBUG=true ceph-volume --cluster 
>test lvm batch --bluestore /dev/vdb'
>
>Total OSDs: 1
>
>  TypePath

[ceph-users] Re: Ceph MDS specific perf info disappeared in Nautilus

2020-01-29 Thread Stefan Kooman

Hi,

Quoting Dan van der Ster (d...@vanderster.com):

> Maybe you're checking a standby MDS ?

Looks like it. Active does have performance metrics.

Thanks,

Stefan

-- 
| BIT BV  https://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Write i/o in CephFS metadata pool

2020-01-29 Thread Samy Ascha

Hi!

I've been running CephFS for a while now and ever since setting it up, I've 
seen unexpectedly large write i/o on the CephFS metadata pool.

The filesystem is otherwise stable and I'm seeing no usage issues.

I'm in a read-intensive environment, from the clients' perspective and 
throughput for the metadata pool is consistently larger than that of the data 
pool.

For example:

# ceph osd pool stats
pool cephfs_data id 1
  client io 7.6 MiB/s rd, 19 KiB/s wr, 404 op/s rd, 1 op/s wr

pool cephfs_metadata id 2
  client io 338 KiB/s rd, 43 MiB/s wr, 84 op/s rd, 26 op/s wr

I realise, of course, that this is a momentary display of statistics, but I see 
this unbalanced r/w activity consistently when monitoring it live.

I would like some insight into what may be causing this large imbalance in r/w, 
especially since I'm in a read-intensive (web hosting) environment.

Some of it may be expected in when considering details of my environment and 
CephFS implementation specifics, so please ask away if more details are needed.

With my experience using NFS, I would start by looking at client io stats, like 
`nfsstat` and tuning e.g. mount options, but I haven't been able to find such 
statistics for CephFS clients.

Is there anything of the sort for CephFS? Are similar stats obtainable in some 
other way?

This might be a somewhat broad question and shallow description, so yeah, let 
me know if there's anything you would like more details on.

Thanks a lot,
Samy
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: getting rid of incomplete pg errors

2020-01-29 Thread Gregory Farnum

There should be docs on how to mark an OSD lost, which I would expect to be
linked from the troubleshooting PGs page.

There is also a command to force create PGs but I don’t think that will
help in this case since you already have at least one copy.

On Tue, Jan 28, 2020 at 5:15 PM Hartwig Hauschild 
wrote:

> Hi.
>
> before I descend into what happened and why it happened: I'm talking about
> a
> test-cluster so I don't really care about the data in this case.
>
> We've recently started upgrading from luminous to nautilus, and for us that
> means we're retiring ceph-disk in favour of ceph-volume with lvm and
> dmcrypt.
>
> Our setup is in containers and we've got DBs separated from Data.
> When testing our upgrade-path we discovered that running the host on
> ubuntu-xenial and the containers on centos-7.7 leads to lvm inside the
> containers not using lvmetad because it's too old. That in turn means that
> not running `vgscan --cache` on the host before adding a LV to a VG
> essentially zeros the metadata for all LVs in that VG.
>
> That happened on two out of three hosts for a bunch of OSDs and those OSDs
> are gone. I have no way of getting them back, they've been overwritten
> multiple times trying to figure out what went wrong.
>
> So now I have a cluster that's got 16 pgs in 'incomplete', 14 of them with
> 0
> objects, 2 with about 150 objects each.
>
> I have found a couple of howtos that tell me to use ceph-objectstore-tool
> to
> find the pgs on the active osds and I've given that a try, but
> ceph-objectstore-tool always tells me it can't find the pg I am looking
> for.
>
> Can I tell ceph to re-init the pgs? Do I have to delete the pools and
> recreate them?
>
> There's no data I can't get back in there, I just don't feel like
> scrapping and redeploying the whole cluster.
>
>
> --
> Cheers,
> Hardy
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Concurrent append operations

2020-01-29 Thread Gregory Farnum

The core RADOS api will order these on the osd as it receives the
operations from clients, and nothing will break if you submit 2 in parallel.

I’m less familiar with the S3 interface but I believe appends there will be
ordered by the rgw daemon and so will be much slower. Or maybe it works the
same as a normal s3 object overwrite and the second one will win and the
first one goes into oblivion?
Either way it probably won’t fit your needs.
-Greg

On Mon, Jan 20, 2020 at 3:42 PM David Bell  wrote:

> Hello,
>
> I am currently evaluating Ceph for our needs and I have a question
> about the 'object append' feature. I note that the rados core API
> supports an 'append' operation, and the S3-compatible interface has
> too.
>
> My question is: does Ceph support concurrent append? I would like to
> use Ceph as a temporary store, a "buffer" if you will, for incoming
> data from a variety of sources. Each object would hold data for a
> particular identifier. I'd like to know if two or more different
> clients can 'append' to the same object, and the data doesn't
> overwrite each other, and each 'append' is added to the end of the
> object?
>
> Performance wise we'd likely be performing 15-20 thousand writes per
> second, so we'd be building a pretty big cluster on very fast flash
> disk. Data would only reside on the system for about an hour at most
> before being read and deleted.
>
> Cheers,
>
> David Bell
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: High CPU usage by ceph-mgr in 14.2.6

2020-01-29 Thread Lenz Grimmer

On 2020-01-29 01:19, jbardg...@godaddy.com wrote:

> We feel this is related to the size of the cluster, similarly to the
> previous report.
> 
> Anyone else experiencing this and/or can provide some direction on
> how to go about resolving this?

What Manager modules are enabled on that node? Have you tried disabling
some of them, e.g. the Dashboard or Balancer module?

Lenz

-- 
SUSE Software Solutions Germany GmbH - Maxfeldstr. 5 - 90409 Nuernberg
GF: Felix Imendörffer, HRB 36809 (AG Nürnberg)



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Luminous Bluestore OSDs crashing with ASSERT

2020-01-29 Thread Igor Fedotov


Hi Stefan,

the proper Ceph way of sending log for developer analysis is 
ceph-post-file but I'm not good in retrieving them from there...


Ideally I'd prefer to start with log snippets covering 20K lines prior 
to crash. 3 or 4 of them. This wouldn't take so much space and you can 
send them by email, attach to a ticket or share via some publicly 
available URL. Accessing the whole logs also works for me.


If this doesn't work then let's go ceph-post-file way, I have to pass 
this trail one day...



Thanks,

Igor

On 1/28/2020 3:09 PM, Stefan Priebe - Profihost AG wrote:

Hello Igor,

i updated all servers to latest 4.19.97 kernel but this doesn't fix the
situation.

I can provide you with all those logs - any idea where to upload / how
to sent them to you?

Greets,
Stefan

Am 20.01.20 um 13:12 schrieb Igor Fedotov:

Hi Stefan,

these lines are result of transaction dump performed on a failure during
transaction submission (which is shown as

"submit_transaction error: Corruption: block checksum mismatch code = 2"

Most probably they are out of interest (checksum errors are unlikely to
be caused by transaction content) and hence we need earlier stuff to
learn what caused that

checksum mismatch.

It's hard to give any formal overview of what you should look for, from
my troubleshooting experience generally one may try to find:

- some previous error/warning indications (e.g. allocation, disk access,
etc)

- prior OSD crashes (sometimes they might have different causes/stack
traces/assertion messages)

- any timeout or retry indications

- any uncommon log patterns which aren't present during regular running
but happen each time before the crash/failure.

Anyway I think the inspection depth should be much(?) deeper than
presumably it is (from what I can see from your log snippets).

Ceph keeps last 1 log events with an increased log level and dumps
them on crash with negative index starting at - up to -1 as a prefix.

-1> 2020-01-16 01:10:13.404090 7f3350a14700 -1 rocksdb:


It would be great If you share several log snippets for different
crashes containing these last 1 lines.


Thanks,

Igor


On 1/19/2020 9:42 PM, Stefan Priebe - Profihost AG wrote:

Hello Igor,

there's absolutely nothing in the logs before.

What do those lines mean:
Put( Prefix = O key =
0x7f8001cc45c881217262'd_data.4303206b8b4567.9632!='0xfffe6f0012'x'

Value size = 480)
Put( Prefix = O key =
0x7f8001cc45c881217262'd_data.4303206b8b4567.9632!='0xfffe'o'

Value size = 510)

on the right size i always see 0xfffe on all
failed OSDs.

greets,
Stefan
Am 19.01.20 um 14:07 schrieb Stefan Priebe - Profihost AG:

Yes, except that this happens on 8 different clusters with different
hw but same ceph version and same kernel version.

Greets,
Stefan


Am 19.01.2020 um 11:53 schrieb Igor Fedotov :

So the intermediate summary is:

Any OSD in the cluster can experience interim RocksDB checksum
failure. Which isn't present after OSD restart.

No HW issues observed, no persistent artifacts (except OSD log)
afterwards.

And looks like the issue is rather specific to the cluster as no
similar reports from other users seem to be present.


Sorry, I'm out of ideas other then collect all the failure logs and
try to find something common in them. May be this will shed some
light..

BTW from my experience it might make sense to inspect OSD log prior
to failure (any error messages and/or prior restarts, etc) sometimes
this might provide some hints.


Thanks,

Igor



On 1/17/2020 2:30 PM, Stefan Priebe - Profihost AG wrote:
HI Igor,


Am 17.01.20 um 12:10 schrieb Igor Fedotov:
hmmm..

Just in case - suggest to check H/W errors with dmesg.

this happens on around 80 nodes - i don't expect all of those have not
identified hw errors. Also all of them are monitored - no dmesg
outpout
contains any errors.


Also there are some (not very much though) chances this is another
incarnation of the following bug:
https://tracker.ceph.com/issues/22464
https://github.com/ceph/ceph/pull/24649

The corresponding PR works around it for main device reads (user data
only!) but theoretically it might still happen

either for DB device or DB data at main device.

Can you observe any bluefs spillovers? Are there any correlation
between
failing OSDs and spillover presence if any, e.g. failing OSDs always
have a spillover. While OSDs without spillovers never face the
issue...

To validate this hypothesis one can try to monitor/check (e.g. once a
day for a week or something) "bluestore_reads_with_retries"
counter over
OSDs to learn if the issue is happening

in the system.  Non-zero values mean it's there for user data/main
device and hence is likely to happen for DB ones as well (which
doesn't
have any workaround yet).

OK i checked bluestore_reads_with_retries on 360 osds but all of
them say 0.



Additionally you might want to

[ceph-users] ceph fs dir-layouts and sub-directory mounts

2020-01-29 Thread Frank Schilder

I would like to (in this order)

- set the data pool for the root "/" of a ceph-fs to a custom value, say "P" 
(not the initial data pool used in fs new)
- create a sub-directory of "/", for example "/a"
- mount the sub-directory "/a" with a client key with access restricted to "/a"

The client will not be able to see the dir layout attribute set at "/", its not 
mounted.

Will the data of this client still go to the pool "P", that is, does "/a" 
inherit the dir layout transparently to the client when following the steps 
above?

Best regards,

=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: getting rid of incomplete pg errors

2020-01-29 Thread Hartwig Hauschild

Hi, 

I had looked at the output of `ceph health detail` which told me to search
for 'incomplete' in the docs.
Since that said to file a bug (and I was sure that filing a bug did not
help) I continued to purge the Disks that we hat overwritten and ceph then
did some magic and told me that the PGs were again available on three OSDs
but were incomplete.

I have now gone ahead and marked all three of the OSDs where one of my
incomplete PGs is (according to `ceph pg ls incomplete`) as lost one by
one, waiting for ceph status to settle in between and that lead to the PG
now being incomplete on three different OSDs.
Also, force-create-pg tells me "already created".


Am 29.01.2020 schrieb Gregory Farnum:
> There should be docs on how to mark an OSD lost, which I would expect to be
> linked from the troubleshooting PGs page.
> 
> There is also a command to force create PGs but I don’t think that will
> help in this case since you already have at least one copy.
> 
> On Tue, Jan 28, 2020 at 5:15 PM Hartwig Hauschild 
> wrote:
> 
> > Hi.
> >
> > before I descend into what happened and why it happened: I'm talking about
> > a
> > test-cluster so I don't really care about the data in this case.
> >
> > We've recently started upgrading from luminous to nautilus, and for us that
> > means we're retiring ceph-disk in favour of ceph-volume with lvm and
> > dmcrypt.
> >
> > Our setup is in containers and we've got DBs separated from Data.
> > When testing our upgrade-path we discovered that running the host on
> > ubuntu-xenial and the containers on centos-7.7 leads to lvm inside the
> > containers not using lvmetad because it's too old. That in turn means that
> > not running `vgscan --cache` on the host before adding a LV to a VG
> > essentially zeros the metadata for all LVs in that VG.
> >
> > That happened on two out of three hosts for a bunch of OSDs and those OSDs
> > are gone. I have no way of getting them back, they've been overwritten
> > multiple times trying to figure out what went wrong.
> >
> > So now I have a cluster that's got 16 pgs in 'incomplete', 14 of them with
> > 0
> > objects, 2 with about 150 objects each.
> >
> > I have found a couple of howtos that tell me to use ceph-objectstore-tool
> > to
> > find the pgs on the active osds and I've given that a try, but
> > ceph-objectstore-tool always tells me it can't find the pg I am looking
> > for.
> >
> > Can I tell ceph to re-init the pgs? Do I have to delete the pools and
> > recreate them?
> >
> > There's no data I can't get back in there, I just don't feel like
> > scrapping and redeploying the whole cluster.
> >
> >
> > --
> > Cheers,
> > Hardy
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> >

-- 
Cheers,
Hardy
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Write i/o in CephFS metadata pool

2020-01-29 Thread Jasper Spaans

On 29/01/2020 10:24, Samy Ascha wrote:
> I've been running CephFS for a while now and ever since setting it up, I've 
> seen unexpectedly large write i/o on the CephFS metadata pool.
>
> The filesystem is otherwise stable and I'm seeing no usage issues.
>
> I'm in a read-intensive environment, from the clients' perspective and 
> throughput for the metadata pool is consistently larger than that of the data 
> pool.
>
> [...]
>
> This might be a somewhat broad question and shallow description, so yeah, let 
> me know if there's anything you would like more details on.

No explanation, but chiming in, as I've seen something similar happen on
my single node "cluster" at home, where I'm exposing a cephfs through
Samba using vfs_ceph, mostly for time machine backups. Running ceph
14.2.6 on debian buster.

I can easily perform debugging operations there, no SLA in place :)

Jasper

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: cephfs : write error: Operation not permitted

2020-01-29 Thread Yoann Moulin


Hello,

Sorry, this should be

   ceph osd pool application set cephfs_data cephfs data cephfs
   ceph osd pool application set cephfs_metadata cephfs metadata cephfs

so that the json output looks like

   "cephfs_data"
   {
 "cephfs": {
   "data": "cephfs"
 }
   }
   "cephfs_metadata"
   {
 "cephfs": {
   "metadata": "cephfs"
 }
   }


Thanks a lot, that has fixed my issue!

Best,

--
Yoann Moulin
EPFL IC-IT
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Network performance checks

2020-01-29 Thread Massimo Sgaravatto

After having upgraded my ceph cluster from Luminous to Nautilus 14.2.6 ,
from time to time "ceph health detail" claims about some"Long heartbeat
ping times on front/back interface seen".

As far as I can understand (after having read
https://docs.ceph.com/docs/nautilus/rados/operations/monitoring/), this
means that  the ping from one OSD to another one exceeded 1 s.

I have some questions on these network performance checks

1) What is meant exactly with front and back interface ?

2) I can see the involved OSDs only in the output of "ceph health detail"
(when there is the problem) but I can't find this information  in the log
files. In the mon log file I can only see messages such as:


2020-01-28 11:14:07.641 7f618e644700  0 log_channel(cluster) log [WRN] :
Health check failed: Long heartbeat ping times on back interface seen,
longest is 1416.618 msec (OSD_SLOW_PING_TIME_BACK)

but the involved OSDs are not reported in this log.
Do I just need to increase the verbosity of the mon log ?

3) Is 1 s a reasonable value for this threshold ? How could this value be
changed ? What is the relevant configuration variable ?

4)  https://docs.ceph.com/docs/nautilus/rados/operations/monitoring/
suggests to use the dump_osd_network command. I think there is an error in
that page: it says that the command should be issued on ceph-mgr.x.asok,
while I think that instead the ceph-osd-x.asok should be used

I have an other ceph cluster (running nautilus 14.2.6 as well) where there
aren't OSD_SLOW_PING_* error messages in the mon logs, but:

ceph daemon /var/run/ceph/ceph-osd..asok dump_osd_network 1

reports a lot of entries (i.e. pings exceeded 1 s). How can this be
explained ?


Thanks, Massimo
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Write i/o in CephFS metadata pool

2020-01-29 Thread DHilsbos

Sammy;

I had a thought; since you say the FS has high read activity, but you're seeing 
large write I/O... is it possible that this is related to  atime (Linux last 
access time)?  If I remember my Linux FS basics, atime is stored in the file 
entry for the file in the directory, and I believe directory information is 
stored in the metadata pool (dentries?).

As a test; you might try mounting the CephFS with the noatime flag.  Then see 
if the write I/O is reduced.

I honestly don't know if CephFS supports atime, but I would expect it would.

Thank you,

Dominic L. Hilsbos, MBA 
Director - Information Technology 
Perform Air International Inc.
dhils...@performair.com 
www.PerformAir.com



-Original Message-
From: Samy Ascha [mailto:s...@xel.nl] 
Sent: Wednesday, January 29, 2020 2:25 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Write i/o in CephFS metadata pool

Hi!

I've been running CephFS for a while now and ever since setting it up, I've 
seen unexpectedly large write i/o on the CephFS metadata pool.

The filesystem is otherwise stable and I'm seeing no usage issues.

I'm in a read-intensive environment, from the clients' perspective and 
throughput for the metadata pool is consistently larger than that of the data 
pool.

For example:

# ceph osd pool stats
pool cephfs_data id 1
  client io 7.6 MiB/s rd, 19 KiB/s wr, 404 op/s rd, 1 op/s wr

pool cephfs_metadata id 2
  client io 338 KiB/s rd, 43 MiB/s wr, 84 op/s rd, 26 op/s wr

I realise, of course, that this is a momentary display of statistics, but I see 
this unbalanced r/w activity consistently when monitoring it live.

I would like some insight into what may be causing this large imbalance in r/w, 
especially since I'm in a read-intensive (web hosting) environment.

Some of it may be expected in when considering details of my environment and 
CephFS implementation specifics, so please ask away if more details are needed.

With my experience using NFS, I would start by looking at client io stats, like 
`nfsstat` and tuning e.g. mount options, but I haven't been able to find such 
statistics for CephFS clients.

Is there anything of the sort for CephFS? Are similar stats obtainable in some 
other way?

This might be a somewhat broad question and shallow description, so yeah, let 
me know if there's anything you would like more details on.

Thanks a lot,
Samy
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Write i/o in CephFS metadata pool

2020-01-29 Thread Samy Ascha

Hi Dominic,

I should have mentioned that I've set noatime already.

I have not found any obvious other mount options that would contribute to 
'write on read' behaviour..

Thx

Samy

> On 29 Jan 2020, at 15:43, dhils...@performair.com wrote:
> 
> Sammy;
> 
> I had a thought; since you say the FS has high read activity, but you're 
> seeing large write I/O... is it possible that this is related to  atime 
> (Linux last access time)?  If I remember my Linux FS basics, atime is stored 
> in the file entry for the file in the directory, and I believe directory 
> information is stored in the metadata pool (dentries?).
> 
> As a test; you might try mounting the CephFS with the noatime flag.  Then see 
> if the write I/O is reduced.
> 
> I honestly don't know if CephFS supports atime, but I would expect it would.
> 
> Thank you,
> 
> Dominic L. Hilsbos, MBA 
> Director - Information Technology 
> Perform Air International Inc.
> dhils...@performair.com 
> www.PerformAir.com
> 
> 
> 
> -Original Message-
> From: Samy Ascha [mailto:s...@xel.nl] 
> Sent: Wednesday, January 29, 2020 2:25 AM
> To: ceph-users@ceph.io
> Subject: [ceph-users] Write i/o in CephFS metadata pool
> 
> Hi!
> 
> I've been running CephFS for a while now and ever since setting it up, I've 
> seen unexpectedly large write i/o on the CephFS metadata pool.
> 
> The filesystem is otherwise stable and I'm seeing no usage issues.
> 
> I'm in a read-intensive environment, from the clients' perspective and 
> throughput for the metadata pool is consistently larger than that of the data 
> pool.
> 
> For example:
> 
> # ceph osd pool stats
> pool cephfs_data id 1
> client io 7.6 MiB/s rd, 19 KiB/s wr, 404 op/s rd, 1 op/s wr
> 
> pool cephfs_metadata id 2
> client io 338 KiB/s rd, 43 MiB/s wr, 84 op/s rd, 26 op/s wr
> 
> I realise, of course, that this is a momentary display of statistics, but I 
> see this unbalanced r/w activity consistently when monitoring it live.
> 
> I would like some insight into what may be causing this large imbalance in 
> r/w, especially since I'm in a read-intensive (web hosting) environment.
> 
> Some of it may be expected in when considering details of my environment and 
> CephFS implementation specifics, so please ask away if more details are 
> needed.
> 
> With my experience using NFS, I would start by looking at client io stats, 
> like `nfsstat` and tuning e.g. mount options, but I haven't been able to find 
> such statistics for CephFS clients.
> 
> Is there anything of the sort for CephFS? Are similar stats obtainable in 
> some other way?
> 
> This might be a somewhat broad question and shallow description, so yeah, let 
> me know if there's anything you would like more details on.
> 
> Thanks a lot,
> Samy
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: High CPU usage by ceph-mgr in 14.2.6

2020-01-29 Thread jbardgett

Modules that are normally enabled:

ceph mgr module ls | jq -r '.enabled_modules'
[
  "dashboard",
  "prometheus",
  "restful"
]

We did test with all modules disabled, restarted the mgrs and saw no difference.

Joe
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Servicing multiple OpenStack clusters from the same Ceph cluster

2020-01-29 Thread Paul Browne

Hello,

We have a medium-sized Ceph Luminous cluster that, up til now, has been the
RBD image backend solely for an OpenStack Newton cluster that's marked for
upgrade to Stein later this year.

Recently we deployed a brand new Stein cluster however, and I'm curious
whether the idea of pointing the new OpenStack cluster at the same RBD
pools for Cinder/Glance/Nova as the Luminous cluster would be considered
bad practice, or even potentially dangerous.

One argument for doing it may be that multiple CInder/Glance/Nova pools
serving disparate groups of clients would come at a PG cost to the cluster,
though the separation of multiple, distinct pools also has its advantages.
The UUIDs generated for RBD images in the pools by OpenStack services
*should* be unique and collision-less between the 2 OpenStack clusters, in
theory.

One other point I was curious about was RBD image feature sets; Stein Ceph
clients will be running later versions of Ceph libraries than Newton
clients. If the 2 sets of clients were to share pools, would that itself
cause problems (in the case that neither set needed to share RBD images
within pools, only the pool itself) with some images in the pool having
different feature lists?

-- 
***
Paul Browne
Research Computing Platforms
University Information Services
Roger Needham Building
JJ Thompson Avenue
University of Cambridge
Cambridge
United Kingdom
E-Mail: pf...@cam.ac.uk
Tel: 0044-1223-746548
***
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Servicing multiple OpenStack clusters from the same Ceph cluster [EXT]

2020-01-29 Thread Matthew Vernon

Hi,

On 29/01/2020 16:40, Paul Browne wrote:

> Recently we deployed a brand new Stein cluster however, and I'm curious
> whether the idea of pointing the new OpenStack cluster at the same RBD
> pools for Cinder/Glance/Nova as the Luminous cluster would be considered
> bad practice, or even potentially dangerous.

I think that would be pretty risky - here we have a Ceph cluster that
provides backing for our OpenStacks, and each OpenStack has its own set
of pools -metrics,-images,-volumes,-vms (and its own credential).

Regards,

Matthew



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Servicing multiple OpenStack clusters from the same Ceph cluster

2020-01-29 Thread tdados

Hello,
We have recently deployed that and it's working fine. We have deployed 
different keys for the different openstack clusters ofcourse and they are using 
the same cinder/nova/glance pools. 

The only risk is if a client from one openstack cluster creates a volume and 
the id that will be generated ends up being the same on an existing volume from 
the other openstack cluster. But that's like possibility of 1 in 5 billion or 
something. 
We took the risk. 

Regards
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Servicing multiple OpenStack clusters from the same Ceph cluster

2020-01-29 Thread Matthew H

You should have used separate pool name scemes for each OpenStack cluster..

From: tda...@hotmail.com 
Sent: Wednesday, January 29, 2020 12:29 PM
To: ceph-users@ceph.io 
Subject: [ceph-users] Re: Servicing multiple OpenStack clusters from the same 
Ceph cluster

Hello,
We have recently deployed that and it's working fine. We have deployed 
different keys for the different openstack clusters ofcourse and they are using 
the same cinder/nova/glance pools.

The only risk is if a client from one openstack cluster creates a volume and 
the id that will be generated ends up being the same on an existing volume from 
the other openstack cluster. But that's like possibility of 1 in 5 billion or 
something.
We took the risk.

Regards
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Servicing multiple OpenStack clusters from the same Ceph cluster

2020-01-29 Thread Anastasios Dados

Yes but we are offering our rbd volumes in another cloud product which can 
enable them migrate their volumes to openstack when they want.



Sent from my iPhone

On 29 Jan 2020, at 18:38, Matthew H  wrote:


You should have used separate pool name scemes for each OpenStack cluster..


From: tda...@hotmail.com 
Sent: Wednesday, January 29, 2020 12:29 PM
To: ceph-users@ceph.io 
Subject: [ceph-users] Re: Servicing multiple OpenStack clusters from the same 
Ceph cluster

Hello,
We have recently deployed that and it's working fine. We have deployed 
different keys for the different openstack clusters ofcourse and they are using 
the same cinder/nova/glance pools.

The only risk is if a client from one openstack cluster creates a volume and 
the id that will be generated ends up being the same on an existing volume from 
the other openstack cluster. But that's like possibility of 1 in 5 billion or 
something.
We took the risk.

Regards
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: High CPU usage by ceph-mgr in 14.2.6

2020-01-29 Thread Neha Ojha

Hi Joe,

Can you grab a wallclock profiler dump from the mgr process and share
it with us? This was useful for us to get to the root cause of the
issue in 14.2.5.

Quoting Mark's suggestion from "[ceph-users] High CPU usage by
ceph-mgr in 14.2.5" below.

If you can get a wallclock profiler on the mgr process we might be able
to figure out specifics of what's taking so much time (ie processing
pg_summary or something else).  Assuming you have gdb with the python
bindings and the ceph debug packages installed, if you (are anyone)
could try gdbpmp on the 100% mgr process that would be fantastic.

https://github.com/markhpc/gdbpmp

gdbpmp.py -p`pidof ceph-mgr` -n 1000 -o mgr.gdbpmp

If you want to view the results:

gdbpmp.py -i mgr.gdbpmp -t 1

Thanks,
Neha

On Wed, Jan 29, 2020 at 7:35 AM  wrote:
>
> Modules that are normally enabled:
>
> ceph mgr module ls | jq -r '.enabled_modules'
> [
>   "dashboard",
>   "prometheus",
>   "restful"
> ]
>
> We did test with all modules disabled, restarted the mgrs and saw no 
> difference.
>
> Joe
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Nautilus 14.2.6 ceph-volume bluestore _read_fsid unparsable uuid

2020-01-29 Thread Dave Hall


Jan,

I have something new on this topic.  I had gone back to Debian 9 
backports and Luminous (distro packages).  I had all of my OSDs working 
and I was about to deploy an MDS.  But I noticed that the same Luminous 
packages where in Debian 10 (not backports), so I upgraded my OS to 
Debian 10.  The OSDs, MONs, and MGRs survived the trip, although a 
couple of the OSDs needed me to 'systemctl start ceph-volume@lvm' 
before they came online.


Then I couldn't resist, so I did one further upgrade to Debian 10 
Backports, which moved my Ceph to Nautilus.  What could go wrong? I did 
refer to https://pve.proxmox.com/wiki/Ceph_Luminous_to_Nautilus even 
though it's not exactly equivalent.


After the dist-upgrade the MONs and MGRs were all good, but 17 of my 24 
OSDs are down and don't seem to want to come up:


   root@ceph00:~# ceph versions
   {
    "mon": {
    "ceph version 14.2.6
   (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus (stable)": 3
    },
    "mgr": {
    "ceph version 14.2.6
   (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus (stable)": 3
    },
    "osd": {
    "ceph version 12.2.11
   (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable)": 7
    },
    "mds": {},
    "overall": {
    "ceph version 12.2.11
   (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable)": 7,
    "ceph version 14.2.6
   (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus (stable)": 6
    }
   }
   root@ceph00:~# ceph osd tree
   ID CLASS WEIGHT    TYPE NAME   STATUS REWEIGHT PRI-AFF
   -1   261.93823 root default
   -7    87.31274 host ceph00
   15   hdd  10.91409 osd.15    down  1.0 1.0
   16   hdd  10.91409 osd.16    down  1.0 1.0
   17   hdd  10.91409 osd.17    down  1.0 1.0
   18   hdd  10.91409 osd.18    down  1.0 1.0
   19   hdd  10.91409 osd.19    down  1.0 1.0
   20   hdd  10.91409 osd.20    down  1.0 1.0
   21   hdd  10.91409 osd.21    down  1.0 1.0
   22   hdd  10.91409 osd.22    down  1.0 1.0
   -5    87.31274 host ceph01
 7   hdd  10.91409 osd.7 down  1.0 1.0
 8   hdd  10.91409 osd.8 down  1.0 1.0
 9   hdd  10.91409 osd.9 down  1.0 1.0
   10   hdd  10.91409 osd.10    down  1.0 1.0
   11   hdd  10.91409 osd.11    down  1.0 1.0
   12   hdd  10.91409 osd.12    down  1.0 1.0
   13   hdd  10.91409 osd.13    down  1.0 1.0
   14   hdd  10.91409 osd.14    down  1.0 1.0
   -3    87.31274 host ceph02
 0   hdd  10.91409 osd.0 down  1.0 1.0
 1   hdd  10.91409 osd.1   up  1.0 1.0
 2   hdd  10.91409 osd.2   up  1.0 1.0
 3   hdd  10.91409 osd.3   up  1.0 1.0
 4   hdd  10.91409 osd.4   up  1.0 1.0
 5   hdd  10.91409 osd.5   up  1.0 1.0
 6   hdd  10.91409 osd.6   up  1.0 1.0
   23   hdd  10.91409 osd.23  up  1.0 1.0

   root@ceph00:~# ceph-volume inventory

   Device Path   Size rotates available Model name
   /dev/md0  186.14 GB    False   False
   /dev/md1  37.27 GB False   False
   /dev/nvme0n1  1.46 TB  False   False SAMSUNG
   MZPLL1T6HEHP-3
   /dev/sda  223.57 GB    False   False Samsung SSD 883
   /dev/sdb  223.57 GB    False   False Samsung SSD 883
   /dev/sdc  10.91 TB True    False ST12000NM0027
   /dev/sdd  10.91 TB True    False ST12000NM0027
   /dev/sde  10.91 TB True    False ST12000NM0027
   /dev/sdf  10.91 TB True    False ST12000NM0027
   /dev/sdg  10.91 TB True    False ST12000NM0027
   /dev/sdh  10.91 TB True    False ST12000NM0027
   /dev/sdi  10.91 TB True    False ST12000NM0027
   /dev/sdj  10.91 TB True    False ST12000NM0027

I'm going to try a couple things on one of the two nodes, but I will 
save the other until I hear from you on any further information I could 
collect.  Note that all 3 nodes are identical hardware and software.


Since I don't have any data on these OSDs yet I don't have any problem 
with destroying and rebuilding them.  What would be really interesting 
would be a sequence of low-level commands that could be issued to 
manually create these OSDs.  There's some evidence of this in 
/var/log/ceph/ceph-volume.log, but there's some detail missing and it's 
really hard to follow.


If you can provide this list I'd gladly give it a try and let you know 
how it goes.


Thanks.

-Dave

Dave Hall
Binghamton University


On 1/29/2

[ceph-users] health_warn: slow_ops 4 slow ops

2020-01-29 Thread Ignacio Ocampo

Hi Ceph Community (I'm new here :),

I'm learning Ceph in a Virtual Environment Vagrant/Virtualbox (I understand
this is far from a real environment in several ways, mainly performance,
but I'm ok with that at this point :)

I've 3 nodes, and after few *vagrant halt/up*, when I do *ceph -s*, I got
the following message:

[vagrant@ceph-node1 ~]$ sudo ceph -s
  cluster:
id: 7f8cb5f0-1989-4ab1-8fb9-d5c08aa96658
health: *HEALTH_WARN*
Reduced data availability: 512 pgs inactive
4 slow ops, oldest one blocked for 1576 sec, daemons
[osd.6,osd.7,osd.8] have slow ops.

  services:
mon: 3 daemons, quorum ceph-node1,ceph-node2,ceph-node3 (age 7m)
mgr: ceph-node1(active, since 26m), standbys: ceph-node2, ceph-node3
osd: 9 osds: 9 up (since 25m), 9 in (since 2d)

  data:
pools:   1 pools, 512 pgs
objects: 0 objects, 0 B
usage:   9.1 GiB used, 162 GiB / 171 GiB avail
pgs: 100.000% pgs unknown
 512 unknown

Here the output of *ceph health detail*:

[vagrant@ceph-node1 ~]$ sudo ceph health detail
HEALTH_WARN Reduced data availability: 512 pgs inactive; 4 slow ops, oldest
one blocked for 1810 sec, daemons [osd.6,osd.7,osd.8] have slow ops.
PG_AVAILABILITY Reduced data availability: 512 pgs inactive
pg 2.1cd is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1ce is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1cf is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1d0 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1d1 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1d2 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1d3 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1d4 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1d5 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1d6 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1d7 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1d8 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1d9 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1da is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1db is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1dc is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1dd is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1de is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1df is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1e0 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1e1 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1e2 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1e3 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1e4 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1e5 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1e6 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1e7 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1e8 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1e9 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1ea is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1eb is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1ec is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1ed is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1ee is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1ef is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1f0 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1f1 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1f2 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1f3 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1f4 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1f5 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1f6 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1f7 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1f8 is stuc

[ceph-users] Re: ceph fs dir-layouts and sub-directory mounts

2020-01-29 Thread Konstantin Shalygin


On 1/29/20 6:03 PM, Frank Schilder wrote:

I would like to (in this order)

- set the data pool for the root "/" of a ceph-fs to a custom value, say "P" 
(not the initial data pool used in fs new)
- create a sub-directory of "/", for example "/a"
- mount the sub-directory "/a" with a client key with access restricted to "/a"

The client will not be able to see the dir layout attribute set at "/", its not 
mounted.

Will the data of this client still go to the pool "P", that is, does "/a" 
inherit the dir layout transparently to the client when following the steps above?


AFAIU:

"/" (the cephfs root) is "pool_A"

"/folder" is "pool_B"


If you want for your client to put data to "/folder" only, you should 
set "caps: [mds] allow rw path=/folder"


and yes, client data will be putted to "pool_B".




k
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Nautilus 14.2.6 ceph-volume bluestore _read_fsid unparsable uuid

2020-01-29 Thread Dave Hall

Jan,

In trying to recover my OSDs after the upgrade from Nautilus described
earlier, I eventually managed to make things worse to the point where I'm
going to scrub and fully reinstall.  So I zapped all of the devices on one
of my nodes and reproduced the ceph-volume lvm create error I mentioned
earlier, using the procedure from
https://docs.ceph.com/docs/mimic/rados/configuration/bluestore-config-ref/ to
lay out the LVs and issue ceph-volume lvm create.  As I was concerned that
maybe it was a size thing, I only create a 4TB block LV for my first
attempt, and the full 12TB drive for my second attempt.

The output is:

root@ceph01:~# ceph-volume lvm create --bluestore --data
ceph-block-0/block-0 --block.db ceph-db-0/db-0
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd
--keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new
6441f236-8694-46b9-9c6a-bf82af89765d
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-24
--> Absolute path not found for executable: selinuxenabled
--> Ensure $PATH environment variable contains common executable locations
Running command: /bin/chown -h ceph:ceph /dev/ceph-block-0/block-0
Running command: /bin/chown -R ceph:ceph /dev/dm-0
Running command: /bin/ln -s /dev/ceph-block-0/block-0
/var/lib/ceph/osd/ceph-24/block
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd
--keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o
/var/lib/ceph/osd/ceph-24/activate.monmap
 stderr: got monmap epoch 4
Running command: /usr/bin/ceph-authtool /var/lib/ceph/osd/ceph-24/keyring
--create-keyring --name osd.24 --add-key
AQAuMjJe5OGHBRAAP94+1E7CzV5Rv9HFj9WVqA==
 stdout: creating /var/lib/ceph/osd/ceph-24/keyring
added entity osd.24 auth(key=AQAuMjJe5OGHBRAAP94+1E7CzV5Rv9HFj9WVqA==)
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-24/keyring
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-24/
Running command: /bin/chown -h ceph:ceph /dev/ceph-db-0/db-0
Running command: /bin/chown -R ceph:ceph /dev/dm-1
Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore
bluestore --mkfs -i 24 --monmap /var/lib/ceph/osd/ceph-24/activate.monmap
--keyfile - --bluestore-block-db-path /dev/ceph-db-0/db-0 --osd-data
/var/lib/ceph/osd/ceph-24/ --osd-uuid 6441f236-8694-46b9-9c6a-bf82af89765d
--setuser ceph --setgroup ceph
 stderr: 2020-01-29 20:32:33.054 7ff4c24abc80 -1
bluestore(/var/lib/ceph/osd/ceph-24/) _read_fsid unparsable uuid
 stderr: terminate called after throwing an instance of
'boost::exception_detail::clone_impl
>'
 stderr: what():  boost::bad_get: failed value get using boost::get
 stderr: *** Caught signal (Aborted) **
 stderr: in thread 7ff4c24abc80 thread_name:ceph-osd
 stderr: ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9)
nautilus (stable)
 stderr: 1: (()+0x12730) [0x7ff4c2f54730]
 stderr: 2: (gsignal()+0x10b) [0x7ff4c2a377bb]
 stderr: 3: (abort()+0x121) [0x7ff4c2a22535]
 stderr: 4: (()+0x8c983) [0x7ff4c2dea983]
 stderr: 5: (()+0x928c6) [0x7ff4c2df08c6]
 stderr: 6: (()+0x92901) [0x7ff4c2df0901]
 stderr: 7: (()+0x92b34) [0x7ff4c2df0b34]
 stderr: 8: (()+0x5a3f53) [0x564eed1c4f53]
 stderr: 9: (Option::size_t const
md_config_t::get_val(ConfigValues const&,
std::__cxx11::basic_string,
std::allocator > const&) const+0x81) [0x564eed1cac91]
 stderr: 10: (BlueStore::_set_cache_sizes()+0x15a) [0x564eed645d8a]
 stderr: 11: (BlueStore::_open_bdev(bool)+0x173) [0x564eed648b23]
 stderr: 12: (BlueStore::mkfs()+0x42b) [0x564eed6adeab]
 stderr: 13: (OSD::mkfs(CephContext*, ObjectStore*, uuid_d, int)+0xd5)
[0x564eed1e4bf5]
 stderr: 14: (main()+0x1796) [0x564eed191366]
 stderr: 15: (__libc_start_main()+0xeb) [0x7ff4c2a2409b]
 stderr: 16: (_start()+0x2a) [0x564eed1c4c6a]
 stderr: 2020-01-29 20:32:33.062 7ff4c24abc80 -1 *** Caught signal
(Aborted) **
 stderr: in thread 7ff4c24abc80 thread_name:ceph-osd
 stderr: ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9)
nautilus (stable)
 stderr: 1: (()+0x12730) [0x7ff4c2f54730]
 stderr: 2: (gsignal()+0x10b) [0x7ff4c2a377bb]
 stderr: 3: (abort()+0x121) [0x7ff4c2a22535]
 stderr: 4: (()+0x8c983) [0x7ff4c2dea983]
 stderr: 5: (()+0x928c6) [0x7ff4c2df08c6]
 stderr: 6: (()+0x92901) [0x7ff4c2df0901]
 stderr: 7: (()+0x92b34) [0x7ff4c2df0b34]
 stderr: 8: (()+0x5a3f53) [0x564eed1c4f53]
 stderr: 9: (Option::size_t const
md_config_t::get_val(ConfigValues const&,
std::__cxx11::basic_string,
std::allocator > const&) const+0x81) [0x564eed1cac91]
 stderr: 10: (BlueStore::_set_cache_sizes()+0x15a) [0x564eed645d8a]
 stderr: 11: (BlueStore::_open_bdev(bool)+0x173) [0x564eed648b23]
 stderr: 12: (BlueStore::mkfs()+0x42b) [0x564eed6adeab]
 stderr: 13: (OSD::mkfs(CephContext*, ObjectStore*, uuid_d, int)+0xd5)
[0x564eed1e4bf5]
 stderr: 14: (main()+0x1796) [0x564eed191366]
 stderr: 15: (__libc_start_main()+0xeb) [0x7ff4c2a2409b]
 stderr: 16: (_s

[ceph-users] Re: Nautilus 14.2.6 ceph-volume bluestore _read_fsid unparsable uuid

2020-01-29 Thread bauen1


Hi,

Installing ceph from the debian unstable repository (ceph version 14.2.6 
(f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus (stable), debian 
package: 14.2.6-6) has fixed things form me.


(See also the bug report and the duplicate of it and the changelog of 
14.2.6-6


- bauen1

On 1/30/20 3:31 AM, Dave Hall wrote:

Jan,

In trying to recover my OSDs after the upgrade from Nautilus described
earlier, I eventually managed to make things worse to the point where I'm
going to scrub and fully reinstall.  So I zapped all of the devices on one
of my nodes and reproduced the ceph-volume lvm create error I mentioned
earlier, using the procedure from
https://docs.ceph.com/docs/mimic/rados/configuration/bluestore-config-ref/ to
lay out the LVs and issue ceph-volume lvm create.  As I was concerned that
maybe it was a size thing, I only create a 4TB block LV for my first
attempt, and the full 12TB drive for my second attempt.

The output is:

root@ceph01:~# ceph-volume lvm create --bluestore --data
ceph-block-0/block-0 --block.db ceph-db-0/db-0
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd
--keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new
6441f236-8694-46b9-9c6a-bf82af89765d
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-24
--> Absolute path not found for executable: selinuxenabled
--> Ensure $PATH environment variable contains common executable locations
Running command: /bin/chown -h ceph:ceph /dev/ceph-block-0/block-0
Running command: /bin/chown -R ceph:ceph /dev/dm-0
Running command: /bin/ln -s /dev/ceph-block-0/block-0
/var/lib/ceph/osd/ceph-24/block
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd
--keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o
/var/lib/ceph/osd/ceph-24/activate.monmap
  stderr: got monmap epoch 4
Running command: /usr/bin/ceph-authtool /var/lib/ceph/osd/ceph-24/keyring
--create-keyring --name osd.24 --add-key
AQAuMjJe5OGHBRAAP94+1E7CzV5Rv9HFj9WVqA==
  stdout: creating /var/lib/ceph/osd/ceph-24/keyring
added entity osd.24 auth(key=AQAuMjJe5OGHBRAAP94+1E7CzV5Rv9HFj9WVqA==)
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-24/keyring
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-24/
Running command: /bin/chown -h ceph:ceph /dev/ceph-db-0/db-0
Running command: /bin/chown -R ceph:ceph /dev/dm-1
Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore
bluestore --mkfs -i 24 --monmap /var/lib/ceph/osd/ceph-24/activate.monmap
--keyfile - --bluestore-block-db-path /dev/ceph-db-0/db-0 --osd-data
/var/lib/ceph/osd/ceph-24/ --osd-uuid 6441f236-8694-46b9-9c6a-bf82af89765d
--setuser ceph --setgroup ceph
  stderr: 2020-01-29 20:32:33.054 7ff4c24abc80 -1
bluestore(/var/lib/ceph/osd/ceph-24/) _read_fsid unparsable uuid
  stderr: terminate called after throwing an instance of
'boost::exception_detail::clone_impl

'

  stderr: what():  boost::bad_get: failed value get using boost::get
  stderr: *** Caught signal (Aborted) **
  stderr: in thread 7ff4c24abc80 thread_name:ceph-osd
  stderr: ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9)
nautilus (stable)
  stderr: 1: (()+0x12730) [0x7ff4c2f54730]
  stderr: 2: (gsignal()+0x10b) [0x7ff4c2a377bb]
  stderr: 3: (abort()+0x121) [0x7ff4c2a22535]
  stderr: 4: (()+0x8c983) [0x7ff4c2dea983]
  stderr: 5: (()+0x928c6) [0x7ff4c2df08c6]
  stderr: 6: (()+0x92901) [0x7ff4c2df0901]
  stderr: 7: (()+0x92b34) [0x7ff4c2df0b34]
  stderr: 8: (()+0x5a3f53) [0x564eed1c4f53]
  stderr: 9: (Option::size_t const
md_config_t::get_val(ConfigValues const&,
std::__cxx11::basic_string,
std::allocator > const&) const+0x81) [0x564eed1cac91]
  stderr: 10: (BlueStore::_set_cache_sizes()+0x15a) [0x564eed645d8a]
  stderr: 11: (BlueStore::_open_bdev(bool)+0x173) [0x564eed648b23]
  stderr: 12: (BlueStore::mkfs()+0x42b) [0x564eed6adeab]
  stderr: 13: (OSD::mkfs(CephContext*, ObjectStore*, uuid_d, int)+0xd5)
[0x564eed1e4bf5]
  stderr: 14: (main()+0x1796) [0x564eed191366]
  stderr: 15: (__libc_start_main()+0xeb) [0x7ff4c2a2409b]
  stderr: 16: (_start()+0x2a) [0x564eed1c4c6a]
  stderr: 2020-01-29 20:32:33.062 7ff4c24abc80 -1 *** Caught signal
(Aborted) **
  stderr: in thread 7ff4c24abc80 thread_name:ceph-osd
  stderr: ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9)
nautilus (stable)
  stderr: 1: (()+0x12730) [0x7ff4c2f54730]
  stderr: 2: (gsignal()+0x10b) [0x7ff4c2a377bb]
  stderr: 3: (abort()+0x121) [0x7ff4c2a22535]
  stderr: 4: (()+0x8c983) [0x7ff4c2dea983]
  stderr: 5: (()+0x928c6) [0x7ff4c2df08c6]
  stderr: 6: (()+0x92901) [0x7ff4c2df0901]
  stderr: 7: (()+0x92b34) [0x7ff4c2df0b34]
  stderr: 8: (()+0x5a3f53) [0x564eed1c4f53]
  stderr: 9: (Option::size_t const
md_config_t::get_val(ConfigValues const&,
std::__cxx11::basic_string,
std::allocator > const&) const+0x81) [0x564eed1cac91]
  stderr: 10: (BlueStore::_set_ca

[ceph-users] Re: health_warn: slow_ops 4 slow ops

2020-01-29 Thread Stefan Kooman

Quoting Ignacio Ocampo (naf...@gmail.com):
> Hi Ceph Community (I'm new here :),

Welcome!
 
> Do you have any guidance on how to proceed with this? I'm trying to
> understand why the cluster is HEALTH_WARN and what I need to do in order to
> make it health again.

This might be because there is no CRUSH rule [1] that matches your layout.

Can you provide output of "ceph osd tree" "ceph osd crush rule dump" and
"ceph osd pool get $pool-name all"

It might also be that the PGs cannot peer with each other because of
networking / firewall issues.

G. Stefan

[1]: https://docs.ceph.com/docs/master/rados/operations/crush-map/

-- 
| BIT BV  https://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Nautilus 14.2.6 ceph-volume bluestore _read_fsid unparsable uuid

[ceph-users] Re: Ceph MDS specific perf info disappeared in Nautilus

[ceph-users] Write i/o in CephFS metadata pool

[ceph-users] Re: getting rid of incomplete pg errors

[ceph-users] Re: Concurrent append operations

[ceph-users] Re: High CPU usage by ceph-mgr in 14.2.6

[ceph-users] Re: Luminous Bluestore OSDs crashing with ASSERT

[ceph-users] ceph fs dir-layouts and sub-directory mounts

[ceph-users] Re: getting rid of incomplete pg errors

[ceph-users] Re: Write i/o in CephFS metadata pool

[ceph-users] Re: cephfs : write error: Operation not permitted

[ceph-users] Network performance checks

[ceph-users] Re: Write i/o in CephFS metadata pool

[ceph-users] Re: Write i/o in CephFS metadata pool

[ceph-users] Re: High CPU usage by ceph-mgr in 14.2.6

[ceph-users] Servicing multiple OpenStack clusters from the same Ceph cluster

[ceph-users] Re: Servicing multiple OpenStack clusters from the same Ceph cluster [EXT]

[ceph-users] Re: Servicing multiple OpenStack clusters from the same Ceph cluster

[ceph-users] Re: Servicing multiple OpenStack clusters from the same Ceph cluster

[ceph-users] Re: Servicing multiple OpenStack clusters from the same Ceph cluster

[ceph-users] Re: High CPU usage by ceph-mgr in 14.2.6

[ceph-users] Re: Nautilus 14.2.6 ceph-volume bluestore _read_fsid unparsable uuid

[ceph-users] health_warn: slow_ops 4 slow ops

[ceph-users] Re: ceph fs dir-layouts and sub-directory mounts

[ceph-users] Re: Nautilus 14.2.6 ceph-volume bluestore _read_fsid unparsable uuid

[ceph-users] Re: Nautilus 14.2.6 ceph-volume bluestore _read_fsid unparsable uuid

[ceph-users] Re: health_warn: slow_ops 4 slow ops

27 matches

Site Navigation

Mail list logo

Footer information