[ceph-users] Monitors' election failed on VMs : e4 handle_auth_request failed to assign global_id

2020-03-10 Thread Yoann Moulin
Hello, On a Nautilus cluster, I'd like to move monitors from bare metal servers to VMs to prepare a migration. I have added 3 new monitors on 3 VMs and I'd like to stop the 3 old monitors daemon. But I soon as I stop the 3rd old monitor, the cluster stuck because the election of a new monitor f

[ceph-users] Re: Monitors' election failed on VMs : e4 handle_auth_request failed to assign global_id

2020-03-10 Thread Paul Emmerich
On Tue, Mar 10, 2020 at 8:18 AM Yoann Moulin wrote: > I have added 3 new monitors on 3 VMs and I'd like to stop the 3 old monitors > daemon. But I soon as I stop the 3rd old monitor, the cluster stuck > because the election of a new monitor fails. By "stop" you mean "stop and then immediately re

[ceph-users] Re: Monitors' election failed on VMs : e4 handle_auth_request failed to assign global_id

2020-03-10 Thread Håkan T Johansson
Note that with 6 monitors, quorum requires 4. So if only 3 are running, the system cannot work. With one old removed there would be 5 possible, then with quorum of 3. Best regards, Håkan On Tue, 10 Mar 2020, Paul Emmerich wrote: On Tue, Mar 10, 2020 at 8:18 AM Yoann Moulin wrote: I have

[ceph-users] Re: Monitors' election failed on VMs : e4 handle_auth_request failed to assign global_id

2020-03-10 Thread Yoann Moulin
Hello, > Note that with 6 monitors, quorum requires 4. > > So if only 3 are running, the system cannot work. > > With one old removed there would be 5 possible, then with quorum of 3. Good point! I hadn't thought of that. Looks like it works if I remove one, thanks a lot! Best, Yoann > On Tue,

[ceph-users] Re: Radosgw dynamic sharding jewel -> luminous

2020-03-10 Thread Robert LeBlanc
I don't know if it is that specifically, but they are all are running the latest version of Luminous and I set the cluster to only allow Luminous OSDs in. All services have been upgraded to Luminous. Do I need to run a command to activate the cls_rgw API? Robert LeBlanc PGP Fingerp

[ceph-users] Re: ceph-mon store.db disk usage increase on OSD-Host fail

2020-03-10 Thread Hartwig Hauschild
Hi, I've done a bit more testing ... Am 05.03.2020 schrieb Hartwig Hauschild: > Hi, > > I'm (still) testing upgrading from Luminous to Nautilus and ran into the > following situation: > > The lab-setup I'm testing in has three OSD-Hosts. > If one of those hosts dies the store.db in /var/lib/

[ceph-users] Re: ceph-mon store.db disk usage increase on OSD-Host fail

2020-03-10 Thread Wido den Hollander
On 3/10/20 10:48 AM, Hartwig Hauschild wrote: > Hi, > > I've done a bit more testing ... > > Am 05.03.2020 schrieb Hartwig Hauschild: >> Hi, >> >> I'm (still) testing upgrading from Luminous to Nautilus and ran into the >> following situation: >> >> The lab-setup I'm testing in has three OSD

[ceph-users] reset pgs not deep-scrubbed in time

2020-03-10 Thread Stefan Priebe - Profihost AG
Hello, is there any way to reset deep-scrubbed time for pgs? The cluster was accidently in state nodeep-scrub and is now unable to deep scrub fast enough. Is there any way to force mark all pgs as deep scrubbed to start from 0 again? Greets, Stefan __

[ceph-users] cephfs snap mkdir strange timestamp

2020-03-10 Thread Marc Roos
If I make a directory in linux the directory has the date of now, why is this not with creating a snap dir? Is this not a bug? One expects this to be the same as in linux not [ @ test]$ mkdir temp [ @os0 test]$ ls -arltn total 28 drwxrwxrwt. 27 0 0 20480 Mar 10 11:38 .. drwxrwxr-x

[ceph-users] Re: ceph: Can't lookup inode 1 (err: -13)

2020-03-10 Thread Marc Roos
Nobody knows where this is coming from or had something similar? -Original Message- To: ceph-users Subject: [ceph-users] ceph: Can't lookup inode 1 (err: -13) For testing purposes I changed the kernel 3.10 for a 5.5, now I am getting these messages. I assume the 3.10 was just never

[ceph-users] Re: ceph: Can't lookup inode 1 (err: -13)

2020-03-10 Thread Paul Mezzanini
We see this constantly and the last time I looked for what it was I came to the conclusion that it's because I'm mounting below the root in cephfs and the kernel is trying to get quota information for the mount point. That means it's trying to go above by one layer and it can't. Annoying and h

[ceph-users] rbd-mirror replay is very slow - but initial bootstrap is fast

2020-03-10 Thread Ml Ml
Hello List, when i initially enable journal/mirror on an image it gets bootstrapped to my site-b pretty quickly with 250MB/sec which is about the IO Write limit. Once its up2date, the replay is very slow. About 15KB/sec and the entries_behind_maste is just running away: root@ceph01:~# rbd --clus

[ceph-users] Re: cephfs snap mkdir strange timestamp

2020-03-10 Thread Paul Emmerich
There's an xattr for this: ceph.snap.btime IIRC Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Tue, Mar 10, 2020 at 11:42 AM Marc Roos wrote: > > > > If I make a d

[ceph-users] Nautilus cephfs usage

2020-03-10 Thread Yoann Moulin
Hello, I have a Nautilus cluster with a cephfs volume, on grafana, it shows that cephfs_data pool is almost full[1] but if I give a look to the pool usage, it looks like I have plenty of space. Which metrics are used by grafana? 1. https://framapic.org/5r7J86s55x6k/jGSIsjEUPYMU.png pool usage:

[ceph-users] Re: rbd-mirror replay is very slow - but initial bootstrap is fast

2020-03-10 Thread Jason Dillaman
On Tue, Mar 10, 2020 at 6:47 AM Ml Ml wrote: > > Hello List, > > when i initially enable journal/mirror on an image it gets > bootstrapped to my site-b pretty quickly with 250MB/sec which is about > the IO Write limit. > > Once its up2date, the replay is very slow. About 15KB/sec and the > entries

[ceph-users] Re: ceph: Can't lookup inode 1 (err: -13)

2020-03-10 Thread Marc Roos
Ok thanks, for letting me know -Original Message- To: ceph-users; Subject: Re: [ceph-users] Re: ceph: Can't lookup inode 1 (err: -13) We see this constantly and the last time I looked for what it was I came to the conclusion that it's because I'm mounting below the root in cephfs and

[ceph-users] Re: cephfs snap mkdir strange timestamp

2020-03-10 Thread Marc Roos
Hmmm, but typing a ls -lart is faster than having to lookup in my manual how to get such a thing with xattr. I honestly do not get the logics about applying everywhere the same date as the parent folder. Totally useless information stored. Might as well store nothing. -Original Message

[ceph-users] Re: rbd-mirror replay is very slow - but initial bootstrap is fast

2020-03-10 Thread Jason Dillaman
On Tue, Mar 10, 2020 at 10:36 AM Ml Ml wrote: > > Hello Jason, > > thanks for that fast reply. > > This is now my /etc/ceph/ceph.conf > > [client] > rbd_mirror_journal_max_fetch_bytes = 4194304 > > > I stopped and started my rbd-mirror manually with: > rbd-mirror -d -c /etc/ceph/ceph.conf > > Stil

[ceph-users] Re: rbd-mirror replay is very slow - but initial bootstrap is fast

2020-03-10 Thread Ml Ml
Hello Jason, thanks for that fast reply. This is now my /etc/ceph/ceph.conf [client] rbd_mirror_journal_max_fetch_bytes = 4194304 I stopped and started my rbd-mirror manually with: rbd-mirror -d -c /etc/ceph/ceph.conf Still same result. Slow speed shown by iftop and entries_behind_master keep

[ceph-users] Re: rbd-mirror replay is very slow - but initial bootstrap is fast

2020-03-10 Thread Ml Ml
Hello Jason, okay, good hint! I did not realize, that it will write the journal 1:1 but that makes sense. I will benchmark it later. However, my backup cluster is the place where the old spinning rust will find its last dedication. Therefore it will never be as fast as the live cluster. Looking

[ceph-users] Re: rbd-mirror replay is very slow - but initial bootstrap is fast

2020-03-10 Thread Jason Dillaman
On Tue, Mar 10, 2020 at 11:53 AM Ml Ml wrote: > > Hello Jason, > > okay, good hint! > > I did not realize, that it will write the journal 1:1 but that makes > sense. I will benchmark it later. Yes, it's replaying the exact IOs again to ensure it's point-in-time consistent. > However, my backup c

[ceph-users] Re: rbd-mirror replay is very slow - but initial bootstrap is fast

2020-03-10 Thread Anthony D'Atri
FWIW when using rbd-mirror to migrate volumes between SATA SSD clusters, I found that rbd_mirror_journal_max_fetch_bytes: section: "client" value: "33554432" rbd_journal_max_payload_bytes: section: "client" value: “8388608" Made a world of difference in expediting journal

[ceph-users] Re: rbd-mirror replay is very slow - but initial bootstrap is fast

2020-03-10 Thread Jason Dillaman
On Tue, Mar 10, 2020 at 2:31 PM Anthony D'Atri wrote: > > FWIW when using rbd-mirror to migrate volumes between SATA SSD clusters, I > found that > > >rbd_mirror_journal_max_fetch_bytes: > section: "client" > value: "33554432" > > rbd_journal_max_payload_bytes: > section: "clien

[ceph-users] Possible bug with rbd export/import?

2020-03-10 Thread Matt Dunavant
Hello, I think I've been running into an rbd export/import bug and wanted to see if anybody else had any experience. We're using rbd images for VM drives both with and without custom stripe sizes. When we try to export/import the drive to another ceph cluster, the VM always comes up in a buste

[ceph-users] Re: Possible bug with rbd export/import?

2020-03-10 Thread Jack
Hi, Are you exporting rbd image while a VM is running upon it ? As far as I know, rbd export is not consistent You should not export an image, but only snapshots: - create a snapshot of the image - export the snapshot (rbd export pool/image@snap - | ..) - drop the snapshot Regards, On 3/10/20

[ceph-users] Re: Possible bug with rbd export/import?

2020-03-10 Thread Simon Ironside
On 10/03/2020 19:31, Matt Dunavant wrote: We're using rbd images for VM drives both with and without custom stripe sizes. When we try to export/import the drive to another ceph cluster, the VM always comes up in a busted state it can't recover from. Don't shoot me for asking but is the VM be

[ceph-users] Rados example: create namespace, user for this namespace, read and write objects with created namespace and user

2020-03-10 Thread Rodrigo Severo - Fábrica
Hi, I'm trying to create a namespace in rados, create a user that has access to this created namespace and with rados command line utility read and write objects in this created namespace using the created user. I can't find an example on how to do it. Can someone point me to such example or sh

[ceph-users] FW: Warning: could not send message for past 4 hours

2020-03-10 Thread Marc Roos
-Original Message- From: Mail Delivery Subsystem [mailto:MAILER-DAEMON] Sent: 10 March 2020 19:01 Subject: Warning: could not send message for past 4 hours ** ** THIS IS A WARNING MESSAGE ONLY ** ** YOU DO NOT NEED TO R

[ceph-users] Re: rbd-mirror replay is very slow - but initial bootstrap is fast

2020-03-10 Thread Anthony D'Atri
>> >> FWIW when using rbd-mirror to migrate volumes between SATA SSD clusters, I >> found that >> >> >> rbd_mirror_journal_max_fetch_bytes: >>section: "client" >>value: "33554432" >> >> rbd_journal_max_payload_bytes: >>section: "client" >>value: “8388608" > > Indeed, that'

[ceph-users] Re: Rados example: create namespace, user for this namespace, read and write objects with created namespace and user

2020-03-10 Thread JC Lopez
Hi no need to craete a namespace. You just specify the namespace you want to access. See https://docs.ceph.com/docs/nautilus/man/8/rados/ the -N cli option For access to a particular namespace have a look at the example here: https://docs.ceph.com/docs/nautilus/rados/operations/user-management

[ceph-users] Bucket notification with kafka error

2020-03-10 Thread 曹 海旺
HI, I'm sorry to bother you again I want to use kafka to queue the notifications , I add a topic named kafka,and put the notification config xml The topic info: https://sns.amazonaws.com/doc/2010-03-31/";> sr kafka