[ceph-users] Ceph failover claster

2021-04-12 Thread Várkonyi János
Hi All, Use anybody windows file server with ceph storage? Finally I can do the gateways. We've a ceph storage with 3 nodes and we can add this to windows via ceph-iscsi. I'd like to use it with 2 windows 2019 servers in failover cluster. I can connect to the storage each sides. But when I chec

[ceph-users] Re: Ceph failover claster

2021-04-12 Thread Maged Mokhtar
Hello Varkonyi, Windows clustering requires the use of SCSI 3 clustered persistent reservations, to support this with Ceph you could use our distribution PetaSAN: www.petasan.org which supports this and passes the Windows clustering tests. /Maged On 12/04/2021 10:28, Várkonyi János wrote:

[ceph-users] Re: RGW failed to start after upgrade to pacific

2021-04-12 Thread Robert Sander
Am 06.04.21 um 18:53 schrieb Casey Bodley: > thanks for the details. this is a regression from changes to the > datalog storage for multisite - this -5 error is coming from the new > 'fifo' backend. as a workaround, you can set the new > 'rgw_data_log_backing' config variable back to 'omap' > > Ad

[ceph-users] Re: rbd info error opening image

2021-04-12 Thread Eugen Block
Hi, have you checked if the rbd_header object still exists for that volume? If it's indeed missing you could rebuild it as described in [1], I haven't done that myself though. It would help if you knew the block_name_prefix of that volume, if not you could figure that out by matching all ex

[ceph-users] cephadm custom mgr modules

2021-04-12 Thread Rob Haverkamp
Hi there, I'm developing a custom ceph-mgr module and have issues deploying this on a cluster deployed with cephadm. With a cluster deployed with ceph-deploy, I can just put my code under /usr/share/ceph/mgr/ and load the module. This works fine. I think I found 2 options to do this with cephad

[ceph-users] Re: Nautilus, Ceph-Ansible, existing OSDs, and ceph.conf updates [EXT]

2021-04-12 Thread Matthew Vernon
On 10/04/2021 13:03, Dave Hall wrote: Hello, A while back I asked about the troubles I was having with Ceph-Ansible when I kept existing OSDs in my inventory file when managing my Nautilus cluster. At the time it was suggested that once the OSDs have been configured they should be excluded from

[ceph-users] rbd info error opening image

2021-04-12 Thread Marcel Kuiper
I hope someone can help out. I cannot run 'rbd info' on any image. # rbd ls openstack-volumes volume-628efc47-fc57-4630-8661-a13210a4e02c volume-e4fe1e24-fb26-4abc-a458-f936a4e75715 volume-1ce1439d-767b-4b1d-8217-51464a11c5cc volume-0a01d7e3-2c8f-4fab-9f9f-d84bbc7fa3c7 volume-a4aeb848-7283-4cd0-

[ceph-users] Re: cephadm custom mgr modules

2021-04-12 Thread Sebastian Wagner
You want to build a custom container for that user case indeed. On Mon, Apr 12, 2021 at 2:18 PM Rob Haverkamp wrote: > Hi there, > > I'm developing a custom ceph-mgr module and have issues deploying this on > a cluster deployed with cephadm. > With a cluster deployed with ceph-deploy, I can just

[ceph-users] Re: OSDs RocksDB corrupted when upgrading nautilus->octopus: unknown WriteBatch tag

2021-04-12 Thread Dan van der Ster
Too bad. Let me continue trying to invoke Cunningham's Law for you ... ;) Have you excluded any possible hardware issues? 15.2.10 has a new option to check for all zero reads; maybe try it with true? Option("bluefs_check_for_zeros", Option::TYPE_BOOL, Option::LEVEL_DEV) .set_default(fals

[ceph-users] Re: OSDs RocksDB corrupted when upgrading nautilus->octopus: unknown WriteBatch tag

2021-04-12 Thread Igor Fedotov
Sorry for being too late to the party... I think the root cause is related to the high amount of repairs made during the first post-upgrade fsck run. The check (and fix) for zombie spanning blobs was been backported to v15.2.9 (here is the PR https://github.com/ceph/ceph/pull/39256). And I p

[ceph-users] Re: OSDs RocksDB corrupted when upgrading nautilus->octopus: unknown WriteBatch tag

2021-04-12 Thread DHilsbos
Is there a way to check for these zombie blobs, and other issues needing repair, prior to the upgrade? That would allow us to know that issues might be coming, and perhaps address them before they result in corrupt OSDs. I'm considering upgrading our clusters from 14 to 15, and would really lik

[ceph-users] has anyone enabled bdev_enable_discard?

2021-04-12 Thread Dan van der Ster
Hi all, bdev_enable_discard has been in ceph for several major releases now but it is still off by default. Did anyone try it recently -- is it safe to use? And do you have perf numbers before and after enabling? Cheers, Dan ___ ceph-users mailing list

[ceph-users] Re: Ceph osd Reweight command in octopus

2021-04-12 Thread Brent Kennedy
Yes, I ended up doing that and you are right, it was just being stubborn. I had to drop all the way down to .9 to get those moving. In Naultilus, I don't have to tick that down so low before things start moving. Been on Ceph since firefly, so I try not to go too low. Based on what I was reading

[ceph-users] Re: OSDs RocksDB corrupted when upgrading nautilus->octopus: unknown WriteBatch tag

2021-04-12 Thread Igor Fedotov
The workaround would be to disable bluestore_fsck_quick_fix_on_mount, do an upgrade and then do a regular fsck. Depending on fsck  results either proceed with a repair or not. Thanks, Igor On 4/12/2021 6:35 PM, dhils...@performair.com wrote: Is there a way to check for these zombie blobs,

[ceph-users] Re: cephadm custom mgr modules

2021-04-12 Thread Robert Sander
Hi, this is one of the use cases mentioned in Tim Serong's talk: https://youtu.be/pPZsN_urpqw Containers are great for deploying a fixed state of a software project (a release), but not so much for the development of plugins etc. Regards -- Robert Sander Heinlein Support GmbH Schwedter Str. 8

[ceph-users] Re: OSDs RocksDB corrupted when upgrading nautilus->octopus: unknown WriteBatch tag

2021-04-12 Thread DHilsbos
Igor; Does this only impact CephFS then? Thank you, Dominic L. Hilsbos, MBA Director – Information Technology Perform Air International Inc. dhils...@performair.com www.PerformAir.com -Original Message- From: Igor Fedotov [mailto:ifedo...@suse.de] Sent: Monday, April 12, 2021 9:16

[ceph-users] Re: HEALTH_WARN - Recovery Stuck?

2021-04-12 Thread Marc
You know you can play a bit with the ratios? ceph tell osd.* injectargs '--mon_osd_full_ratio=0.95' ceph tell osd.* injectargs '--mon_osd_backfillfull_ratio=0.90' > -Original Message- > From: Ml Ml > Sent: 12 April 2021 19:31 > To: ceph-users > Subject: [ceph-users] HEALTH_WA

[ceph-users] Re: HEALTH_WARN - Recovery Stuck?

2021-04-12 Thread Andrew Walker-Brown
If you increase the number of pgs, effectively each one is smaller so the backfill process may be able to ‘squeeze’ them onto the nearly full osds while it sorts things out. I’ve had something similar before and this def helped. Sent from my iPhone On 12 Apr 2021, at 19:11, Marc wrote:  Y

[ceph-users] Re: OSDs RocksDB corrupted when upgrading nautilus->octopus: unknown WriteBatch tag

2021-04-12 Thread Jonas Jelten
Hi Igor! I have plenty of OSDs to loose, as long as the recovery works well afterward, so I can go ahead with it :D What debug flags should I activate? osd=10, bluefs=20, bluestore=20, rocksdb=10, ...? I'm not sure it's really the transaction size, since the broken WriteBatch is dumped, and t

[ceph-users] Re: rbd info error opening image

2021-04-12 Thread Marcel Kuiper
Hi Eugen, Thanks for your response Apparently we ran into network troubles where sometimes traffic was delivered to the wrong firewall over L2 Regards Marcel Eugen Block schreef op 2021-04-12 12:09: Hi, have you checked if the rbd_header object still exists for that volume? If it's indeed

[ceph-users] Re: OSDs RocksDB corrupted when upgrading nautilus->octopus: unknown WriteBatch tag

2021-04-12 Thread Igor Fedotov
No, it has absolutely no relation to CephFS. I presume it's a generic Bluestore/BlueFS issue. On 4/12/2021 9:07 PM, dhils...@performair.com wrote: Igor; Does this only impact CephFS then? Thank you, Dominic L. Hilsbos, MBA Director – Information Technology Perform Air International Inc. dhi

[ceph-users] Re: HEALTH_WARN - Recovery Stuck?

2021-04-12 Thread Michael Thomas
I recently had a similar issue when reducing the number of PGs on a pool. A few OSDs became backfillful even though there was enough space; the OSDs were just not balanced well. To fix, I reweighted the most-full OSDs: ceph osd reweight-by-utilization 120 After it finished (~1 hour), I had f

[ceph-users] HEALTH_WARN - Recovery Stuck?

2021-04-12 Thread Ml Ml
Hello, i kind of ran out of disk space, so i added another host with osd.37. But it does not seem to move much data on it. (85MB in 2h) Any idea why the recovery process seems to be stuck? Should i fix the 4 backfillfull osds first? (by changing the weight)? root@ceph01:~# ceph -s cluster:

[ceph-users] Re: Nautilus 14.2.19 mon 100% CPU

2021-04-12 Thread Brad Hubbard
On Mon, Apr 12, 2021 at 11:35 AM Robert LeBlanc wrote: > > On Sun, Apr 11, 2021 at 4:19 PM Brad Hubbard wrote: > > > > PSA. > > > > https://docs.ceph.com/en/latest/releases/general/#lifetime-of-stable-releases > > > > https://docs.ceph.com/en/latest/releases/#ceph-releases-index > > I'm very well

[ceph-users] Re: Nautilus 14.2.19 mon 100% CPU

2021-04-12 Thread Robert LeBlanc
On Mon, Apr 12, 2021 at 3:41 PM Brad Hubbard wrote: > > Sure Robert, > > I understand the realities of maintaining large installations which > may have many reasons holding them back from upgrading any of the > interdependent software they run. The other side of the page however > is that we can n

[ceph-users] ceph rgw why are reads faster for larger than 64kb object size

2021-04-12 Thread Ronnie Puthukkeril
Environment: Ceph Nautilus 14.2.8 Object Storage Data nodes: 12 * HDD OSDs drives each with a 12TB capacity + 2 * SSD OSDs drives for rgw bucket index pool & rgw meta pool. Custom configs (since we dealing with a majority smaller sized objects) bluestore_min_alloc_size_ssd 4096 bluestor

[ceph-users] Re: ceph rgw why are reads faster for larger than 64kb object size

2021-04-12 Thread Ronnie Puthukkeril
Sorry about the formatting in the earlier email. Hope this one works. Below are the read response times from cosbench Stage Op-NameOp-Type Op-Count Byte-Count Avg-ResTime s7-read1KB 48W read read

[ceph-users] Re: ceph rgw why are reads faster for larger than 64kb object size

2021-04-12 Thread Ronnie Puthukkeril
Environment: Ceph Nautilus 14.2.8 Object Storage Data nodes: 12 * HDD OSDs drives each with a 12TB capacity + 2 * SSD OSDs drives for rgw bucket index pool & rgw meta pool. Custom configs (since we dealing with a majority smaller sized objects) bluestore_min_alloc_size_ssd 4096 bluestore_min_allo

[ceph-users] Re: Nautilus 14.2.19 mon 100% CPU

2021-04-12 Thread Brad Hubbard
On Tue, Apr 13, 2021 at 8:40 AM Robert LeBlanc wrote: > > Do you think it would be possible to build Nautilus FUSE or newer on > 14.04, or do you think the toolchain has evolved too much since then? > An interesting question. # cat /etc/os-release NAME="Ubuntu" VERSION="14.04.6 LTS, Trusty Tahr"

[ceph-users] BADAUTHORIZER in Nautilus, unknown PGs, slow peering, very slow client I/O

2021-04-12 Thread Nico Schottelius
Good morning, I've look somewhat intensively through the list and it seems we are rather hard hit by this. Originally yesterday started on a mixed 14.2.9 and 14.2.16 cluster (osds, mons were all 14.2.16). We started phasing in 7 new osds, 6 of them throttled by reweighting to 0.1. Symptoms are

[ceph-users] Re: BADAUTHORIZER in Nautilus, unknown PGs, slow peering, very slow client I/O

2021-04-12 Thread Nico Schottelius
Update, posting information from other posts before: [08:09:40] server3.place6:~# ceph config-key dump | grep config/ "config/global/auth_client_required": "cephx", "config/global/auth_cluster_required": "cephx", "config/global/auth_service_required": "cephx", "config/global/clus