[ceph-users] Re: OSD spend too much time on "waiting for readable" -> slow ops -> laggy pg -> rgw stop -> worst case osd restart

2021-11-04 Thread Manuel Lausch
On Tue, 2 Nov 2021 09:02:31 -0500 Sage Weil wrote: > > Just to be clear, you should try > osd_fast_shutdown = true > osd_fast_shutdown_notify_mon = false I added some logs to the tracker ticket with this options set. > You write if the osd rejects messenger connections, because it is > >

[ceph-users] Re: High ceph_osd_commit_latency_ms on Toshiba MG07ACA14TE HDDs

2021-11-04 Thread Dan van der Ster
Hello Benoît, (and others in this great thread), Apologies for replying to this ancient thread. We have been debugging similar issues during an ongoing migration to new servers with TOSHIBA MG07ACA14TE hdds. We see a similar commit_latency_ms issue on the new drives (~60ms in our env vs ~20ms fo

[ceph-users] Re: High ceph_osd_commit_latency_ms on Toshiba MG07ACA14TE HDDs

2021-11-04 Thread Mark Nelson
Hi Dan, I can't speak for those specific Toshiba drives, but we have absolutely seen very strange behavior (sometimes with cache enabled and sometimes not) with different drives and firmwares over the years from various manufacturers.  There was one especially bad case from back in the Inkta

[ceph-users] Re: Multisite replication is on gateway layer right?

2021-11-04 Thread Janne Johansson
Den tors 4 nov. 2021 kl 13:37 skrev Szabo, Istvan (Agoda) : > Hi, > > In case of bucket replication is the replication happening on osd level or > gateway layer? bucket == gateway layer. > Could that be a problem, that in my 3 clustered multisite environment the > cluster networks are in 2 clus

[ceph-users] Re: High ceph_osd_commit_latency_ms on Toshiba MG07ACA14TE HDDs

2021-11-04 Thread Dan van der Ster
Thanks Mark. With the help of the crowd on Telegram, we found that (at least here) the drive cache needs to be disabled like this: ``` for x in /sys/class/scsi_disk/*/cache_type; do echo 'write through' > $x; done ``` This disables the cache (confirmed afterwards with hdparm) but more importantl

[ceph-users] Re: High cephfs MDS latency and CPU load

2021-11-04 Thread Patrick Donnelly
Hi Andras, On Wed, Nov 3, 2021 at 10:18 AM Andras Pataki wrote: > > Hi cephers, > > Recently we've started using cephfs snapshots more - and seem to be > running into a rather annoying performance issue with the MDS. The > cluster in question is on Nautilus 14.2.20. > > Typically, the MDS proces

[ceph-users] large bucket index in multisite environement (how to deal with large omap objects warning)?

2021-11-04 Thread Boris Behrens
Hi everybody, we maintain three ceph clusters (2x octopus, 1x nautilus) that use three zonegroups to sync metadata, without syncing the actual data (only one zone per zonegroup). Some customer got buckets with >4m objects in our largest cluster (the other two a very fresh with close to 0 data in

[ceph-users] Re: OSD spend too much time on "waiting for readable" -> slow ops -> laggy pg -> rgw stop -> worst case osd restart

2021-11-04 Thread Gregory Farnum
On Tue, Nov 2, 2021 at 7:03 AM Sage Weil wrote: > On Tue, Nov 2, 2021 at 8:29 AM Manuel Lausch > wrote: > > > Hi Sage, > > > > The "osd_fast_shutdown" is set to "false" > > As we upgraded to luminous I also had blocked IO issuses with this > > enabled. > > > > Some weeks ago I tried out the opti

[ceph-users] Re: large bucket index in multisite environement (how to deal with large omap objects warning)?

2021-11-04 Thread Teoman Onay
AFAIK dynamic resharding is not supported for multisite setups but you can reshard manually. Note that this is a very expensive process which requires you to: - disable the sync of the bucket you want to reshard. - Stops all the RGW (no more access to your Ceph cluster) - On a node of the master z

[ceph-users] How to setup radosgw with https on pacific?

2021-11-04 Thread Scharfenberg, Carsten
Hello everybody, I'm quite new to ceph and I'm facing a myriad of issues trying to use it. So I've subscribed to this mailing list. Hopefully you guys can help me with some of those issues. My current goal is to setup a local S3 storage -- i.e. a ceph "cluster" with radosgw. In my test environ

[ceph-users] Re: large bucket index in multisite environement (how to deal with large omap objects warning)?

2021-11-04 Thread Сергей Процун
If sharding is not option at all, then you can increase osd_deep_scrub_large_omap_object_key threshold which is not the best idea. I would still go with resharding which might result in taking offline at least slave sites. In the future you can set the higher number of shards during initial creatio

[ceph-users] fresh pacific installation does not detect available disks

2021-11-04 Thread Scharfenberg, Carsten
Hello everybody, as ceph newbie I've tried out setting up ceph pacific according to the official documentation: https://docs.ceph.com/en/latest/cephadm/install/ The intention was to setup a single node "cluster" with radosgw to feature local S3 storage. This failed because my ceph "cluster" woul

[ceph-users] Grafana embed in dashboard no longer functional

2021-11-04 Thread Zach Heise (SSCC)
We're using cephadm with all 5 nodes on 16.2.6. Until today, grafana has been running only on ceph05. Before the 16.2.6 update, the embedded frames would pop up an expected security error for self-signed certificates, but after accepting would work. After the 16.2

[ceph-users] Re: fresh pacific installation does not detect available disks

2021-11-04 Thread Zach Heise
Hi Carsten, When I had problems on my physical hosts (recycled systems that we wanted to just use in a test cluster) I found that I needed to use sgdisk --zap-all /dev/sd{letter} to clean all partition maps off the disks before ceph would recognize them as available. Worth a shot in your case, eve

[ceph-users] Re: fresh pacific installation does not detect available disks

2021-11-04 Thread Yury Kirsanov
Hi, You should erase any partitions or LVM groups on the disks and restart OSD hosts so CEPH would be able to detect drives. I usually just do 'dd if=/dev/zero of=/dev/ bs=1M count=1024' and then reboot host to make sure it will definitely be clean. Or, alternatively, you can zap the drives, or you

[ceph-users] Re: fresh pacific installation does not detect available disks

2021-11-04 Thread Сергей Процун
Hello, I agree with that point. When ceph creates lvm volumes it adds lvm tags to them. Thats how ceph finds that those they are occupied by ceph. So you should remove lvm volumes and even better clean all data on those lvm volumes. Usually its enough to clean just the head of lvm partition where

[ceph-users] Re: Grafana embed in dashboard no longer functional

2021-11-04 Thread Zach Heise (SSCC)
Argh - that was it. Tested in Microsoft Edge and it worked fine. I was using Firefox as my primary browser, and the "enhanced tracking protection" setting was the issue killing the iframe loading. Once I disabled that for the mgr daemon's URL the embeds started loadi

[ceph-users] Re: OSD spend too much time on "waiting for readable" -> slow ops -> laggy pg -> rgw stop -> worst case osd restart

2021-11-04 Thread Sage Weil
Can you try setting paxos_propose_interval to a smaller number, like .3 (by default it is 2 seconds) and see if that has any effect. It sounds like the problem is not related to getting the OSD marked down (or at least that is not the only thing going on). My next guess is that the peering proces

[ceph-users] One cephFS snapshot kills performance

2021-11-04 Thread Sebastian Mazza
Hi all! I’m new to cephFS. My test file system uses a replicated pool on NVMe SSDs for metadata and an erasure coded pool on HDDs for data. All OSDs uses bluestore. I used the ceph version 16.2.6 for all daemons - created with this version and running this version. The linux kernel that I used f

[ceph-users] Optimal Erasure Code profile?

2021-11-04 Thread Zakhar Kirpichenko
Hi! I've got a CEPH 16.2.6 cluster, the hardware is 6 x Supermicro SSG-6029P nodes, each equipped with: 2 x Intel(R) Xeon(R) Gold 5220R CPUs 384 GB RAM 2 x boot drives 2 x 1.6 TB enterprise NVME drives (DB/WAL) 2 x 6.4 TB enterprise drives (storage tier) 9 x 9TB HDDs (storage tier) 2 x Intel XL71

[ceph-users] Are setting 'ceph auth caps' and/or adding a cache pool I/O-disruptive operations?

2021-11-04 Thread Zakhar Kirpichenko
Hi, I'm trying to figure out if setting auth caps and/or adding a cache pool are I/O-disruptive operations, i.e. if caps reset to 'none' for a brief moment or client I/O momentarily stops for other reasons. For example, I had the following auth setting in my 16.2.x cluster: client.cinder

[ceph-users] Stale monitoring alerts in UI

2021-11-04 Thread Zakhar Kirpichenko
Hi, I seem to have some stale monitoring alerts in my Mgr UI, which do not want to go away. For example (I'm also attaching an image for your convenience): MTU Mismatch: Node ceph04 has a different MTU size (9000) than the median value on device storage-int. The alerts appears to be active, but

[ceph-users] Re: Are setting 'ceph auth caps' and/or adding a cache pool I/O-disruptive operations?

2021-11-04 Thread Zakhar Kirpichenko
Yes, it was an attempt to address poor performance, which didn't go well. Btw, this isn't the first time I'm reading that cache tier is "kind of deprecated", but the documentation doesn't really say this but explains how to make a cache tier instead. Perhaps it should be made more clear that addin