[ceph-users] Re: CephFS perforamnce degradation in root directory

2022-08-15 Thread Gregory Farnum
I was wondering if it had something to do with quota enforcement. The other possibility that occurs to me is if other clients are monitoring the system, or an admin pane (eg the dashboard) is displaying per-volume or per-client stats, they may be poking at the mountpoint and interrupting exclusive

[ceph-users] Re: What is client request_load_avg? Troubleshooting MDS issues on Luminous

2022-08-15 Thread Chris Smart
On Mon, 2022-08-15 at 09:00 +, Frank Schilder wrote: > Hi Chris, > Hi Frank, thanks for the reply. > I also have serious problems identifying problematic ceph-fs clients > (using mimic). I don't think that even in the newest ceph version > there are useful counters for that. Just last week

[ceph-users] Ceph User + Dev Monthly August Meetup

2022-08-15 Thread Neha Ojha
Hi everyone, This month's Ceph User + Dev Monthly meetup is on August 18, 14:00-15:00 UTC. We are planning to get some user feedback on BlueStore compression modes. Please add other topics to the agenda: https://pad.ceph.com/p/ceph-user-dev-monthly-minutes. Hope to see you there! Thanks, Neha _

[ceph-users] Re: What is client request_load_avg? Troubleshooting MDS issues on Luminous

2022-08-15 Thread Chris Smart
On Tue, 2022-08-16 at 13:21 +1000, distro...@gmail.com wrote: > > I'm not quite sure of the relationship of operations between MDS and > OSD data. The MDS gets written to nvme pool and clients access data > directly on OSD nodes, but do MDS operations also need to wait for > OSDs > to perform oper

[ceph-users] Re: What is client request_load_avg? Troubleshooting MDS issues on Luminous

2022-08-15 Thread distroguy
On Mon, 2022-08-15 at 08:33 +, Eugen Block wrote: > Hi, > > do you see high disk utilization on the OSD nodes?  Hi Eugen, thanks for the reply, much appreciated. > How is the load on  > the active MDS? Yesterday I rebooted the three MDS nodes one at a time (which obviously included a failo

[ceph-users] Re: CephFS perforamnce degradation in root directory

2022-08-15 Thread Xiubo Li
On 8/9/22 4:07 PM, Robert Sander wrote: Hi, we have a cluster with 7 nodes each with 10 SSD OSDs providing CephFS to a CloudStack system as primary storage. When copying a large file into the root directory of the CephFS the bandwidth drops from 500MB/s to 50MB/s after around 30 seconds. W

[ceph-users] Re: Ceph needs your help with defining availability!

2022-08-15 Thread Kamoltat Sirivadhna
Hi guys, thank you so much for filling out the Ceph Cluster Availability survey! we have received a total of 59 responses from various groups of people, which is enough to help us understand more profoundly what availability means to everyone. As promised, here is the link to the results of the

[ceph-users] Re: Quincy: Corrupted devicehealth sqlite3 database from MGR crashing bug

2022-08-15 Thread Daniel Williams
ceph-post-file: a9802e30-0096-410e-b5c0-f2e6d83acfd6 On Tue, Aug 16, 2022 at 3:13 AM Patrick Donnelly wrote: > On Mon, Aug 15, 2022 at 11:39 AM Daniel Williams > wrote: > > > > Using ubuntu with apt repository from ceph. > > > > Ok that helped me figure out that it's .mgr not mgr. > > # ceph -v

[ceph-users] Re: The next quincy point release

2022-08-15 Thread Patrick Donnelly
This must go in the next quincy release: https://github.com/ceph/ceph/pull/47288 but we're still waiting on reviews and final tests before merging into main. On Mon, Aug 15, 2022 at 11:02 AM Yuri Weinstein wrote: > > We plan to start QE validation for the next quincy point release this week. >

[ceph-users] Re: Quincy: Corrupted devicehealth sqlite3 database from MGR crashing bug

2022-08-15 Thread Patrick Donnelly
On Mon, Aug 15, 2022 at 11:39 AM Daniel Williams wrote: > > Using ubuntu with apt repository from ceph. > > Ok that helped me figure out that it's .mgr not mgr. > # ceph -v > ceph version 17.2.3 (dff484dfc9e19a9819f375586300b3b79d80034d) quincy (stable) > # export CEPH_CONF='/etc/ceph/ceph.conf' >

[ceph-users] Re: Quincy: Corrupted devicehealth sqlite3 database from MGR crashing bug

2022-08-15 Thread Daniel Williams
Using ubuntu with apt repository from ceph. Ok that helped me figure out that it's .mgr not mgr. # ceph -v ceph version 17.2.3 (dff484dfc9e19a9819f375586300b3b79d80034d) quincy (stable) # export CEPH_CONF='/etc/ceph/ceph.conf' # export CEPH_KEYRING='/etc/ceph/ceph.client.admin.keyring' # export CE

[ceph-users] Re: Quincy: Corrupted devicehealth sqlite3 database from MGR crashing bug

2022-08-15 Thread Patrick Donnelly
Hello Daniel, On Mon, Aug 15, 2022 at 10:38 AM Daniel Williams wrote: > > My managers are crashing reading the sqlite database for deviceheatlth: > .mgr:devicehealth/main.db-journal > debug -2> 2022-08-15T11:14:09.184+ 7fa5721b7700 5 cephsqlite: > Read: (client.53284882) [.mgr:deviceheal

[ceph-users] Re: Some odd results while testing disk performance related to write caching

2022-08-15 Thread Dan van der Ster
Hi, We have some docs about this in the Ceph hardware recommendations: https://docs.ceph.com/en/latest/start/hardware-recommendations/#write-caches I added some responses inline.. On Fri, Aug 5, 2022 at 7:23 PM Torbjörn Jansson wrote: > > Hello > > i got a small 3 node ceph cluster and i'm doin

[ceph-users] Quincy: Corrupted devicehealth sqlite3 database from MGR crashing bug

2022-08-15 Thread Daniel Williams
My managers are crashing reading the sqlite database for deviceheatlth: .mgr:devicehealth/main.db-journal debug -2> 2022-08-15T11:14:09.184+ 7fa5721b7700 5 cephsqlite: Read: (client.53284882) [.mgr:devicehealth/main.db-journal] 0x5601da0c0008 4129788~65536 debug -1> 2022-08-15T11:14:09

[ceph-users] Re: Recovery very slow after upgrade to quincy

2022-08-15 Thread Torkil Svensgaard
On 15-08-2022 08:24, Satoru Takeuchi wrote: 2022年8月13日(土) 1:35 Robert W. Eckert : Interesting, a few weeks ago I added a new disk to each of my 3 node cluster and saw the same 2 Mb/s recovery.What I had noticed was that one OSD was using very high CPU and seems to have been the primary no

[ceph-users] Re: What is client request_load_avg? Troubleshooting MDS issues on Luminous

2022-08-15 Thread Eugen Block
Hi, do you see high disk utilization on the OSD nodes? How is the load on the active MDS? How much RAM is configured for the MDS (mds_cache_memory_limit)? You can list all MDS sessions with 'ceph daemon mds. session ls' to identify all your clients and 'ceph daemon mds. dump_blocked_ops'

[ceph-users] Re: CephFS perforamnce degradation in root directory

2022-08-15 Thread Robert Sander
Am 09.08.22 um 10:07 schrieb Robert Sander: When copying the same file to a subdirectory of the CephFS the performance stays at 500MB/s for the whole time. MDS activity does not seems to influence the performance here. There is a new datapoint: When mounting the subdirectory (and not CephFS'