[ceph-users] Re: Linux Laptop Losing CephFS mounts on Sleep/Hibernate

2024-03-28 Thread Jos Collin
Need logs to check more on this. Or just call a script during wakeup? On Tue, 26 Mar 2024 at 04:16, wrote: > Hi All, > > So I've got a Ceph Reef Cluster (latest version) with a CephFS system set > up with a number of directories on it. > > On a Laptop (running Rocky Linux (latest version)) I've

[ceph-users] PG's stuck incomplete on EC pool after multiple drive failure

2024-03-28 Thread Malcolm Haak
Hello all. I have a cluster with ~80TB of spinning disk. Its primary role is cephfs. Recently I had a multiple drive failure (it was not simultaneous) but it's left me with 20 incomplete pg's I know this data is toast, but I need to be able to get what isn't toast out of the cephfs. Well out of t

[ceph-users] Re: MDS Behind on Trimming...

2024-03-28 Thread Xiubo Li
On 3/29/24 04:57, Erich Weiler wrote: MDS logs show: Mar 28 13:42:29 pr-md-02.prism ceph-mds[1464328]: log_channel(cluster) log [WRN] : 16 slow requests, 0 included below; oldest blocked for > 3676.400077 secs Mar 28 13:42:30 pr-md-02.prism ceph-mds[1464328]: mds.slugfs.pr-md-02.sbblqq Updat

[ceph-users] Re: MDS Behind on Trimming...

2024-03-28 Thread Burkhard Linke
Hi, we have similar problems from time to time. Running Reef on servers and latest ubuntu 20.04 hwe kernel on the clients. There are probably two scenarios with slightly different observations: 1. MDS reports slow ops Some client is holding caps for a certain file / directory and blocks o

[ceph-users] Re: MDS Behind on Trimming...

2024-03-28 Thread Alexander E. Patrakov
Hello Erich, What you are experiencing is definitely a bug - but possibly a client bug. Not sure. Upgrading Ceph packages on the clients, though, will not help, because the actual CephFS client is the kernel. You can try upgrading it to the latest 6.8.x (or, better, trying the same workload from d

[ceph-users] ceph orchestrator for osds

2024-03-28 Thread Jeffrey Turmelle
Running on Octopus: While attempting to install a bunch of new OSDs on multiple hosts, I ran some ceph orchestrator commands to install them, such as ceph orch apply osd --all-available-devices ceph orch apply osd -I HDD_drive_group.yaml I assumed these were just helper processes, and they wou

[ceph-users] Re: MDS Behind on Trimming...

2024-03-28 Thread Erich Weiler
Could there be an issue with the fact that the servers (MDS, MGR, MON, OSD) are running reef and all the clients are running quincy? I can easily enough get the new reef repo in for all our clients (Ubuntu 22.04) and upgrade the clients to reef if that might help..? On 3/28/24 3:05 PM, Erich

[ceph-users] Re: MDS Behind on Trimming...

2024-03-28 Thread Erich Weiler
I asked the user and they said no, no rsync involved. Although I rsync'd 500TB into this filesystem in the beginning without incident, so hopefully it's not a big deal here. I'm asking the user what their workflow does to try and pin this down. Are there any other known reason why a slow requ

[ceph-users] Re: MDS Behind on Trimming...

2024-03-28 Thread Alexander E. Patrakov
Hello Erich, Does the workload, by any chance, involve rsync? It is unfortunately well-known for triggering such issues. A workaround is to export the directory via NFS and run rsync against the NFS mount instead of directly against CephFS. On Fri, Mar 29, 2024 at 4:58 AM Erich Weiler wrote: > >

[ceph-users] Re: MDS Behind on Trimming...

2024-03-28 Thread Erich Weiler
MDS logs show: Mar 28 13:42:29 pr-md-02.prism ceph-mds[1464328]: log_channel(cluster) log [WRN] : 16 slow requests, 0 included below; oldest blocked for > 3676.400077 secs Mar 28 13:42:30 pr-md-02.prism ceph-mds[1464328]: mds.slugfs.pr-md-02.sbblqq Updating MDS map to version 22775 from mon.3

[ceph-users] Re: MDS Behind on Trimming...

2024-03-28 Thread Erich Weiler
Wow those are extremely useful commands. Next time this happens I'll be sure to use them. A quick test shows they work just great! cheers, erich On 3/28/24 11:16 AM, Alexander E. Patrakov wrote: Hi Erich, Here is how to map the client ID to some extra info: ceph tell mds.0 client ls id=994

[ceph-users] Re: Call for Interest: Managed SMB Protocol Support

2024-03-28 Thread Bailey Allison
Hey, We make use of the ctdb_mutex_ceph_rados_helper so the lock file just gets stored within CephFS metadata pool rather than on a shared CephFS mount as a file. We don't recommend storing directly on CephFS as if the mount hosting the lock file is to go down we have seen the mds mark as stal

[ceph-users] Re: MDS Behind on Trimming...

2024-03-28 Thread Alexander E. Patrakov
Hi Erich, Here is how to map the client ID to some extra info: ceph tell mds.0 client ls id=99445 Here is how to map inode ID to the path: ceph tell mds.0 dump inode 0x100081b9ceb | jq -r .path On Fri, Mar 29, 2024 at 1:12 AM Erich Weiler wrote: > > Here are some of the MDS logs: > > Mar 27 1

[ceph-users] Re: MDS Behind on Trimming...

2024-03-28 Thread Erich Weiler
Here are some of the MDS logs: Mar 27 11:58:25 pr-md-01.prism ceph-mds[1296468]: log_channel(cluster) log [WRN] : slow request 511.703289 seconds old, received at 2024-03-27T18:49:53.623192+: client_request(client.99375:459393 getattr AsXsFs #0x100081b9ceb 2024-03-27T18:49:53.620806+

[ceph-users] Re: Failed adding back a node

2024-03-28 Thread Alex
Thanks so much! I'll give it a shot. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: RGW Data Loss Bug in Octopus 15.2.0 through 15.2.6

2024-03-28 Thread Jonas Nemeiksis
Hello, Maybe your issue depends to this https://tracker.ceph.com/issues/63642 On Wed, Mar 27, 2024 at 7:31 PM xu chenhui wrote: > Hi, Eric Ivancich > I have similar problem in ceph version 16.2.5. Has this problem been > completely resolved in Pacific version? > Our bucket has no lifecy

[ceph-users] Re: Failed adding back a node

2024-03-28 Thread Adam King
No, you can't use the image id for hte upgrade command, it has to be the image name. So it should start, based on what you have, registry.redhat.io/rhceph/. As for the full name, it depends which image you want to go with. As for trying this on an OSD first, there is `ceph orch daemon redeploy --i

[ceph-users] Re: 1x port from bond down causes all osd down in a single machine

2024-03-28 Thread Alwin Antreich
On March 26, 2024 5:02:16 PM GMT+01:00, "Szabo, Istvan (Agoda)" wrote: >Hi, > >Wonder what we are missing from the netplan configuration on ubuntu which ceph >needs to tolerate properly. >We are using this bond configuration on ubuntu 20.04 with octopus ceph: > > >bond1: > macaddress: x

[ceph-users] Re: Can setting mds_session_blocklist_on_timeout to false minize the session eviction?

2024-03-28 Thread Kotresh Hiremath Ravishankar
On Tue, Mar 26, 2024 at 7:30 PM Yongseok Oh wrote: > Hi, > > CephFS is provided as a shared file system service in a private cloud > environment of our company, LINE. The number of sessions is approximately > more than 5,000, and session evictions occur several times a day. When > session evictio

[ceph-users] Re: Linux Laptop Losing CephFS mounts on Sleep/Hibernate

2024-03-28 Thread Suyash Dongre
The best way I would recommend is to disable sleep when the laptop lid closes. You will need to open the “logind.conf” file, ( sudo nano /etc/systemd/logind.conf ) and then will need to change #HandleLidSwitch=suspend to HandleLidSwitch=ignore Remove the # Remember to save and exit. You ca

[ceph-users] Re: Linux Laptop Losing CephFS mounts on Sleep/Hibernate

2024-03-28 Thread Dhairya Parmar
So the client session was dropped when the laptop went into sleep mode, maybe what could've happened is that since the client is silent; it failed to renew its caps in time and hit `session_autoclose` (defaults to 300 secs) and thus got evicted. As Kotresh mentioned, client logs would reveal better

[ceph-users] Re: Linux Laptop Losing CephFS mounts on Sleep/Hibernate

2024-03-28 Thread Kotresh Hiremath Ravishankar
I think the client should reconnect when it's out of sleep. Could you please share the client logs to check what's happening? On Tue, Mar 26, 2024 at 4:16 AM wrote: > Hi All, > > So I've got a Ceph Reef Cluster (latest version) with a CephFS system set > up with a number of directories on it.

[ceph-users] Re: mclock and massive reads

2024-03-28 Thread Luis Domingues
Luis Domingues Proton AG On Thursday, 28 March 2024 at 10:10, Sridhar Seshasayee wrote: > Hi Luis, > > > So our question, is mClock taking into account the reads as well as the > > writes? Or are the reads calculate to be less expensive than the writes? > > mClock treats both reads and

[ceph-users] Re: mclock and massive reads

2024-03-28 Thread Sridhar Seshasayee
Hi Luis, > So our question, is mClock taking into account the reads as well as the > writes? Or are the reads calculate to be less expensive than the writes? > > mClock treats both reads and writes equally. When you say "massive reads", do you mean a predominantly read workload? Also, the size of