[ceph-users] Re: How to recover from an MDs rank in state 'failed'

2024-05-30 Thread Dhairya Parmar
Hi Noe, If the MDS has failed and you're sure of the fact that there are no pending tasks or sessions associated with the failed MDS, you can try to make use of `ceph mds rmfailed` but beware this MDS is really doing nothing and doesn't link to any file system otherwise things can go wrong and can

[ceph-users] How to setup NVMeoF?

2024-05-30 Thread Robert Sander
Hi, I am trying to follow the documentation at https://docs.ceph.com/en/reef/rbd/nvmeof-target-configure/ to deploy an NVMe over Fabric service. Step 2b of the configuration section is currently the showstopper. First the command says: error: the following arguments are required: --host-nam

[ceph-users] Re: How to setup NVMeoF?

2024-05-30 Thread Robert Sander
Hi, On 5/30/24 11:58, Robert Sander wrote: I am trying to follow the documentation at https://docs.ceph.com/en/reef/rbd/nvmeof-target-configure/ to deploy an NVMe over Fabric service. It looks like the cephadm orchestrator in this 18.2.2 cluster uses the image quay.io/ceph/nvmeof:0.0.2 whic

[ceph-users] Re: Missing ceph data

2024-05-30 Thread Eugen Block
Hi, I've never heard of automatic data deletion. Maybe just some snapshots were removed? Or someone deleted data on purpose because of the nearfull state of some OSDs? And there's no trash function for cephfs (for rbd there is). Do you use cephfs snapshots? Zitat von Prabu GJ : Hi Team

[ceph-users] Re: How to setup NVMeoF?

2024-05-30 Thread Frédéric Nass
Hello Robert, You could try: ceph config set mgr mgr/cephadm/container_image_nvmeof "quay.io/ceph/nvmeof:1.2.13" or whatever image tag you need (1.2.13 is current latest). Another way to run the image is by editing the unit.run file of the service or by directly running the container with pod

[ceph-users] Re: How to setup NVMeoF?

2024-05-30 Thread John Mulligan
On Thursday, May 30, 2024 7:03:44 AM EDT Robert Sander wrote: > Hi, > > On 5/30/24 11:58, Robert Sander wrote: > > > > I am trying to follow the documentation at > > https://docs.ceph.com/en/reef/rbd/nvmeof-target-configure/ to deploy an > > NVMe over Fabric service. > > > It looks like the

[ceph-users] RBD-Images are not shown in the Dashbord: Failed to execute RBD [errno 19] error generating diff from snapshot None

2024-05-30 Thread Maximilian Dauer
Dear Community, I hope, you can guide me to solve this error or I can assist in solving a bug: RBD-Images are not shown in my Dashbord. - When accessing the dashboard page (block -> images) no images were listed and the error "Failed to execute RBD [errno 19] error generating diff from snapshot

[ceph-users] Re: How to setup NVMeoF?

2024-05-30 Thread Dino Yancey
I've never used this feature, but I wanted to point out your command versus the error message; gateway-name / gateway_name (dash versus underscore) On Thu, May 30, 2024 at 5:07 AM Robert Sander wrote: > Hi, > > I am trying to follow the documentation at > https://docs.ceph.com/en/reef/rbd/nvmeof

[ceph-users] Re: How to setup NVMeoF?

2024-05-30 Thread Robert Sander
Hi, On 5/30/24 14:18, Frédéric Nass wrote: ceph config set mgr mgr/cephadm/container_image_nvmeof "quay.io/ceph/nvmeof:1.2.13" Thanks for the hint. With that the orchestrator deploys the current container image. But: It suddenly listens on port 5499 instead of 5500 and: # podman run -it q

[ceph-users] RBD Mirror - Failed to unlink peer

2024-05-30 Thread Scott Cairns
Hi, Following the introduction of an additional node to our Ceph cluster, we've started to see unlink errors when taking a rbd mirror snapshot. We've had RBD mirroring configured for over a year now and it's been working flawlessly, however after we created OSD's on a new node we've receiving t

[ceph-users] Re: How to setup NVMeoF?

2024-05-30 Thread Gregory Farnum
There's a major NVMe effort underway but it's not even merged to master yet, so I'm not sure how docs would have ended up in the Reef doc tree. :/ Zac, any idea? Can we pull this out? -Greg On Thu, May 30, 2024 at 7:03 AM Robert Sander wrote: > > Hi, > > On 5/30/24 14:18, Frédéric Nass wrote: >

[ceph-users] Re: MDS Abort druing FS scrub

2024-05-30 Thread Patrick Donnelly
On Fri, May 24, 2024 at 7:09 PM Malcolm Haak wrote: > > When running a cephfs scrub the MDS will crash with the following backtrace > > -1> 2024-05-25T09:00:23.028+1000 7ef2958006c0 -1 > /usr/src/debug/ceph/ceph-18.2.2/src/mds/MDSRank.cc: In function 'void > MDSRank::abort(std::string_view)' t

[ceph-users] Re: Help needed! First MDs crashing, then MONs. How to recover ?

2024-05-30 Thread Patrick Donnelly
On Tue, May 28, 2024 at 8:54 AM Noe P. wrote: > > Hi, > > we ran into a bigger problem today with our ceph cluster (Quincy, > Alma8.9). > We have 4 filesystems and a total of 6 MDs, the largest fs having > two ranks assigned (i.e. one standby). > > Since we often have the problem of MDs lagging be

[ceph-users] Re: CephFS HA: mgr finish mon failed to return metadata for mds

2024-05-30 Thread Patrick Donnelly
The fix was actually backported to v18.2.3. The tracker was wrong. On Wed, May 29, 2024 at 3:26 PM wrote: > > Hi, > > we have a stretched cluster (Reef 18.2.1) with 5 nodes (2 nodes on each side > + witness). You can se our daemon placement below. > > [admin] > ceph-admin01 labels="['_admin', 'm

[ceph-users] Re: Ceph Reef v18.2.3 - release date?

2024-05-30 Thread Pierre Riteau
Hi Peter, The upcoming Reef minor release is delayed due to important bugs: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/FMFUZHKNFH4Z5DWS5BAYBPENHTNJCAYS/ On Wed, 29 May 2024 at 21:03, Peter Razumovsky wrote: > Hello! We're waiting brand new minor 18.2.3 due to > https://git

[ceph-users] Re: reef 18.2.3 QE validation status

2024-05-30 Thread Yuri Weinstein
I reran rados on the fix https://github.com/ceph/ceph/pull/57794/commits and seeking approvals from Radek and Laure https://tracker.ceph.com/issues/65393#note-1 On Tue, May 28, 2024 at 2:12 PM Yuri Weinstein wrote: > > We have discovered some issues (#1 and #2) during the final stages of > testi