[ceph-users] Re: The snaptrim queue of PGs has not decreased for several days.

2024-08-21 Thread Giovanna Ratini

Hello Eugen,


Hi (please don't drop the ML from your responses),

Sorry. I didn't pay attention. I will.



All PGs of pool cephfs are affected and they are in all OSDs


then just pick a random one and check if anything stands out. I'm not 
sure if you mentioned it already, did you also try restarting OSDs?


Yes, I've done everything, including compaction, reducing defaults, and 
OSD restarts.


The growth seems to have stopped, but there hasn't been a decrease. It 
appears that only the CephFS pool is problematic. I'm an Oracle admin 
and I don't have much experience with Ceph, so perhaps my questions 
might seem a bit naive.


I have a lot of space in this cluster. Could I create a new cephfs pool 
(cephfs01) and copy the data over to it?
Then I would change the name of the pool in Rook and hope that the pods 
will find their PVs."


Regards,

Gio


Oh, not yesterday. I do it now, then I compat all osds with nostrim.
Do I add OSDs?


Let's wait for the other results first (compaction, reducing defaults, 
OSD restart). If that doesn't change anything, I would probably try to 
add three more OSDs. I assume you have three hosts?



Zitat von Giovanna Ratini :


Hello Eugen,

Am 20.08.2024 um 09:44 schrieb Eugen Block:
 You could also look into the historic_ops of the primary OSD for 
one affected PG:


All PGs of pool cephfs are affected and they are in all OSDs :-(


Did you reduce the default values I mentioned?


Oh, not yesterday. I do it now, then I compat all osds with nostrim.

Do I add OSDs?

Regars,

Gio




ceph tell osd. dump_historic_ops_by_duration

But I'm not sure if that can actually help here. There are plenty of 
places to look at, you could turn on debug logs on one primary OSD 
and inspect the output.


I just get the feeling that this is one of the corner cases with too 
few OSDs, although the cluster load seems to be low.


Zitat von Giovanna Ratini :


Hello Eugen,

yesterday after stop and go of snaptrim the queue decrease a little 
and then remain blocked.

They didn't grow and didn't decrease.

Is that good or bad?


Am 19.08.2024 um 15:43 schrieb Eugen Block:
There's a lengthy thread [0] where several approaches are 
proposed. The worst is a OSD recreation, but that's the last 
resort, of course.


What's are the current values for these configs?

ceph config get osd osd_pg_max_concurrent_snap_trims
ceph config get osd osd_max_trimming_pgs

Maybe decrease them to 1 each while the nosnaptrim flag is set, 
then unset it. You could also try online (and/or offline osd 
compaction) before unsetting the flag. Are the OSD processes 
utilizing an entire CPU?


[0] https://www.spinics.net/lists/ceph-users/msg75626.html

Zitat von Giovanna Ratini :


Hallo Eugen,

yes, the load is for now not too much.

I stop the snap and now this is the output. No changes in the queue.

root@kube-master02:~# k ceph -s
Info: running 'ceph' command with args: [-s]
  cluster:
    id: 3a35629a-6129-4daf-9db6-36e0eda637c7
    health: HEALTH_WARN
    nosnaptrim flag(s) set
    32 pgs not deep-scrubbed in time
    32 pgs not scrubbed in time

  services:
    mon: 3 daemons, quorum bx,bz,ca (age 30h)
    mgr: a(active, since 29h), standbys: b
    mds: 1/1 daemons up, 1 hot standby
    osd: 6 osds: 6 up (since 21h), 6 in (since 6d)
 flags nosnaptrim

  data:
    volumes: 1/1 healthy
    pools:   4 pools, 97 pgs
    objects: 4.21M objects, 2.5 TiB
    usage:   7.7 TiB used, 76 TiB / 84 TiB avail
    pgs: 65 active+clean
 32 active+clean+snaptrim_wait

  io:
    client:   7.4 MiB/s rd, 7.9 MiB/s wr, 11 op/s rd, 35 op/s wr

Am 19.08.2024 um 14:54 schrieb Eugen Block:

What happens when you disable snaptrimming entirely?

ceph osd set nosnaptrim

So the load on your cluster seems low, but are the OSDs heavily 
utilized? Have you checked iostat?


Zitat von Giovanna Ratini :


Hello Eugen,

*root@kube-master02:~# k ceph -s*

Info: running 'ceph' command with args: [-s]
  cluster:
    id: 3a35629a-6129-4daf-9db6-36e0eda637c7
    health: HEALTH_WARN
    32 pgs not deep-scrubbed in time
    32 pgs not scrubbed in time

  services:
    mon: 3 daemons, quorum bx,bz,ca (age 13h)
    mgr: a(active, since 13h), standbys: b
    mds: 1/1 daemons up, 1 hot standby
    osd: 6 osds: 6 up (since 5h), 6 in (since 5d)

  data:
    volumes: 1/1 healthy
    pools:   4 pools, 97 pgs
    objects: 4.20M objects, 2.5 TiB
    usage:   7.7 TiB used, 76 TiB / 84 TiB avail
    pgs: 65 active+clean
 20 active+clean+snaptrim_wait
 12 active+clean+snaptrim

  io:
    client:   3.5 MiB/s rd, 3.6 MiB/s wr, 6 op/s rd, 12 op/s wr

If I understand the documentation correctly, I will never have 
a scrub unless the PGs (Placement Groups) are active and clean.


All 32 PGs of the CephFS pool have been in this status for 
several days:


 * 20 active+clean+snaptrim_wait
 * 12 active+clean+snaptrim"

Today, I restarted the MON, MGR, and MDS, but no change

[ceph-users] Re: Unable to recover cluster, error: unable to read magic from mon data

2024-08-21 Thread Eugen Block

Hi,

Is there any command-line history available to get at least some sort  
of history of events?

Are all MONs down or has one survived?
Could he have tried to change IP addresses or something? There's an  
old blog post [0] explaining how to clean up. And here's some more  
reading [1] how to modify a monmap in a cephadm managed cluster.

I assume none of the ceph commands work, can you confirm?

[0] https://ceph.io/en/news/blog/2015/ceph-monitor-troubleshooting/
[1]  
https://docs.ceph.com/en/latest/rados/operations/add-or-rm-mons/#rados-mon-remove-from-unhealthy


Zitat von RIT Computer Science House :


Hello,

Our cluster has become unresponsive after a teammate's work on the cluster.
We are currently unable to get the full story on what he did to fully
understand what is going on, and the only error we are able to see in any
of the logs is the following:
2024-08-20T03:12:34.183+ 7f3670246b80 -1 unable to read magic from mon
data

Any help would be greatly appreciated!
I can provide any information necessary to help debugging.

Thanks,
Tyler
System Administrator @ Computer Science House
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph XFS deadlock with Rook

2024-08-21 Thread Satoru Takeuchi
2024年8月14日(水) 8:23 Raphaël Ducom :
>
> Hi
>
> I'm reaching out to check on the status of the XFS deadlock issue with RBD
> in hyperconverged environments, as detailed in Ceph tracker issue #43910 (
> https://tracker.ceph.com/issues/43910?tab=history). It looks like there
> hasn’t been much activity on this for a while, and I'm wondering if there
> have been any updates or if it’s just been lost in the issue volume.
>
> The issue is a deadlock when using XFS, leading Rook to recommend using
> EXT4 instead.
> However, kernel 5.6 introduced PR_SET_IO_FLUSHER, which should allow Ceph
> to handle this scenario better. Versions lacking PR_SET_IO_FLUSHER are
> either already EOL or soon will be, with the exception of kernel 5.4.
>
> While EXT4 works according to the Ceph documentation, the limitations with
> XATTR on EXT4 are still a concern.
> It’s a bit unfortunate that Rook has to move away from XFS where Ceph
> recommends it.
> The issue on the Rook side :
> https://github.com/rook/rook/issues/3132#issuecomment-580508760
>
> Could anyone provide an update on this Ceph issue or suggest how we might
> push it forward?
> Any insights would be really appreciated, as this impacts the broader use
> of XFS in hyperconverged Ceph setups deployed with Rook.

I've also encountered this issue. My team haven't be able to XFS for
RBD due to this problem.

I hope that someone will fix this problem, in other words, calling
`prctl(PR_SET_IO_FLUSHER)`
somewhere in the Ceph daemons' initialization code. Or if someone
teach me the appropriate
place to insert this call, I'll create a patch.

Best,
Satoru
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Pull failed on cluster upgrade

2024-08-21 Thread Nicola Mori
In the end I built up an image based on Ubuntu 22.04 which does not 
mandate x86-64-v2. I installed the official Ceph packages and hacked 
here and there (e.g. it was necessary to set the uid and gid of the Ceph 
user and group identical to those used by the CentOS Stream 8 image to 
avoid to mess with filesystem permissions), now I'm upgrading the 
cluster and hopefully it will end up well.


I hope that this way, if Red Hat will continue to provide Ceph packages 
for Ubuntu, I will be able to upgrade to Ceph 19 and beyond since also 
Ubuntu 24.04 does not need x86-64-v2. I'd have to figure out how to 
upgrade cephadm on the hosts (which run Rocky Linux 8) now that the 
official build is discontinued, but I'll cope with that later.


Thanks to everybody for the help; if anyone is interested I can share 
the Dockerfile.

Cheers,

Nicola


smime.p7s
Description: S/MIME Cryptographic Signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Pull failed on cluster upgrade

2024-08-21 Thread Nicola Mori

The upgrade ended successfully, but now the cluster reports this error:

  MDS_CLIENTS_BROKEN_ROOTSQUASH: 1 MDS report clients with broken 
root_squash implementation


From what I understood this is due to a new feature meant to fix a bug 
in the root_squash implementation, and that will be released with 
version 19. I didn't find anything about a backport to 18.2.4. Can 
someone share some info please? Especially about if and how it can be fixed.

Thanks in advance,

Nicola


smime.p7s
Description: S/MIME Cryptographic Signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Unable to recover cluster, error: unable to read magic from mon data

2024-08-21 Thread RIT Computer Science House
One mon survived (it took us a while to find it since it was in a damaged
state), and we have since been able to create a new second mon where an old
mon was - quorum has been re-established. We are not able to use `ceph
orch` now to deploy new mons though, it is giving us an error from the
keyring.

The ceph commands were not working at the time of only one damaged mon, but
now most ceph commands function normally with 2 mons.

On Wed, Aug 21, 2024 at 6:27 AM Eugen Block  wrote:

> Hi,
>
> Is there any command-line history available to get at least some sort
> of history of events?
> Are all MONs down or has one survived?
> Could he have tried to change IP addresses or something? There's an
> old blog post [0] explaining how to clean up. And here's some more
> reading [1] how to modify a monmap in a cephadm managed cluster.
> I assume none of the ceph commands work, can you confirm?
>
> [0] https://ceph.io/en/news/blog/2015/ceph-monitor-troubleshooting/
> [1]
>
> https://docs.ceph.com/en/latest/rados/operations/add-or-rm-mons/#rados-mon-remove-from-unhealthy
>
> Zitat von RIT Computer Science House :
>
> > Hello,
> >
> > Our cluster has become unresponsive after a teammate's work on the
> cluster.
> > We are currently unable to get the full story on what he did to fully
> > understand what is going on, and the only error we are able to see in any
> > of the logs is the following:
> > 2024-08-20T03:12:34.183+ 7f3670246b80 -1 unable to read magic from
> mon
> > data
> >
> > Any help would be greatly appreciated!
> > I can provide any information necessary to help debugging.
> >
> > Thanks,
> > Tyler
> > System Administrator @ Computer Science House
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Pull failed on cluster upgrade

2024-08-21 Thread Frédéric Nass
Hi Nicola,

You might want to post in the ceph-dev list about this or discuss it with devs 
in the ceph-devel slack channel for quicker help.

Bests,
Frédéric.


De : Nicola Mori 
Envoyé : mercredi 21 août 2024 15:52
À : ceph-users@ceph.io
Objet : [ceph-users] Re: Pull failed on cluster upgrade

The upgrade ended successfully, but now the cluster reports this error: 

   MDS_CLIENTS_BROKEN_ROOTSQUASH: 1 MDS report clients with broken 
root_squash implementation 

From what I understood this is due to a new feature meant to fix a bug 
in the root_squash implementation, and that will be released with 
version 19. I didn't find anything about a backport to 18.2.4. Can 
someone share some info please? Especially about if and how it can be fixed. 
Thanks in advance, 

Nicola
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io