[ceph-users] Contionuous spurious repairs without cause?

2023-09-05 Thread Christian Theune
Hi,

this is a bit older cluster (Nautilus, bluestore only).

We’ve noticed that the cluster is almost continuously repairing PGs. However, 
they all finish successfully with “0 fixed”. We do not see the trigger why Ceph 
decides to repair the PGs and it’s happening for a lot of PGs, not any specific 
individual one.

Deep-scrubs are generally running, but currently a bit late as we had some 
recoveries in the last week.

Logs look regular aside from the number of repairs. Here’s the last weeks from 
the perspective of a single PG. There’s one repair, but the same thing seems to 
happen for all PGs.

2023-08-06 16:08:17.870 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub starts
2023-08-06 16:08:18.270 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub ok
2023-08-07 21:52:22.299 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub starts
2023-08-07 21:52:22.711 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub ok
2023-08-09 00:33:42.587 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub starts
2023-08-09 00:33:43.049 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub ok
2023-08-10 09:36:00.590 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 deep-scrub starts
2023-08-10 09:36:28.811 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 deep-scrub ok
2023-08-11 12:59:14.219 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub starts
2023-08-11 12:59:14.567 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub ok
2023-08-12 13:52:44.073 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub starts
2023-08-12 13:52:44.483 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub ok
2023-08-14 01:51:04.774 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
278.2f3 deep-scrub starts
2023-08-14 01:51:33.113 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 deep-scrub ok
2023-08-15 05:18:16.093 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub starts
2023-08-15 05:18:16.520 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub ok
2023-08-16 09:47:38.520 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub starts
2023-08-16 09:47:38.930 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub ok
2023-08-17 19:25:45.352 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub starts
2023-08-17 19:25:45.775 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub ok
2023-08-19 05:40:43.663 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub starts
2023-08-19 05:40:44.073 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub ok
2023-08-20 12:06:54.343 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub starts
2023-08-20 12:06:54.809 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub ok
2023-08-21 19:23:10.801 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
278.2f3 deep-scrub starts
2023-08-21 19:23:39.936 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 deep-scrub ok
2023-08-23 03:43:21.391 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub starts
2023-08-23 03:43:21.844 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub ok
2023-08-24 04:21:17.004 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 deep-scrub starts
2023-08-24 04:21:47.972 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
278.2f3 deep-scrub ok
2023-08-25 06:55:13.588 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub starts
2023-08-25 06:55:14.087 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub ok
2023-08-26 09:26:01.174 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub starts
2023-08-26 09:26:01.561 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub ok
2023-08-27 11:18:10.828 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub starts
2023-08-27 11:18:11.264 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub ok
2023-08-28 19:05:42.104 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub starts
2023-08-28 19:05:42.693 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub ok
2023-08-30 07:03:10.327 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub starts
2023-08-30 07:03:10.805 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub ok
2023-08-31 14:43:23.849 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 deep-scrub starts
2023-08-31 14:43:50.723 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
278.2f3 deep-scrub ok
2023-09-01 20:53:42.749 7f37ca268640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub starts
2023-09-01 20:53:43.389 7f37c6260640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub ok
2023-09-02 22:57:49.542 7f37ca268640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub starts
2023-09-02 22:57:50.065 7f37c6260640  0 log_channel(cluster) log [DBG] : 
278.2f3 scrub ok
2023-09-04 03:16:14.754 7f37ca268640  0 log_channel(cluster) lo

[ceph-users] Re: Contionuous spurious repairs without cause?

2023-09-05 Thread Eugen Block

Hi,

it sounds like you have auto-repair enabled (osd_scrub_auto_repair). I  
guess you could disable that to see what's going on with the PGs and  
their replicas. And/or you could enable debug logs. Are all daemons  
running the same ceph (minor) version? I remember a customer case  
where different ceph minor versions (but overall Octopus) caused  
damaged PGs, a repair fixed them everytime. After they updated all  
daemons to the same minor version those errors were gone.


Regards,
Eugen

Zitat von Christian Theune :


Hi,

this is a bit older cluster (Nautilus, bluestore only).

We’ve noticed that the cluster is almost continuously repairing PGs.  
However, they all finish successfully with “0 fixed”. We do not see  
the trigger why Ceph decides to repair the PGs and it’s happening  
for a lot of PGs, not any specific individual one.


Deep-scrubs are generally running, but currently a bit late as we  
had some recoveries in the last week.


Logs look regular aside from the number of repairs. Here’s the last  
weeks from the perspective of a single PG. There’s one repair, but  
the same thing seems to happen for all PGs.


2023-08-06 16:08:17.870 7fc49f1e6640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub starts
2023-08-06 16:08:18.270 7fc49b1de640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub ok
2023-08-07 21:52:22.299 7fc49f1e6640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub starts
2023-08-07 21:52:22.711 7fc49b1de640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub ok
2023-08-09 00:33:42.587 7fc49b1de640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub starts
2023-08-09 00:33:43.049 7fc49f1e6640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub ok
2023-08-10 09:36:00.590 7fc49b1de640  0 log_channel(cluster) log  
[DBG] : 278.2f3 deep-scrub starts
2023-08-10 09:36:28.811 7fc49b1de640  0 log_channel(cluster) log  
[DBG] : 278.2f3 deep-scrub ok
2023-08-11 12:59:14.219 7fc49f1e6640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub starts
2023-08-11 12:59:14.567 7fc49b1de640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub ok
2023-08-12 13:52:44.073 7fc49b1de640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub starts
2023-08-12 13:52:44.483 7fc49f1e6640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub ok
2023-08-14 01:51:04.774 7fc49f1e6640  0 log_channel(cluster) log  
[DBG] : 278.2f3 deep-scrub starts
2023-08-14 01:51:33.113 7fc49b1de640  0 log_channel(cluster) log  
[DBG] : 278.2f3 deep-scrub ok
2023-08-15 05:18:16.093 7fc49b1de640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub starts
2023-08-15 05:18:16.520 7fc49f1e6640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub ok
2023-08-16 09:47:38.520 7fc49b1de640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub starts
2023-08-16 09:47:38.930 7fc49b1de640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub ok
2023-08-17 19:25:45.352 7fc49b1de640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub starts
2023-08-17 19:25:45.775 7fc49b1de640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub ok
2023-08-19 05:40:43.663 7fc49b1de640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub starts
2023-08-19 05:40:44.073 7fc49f1e6640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub ok
2023-08-20 12:06:54.343 7fc49f1e6640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub starts
2023-08-20 12:06:54.809 7fc49b1de640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub ok
2023-08-21 19:23:10.801 7fc49f1e6640  0 log_channel(cluster) log  
[DBG] : 278.2f3 deep-scrub starts
2023-08-21 19:23:39.936 7fc49b1de640  0 log_channel(cluster) log  
[DBG] : 278.2f3 deep-scrub ok
2023-08-23 03:43:21.391 7fc49f1e6640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub starts
2023-08-23 03:43:21.844 7fc49b1de640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub ok
2023-08-24 04:21:17.004 7fc49b1de640  0 log_channel(cluster) log  
[DBG] : 278.2f3 deep-scrub starts
2023-08-24 04:21:47.972 7fc49f1e6640  0 log_channel(cluster) log  
[DBG] : 278.2f3 deep-scrub ok
2023-08-25 06:55:13.588 7fc49b1de640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub starts
2023-08-25 06:55:14.087 7fc49f1e6640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub ok
2023-08-26 09:26:01.174 7fc49f1e6640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub starts
2023-08-26 09:26:01.561 7fc49f1e6640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub ok
2023-08-27 11:18:10.828 7fc49b1de640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub starts
2023-08-27 11:18:11.264 7fc49f1e6640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub ok
2023-08-28 19:05:42.104 7fc49f1e6640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub starts
2023-08-28 19:05:42.693 7fc49f1e6640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub ok
2023-08-30 07:03:10.327 7fc49b1de640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub starts
2023-08-30 07:03:10.805 7fc49f1e6640  0 log_channel(cluster) log  
[DBG] : 278.2f3 scrub ok
2023-08-31 14:43:23.849 7fc49b1de640  0 log_channel

[ceph-users] Re: Contionuous spurious repairs without cause?

2023-09-05 Thread Manuel Lausch
Hi,

in older versions of ceph with the auto-repair feature the PG state of
scrubbing PGs had always the repair state as well.
With later versions (I don't know exactly at which version) ceph
differentiated scrubbing and repair again in the PG state.

I think as long as there are no errors loged all should be fine. If
you disable auto repair, the issue should disapear as well. In case of
scrub errors you will then see appropriate states. 

Regards
Manuel

On Tue, 05 Sep 2023 14:14:56 +
Eugen Block  wrote:

> Hi,
> 
> it sounds like you have auto-repair enabled (osd_scrub_auto_repair). I  
> guess you could disable that to see what's going on with the PGs and  
> their replicas. And/or you could enable debug logs. Are all daemons  
> running the same ceph (minor) version? I remember a customer case  
> where different ceph minor versions (but overall Octopus) caused  
> damaged PGs, a repair fixed them everytime. After they updated all  
> daemons to the same minor version those errors were gone.
> 
> Regards,
> Eugen
> 
> Zitat von Christian Theune :
> 
> > Hi,
> >
> > this is a bit older cluster (Nautilus, bluestore only).
> >
> > We’ve noticed that the cluster is almost continuously repairing PGs.  
> > However, they all finish successfully with “0 fixed”. We do not see  
> > the trigger why Ceph decides to repair the PGs and it’s happening  
> > for a lot of PGs, not any specific individual one.
> >
> > Deep-scrubs are generally running, but currently a bit late as we  
> > had some recoveries in the last week.
> >
> > Logs look regular aside from the number of repairs. Here’s the last  
> > weeks from the perspective of a single PG. There’s one repair, but  
> > the same thing seems to happen for all PGs.
> >
> > 2023-08-06 16:08:17.870 7fc49f1e6640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub starts
> > 2023-08-06 16:08:18.270 7fc49b1de640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub ok
> > 2023-08-07 21:52:22.299 7fc49f1e6640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub starts
> > 2023-08-07 21:52:22.711 7fc49b1de640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub ok
> > 2023-08-09 00:33:42.587 7fc49b1de640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub starts
> > 2023-08-09 00:33:43.049 7fc49f1e6640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub ok
> > 2023-08-10 09:36:00.590 7fc49b1de640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 deep-scrub starts
> > 2023-08-10 09:36:28.811 7fc49b1de640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 deep-scrub ok
> > 2023-08-11 12:59:14.219 7fc49f1e6640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub starts
> > 2023-08-11 12:59:14.567 7fc49b1de640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub ok
> > 2023-08-12 13:52:44.073 7fc49b1de640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub starts
> > 2023-08-12 13:52:44.483 7fc49f1e6640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub ok
> > 2023-08-14 01:51:04.774 7fc49f1e6640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 deep-scrub starts
> > 2023-08-14 01:51:33.113 7fc49b1de640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 deep-scrub ok
> > 2023-08-15 05:18:16.093 7fc49b1de640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub starts
> > 2023-08-15 05:18:16.520 7fc49f1e6640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub ok
> > 2023-08-16 09:47:38.520 7fc49b1de640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub starts
> > 2023-08-16 09:47:38.930 7fc49b1de640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub ok
> > 2023-08-17 19:25:45.352 7fc49b1de640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub starts
> > 2023-08-17 19:25:45.775 7fc49b1de640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub ok
> > 2023-08-19 05:40:43.663 7fc49b1de640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub starts
> > 2023-08-19 05:40:44.073 7fc49f1e6640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub ok
> > 2023-08-20 12:06:54.343 7fc49f1e6640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub starts
> > 2023-08-20 12:06:54.809 7fc49b1de640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub ok
> > 2023-08-21 19:23:10.801 7fc49f1e6640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 deep-scrub starts
> > 2023-08-21 19:23:39.936 7fc49b1de640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 deep-scrub ok
> > 2023-08-23 03:43:21.391 7fc49f1e6640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub starts
> > 2023-08-23 03:43:21.844 7fc49b1de640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub ok
> > 2023-08-24 04:21:17.004 7fc49b1de640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 deep-scrub starts
> > 2023-08-24 04:21:47.972 7fc49f1e6640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 deep-scrub ok
> > 2023-08-25 06:55:13.588 7fc49b1de640  0 log_channel(cluster) log  
> > [DBG] : 278.2f3 scrub starts
> > 2023-08-25 06:55:14.087 7fc49f1e6640  0 log_

[ceph-users] Re: RGW Lua - writable response header/field

2023-09-05 Thread Yuval Lifshitz
Hi Ondřej,
As you said, you can't add a new header in the response, but maybe you can
override one of the existing response fields?
e.g. Request.Response.Message

let me know if that works for you?

Yuval



On Mon, Sep 4, 2023 at 1:33 PM Ondřej Kukla  wrote:

> Hello,
>
> We have a RGW setup that has a bunch of Nginx in front of RGWs to work as
> a LB. I’m currently working on some metrics and log analysis from the LB
> logs.
>
> At the moment I’m looking at possibilities to recognise the type of s3
> request on the LB. I know that matching the format shouldn’t be extremely
> hard, but I was looking into a possibility to extract the information from
> RGW as that’s the part that’s aware of that.
>
> I was working with the LUA part of RGW before so I know that the
> Request.RGWOp Field is an great fit.
>
>
> I would like to add this as a some kind of response header, but
> unfortunately that’s not possible at the moment if I’m not wrong.
>
> Has anyone looked into this (wink wink Yuval :))? Or do you have a
> recommendation how to do it?
>
> Thanks a lot.
>
> Regards,
>
> Ondrej
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: lack of RGW_API_HOST in ceph dashboard, 17.2.6, causes ceph mgr dashboard problems

2023-09-05 Thread Christopher Durham
 I solved this.
It has multiple layers.
1. RGW_API_HOST is no longer available in 17.2.6 as a configuration option for 
the ceph mgr. (I was wrong below when I said it could be queried on an 
*upgraded* host with:
# ceph dashboard get-rgw-api-host
You *can* query it with:
# ceph config dump | grep mgr | grep dashboard
But because the mgr doesn't use it, it's existence in the config doesn't matter.
My setup uses a DNS record that points to multiple IP addresses for the rgw 
servers. This record is s3.my.dom, and these IP numbers are VIPs that are 
controlled by keepalived, to share them amongst the radosgw servers in the case 
one of the radosgw servers dies, one of the others will answer for it.
But each radosgw server itself has a name similar to: rgw1.my.dom, rgw2.my.dom, 
rgw3.my.dom, in DNS each of these points to the IP of the actual server, (not 
the VIPs above).
In 17.2.6, I *can* do this:

$ aws --endpoint https://s3.my.dom s3 ls s3://mybucket

But, in 17.2.6, I can no longer do:
$ aws --endpoint https://rgw1.my.dom s3 ls s3://mybucket

It returns a "NoSuchBucket" error with a 403 error.
This looks to be the same error that the ceph mgr is returning in the GUI, as 
it tries to query the individual radosgw server name, NOT the s3.my.dom which 
it was before the upgrade to 17.2.6 (It was using the RGW_API_HOST which I have 
set to s3.my.dom.
In the radosgw zonegroup, (radosgw-admin zonegroup get) I have:
"endpoints": [
    "https://s3.my.dom";]
"hostnames": [   "s3.my.dom"]

and in the individual zone definition I have:
"endpoints": [   "https://s3.my.dom";]
The code in rgw_rest.cc has this section:
    if (subdomain.empty()
    && (domain.empty() || domain != info.host)
    && !looks_like_ip_address(info.host.c_str())
    && RGWHandler_REST::validate_bucket_name(info.host) == 0
    && !(hostnames_set.empty() && hostnames_s3website_set.empty())) {
  subdomain.append(info.host);
  in_hosted_domain = 1;
    }

and later this:
    if (in_hosted_domain && !subdomain.empty()) {
  string encoded_bucket = "/";
  encoded_bucket.append(subdomain);
  if (s->info.request_uri[0] != '/')
    encoded_bucket.append("/");
  encoded_bucket.append(s->info.request_uri);
  s->info.request_uri = encoded_bucket;
    }


In my situation, this ends up with the bucket name being changed, resulting in 
the NoSuchBucket error.
The way to fix this is to have "rgw dns name" set separately for each rgw host 
in ceph.conf and restarting the radosgw servers. 
That way, the rgw hostname is set and the bucket name does not get changed. It 
appears that in my setup, without "rgw dns name" set,along with the other 
settings I have, radosgw ASSUMES? we are using the newer domain style bucket 
names (we are not), and thatthe radosgw servers no longer recognize (with my 
settings) their own server name as valid. Adding "rgw dns name" for each server 
to ceph.confallows the aws cli to work with --endpoint set to the individual 
server name and for the ceph-mgr to work when querying radosgw information.

In pacific 16.x.y this was not a problem.
-Chris




On Wednesday, August 30, 2023 at 05:08:36 AM MDT, Eugen Block 
 wrote:  
 
 Hi,

there have been multiple discussions on this list without any  
satisfying solution for all possible configurations. One of the  
changes [1] made in Pacific was to use hostname instead of IP, but it  
only uses the shortname (you can check the "hostname" in 'ceph service  
dump' output. But this seems to only impact the dashboard access if  
you have ssl-verify set to true. I'm still waiting for a solution as  
well for a customer cluster which uses wildcard certificates only,  
until then we let ssl-verify disabled. But I didn't check the tracker  
for any pending tickets, so someone might be working on it.

Regards,
Eugen

[1] https://github.com/ceph/ceph/pull/47207/files

Zitat von Christopher Durham :

> Hi,
> I am using 17.2.6 on Rocky Linux 8
> The ceph mgr dashboard, in my situation, (bare metal install,  
> upgraded from 15->16-> 17.2.6), can no longer hit the  
> ObjectStore->(Daemons,Users,Buckets) pages.
>
> When I try to hit those pages, it gives an error:
> RGW REST API failed request with status code 403 {"Code":  
> "AccessDenied", RequestId: "xxx", HostId: "-"}
>
> The log of the rgw server it hit has:
>
> "GET /admin/metadata/user?myself HTTP/1.1" 403 125
>
> It appears that the mgr dashboard setting RGW_API_HOST is no longer  
> an option that can be set, nor does that name exist anywhere under  
> /usr/share/ceph/mgr/dashboard, and:
>
> # ceph dashboard set-rgw-api-host 
>
> is no longer in existence in 17.2.6
>
> However, since my situation is an upgrade, the config value still  
> exists in my config, and I can retrieve it with:
>
> # ceph dashboard get-rgw-api-host
>
> To get the  to work in my situation, I have modified  
> /usr/share/ceph/mgr/dashboard/settings.py and re-added RGW_API_HOST  
> to the Options class using
>
> R

[ceph-users] insufficient space ( 10 extents) on vgs lvm detected locked

2023-09-05 Thread absankar89
ceph orch device ls - output ( insufficient space ( 10 extents) on vgs lvm 
detected locked ) Quincy version , Is this just warning or any action should be 
taken.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph-dashboard python warning with new pyo3 0.17 lib (debian12)

2023-09-05 Thread Max Carrara
Hello there,

could you perhaps provide some more information on how (or where) this
got fixed? It doesn't seem to be fixed yet on the latest Ceph Quincy
and Reef versions, but maybe I'm mistaken. I've provided some more
context regarding this below, in case that helps.


On Ceph Quincy 17.2.6 I'm encountering the following error when trying
to enable the dashboard (so, the same error that was posted above):

  root@ceph-01:~# ceph --version
  ceph version 17.2.6 (810db68029296377607028a6c6da1ec06f5a2b27) quincy (stable)

  root@ceph-01:~#  ceph mgr module enable dashboard
  Error ENOENT: module 'dashboard' reports that it cannot run on the active 
manager daemon: PyO3 modules may only be initialized once per interpreter 
process (pass --force to force enablement)

I was then able to find this Python traceback in the systemd journal:

  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: 2023-09-04T18:39:51.438+0200 
7fecdc91e000 -1 mgr[py] Traceback (most recent call last):
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/usr/share/ceph/mgr/dashboard/__init__.py", line 60, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from .module import Module, 
StandbyModule  # noqa: F401
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: 
^
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/usr/share/ceph/mgr/dashboard/module.py", line 30, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from .controllers import Router, 
json_error_page
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/usr/share/ceph/mgr/dashboard/controllers/__init__.py", line 1, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from ._api_router import 
APIRouter
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/usr/share/ceph/mgr/dashboard/controllers/_api_router.py", line 1, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from ._router import Router
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/usr/share/ceph/mgr/dashboard/controllers/_router.py", line 7, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from ._base_controller import 
BaseController
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/usr/share/ceph/mgr/dashboard/controllers/_base_controller.py", line 11, in 

  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from ..services.auth import 
AuthManager, JwtManager
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/usr/share/ceph/mgr/dashboard/services/auth.py", line 12, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: import jwt
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/lib/python3/dist-packages/jwt/__init__.py", line 1, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from .api_jwk import PyJWK, 
PyJWKSet
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/lib/python3/dist-packages/jwt/api_jwk.py", line 6, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from .algorithms import 
get_default_algorithms
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/lib/python3/dist-packages/jwt/algorithms.py", line 6, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from .utils import (
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/lib/python3/dist-packages/jwt/utils.py", line 7, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from 
cryptography.hazmat.primitives.asymmetric.ec import EllipticCurve
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/lib/python3/dist-packages/cryptography/hazmat/primitives/asymmetric/ec.py", 
line 11, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from cryptography.hazmat._oid 
import ObjectIdentifier
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/lib/python3/dist-packages/cryptography/hazmat/_oid.py", line 7, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from 
cryptography.hazmat.bindings._rust import (
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: ImportError: PyO3 modules may only 
be initialized once per interpreter process
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: 2023-09-04T18:39:51.438+0200 
7fecdc91e000 -1 mgr[py] Class not found in module 'dashboard'
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: 2023-09-04T18:39:51.438+0200 
7fecdc91e000 -1 mgr[py] Error loading module 'dashboard': (2) No such file or 
directory
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: 2023-09-04T18:39:51.470+0200 
7fecdc91e000 -1 mgr[py] Module progress has missing NOTIFY_TYPES member
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: 2023-09-04T18:39:51.502+0200 
7fecdc91e000 -1 mgr[py] Module iostat has missing NOTIFY_TYPES member
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: 2023-09-04T18:39:51.502+0200 
7fecdc91e000 -1 log_channel(cluster) log [ERR] : Failed to load ceph-mgr 
modules: dashboard


As the traceback above reveals, the dashboard uses `PyJWT`, which in
turn uses `cryptography`, and `cryptography` uses `PyO3`.

That led me to an issue[0] regarding this on `cryptography`'s side;
the Ceph Dashboard is apparently not the only thing that's affected
by this.

As it turns out, the maintainer of the Ceph AUR package has also
recently stumbled a