[ceph-users] PG increase / data movement fine tuning

2023-02-06 Thread Szabo, Istvan (Agoda)
Hi,

I've increased the placement group in my octopus cluster firstly in the index 
pool and I gave almost 2.5 hours bad performance for the user. I'm planning to 
increase the data pool also, but first I'd like to know is there any way to 
make it smoother or not.

At the moment I have these values:

osd_max_backfills = 1
osd_recovery_max_active = 1
osd_recovery_op_priority = 1

But seems like this still generates slow ops.

Should I turn off scrubbing or any other way to make it even smoother?


Some information about the setup:

  *   I have 9 nodes, each of them has 2x nvme drives with 4osd on those and 
this is where the index pool lives.
  *   Currently has 2048 pg-s for the index pool

Thank you


This message is confidential and is for the sole use of the intended 
recipient(s). It may also be privileged or otherwise protected by copyright or 
other legal rules. If you have received it by mistake please let us know by 
reply email and delete it from your system. It is prohibited to copy this 
message or disclose its content to anyone. Any confidentiality or privilege is 
not waived or lost by any mistaken delivery or unauthorized disclosure of the 
message. All messages sent to and from Agoda may be monitored to ensure 
compliance with company policies, to protect the company's interests and to 
remove potential malware. Electronic messages may be intercepted, amended, lost 
or deleted, or contain viruses.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Removing Rados Gateway in ceph cluster

2023-02-06 Thread Michel Niyoyita
Hello team,

I have a ceph cluster deployed using ceph-ansible , running on ubuntu 20.04
OS which have 6 hosts , 3 hosts for OSD  and 3 hosts used as monitors and
managers , I have deployed RGW on all those hosts  and RGWLOADBALENCER on
top of them , for testing purpose , I have switched off one OSD , to check
if the rest can work properly , The test went well as expected,
unfortunately while coming back an OSD , the RGW failed to connect through
the dashboard. below is the message :
The Object Gateway Service is not configuredError connecting to Object
GatewayPlease consult the documentation

on
how to configure and enable the Object Gateway management functionality.

would like to ask how to solve that issue or how can I proceed to remove
completely RGW and redeploy it after .


root@ceph-mon1:~# ceph -s
  cluster:
id: cb0caedc-eb5b-42d1-a34f-96facfda8c27
health: HEALTH_OK

  services:
mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 72m)
mgr: ceph-mon2(active, since 71m), standbys: ceph-mon3, ceph-mon1
osd: 48 osds: 48 up (since 79m), 48 in (since 3d)
rgw: 6 daemons active (6 hosts, 1 zones)

  data:
pools:   9 pools, 257 pgs
objects: 59.49k objects, 314 GiB
usage:   85 TiB used, 348 TiB / 433 TiB avail
pgs: 257 active+clean

  io:
client:   2.0 KiB/s wr, 0 op/s rd, 0 op/s wr

Kindly help

Best Regards

Michel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Inconsistency in rados ls

2023-02-06 Thread Eugen Block

Hi,

did you check if your cluster has many "shadow" or "multipart" objects  
in the pool? Those are taken into account when calculating the total  
number of objects in a pool but are not in the user stats of radosgw.  
Here's an example of a small rgw setup:


rados -p  ls | grep -vE "shadow|multipart" | wc -l
455

and it matches the number of files reported by user stats:

radosgw-admin user stats --uid  2>/dev/null | jq -r '.stats |  
select(.num_objects > 0) | .num_objects'

455

Although the total number of objects is much higher because of the  
mentioned shadow objects:


rados -p  ls | grep -cE "shadow|multipart"
2248

And the total number (2248 + 455) matches the stats from ceph (2.70k):

ceph df | grep 
598  7.7 GiB2.70k   21 GiB   0.571.4 TiB

Regards,
Eugen

Zitat von Ramin Najjarbashi :


On Thu, Feb 2, 2023 at 7:56 PM Eugen Block  wrote:


Hi,

> I have a cluster with approximately one billion objects and when I run a
PG
> query, it shows that I have 27,000 objects per PG.

which query is that, can you provide more details about that cluster and
pool?



Thanks for your response

ceph df | grep mypoo

--- POOLS ---

POOL OBJECTS

mypool   1.11G

---

 and from this, I got 8.8M objects :

for item in `radosgw-admin user list | jq -r ".[]" | head`; do
B_OBJ=$(radosgw-admin user stats --uid $item 2>/dev/null | jq -r '.stats |
select(.num_objects > 0) | .num_objects'); SUM=$((SUM + B_OBJ)); done



However, when I run the same command per pg, the results are much
> less, with only 20 million
> objects being reported. For example, "rados -p  --pgid 1.xx ls |
wc
> -l" shows only three objects in the specified PG.

It seems like your PG and object distribution might not be balanced
very well. Did you check each PG of that pool? The individual PG's
numbers should add up to the total number of objects. Here's a quick
example from an almost empty pool with 8 PGs:







storage01:~ # ceph pg ls-by-pool volumes | awk '{print $1" - "$2}'
PG - OBJECTS
3.0 - 2
3.1 - 0
3.2 - 2
3.3 - 1
3.4 - 3
3.5 - 0
3.6 - 2
3.7 - 0



```
~# ceph pg ls-by-pool mypool  | awk '{print $1" - "$2}
...
9.ff2 - 271268
9.ff3 - 271046
9.ff4 - 271944
9.ff5 - 270864
9.ff6 - 272122
9.ff7 - 272244
9.ff8 - 271638
9.ff9 - 271702
9.ffa - 270906
9.ffb - 271114
9.ffc - 271986
9.ffd - 270766
9.ffe - 271702
9.fff - 271693
...
```



and it sums up to 10 objects which matches the total stats:

storage01:~ # ceph df | grep -E "OBJECTS|volumes"
POOL   ID  PGS   STORED  OBJECTS USED  %USED
MAX AVAIL
volumes38379 B   10  960 KiB  07.5 GiB

Regards,
Eugen

Zitat von Ramin Najjarbashi :

> Hi
> I hope this email finds you well. I am reaching out to you because I have
> encountered an issue with my CEPH Bluestore cluster and I am seeking your
> assistance.
> I have a cluster with approximately one billion objects and when I run a
PG
> query, it shows that I have 27,000 objects per PG.
> I have run the following command: "rados -p  ls | wc -l" which
> returns the correct number of one billion objects. However, when I run
the
> same command per pg, the results are much less, with only 20 million
> objects being reported. For example, "rados -p  --pgid 1.xx ls |
wc
> -l" shows only three objects in the specified PG.
> This is a significant discrepancy and I am concerned about the integrity
of
> my data.
> Do you have any idea about this discrepancy?
>
> p.s:
> I have a total of 30 million objects in a single bucket and versioning
has
> not been enabled for this particular bucket.
>
> Thank you for your time and I look forward to your response.
>
> Best regards,
> Ramin
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Removing Rados Gateway in ceph cluster

2023-02-06 Thread Eugen Block

Hi,

can you paste the output of:

ceph config dump | grep mgr/dashboard/RGW_API_HOST

Does it match your desired setup? Depending on the ceph version (and  
how ceph-ansible deploys the services) you could also check:


ceph dashboard get-rgw-api-host

I'm not familiar with ceph-ansible, but if you shared your rgw  
definitions and the respective ceph output we might be able to assist  
resolving this.


Regards,
Eugen

Zitat von Michel Niyoyita :


Hello team,

I have a ceph cluster deployed using ceph-ansible , running on ubuntu 20.04
OS which have 6 hosts , 3 hosts for OSD  and 3 hosts used as monitors and
managers , I have deployed RGW on all those hosts  and RGWLOADBALENCER on
top of them , for testing purpose , I have switched off one OSD , to check
if the rest can work properly , The test went well as expected,
unfortunately while coming back an OSD , the RGW failed to connect through
the dashboard. below is the message :
The Object Gateway Service is not configuredError connecting to Object
GatewayPlease consult the documentation

on
how to configure and enable the Object Gateway management functionality.

would like to ask how to solve that issue or how can I proceed to remove
completely RGW and redeploy it after .


root@ceph-mon1:~# ceph -s
  cluster:
id: cb0caedc-eb5b-42d1-a34f-96facfda8c27
health: HEALTH_OK

  services:
mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 72m)
mgr: ceph-mon2(active, since 71m), standbys: ceph-mon3, ceph-mon1
osd: 48 osds: 48 up (since 79m), 48 in (since 3d)
rgw: 6 daemons active (6 hosts, 1 zones)

  data:
pools:   9 pools, 257 pgs
objects: 59.49k objects, 314 GiB
usage:   85 TiB used, 348 TiB / 433 TiB avail
pgs: 257 active+clean

  io:
client:   2.0 KiB/s wr, 0 op/s rd, 0 op/s wr

Kindly help

Best Regards

Michel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Removing Rados Gateway in ceph cluster

2023-02-06 Thread Eugen Block

Please send responses to the mailing-list.

If the orchestrator is available, please share also this output (mask  
sensitive data):


ceph orch ls rgw --export --format yaml

Which ceph version is this? The command 'ceph dashboard  
get-rgw-api-host' was removed between Octopus and Pacific, that's why  
I asked for your ceph version.


I also forgot that mgr/dashboard/RGW_API_HOST was used until Octopus,  
in Pacific it's not applied anymore. I'll need to check how it is  
determined now.


Zitat von Michel Niyoyita :


Hello Eugen,

Thanks for your reply ,

I am trying the shared command but no output .

root@ceph-mon1:~# ceph config dump | grep mgr/dashboard/RGW_API_HOST
root@ceph-mon1:~# ceph config dump | grep mgr/dashboard/
  mgradvanced  mgr/dashboard/ALERTMANAGER_API_HOST
http://10.10.110.196:9093
*
  mgradvanced  mgr/dashboard/GRAFANA_API_PASSWORD 

   *
  mgradvanced  mgr/dashboard/GRAFANA_API_SSL_VERIFY   false

  *
  mgradvanced  mgr/dashboard/GRAFANA_API_URL
https://10.10.110.198:3000
 *
  mgradvanced  mgr/dashboard/GRAFANA_API_USERNAME admin

  *
  mgradvanced  mgr/dashboard/PROMETHEUS_API_HOST
http://10.10.110.196:9092
*
  mgradvanced  mgr/dashboard/RGW_API_ACCESS_KEY

 *
  mgradvanced  mgr/dashboard/RGW_API_SECRET_KEY

 *
  mgradvanced  mgr/dashboard/RGW_API_SSL_VERIFY   false

  *
  mgradvanced  mgr/dashboard/ceph-mon1/server_addr10.10.110.196

  *
  mgradvanced  mgr/dashboard/ceph-mon2/server_addr10.10.110.197

  *
  mgradvanced  mgr/dashboard/ceph-mon3/server_addr10.10.110.198

  *
  mgradvanced  mgr/dashboard/motd {"message":
"WELCOME TO AOS ZONE 3 STORAGE CLUSTER", "md5":
"87149a6798ce42a7e990bc8584a232cd", "severity": "info", "expires": ""}  *
  mgradvanced  mgr/dashboard/server_port  8443

   *
  mgradvanced  mgr/dashboard/ssl  true

   *
  mgradvanced  mgr/dashboard/ssl_server_port  8443


for the second one it seems is not valid

root@ceph-mon1:~# ceph dashboard get-rgw-api-host
no valid command found; 10 closest matches:
dashboard set-jwt-token-ttl 
dashboard get-jwt-token-ttl
dashboard create-self-signed-cert
dashboard grafana dashboards update
dashboard get-account-lockout-attempts
dashboard set-account-lockout-attempts 
dashboard reset-account-lockout-attempts
dashboard get-alertmanager-api-host
dashboard set-alertmanager-api-host 
dashboard reset-alertmanager-api-host
Error EINVAL: invalid command
root@ceph-mon1:~#


Kindly check the output .

Best Regards

Michel
*

On Mon, Feb 6, 2023 at 2:06 PM Eugen Block  wrote:


Hi,

can you paste the output of:

ceph config dump | grep mgr/dashboard/RGW_API_HOST

Does it match your desired setup? Depending on the ceph version (and
how ceph-ansible deploys the services) you could also check:

ceph dashboard get-rgw-api-host

I'm not familiar with ceph-ansible, but if you shared your rgw
definitions and the respective ceph output we might be able to assist
resolving this.

Regards,
Eugen

Zitat von Michel Niyoyita :

> Hello team,
>
> I have a ceph cluster deployed using ceph-ansible , running on ubuntu
20.04
> OS which have 6 hosts , 3 hosts for OSD  and 3 hosts used as monitors and
> managers , I have deployed RGW on all those hosts  and RGWLOADBALENCER on
> top of them , for testing purpose , I have switched off one OSD , to
check
> if the rest can work properly , The test went well as expected,
> unfortunately while coming back an OSD , the RGW failed to connect
through
> the dashboard. below is the message :
> The Object Gateway Service is not configuredError connecting to Object
> GatewayPlease consult the documentation
> <
https://docs.ceph.com/en/latest/mgr/dashboard/#enabling-the-object-gateway-management-frontend
>
> on
> how to configure and enable the Object Gateway management functionality.
>
> would like to ask how to solve that issue or how can I proceed to remove
> completely RGW and redeploy it after .
>
>
> root@ceph-mon1:~# ceph -s
>   cluster:
> id: cb0caedc-eb5b-42d1-a34f-96facfda8c27
> health: HEALTH_OK
>
>   services:
> mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 72m)
> mgr: ceph-mon2(active, since 71m), stand

[ceph-users] Re: Any ceph constants available?

2023-02-06 Thread Robert Sander

On 04.02.23 00:02, Thomas Cannon wrote:


Boreal-01 - the host - 17.2.5:



Boreal-02 - 15.2.6:



Boreal-03 - 15.2.8:



And the host I added - Boreal-04 - 17.2.5:


This is a wild mix of versions. Such a situation may exist during an 
upgrade but not when operating normally or extending the cluster.


Please show the output of "ceph versions".

Regards
--
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
http://www.heinlein-support.de

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Removing Rados Gateway in ceph cluster

2023-02-06 Thread Michel Niyoyita
Hello Eugen,

below is the Version of Ceph I am running

root@ceph-mon1:~# ceph -v
ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894) pacific
(stable)
root@ceph-mon1:~# ceph orch ls rgw --export --format yaml
Error ENOENT: No orchestrator configured (try `ceph orch set backend`)
root@ceph-mon1:~#


tried ceph orch set backend , but nothing changed also.

Best Regards

On Mon, Feb 6, 2023 at 2:37 PM Eugen Block  wrote:

> Please send responses to the mailing-list.
>
> If the orchestrator is available, please share also this output (mask
> sensitive data):
>
> ceph orch ls rgw --export --format yaml
>
> Which ceph version is this? The command 'ceph dashboard
> get-rgw-api-host' was removed between Octopus and Pacific, that's why
> I asked for your ceph version.
>
> I also forgot that mgr/dashboard/RGW_API_HOST was used until Octopus,
> in Pacific it's not applied anymore. I'll need to check how it is
> determined now.
>
> Zitat von Michel Niyoyita :
>
> > Hello Eugen,
> >
> > Thanks for your reply ,
> >
> > I am trying the shared command but no output .
> >
> > root@ceph-mon1:~# ceph config dump | grep mgr/dashboard/RGW_API_HOST
> > root@ceph-mon1:~# ceph config dump | grep mgr/dashboard/
> >   mgradvanced  mgr/dashboard/ALERTMANAGER_API_HOST
> > http://10.10.110.196:9093
> > *
> >   mgradvanced  mgr/dashboard/GRAFANA_API_PASSWORD 
> >
> >*
> >   mgradvanced  mgr/dashboard/GRAFANA_API_SSL_VERIFY   false
> >
> >   *
> >   mgradvanced  mgr/dashboard/GRAFANA_API_URL
> > https://10.10.110.198:3000
> >  *
> >   mgradvanced  mgr/dashboard/GRAFANA_API_USERNAME admin
> >
> >   *
> >   mgradvanced  mgr/dashboard/PROMETHEUS_API_HOST
> > http://10.10.110.196:9092
> > *
> >   mgradvanced  mgr/dashboard/RGW_API_ACCESS_KEY
> > 
> >  *
> >   mgradvanced  mgr/dashboard/RGW_API_SECRET_KEY
> > 
> >  *
> >   mgradvanced  mgr/dashboard/RGW_API_SSL_VERIFY   false
> >
> >   *
> >   mgradvanced  mgr/dashboard/ceph-mon1/server_addr
> 10.10.110.196
> >
> >   *
> >   mgradvanced  mgr/dashboard/ceph-mon2/server_addr
> 10.10.110.197
> >
> >   *
> >   mgradvanced  mgr/dashboard/ceph-mon3/server_addr
> 10.10.110.198
> >
> >   *
> >   mgradvanced  mgr/dashboard/motd {"message":
> > "WELCOME TO AOS ZONE 3 STORAGE CLUSTER", "md5":
> > "87149a6798ce42a7e990bc8584a232cd", "severity": "info", "expires": ""}  *
> >   mgradvanced  mgr/dashboard/server_port  8443
> >
> >*
> >   mgradvanced  mgr/dashboard/ssl  true
> >
> >*
> >   mgradvanced  mgr/dashboard/ssl_server_port  8443
> >
> >
> > for the second one it seems is not valid
> >
> > root@ceph-mon1:~# ceph dashboard get-rgw-api-host
> > no valid command found; 10 closest matches:
> > dashboard set-jwt-token-ttl 
> > dashboard get-jwt-token-ttl
> > dashboard create-self-signed-cert
> > dashboard grafana dashboards update
> > dashboard get-account-lockout-attempts
> > dashboard set-account-lockout-attempts 
> > dashboard reset-account-lockout-attempts
> > dashboard get-alertmanager-api-host
> > dashboard set-alertmanager-api-host 
> > dashboard reset-alertmanager-api-host
> > Error EINVAL: invalid command
> > root@ceph-mon1:~#
> >
> >
> > Kindly check the output .
> >
> > Best Regards
> >
> > Michel
> > *
> >
> > On Mon, Feb 6, 2023 at 2:06 PM Eugen Block  wrote:
> >
> >> Hi,
> >>
> >> can you paste the output of:
> >>
> >> ceph config dump | grep mgr/dashboard/RGW_API_HOST
> >>
> >> Does it match your desired setup? Depending on the ceph version (and
> >> how ceph-ansible deploys the services) you could also check:
> >>
> >> ceph dashboard get-rgw-api-host
> >>
> >> I'm not familiar with ceph-ansible, but if you shared your rgw
> >> definitions and the respective ceph output we might be able to assist
> >> resolving this.
> >>
> >> Regards,
> >> Eugen
> >>
> >> Zitat von Michel Niyoyita :
> >>
> >> > Hello team,
> >> >
> >> > I have a ceph cluster deployed using ceph-ansible , running on ubuntu
> >> 20.04
> >> > OS which have 6 hosts , 3 hosts for OSD  and 3 hosts used as monitors
> and
> >> > managers , I have deployed RGW on all those hosts  and
> RGWLOADBALENCER on
> >> > top of them , for testing purpose , 

[ceph-users] Re: Inconsistency in rados ls

2023-02-06 Thread Robert Sander

On 04.02.23 20:54, Ramin Najjarbashi wrote:


ceph df | grep mypoo

--- POOLS ---

POOL OBJECTS

mypool   1.11G

---

  and from this, I got 8.8M objects :

for item in `radosgw-admin user list | jq -r ".[]" | head`; do
B_OBJ=$(radosgw-admin user stats --uid $item 2>/dev/null | jq -r '.stats |
select(.num_objects > 0) | .num_objects'); SUM=$((SUM + B_OBJ)); done


You have mixed RADOS objects and S3 objects.

These are two different layers. Only small (< 4MB) S3 objects are stored 
in a single RADOS object. Larger S3 objects are split into multiple 4MB 
sized RAOS objects by the rados-gateway.


This is why you see much more RADOS objects than S3 objects.

Regards
--
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
http://www.heinlein-support.de

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Removing Rados Gateway in ceph cluster

2023-02-06 Thread Robert Sander

On 06.02.23 13:48, Michel Niyoyita wrote:


root@ceph-mon1:~# ceph -v
ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894) pacific
(stable)


This is the version of the command line tool "ceph".

Please run "ceph versions" to show the version of the running Ceph daemons.

Regards
--
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
http://www.heinlein-support.de

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Removing Rados Gateway in ceph cluster

2023-02-06 Thread Michel Niyoyita
Hello Robert

below is the output of ceph versions command

root@ceph-mon1:~# ceph versions
{
"mon": {
"ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894)
pacific (stable)": 3
},
"mgr": {
"ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894)
pacific (stable)": 3
},
"osd": {
"ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894)
pacific (stable)": 48
},
"mds": {},
"rgw": {
"ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894)
pacific (stable)": 6
},
"overall": {
"ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894)
pacific (stable)": 60
}
}
root@ceph-mon1:~#

Best Regards

Michel

On Mon, Feb 6, 2023 at 2:57 PM Robert Sander 
wrote:

> On 06.02.23 13:48, Michel Niyoyita wrote:
>
> > root@ceph-mon1:~# ceph -v
> > ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894) pacific
> > (stable)
>
> This is the version of the command line tool "ceph".
>
> Please run "ceph versions" to show the version of the running Ceph daemons.
>
> Regards
> --
> Robert Sander
> Heinlein Support GmbH
> Linux: Akademie - Support - Hosting
> http://www.heinlein-support.de
>
> Tel: 030-405051-43
> Fax: 030-405051-19
>
> Zwangsangaben lt. §35a GmbHG:
> HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
> Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Removing Rados Gateway in ceph cluster

2023-02-06 Thread Eugen Block
What does the active mgr log when you try to access the dashboard?  
Please paste your rgw config settings as well.


Zitat von Michel Niyoyita :


Hello Robert

below is the output of ceph versions command

root@ceph-mon1:~# ceph versions
{
"mon": {
"ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894)
pacific (stable)": 3
},
"mgr": {
"ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894)
pacific (stable)": 3
},
"osd": {
"ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894)
pacific (stable)": 48
},
"mds": {},
"rgw": {
"ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894)
pacific (stable)": 6
},
"overall": {
"ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894)
pacific (stable)": 60
}
}
root@ceph-mon1:~#

Best Regards

Michel

On Mon, Feb 6, 2023 at 2:57 PM Robert Sander 
wrote:


On 06.02.23 13:48, Michel Niyoyita wrote:

> root@ceph-mon1:~# ceph -v
> ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894) pacific
> (stable)

This is the version of the command line tool "ceph".

Please run "ceph versions" to show the version of the running Ceph daemons.

Regards
--
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
http://www.heinlein-support.de

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Removing Rados Gateway in ceph cluster

2023-02-06 Thread Eugen Block
Just a quick edit: what does the active mgr log when you try to access  
the rgw page in the dashboard?


With 'ceph service dump' you can see the rgw daemons that are  
registered to the mgr. If the daemons are not shown in the dashboard  
you'll have to check the active mgr logs for errors or hints.


Zitat von Eugen Block :

What does the active mgr log when you try to access the dashboard?  
Please paste your rgw config settings as well.


Zitat von Michel Niyoyita :


Hello Robert

below is the output of ceph versions command

root@ceph-mon1:~# ceph versions
{
   "mon": {
   "ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894)
pacific (stable)": 3
   },
   "mgr": {
   "ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894)
pacific (stable)": 3
   },
   "osd": {
   "ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894)
pacific (stable)": 48
   },
   "mds": {},
   "rgw": {
   "ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894)
pacific (stable)": 6
   },
   "overall": {
   "ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894)
pacific (stable)": 60
   }
}
root@ceph-mon1:~#

Best Regards

Michel

On Mon, Feb 6, 2023 at 2:57 PM Robert Sander 
wrote:


On 06.02.23 13:48, Michel Niyoyita wrote:


root@ceph-mon1:~# ceph -v
ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894) pacific
(stable)


This is the version of the command line tool "ceph".

Please run "ceph versions" to show the version of the running Ceph daemons.

Regards
--
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
http://www.heinlein-support.de

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Removing Rados Gateway in ceph cluster

2023-02-06 Thread Michel Niyoyita
Hello Eugen

Below are rgw configs and logs while I am accessing the dashboard :

root@ceph-mon1:/var/log/ceph# tail -f /var/log/ceph/ceph-mgr.ceph-mon1.log
2023-02-06T15:25:30.037+0200 7f68b15cd700  0 [prometheus INFO
cherrypy.access.140087714875184] :::10.10.110.134 - -
[06/Feb/2023:15:25:30] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.7.2"
2023-02-06T15:25:45.033+0200 7f68b0dcc700  0 [prometheus INFO
cherrypy.access.140087714875184] :::10.10.110.133 - -
[06/Feb/2023:15:25:45] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.7.2"
2023-02-06T15:25:45.033+0200 7f68b1dce700  0 [prometheus INFO
cherrypy.access.140087714875184] :::127.0.0.1 - -
[06/Feb/2023:15:25:45] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.7.2"
2023-02-06T15:25:45.037+0200 7f68b35d1700  0 [prometheus INFO
cherrypy.access.140087714875184] :::10.10.110.134 - -
[06/Feb/2023:15:25:45] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.7.2"
2023-02-06T15:26:00.033+0200 7f68b3dd2700  0 [prometheus INFO
cherrypy.access.140087714875184] :::127.0.0.1 - -
[06/Feb/2023:15:26:00] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.7.2"
2023-02-06T15:26:00.033+0200 7f68b25cf700  0 [prometheus INFO
cherrypy.access.140087714875184] :::10.10.110.133 - -
[06/Feb/2023:15:26:00] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.7.2"
2023-02-06T15:26:00.037+0200 7f68b2dd0700  0 [prometheus INFO
cherrypy.access.140087714875184] :::10.10.110.134 - -
[06/Feb/2023:15:26:00] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.7.2"
2023-02-06T15:26:15.033+0200 7f68afdca700  0 [prometheus INFO
cherrypy.access.140087714875184] :::127.0.0.1 - -
[06/Feb/2023:15:26:15] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.7.2"
2023-02-06T15:26:15.033+0200 7f68b45d3700  0 [prometheus INFO
cherrypy.access.140087714875184] :::10.10.110.133 - -
[06/Feb/2023:15:26:15] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.7.2"
2023-02-06T15:26:15.037+0200 7f68b05cb700  0 [prometheus INFO
cherrypy.access.140087714875184] :::10.10.110.134 - -
[06/Feb/2023:15:26:15] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.7.2"
2023-02-06T15:26:30.033+0200 7f68b15cd700  0 [prometheus INFO
cherrypy.access.140087714875184] :::127.0.0.1 - -
[06/Feb/2023:15:26:30] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.7.2"
2023-02-06T15:26:30.033+0200 7f68b0dcc700  0 [prometheus INFO
cherrypy.access.140087714875184] :::10.10.110.133 - -
[06/Feb/2023:15:26:30] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.7.2"
2023-02-06T15:26:30.037+0200 7f68b1dce700  0 [prometheus INFO
cherrypy.access.140087714875184] :::10.10.110.134 - -
[06/Feb/2023:15:26:30] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.7.2"
2023-02-06T15:26:45.033+0200 7f68b35d1700  0 [prometheus INFO
cherrypy.access.140087714875184] :::127.0.0.1 - -
[06/Feb/2023:15:26:45] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.7.2"
2023-02-06T15:26:45.033+0200 7f68b3dd2700  0 [prometheus INFO
cherrypy.access.140087714875184] :::10.10.110.133 - -
[06/Feb/2023:15:26:45] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.7.2"
2023-02-06T15:26:45.037+0200 7f68b25cf700  0 [prometheus INFO
cherrypy.access.140087714875184] :::10.10.110.134 - -
[06/Feb/2023:15:26:45] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.7.2"



[mons]
ceph-mon1
ceph-mon2
ceph-mon3

[osds]
ceph-osd1
ceph-osd2
ceph-osd3

[mgrs]
ceph-mon1
ceph-mon2
ceph-mon3

[grafana-server]
ceph-mon1
ceph-mon2
ceph-mon3

[rgws]
ceph-osd1
ceph-osd2
ceph-osd3
ceph-mon1
ceph-mon2
ceph-mon3

[rgwloadbalancers]
ceph-osd1
ceph-osd2
ceph-osd3
ceph-mon1
ceph-mon2
ceph-mon3


ceph.conf:

[client]
rbd_default_features = 1

[client.rgw.ceph-mon1.rgw0]
host = ceph-mon1
keyring = /var/lib/ceph/radosgw/ceph-rgw.ceph-mon1.rgw0/keyring
log file = /var/log/ceph/ceph-rgw-ceph-mon1.rgw0.log
rgw frontends = beast endpoint=10.10.110.198:8080
rgw frontends = beast endpoint=10.10.110.196:8080
rgw thread pool size = 512

[client.rgw.ceph-osd1]
rgw_dns_name = ceph-osd1

[client.rgw.ceph-osd2]
rgw_dns_name = ceph-osd2

[client.rgw.ceph-osd3]
rgw_dns_name = ceph-osd3

# Please do not change this file directly since it is managed by Ansible
and will be overwritten
[global]
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
cluster network = 10.10.110.128/26
fsid = cb0caedc-eb5b-42d1-a34f-96facfda8c27
mon host =
mon initial members = ceph-mon1,ceph-mon2,ceph-mon3
mon_allow_pool_delete = True
mon_max_pg_per_osd = 400
osd pool default crush rule = -1
osd_pool_default_min_size = 2
osd_pool_default_size = 3
public network =


Best Regards








On Mon, Feb 6, 2023 at 3:13 PM Eugen Block  wrote:

> What does the active mgr log when you try to access the dashboard?
> Please paste your rgw config settings as well.
>
> Zitat von Michel Niyoyita :
>
> > Hello Robert
> >
> > below is the output of ceph versions command
> >
> > root@ceph-mon1:~# ceph versions
> > {
> > "mon": {
> > "ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894)
> > pacifi

[ceph-users] Re: Removing Rados Gateway in ceph cluster

2023-02-06 Thread Michel Niyoyita
Hello Eugen,

The output shows that all daemon are configured , would like to know also
if there is a possibility of removing those RGW and redeploy them to see if
there will be changes.

root@ceph-mon1:~# ceph service dump
{
"epoch": 1740,
"modified": "2023-02-06T15:21:42.235595+0200",
"services": {
"rgw": {
"daemons": {
"summary": "",
"479626": {
"start_epoch": 1265,
"start_stamp": "2023-02-03T11:41:58.680359+0200",
"gid": 479626,
"addr": "10.10.110.199:0/1880864062",
"metadata": {
"arch": "x86_64",
"ceph_release": "pacific",
"ceph_version": "ceph version 16.2.11
(3cf40e2dca667f68c6ce3ff5cd94f01e711af894) pacific (stable)",
"ceph_version_short": "16.2.11",
"cpu": "Intel(R) Xeon(R) Gold 5215 CPU @ 2.50GHz",
"distro": "ubuntu",
"distro_description": "Ubuntu 20.04.5 LTS",
"distro_version": "20.04",
"frontend_config#0": "beast endpoint=
10.10.110.199:8080",
"frontend_type#0": "beast",
"hostname": "ceph-osd1",
"id": "ceph-osd1.rgw0",
"kernel_description": "#154-Ubuntu SMP Thu Jan 5
17:03:22 UTC 2023",
"kernel_version": "5.4.0-137-generic",
"mem_swap_kb": "8388604",
"mem_total_kb": "263556752",
"num_handles": "1",
"os": "Linux",
"pid": "47369",
"realm_id": "",
"realm_name": "",
"zone_id": "689f9b30-4380-439e-8e7c-3c2046079a2b",
"zone_name": "default",
"zonegroup_id":
"c2d060fe-bd6c-4bfb-a0cd-596124765015",
"zonegroup_name": "default"
},
"task_status": {}
},
"489542": {
"start_epoch": 1267,
"start_stamp": "2023-02-03T11:42:30.711278+0200",
"gid": 489542,
"addr": "10.10.110.200:0/3909810130",
"metadata": {
"arch": "x86_64",
"ceph_release": "pacific",
"ceph_version": "ceph version 16.2.11
(3cf40e2dca667f68c6ce3ff5cd94f01e711af894) pacific (stable)",
"ceph_version_short": "16.2.11",
"cpu": "Intel(R) Xeon(R) Gold 5215 CPU @ 2.50GHz",
"distro": "ubuntu",
"distro_description": "Ubuntu 20.04.5 LTS",
"distro_version": "20.04",
"frontend_config#0": "beast endpoint=
10.10.110.200:8080",
"frontend_type#0": "beast",
"hostname": "ceph-osd2",
"id": "ceph-osd2.rgw0",
"kernel_description": "#154-Ubuntu SMP Thu Jan 5
17:03:22 UTC 2023",
"kernel_version": "5.4.0-137-generic",
"mem_swap_kb": "8388604",
"mem_total_kb": "263556752",
"num_handles": "1",
"os": "Linux",
"pid": "392257",
"realm_id": "",
"realm_name": "",
"zone_id": "689f9b30-4380-439e-8e7c-3c2046079a2b",
"zone_name": "default",
"zonegroup_id":
"c2d060fe-bd6c-4bfb-a0cd-596124765015",
"zonegroup_name": "default"
},
"task_status": {}
},
"489605": {
"start_epoch": 1268,
"start_stamp": "2023-02-03T11:42:58.724973+0200",
"gid": 489605,
"addr": "10.10.110.201:0/59797695",
"metadata": {
"arch": "x86_64",
"ceph_release": "pacific",
"ceph_version": "ceph version 16.2.11
(3cf40e2dca667f68c6ce3ff5cd94f01e711af894) pacific (stable)",
"ceph_version_short": "16.2.11",
"cpu": "Intel(R) Xeon(R) Gold 5215 CPU @ 2.50GHz",
"distro": "ubuntu",
"distro_description": "Ubuntu 20.04.5 LTS",
"distro_version": "20.04",
"frontend_config#0": "beast endpoint=
10.10.110.201:8080",
"frontend_type#0": "beast",
"hostname": "ceph-osd3",
"id": "ce

[ceph-users] Re: Inconsistency in rados ls

2023-02-06 Thread Ramin Najjarbashi
Thank you for your email and for providing the solution to check for shadow
and multipart objects in CEPH. I have checked the objects in my CEPH
cluster and found the following results:

The command rados -p  ls | grep --text -vE "shadow|multipart" | wc -l
returns about 80 million objects.
The command radosgw-admin user stats --uid  2>/dev/null | jq -r
'.stats | select(.num_objects > 0) | .num_objects' returns 889684340
objects, including all buckets for all users.
The data match and are sensible, but I still encounter an inconsistency
when trying to get objects per PG with the rados command. The sum of all
objects in all PGs without shadow and multipart objects is 18 million.

It appears that the S3 objects are stored in the RADOS layer as follows:
Small objects (less than 4 MB) are stored in RADOS without any prefix. If
an object is larger, it is split into multiple objects, each one 4 MB, and
the remaining part (less than 4 MB) is stored as a shadow file. However, in
any case, all S3 objects have a corresponding object in RADOS as a header
that holds the metadata of the object.

https://access.redhat.com/solutions/4177821

Please let me know if you have any further suggestions or if there is
anything else I can assist with.

On Mon, Feb 6, 2023 at 4:24 PM Robert Sander 
wrote:

> On 04.02.23 20:54, Ramin Najjarbashi wrote:
>
> > ceph df | grep mypoo
> >
> > --- POOLS ---
> >
> > POOL OBJECTS
> >
> > mypool   1.11G
> >
> > ---
> >
> >   and from this, I got 8.8M objects :
> >
> > for item in `radosgw-admin user list | jq -r ".[]" | head`; do
> > B_OBJ=$(radosgw-admin user stats --uid $item 2>/dev/null | jq -r '.stats
> |
> > select(.num_objects > 0) | .num_objects'); SUM=$((SUM + B_OBJ)); done
>
> You have mixed RADOS objects and S3 objects.
>
> These are two different layers. Only small (< 4MB) S3 objects are stored
> in a single RADOS object. Larger S3 objects are split into multiple 4MB
> sized RAOS objects by the rados-gateway.
>
> This is why you see much more RADOS objects than S3 objects.
>
> Regards
> --
> Robert Sander
> Heinlein Support GmbH
> Linux: Akademie - Support - Hosting
> http://www.heinlein-support.de
>
> Tel: 030-405051-43
> Fax: 030-405051-19
>
> Zwangsangaben lt. §35a GmbHG:
> HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
> Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Telemetry service is temporarily down

2023-02-06 Thread Yaarit Hatuka
Hi everyone,

Network issues are resolved.
The telemetry endpoints are available again.

Thanks for your patience and contribution!
Yaarit


On Thu, Feb 2, 2023 at 6:11 PM Yaarit Hatuka  wrote:

> Hi everyone,
>
> Our telemetry endpoints are temporarily unavailable due to network issues.
> We apologize for the inconvenience. We will update once the service is
> restored.
>
> Yaarit
>
>
> On Fri, Jan 13, 2023 at 12:05 PM Yaarit Hatuka  wrote:
>
>> Hi everyone,
>>
>> Our telemetry service is up and running again.
>> Thanks Adam Kraitman and Dan Mick for restoring the service.
>>
>> We thank you for your patience and appreciate your contribution to the
>> project!
>>
>> Thanks,
>> Yaarit
>>
>> On Tue, Jan 3, 2023 at 3:14 PM Yaarit Hatuka  wrote:
>>
>>> Hi everyone,
>>>
>>> We are having some infrastructure issues with our telemetry backend, and
>>> we are working on fixing it.
>>> Thanks Jan Horacek for opening this issue
>>>  [1]. We will update once the
>>> service is back up.
>>> We are sorry for any inconvenience you may be experiencing, and
>>> appreciate your patience.
>>>
>>> Thanks,
>>> Yaarit
>>>
>>> [1] https://tracker.ceph.com/issues/58371
>>>
>>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph Pacific 16.2.11 : ceph-volume does not like LV with the same name in different VG

2023-02-06 Thread Gilles Mocellin

Hello !

It seems like ceph-volume from Ceph Pacific 16.2.11 has a problem with 
same LV names in different VG.

I use ceph-ansible (stable-6), with a pre-existing LVM configuration.
Here's the error :

TASK [ceph-osd : include_tasks scenarios/lvm.yml] 
**
Monday 06 February 2023  16:13:55 +0100 (0:00:00.065)   0:03:41.576 
***
included: 
/home/cephadmin/ceph-ansible/roles/ceph-osd/tasks/scenarios/lvm.yml for 
fidcllabs-sto-01.labs.fidcl.cloud, fidcllabs-sto-02.labs.fidcl.cloud


TASK [ceph-osd : use ceph-volume to create bluestore osds] 
*
Monday 06 February 2023  16:13:55 +0100 (0:00:00.121)   0:03:41.698 
***
failed: [fidcllabs-sto-01.labs.fidcl.cloud] (item={'data': 'data-lv1', 
'data_vg': 'data-vg1', 'crush_device_class': 'sas15k'}) => changed=false

  ansible_loop_var: item
  item:
crush_device_class: sas15k
data: data-lv1
data_vg: data-vg1
  msg: 'Could not decode json output:  from the command 
[''ceph-volume'', ''--cluster'', ''ceph'', ''lvm'', ''list'', 
''data-vg1/data-lv1'', ''--format=json'']'

  rc: 1


If I execute the ceph-volume command myself on one target host, I get :

fcadmin@fidcllabs-sto-01:~$ sudo ceph-volume --cluster ceph lvm list 
data-vg1/data-lv1 --format=json
-->  RuntimeError: Filters {'lv_name': 'data-lv1'} matched more than 1 
LV present on this host.


My LVs :

1 fcadmin@fidcllabs-sto-01:~$ sudo lvs
  LV   VG   Attr   LSize Pool Origin Data%  Meta%  Move 
Log Cpy%Sync Convert

  data-lv1 data-vg1 -wi-ao <1024.00g
  data-lv1 data-vg2 -wi-ao <1024.00g
  data-lv1 data-vg3 -wi-ao<2.00t
  data-lv1 data-vg4 -wi-ao<2.00t
  logs sys  -wi-ao 3.81g
  root sys  -wi-ao   <15.26g
  swap sys  -wi-ao<7.63g
  unused   sys  -wi-a-   <43.30g

Yes, all data LVs have the same name, but under a different VG.

I look at the tracker but don't find a clear corresponding issue.
I will certainly open one if there's nothing known around here.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Removing Rados Gateway in ceph cluster

2023-02-06 Thread Gilles Mocellin

Le 2023-02-06 14:11, Eugen Block a écrit :

What does the active mgr log when you try to access the dashboard?
Please paste your rgw config settings as well.


Ah, Sorry to hijack, but I also can't access Object Storage menus in the 
Dashboard since upgrading from 16.2.10 to 16.2.11.


Here are the MGR logs :

fcadmin@fidcllabs-oct-01:~$ sudo grep 8080 
/var/log/ceph/ceph-mgr.fidcllabs-oct-01.log
2023-02-06T08:12:58.179+ 7ffad4910700  0 [dashboard INFO rgw_client] 
Found RGW daemon with configuration: host=fidcllabs-oct-03, port=8080, 
ssl=False
2023-02-06T08:12:58.179+ 7ffad4910700  0 [dashboard INFO rgw_client] 
Found RGW daemon with configuration: host=fidcllabs-oct-01, port=8080, 
ssl=False
2023-02-06T08:12:58.179+ 7ffad4910700  0 [dashboard INFO rgw_client] 
Found RGW daemon with configuration: host=fidcllabs-oct-02, port=8080, 
ssl=False
2023-02-06T08:12:58.275+ 7ffad4910700  0 [dashboard ERROR 
rest_client] RGW REST API failed GET, connection error 
(url=http://fidcllabs-oct-03:8080/admin/metadata/user?myself): [errno: 
111] Connection refused
urllib3.exceptions.MaxRetryError: 
HTTPConnectionPool(host='fidcllabs-oct-03', port=8080): Max retries 
exceeded with url: /admin/metadata/user?myself (Caused by 
NewConnectionError('0x7ffac1e75160>: Failed to establish a new connection: [Errno 111] 
Connection refused',))
requests.exceptions.ConnectionError: 
HTTPConnectionPool(host='fidcllabs-oct-03', port=8080): Max retries 
exceeded with url: /admin/metadata/user?myself (Caused by 
NewConnectionError('0x7ffac1e75160>: Failed to establish a new connection: [Errno 111] 
Connection refused',))


The hostname is resolvable, but not on the same IP (management network) 
than my RGW endpoints (public network).
In other cluster still on 16.2.10, I can see IPs in the corresponding 
logs, not the hostname.


--
Gilles
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph Bluestore tweaks for Bcache

2023-02-06 Thread Richard Bade
Hi Matthias,
I've done a bit of testing on my Nautilus (14.2.22) test cluster and I
can confirm that what you're seeing with rotation going back to '0' in
osd metadata also happens for me in Nautilus after rebooting the host.

> Alternatively, can I manually set and persist the relevant bluestore
> tunables (per OSD / per device class) so as to make the bcache
> rotational flag irrelevant after the OSD is first created?
In my experience you can. I've changed a few parameters as mentioned
in my original post in this thread. I don't think this can set as many
things as rotational does but it definitely improved latency for me.
In ceph config db I have:
  osdclass:hddadvanced
bluestore_compression_max_blob_size   524288
  osdclass:hddadvanced bluestore_deferred_batch_ops
  64
  osdclass:hdddev  bluestore_max_blob_size
  524288
  osdclass:hddadvanced bluestore_min_alloc_size
  65536   *
  osdclass:hddadvanced bluestore_prefer_deferred_size
  32768
  osdclass:hddadvanced bluestore_throttle_cost_per_io
  67

Also, in ceph.conf I have:
osd_op_queue_cut_off = high

this is because this was unable to be changed after osd startup when
it grabs settings from the db, but it was able to be set at the point
it reads the config file.
This probably doesn't impact you though as it's now the default from
Octopus, but it's in my Nautilus cluster.

Regards,
Richard

On Thu, 2 Feb 2023 at 12:20, Matthias Ferdinand  wrote:
>
> ceph version: 17.2.0 on Ubuntu 22.04
>   non-containerized ceph from Ubuntu repos
>   cluster started on luminous
>
> I have been using bcache on filestore on rotating disks for many years
> without problems.  Now converting OSDs to bluestore, there are some
> strange effects.
>
> If I create the bcache device, set its rotational flag to '1', then do
> ceph-volume lvm create ... --crush-device-class=hdd
> the OSD comes up with the right parameters and much improved latency
> compared to OSD directly on /dev/sdX.
>
> ceph osd metatdata ...
> shows
> "bluestore_bdev_type": "hdd",
> "rotational": "1"
>
> But after reboot, bcache rotational flag is set '0' again, and the OSD
> now comes up with "rotational": "0"
> Latency immediately starts to increase (and continually increases over
> the next days, possibly due to accumulating fragmention).
>
> These wrong settings stay in place even if I stop the OSD, set the
> bcache rotational flag to '1' again and restart the OSD. I have found no
> way to get back to the original settings other than destroying and
> recreating the OSD. I guess I am just not seeing something obvious, like
> from where these settings get pulled at OSD startup.
>
> I even created udev rules to set bcache rotational=1 at boot time,
> before any ceph daemon starts, but it did not help. Something running
> after these rules reset the bcache rotationl flags back to 0.
> Haven't found the culprit yet, but not sure if it even matters.
>
> Are these OSD settings (bluestore_bdev_type, rotational) persisted
> somewhere and can they be edited and pinned?
>
> Alternatively, can I manually set and persist the relevant bluestore
> tunables (per OSD / per device class) so as to make the bcache
> rotational flag irrelevant after the OSD is first created?
>
> Regards
> Matthias
>
>
> On Fri, Apr 08, 2022 at 03:05:38PM +0300, Igor Fedotov wrote:
> > Hi Frank,
> >
> > in fact this parameter impacts OSD behavior at both build-time and during
> > regular operationing. It simply substitutes hdd/ssd auto-detection with
> > manual specification.  And hence relevant config parameters are applied. If
> > e.g. min_alloc_size is persistent after OSD creation - it wouldn't be
> > updated. But if specific setting allows at run-time - it would be altered.
> >
> > So the proper usage would definitely be manual ssd/hdd mode selection before
> > the first OSD creation and keeping it in that mode along the whole OSD
> > lifecycle. But technically one can change the mode at any arbitrary point in
> > time which would result in run-rime setting being out-of-sync with creation
> > ones. With some unclear side-effects..
> >
> > Please also note that this setting was orignally intended mostly for
> > development/testing purposes not regular usage. Hence it's flexible but
> > rather unsafe if used improperly.
> >
> >
> > Thanks,
> >
> > Igor
> >
> > On 4/7/2022 2:40 PM, Frank Schilder wrote:
> > > Hi Richard and Igor,
> > >
> > > are these tweaks required at build-time (osd prepare) only or are they 
> > > required for every restart?
> > >
> > > Is this setting "bluestore debug enforce settings=hdd" in the ceph config 
> > > data base or set somewhere else? How does this work if deploying HDD- and 
> > > SSD-OSDs at the same time?
> > >
> > > Ideally, all these tweaks should be applicable and settable at creation 
> > > time only without affecting generic sett

[ceph-users] Re: OSD upgrade problem nautilus->octopus - snap_mapper upgrade stuck

2023-02-06 Thread Mark Schouten

Hi,

I’m seeing the same thing …

With debug logging enabled I see this:
2023-02-07T00:35:51.853+0100 7fdab9930e00 10 snap_mapper.convert_legacy 
converted 1410 keys
2023-02-07T00:35:51.853+0100 7fdab9930e00 10 snap_mapper.convert_legacy 
converted 1440 keys
2023-02-07T00:35:51.853+0100 7fdab9930e00 10 snap_mapper.convert_legacy 
converted 1470 keys
2023-02-07T00:35:51.853+0100 7fdab9930e00 10 snap_mapper.convert_legacy 
converted 1500 keys


It ends at 1500 keys. And nothing happens.

I’m now stuck with a cluster that has 4 OSD’s on Octopus, 10 on 
Nautilus, and one down .. A hint on how to work around this is welcome 
:)


—
Mark Schouten, CTO
Tuxis B.V.
m...@tuxis.nl / +31 318 200208


-- Original Message --
From "Jan Pekař - Imatic" 
To ceph-users@ceph.io
Date 1/12/2023 5:53:02 PM
Subject [ceph-users] OSD upgrade problem nautilus->octopus - snap_mapper 
upgrade stuck



Hi all,

I have problem upgrading nautilus to octopus on my OSD.

Upgrade mon and mgr was OK and first OSD stuck on

2023-01-12T09:25:54.122+0100 7f49ff3eae00  1 osd.0 126556 init upgrade 
snap_mapper (first start as octopus)

and there were no activity after that for more than 48 hours. No disk activity.

I restarted OSD many times and nothing changed.

It is old, filestore OSD based on XFS filesystem. Is upgrade to snap mapper 2 
reliable? What is OSD waiting for? Can I start OSD without upgrade and get 
cluster healthy with old snap structure? Or should I skip octopus upgrade and 
go to pacific directly (some bug backport is missing?).

Thank you for help, I'm sending some logs below..

Log shows

2023-01-09T19:12:49.471+0100 7f41f60f1e00  0 ceph version 15.2.17 
(694d03a6f6c6e9f814446223549caf9a9f60dba0) octopus (stable), process ceph-osd, 
pid 2566563
2023-01-09T19:12:49.471+0100 7f41f60f1e00  0 pidfile_write: ignore empty 
--pid-file
2023-01-09T19:12:49.499+0100 7f41f60f1e00 -1 missing 'type' file, inferring 
filestore from current/ dir
2023-01-09T19:12:49.531+0100 7f41f60f1e00  0 starting osd.0 osd_data 
/var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
2023-01-09T19:12:49.531+0100 7f41f60f1e00 -1 Falling back to public interface
2023-01-09T19:12:49.871+0100 7f41f60f1e00  0 load: jerasure load: lrc load: isa
2023-01-09T19:12:49.875+0100 7f41f60f1e00  0 
filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:0.OSDShard using op 
scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:1.OSDShard using op 
scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:2.OSDShard using op 
scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:3.OSDShard using op 
scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:4.OSDShard using op 
scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 
filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)
2023-01-09T19:12:49.927+0100 7f41f60f1e00  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: FIEMAP ioctl 
is disabled via 'filestore fiemap' config option
2023-01-09T19:12:49.927+0100 7f41f60f1e00  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: 
SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' config option
2023-01-09T19:12:49.927+0100 7f41f60f1e00  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: splice() is 
disabled via 'filestore splice' config option
2023-01-09T19:12:49.983+0100 7f41f60f1e00  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: syncfs(2) 
syscall fully supported (by glibc and kernel)
2023-01-09T19:12:49.983+0100 7f41f60f1e00  0 
xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature: extsize is 
disabled by conf
2023-01-09T19:12:50.015+0100 7f41f60f1e00  0 
filestore(/var/lib/ceph/osd/ceph-0) start omap initiation
2023-01-09T19:12:50.079+0100 7f41f60f1e00  1 leveldb: Recovering log #165531
2023-01-09T19:12:50.083+0100 7f41f60f1e00  1 leveldb: Level-0 table #165533: 
started
2023-01-09T19:12:50.235+0100 7f41f60f1e00  1 leveldb: Level-0 table #165533: 
1598 bytes OK
2023-01-09T19:12:50.583+0100 7f41f60f1e00  1 leveldb: Delete type=0 #165531

2023-01-09T19:12:50.615+0100 7f41f60f1e00  1 leveldb: Delete type=3 #165529

2023-01-09T19:12:51.339+0100 7f41f60f1e00  0 
filestore(/var/lib/ceph/osd/ceph-0) mount(1861): enabling WRITEAHEAD journal 
mode: checkpoint is not enabled
2023-01-09T19:12:51.379+0100 7f41f60f1e00  1 journal _open 
/var/lib/ceph/osd/ceph-0/journal fd 35: 2998927360 bytes, block size 4096 
bytes, directio = 1, aio = 1
2023-01-09T19:12:51.931+0100 7f41f60f1e00 -1 journal do_read_entry(243675136): 
bad header magic
2023-01-09T19:12:51.9

[ceph-users] Rotate lockbox keyring

2023-02-06 Thread Zhongzhou Cai
Hi,

I'm on Ceph 16.2.10, and I'm trying to rotate the ceph lockbox keyring. I
used ceph-authtool to create a new keyring, and used `ceph auth import -i
` to update the lockbox keyring. I also updated the keyring
file, which is /var/lib/ceph/osd/ceph-/lockbox.keyring. I tried
`systemctl restart ceph-volume@lvm--.service`, the
command succeeded. Then I rebooted the node, ceph-volume failed because the
lockbox.keyring file was overwritten with the old key, which doesn't match
the lockbox keyring in `ceph auth get`. Does anyone know where it gets the
lockbox.keyring during reboot?

Thanks,
Zhongzhou Cai
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Nautilus to Octopus when RGW already on Octopus

2023-02-06 Thread Richard Bade
Hi,
We're actually on very similar setup to you with 18.04 and Nautilus
and thinking about the 20.04 upgrade process.

As for your RGW, I think I would not consider the downgrade. I believe
the order is about avoiding issues with newer RGW connecting to older
mons and osds. Since you're already in this situation and not having
any issues, I would probably continue forward with the upgrade on
Mons, then Managers, then osds as per documentation. Then just restart
the RGW at the end.
I think that trying to downgrade at this point may introduce new
issues that you don't currently have.

This is just my opinion though, as I have not actually tried this. Do
you have a test cluster you could practice on?
I would be keen to hear how your upgrade goes.

Regards,
Richard

On Sat, 4 Feb 2023 at 22:10,  wrote:
>
> We are finally going to upgrade our Ceph from Nautilus to Octopus, before 
> looking at moving onward.  We are still on Ubuntu 18.04, so once on Octopus, 
> we will then upgrade the OS to 20.04, ready for the next upgrade.
>
> Unfortunately, we have already upgraded our rados gateways to Ubuntu 20.04, 
> last Sept, which had the side effect of upgrading the RGWs to Octopus. So I'm 
> looking to downgrade the rados gateways, back to Nautilus, just to be safe.  
> We can then do the upgrade in the right order.
>
> I have no idea if the newer Octopus rados gateways will have altered any 
> metadata, that would affect a downgrade back to Nautilus.
>
> Any advise.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io