[ceph-users] Questions about rbd flatten command

2024-04-01 Thread Henry lol
Hello,

I executed multiple 'rbd flatten' commands simultaneously on a client.
The elapsed time of each flatten job increased as the number of jobs
increased, and network I/O was nearly full.

so, I have two questions.
1. isn’t the flatten job running within the ceph cluster? Why is
client-side network I/O so high?
2. How can I apply qos for each flatten job to reduce network I/O?

Sincerely,
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Drained A Single Node Host On Accident

2024-04-01 Thread adam.ther

Hello,

I have a single node host with a VM as a backup MON,MGR,ect.

This has caused all OSD's to be pending as 'deleting', can i safely 
cancel this deletion request?


Regards,

Adam
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Can setting mds_session_blocklist_on_timeout to false minize the session eviction?

2024-04-01 Thread Yongseok Oh
The solution can be found here. 
https://ceph.io/en/news/blog/2020/automatic-cephfs-recovery-after-blacklisting/

When a session eviction occurs on the client side and then the recovery is 
performed, the client's caps and dirty data are dropped and the server also 
releases the client's caps, so there is no problem with metadata consistency.

Is there a reason why the auto recovery option is not turned on by default?

We are conducting various tests to turn on the auto recovery option by default 
for the storage we operate.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph status not showing correct monitor services

2024-04-01 Thread Adiga, Anantha
Hi

Why is "ceph -s" showing only two monitors while three monitor services are 
running ?

# ceph versions
{   "mon": {"ceph version 16.2.5 
(0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)": 2 },
"mgr": { "ceph version 16.2.5 
(0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)": 2 },
"osd": { "ceph version 16.2.5 
(0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)": 36 },
"mds": { "ceph version 16.2.5 
(0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)": 1 },
"rgw": { "ceph version 16.2.5 
(0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)": 3 },
"overall": {"ceph version 16.2.5 
(0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)": 44 } }

# ceph orch ls
NAME PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
crash   3/3  8m ago 2y   label:ceph
ingress.nfs.nfs  10.45.128.8:2049,9049  4/4  8m ago 2y   count:2
mds.cephfs  3/3  8m ago 2y   
count:3;label:mdss
mgr 3/3  8m ago 23M  
a001s016;a001s017;a001s018;count:3
mon 3/3  8m ago 16h  
a001s016;a001s017;a001s018;count:3   <== [ 3 monitor services running]
nfs.nfs  ?:120493/3  8m ago 2y   
a001s016;a001s017;a001s018;count:3
node-exporter?:9100 3/3  8m ago 2y   *
osd.unmanaged 36/36  8m ago -
prometheus   ?:9095 1/1  10s ago23M  count:1
rgw.ceph ?:8080 3/3  8m ago 19h  
count-per-host:1;label:rgws
root@a001s017:~# ceph -s
  cluster:
id: 604d56db-2fab-45db-a9ea-c418f9a8cca8
health: HEALTH_OK

  services:
mon: 2 daemons, quorum a001s018,a001s017 (age 16h)  <== [ shows ONLY 2 
monitors running]
mgr: a001s017.bpygfm(active, since 13M), standbys: a001s016.ctmoay
mds: 1/1 daemons up, 2 standby
osd: 36 osds: 36 up (since 54s), 36 in (since 2y)
rgw: 3 daemons active (3 hosts, 1 zones)

  data:
volumes: 1/1 healthy
pools:   43 pools, 1633 pgs
objects: 51.81M objects, 77 TiB
usage:   120 TiB used, 131 TiB / 252 TiB avail
pgs: 1631 active+clean
 2active+clean+scrubbing+deep

  io:
client:   220 MiB/s rd, 448 MiB/s wr, 251 op/s rd, 497 op/s wr

# ceph orch ls --service_name=mon
NAME  PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
mon  3/3  8m ago 16h  a001s016;a001s017;a001s018;count:3  <== [ 
3 monitors running ]

# ceph orch ps --daemon_type=mon
NAME  HOST  PORTS  STATUS REFRESHED  AGE  MEM USE  MEM LIM  
VERSION  IMAGE ID  CONTAINER ID
mon.a001s016  a001s016 running (19h) 9m ago  19h 706M2048M  
16.2.5   6e73176320aa  8484a912f96a
mon.a001s017  a001s017 running (16h)66s ago  19h 949M2048M  
16.2.5   6e73176320aa  e5e5cb6c256c   <== [ 3  mon daemons running ]
mon.a001s018  a001s018 running (5w)  2m ago   2y1155M2048M  
16.2.5   6e73176320aa  7d2bb6d41f54

a001s016# systemctl --type=service | grep @mon
  
ceph-604d56db-2fab-45db-a9ea-c418f9a8cca8@mon.a001s016.service
loaded active running Ceph mon.a001s016 for 
604d56db-2fab-45db-a9ea-c418f9a8cca8
a001s017# systemctl --type=service | grep @mon
  
ceph-604d56db-2fab-45db-a9ea-c418f9a8cca8@mon.a001s017.service
loaded active running Ceph mon.a001s017 for 
604d56db-2fab-45db-a9ea-c418f9a8cca8
a001s018# systemctl --type=service | grep @mon
  
ceph-604d56db-2fab-45db-a9ea-c418f9a8cca8@mon.a001s018.service
loaded active running Ceph mon.a001s018 for 
604d56db-2fab-45db-a9ea-c418f9a8cca8


Thank you,
Anantha
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph status not showing correct monitor services

2024-04-01 Thread Anthony D'Atri



>  a001s017.bpygfm(active, since 13M), standbys: a001s016.ctmoay

Looks like you just had an mgr failover?  Could be that the secondary mgr 
hasn't caught up with current events.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph status not showing correct monitor services

2024-04-01 Thread Adiga, Anantha


Hi Anthony,

Seeing it since last after noon.  It is same with mgr services as , "ceph -s" 
is reporting only TWO instead of THREE

Also  mon and mgr shows " is_active: false" see below.

# ceph orch ps --daemon_type=mgr
NAME HOST  PORTS   STATUS REFRESHED  AGE  MEM USE  
MEM LIM  VERSION  IMAGE ID  CONTAINER ID
mgr.a001s016.ctmoay  a001s016  *:8443  running (18M) 3m ago  23M 206M   
 -  16.2.5   6e73176320aa  169cafcbbb99
mgr.a001s017.bpygfm  a001s017  *:8443  running (19M) 3m ago  23M 332M   
 -  16.2.5   6e73176320aa  97257195158c
mgr.a001s018.hcxnef  a001s018  *:8443  running (20M) 3m ago  23M 113M   
 -  16.2.5   6e73176320aa  21ba5896cee2

# ceph orch ls --service_name=mgr
NAME  PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
mgr  3/3  3m ago 23M  a001s016;a001s017;a001s018;count:3


# ceph orch ps --daemon_type=mon --format=json-pretty

[
  {
"container_id": "8484a912f96a",
"container_image_digests": [
  
docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586
],
"container_image_id": 
"6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f",
"container_image_name": 
docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586,
"created": "2024-03-31T23:55:16.164155Z",
"daemon_id": "a001s016",
"daemon_type": "mon",
"hostname": "a001s016",
"is_active": false, <== why 
is it false
"last_refresh": "2024-04-01T19:38:30.929014Z",
"memory_request": 2147483648,
"memory_usage": 761685606,
"ports": [],
"service_name": "mon",
"started": "2024-03-31T23:55:16.268266Z",
"status": 1,
"status_desc": "running",
"version": "16.2.5"
  },


Thank you,
Anantha

From: Anthony D'Atri 
Sent: Monday, April 1, 2024 12:25 PM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] ceph status not showing correct monitor services




 a001s017.bpygfm(active, since 13M), standbys: a001s016.ctmoay

Looks like you just had an mgr failover?  Could be that the secondary mgr 
hasn't caught up with current events.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph status not showing correct monitor services

2024-04-01 Thread Eugen Block

Maybe it’s just not in the monmap? Can you show the output of:

ceph mon dump

Did you do any maintenance (apparently OSDs restarted recently) and  
maybe accidentally removed a MON from the monmap?



Zitat von "Adiga, Anantha" :


Hi Anthony,

Seeing it since last after noon.  It is same with mgr services as ,  
"ceph -s" is reporting only TWO instead of THREE


Also  mon and mgr shows " is_active: false" see below.

# ceph orch ps --daemon_type=mgr
NAME HOST  PORTS   STATUS REFRESHED  AGE  
 MEM USE  MEM LIM  VERSION  IMAGE ID  CONTAINER ID
mgr.a001s016.ctmoay  a001s016  *:8443  running (18M) 3m ago  23M  
206M-  16.2.5   6e73176320aa  169cafcbbb99
mgr.a001s017.bpygfm  a001s017  *:8443  running (19M) 3m ago  23M  
332M-  16.2.5   6e73176320aa  97257195158c
mgr.a001s018.hcxnef  a001s018  *:8443  running (20M) 3m ago  23M  
113M-  16.2.5   6e73176320aa  21ba5896cee2


# ceph orch ls --service_name=mgr
NAME  PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
mgr  3/3  3m ago 23M  a001s016;a001s017;a001s018;count:3


# ceph orch ps --daemon_type=mon --format=json-pretty

[
  {
"container_id": "8484a912f96a",
"container_image_digests": [
   
docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586

],
"container_image_id":  
"6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f",
"container_image_name":  
docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586,

"created": "2024-03-31T23:55:16.164155Z",
"daemon_id": "a001s016",
"daemon_type": "mon",
"hostname": "a001s016",
"is_active": false,   
   <== why is it false

"last_refresh": "2024-04-01T19:38:30.929014Z",
"memory_request": 2147483648,
"memory_usage": 761685606,
"ports": [],
"service_name": "mon",
"started": "2024-03-31T23:55:16.268266Z",
"status": 1,
"status_desc": "running",
"version": "16.2.5"
  },


Thank you,
Anantha

From: Anthony D'Atri 
Sent: Monday, April 1, 2024 12:25 PM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] ceph status not showing correct monitor services




 a001s017.bpygfm(active, since 13M), standbys: a001s016.ctmoay

Looks like you just had an mgr failover?  Could be that the  
secondary mgr hasn't caught up with current events.

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph status not showing correct monitor services

2024-04-01 Thread Adiga, Anantha
Hi Eugen,

Yes that is it. OSDs were restarted since mon a001s017 was reporting is low on 
available space.  How  to update the mon map to add  mon.a001s016  as it is 
already online?   
And how to update mgr map to  include standby mgr.a001s018 as it is also 
running. 


ceph mon dump
dumped monmap epoch 6
epoch 6
fsid 604d56db-2fab-45db-a9ea-c418f9a8cca8
last_changed 2024-03-31T23:54:18.692983+
created 2021-09-30T16:15:12.884602+
min_mon_release 16 (pacific)
election_strategy: 1
0: [v2:10.45.128.28:3300/0,v1:10.45.128.28:6789/0] mon.a001s018
1: [v2:10.45.128.27:3300/0,v1:10.45.128.27:6789/0] mon.a001s017


Thank you,
Anantha

-Original Message-
From: Eugen Block  
Sent: Monday, April 1, 2024 1:10 PM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: ceph status not showing correct monitor services

Maybe it’s just not in the monmap? Can you show the output of:

ceph mon dump

Did you do any maintenance (apparently OSDs restarted recently) and maybe 
accidentally removed a MON from the monmap?


Zitat von "Adiga, Anantha" :

> Hi Anthony,
>
> Seeing it since last after noon.  It is same with mgr services as , 
> "ceph -s" is reporting only TWO instead of THREE
>
> Also  mon and mgr shows " is_active: false" see below.
>
> # ceph orch ps --daemon_type=mgr
> NAME HOST  PORTS   STATUS REFRESHED  AGE  
>  MEM USE  MEM LIM  VERSION  IMAGE ID  CONTAINER ID
> mgr.a001s016.ctmoay  a001s016  *:8443  running (18M) 3m ago  23M  
> 206M-  16.2.5   6e73176320aa  169cafcbbb99
> mgr.a001s017.bpygfm  a001s017  *:8443  running (19M) 3m ago  23M  
> 332M-  16.2.5   6e73176320aa  97257195158c
> mgr.a001s018.hcxnef  a001s018  *:8443  running (20M) 3m ago  23M  
> 113M-  16.2.5   6e73176320aa  21ba5896cee2
>
> # ceph orch ls --service_name=mgr
> NAME  PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
> mgr  3/3  3m ago 23M  a001s016;a001s017;a001s018;count:3
>
>
> # ceph orch ps --daemon_type=mon --format=json-pretty
>
> [
>   {
> "container_id": "8484a912f96a",
> "container_image_digests": [
>
> docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586
> ],
> "container_image_id":  
> "6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f",
> "container_image_name":  
> docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586,
> "created": "2024-03-31T23:55:16.164155Z",
> "daemon_id": "a001s016",
> "daemon_type": "mon",
> "hostname": "a001s016",
> "is_active": false,   
><== why is it false
> "last_refresh": "2024-04-01T19:38:30.929014Z",
> "memory_request": 2147483648,
> "memory_usage": 761685606,
> "ports": [],
> "service_name": "mon",
> "started": "2024-03-31T23:55:16.268266Z",
> "status": 1,
> "status_desc": "running",
> "version": "16.2.5"
>   },
>
>
> Thank you,
> Anantha
>
> From: Anthony D'Atri 
> Sent: Monday, April 1, 2024 12:25 PM
> To: Adiga, Anantha 
> Cc: ceph-users@ceph.io
> Subject: Re: [ceph-users] ceph status not showing correct monitor 
> services
>
>
>
>
>  a001s017.bpygfm(active, since 13M), standbys: a001s016.ctmoay
>
> Looks like you just had an mgr failover?  Could be that the secondary 
> mgr hasn't caught up with current events.
> ___
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an 
> email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to 
ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph status not showing correct monitor services

2024-04-01 Thread Eugen Block
I have two approaches in mind, first one (and preferred) would be to  
edit the mon spec to first remove mon.a001s016 and have a clean state.  
Get the current spec with:


ceph orch ls mon --export > mon-edit.yaml

Edit the spec file so that mon.a001s016 is not part of it, then apply:

ceph orch apply -i mon-edit.yaml

This should remove the mon.a001s016 daemon. Then wait a few minutes or  
so (until the daemon is actually gone, check locally on the node with  
'cephadm ls' and in /var/lib/ceph//removed) and add it back to  
the spec file, then apply again. I would expect a third MON to be  
deployed. If that doesn't work for some reason you'll need to inspect  
logs to find the root cause.


The second approach would be to remove and add the daemon manually:

ceph orch daemon rm mon.a001s016

Wait until it's really gone, then add it:

ceph orch daemon add mon a001s016

Not entirely sure about the daemon add mon command, you might need to  
provide something else, I'm typing this by heart.


Zitat von "Adiga, Anantha" :


Hi Eugen,

Yes that is it. OSDs were restarted since mon a001s017 was reporting  
is low on available space.  How  to update the mon map to add   
mon.a001s016  as it is already online?
And how to update mgr map to  include standby mgr.a001s018 as it is  
also running.



ceph mon dump
dumped monmap epoch 6
epoch 6
fsid 604d56db-2fab-45db-a9ea-c418f9a8cca8
last_changed 2024-03-31T23:54:18.692983+
created 2021-09-30T16:15:12.884602+
min_mon_release 16 (pacific)
election_strategy: 1
0: [v2:10.45.128.28:3300/0,v1:10.45.128.28:6789/0] mon.a001s018
1: [v2:10.45.128.27:3300/0,v1:10.45.128.27:6789/0] mon.a001s017


Thank you,
Anantha

-Original Message-
From: Eugen Block 
Sent: Monday, April 1, 2024 1:10 PM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: ceph status not showing correct monitor services

Maybe it’s just not in the monmap? Can you show the output of:

ceph mon dump

Did you do any maintenance (apparently OSDs restarted recently) and  
maybe accidentally removed a MON from the monmap?



Zitat von "Adiga, Anantha" :


Hi Anthony,

Seeing it since last after noon.  It is same with mgr services as ,
"ceph -s" is reporting only TWO instead of THREE

Also  mon and mgr shows " is_active: false" see below.

# ceph orch ps --daemon_type=mgr
NAME HOST  PORTS   STATUS REFRESHED  AGE
 MEM USE  MEM LIM  VERSION  IMAGE ID  CONTAINER ID
mgr.a001s016.ctmoay  a001s016  *:8443  running (18M) 3m ago  23M
206M-  16.2.5   6e73176320aa  169cafcbbb99
mgr.a001s017.bpygfm  a001s017  *:8443  running (19M) 3m ago  23M
332M-  16.2.5   6e73176320aa  97257195158c
mgr.a001s018.hcxnef  a001s018  *:8443  running (20M) 3m ago  23M
113M-  16.2.5   6e73176320aa  21ba5896cee2

# ceph orch ls --service_name=mgr
NAME  PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
mgr  3/3  3m ago 23M  a001s016;a001s017;a001s018;count:3


# ceph orch ps --daemon_type=mon --format=json-pretty

[
  {
"container_id": "8484a912f96a",
"container_image_digests": [

docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586
],
"container_image_id":
"6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f",
"container_image_name":
docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586,
"created": "2024-03-31T23:55:16.164155Z",
"daemon_id": "a001s016",
"daemon_type": "mon",
"hostname": "a001s016",
"is_active": false,
   <== why is it false
"last_refresh": "2024-04-01T19:38:30.929014Z",
"memory_request": 2147483648,
"memory_usage": 761685606,
"ports": [],
"service_name": "mon",
"started": "2024-03-31T23:55:16.268266Z",
"status": 1,
"status_desc": "running",
"version": "16.2.5"
  },


Thank you,
Anantha

From: Anthony D'Atri 
Sent: Monday, April 1, 2024 12:25 PM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] ceph status not showing correct monitor
services




 a001s017.bpygfm(active, since 13M), standbys: a001s016.ctmoay

Looks like you just had an mgr failover?  Could be that the secondary
mgr hasn't caught up with current events.
___
ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an  
email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph status not showing correct monitor services

2024-04-01 Thread Adiga, Anantha
Thank you. I will try the  export and import method first.

Thank you,
Anantha

-Original Message-
From: Eugen Block  
Sent: Monday, April 1, 2024 1:57 PM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: ceph status not showing correct monitor services

I have two approaches in mind, first one (and preferred) would be to edit the 
mon spec to first remove mon.a001s016 and have a clean state.  
Get the current spec with:

ceph orch ls mon --export > mon-edit.yaml

Edit the spec file so that mon.a001s016 is not part of it, then apply:

ceph orch apply -i mon-edit.yaml

This should remove the mon.a001s016 daemon. Then wait a few minutes or so 
(until the daemon is actually gone, check locally on the node with 'cephadm ls' 
and in /var/lib/ceph//removed) and add it back to the spec file, then 
apply again. I would expect a third MON to be deployed. If that doesn't work 
for some reason you'll need to inspect logs to find the root cause.

The second approach would be to remove and add the daemon manually:

ceph orch daemon rm mon.a001s016

Wait until it's really gone, then add it:

ceph orch daemon add mon a001s016

Not entirely sure about the daemon add mon command, you might need to provide 
something else, I'm typing this by heart.

Zitat von "Adiga, Anantha" :

> Hi Eugen,
>
> Yes that is it. OSDs were restarted since mon a001s017 was reporting  
> is low on available space.  How  to update the mon map to add   
> mon.a001s016  as it is already online?
> And how to update mgr map to  include standby mgr.a001s018 as it is 
> also running.
>
>
> ceph mon dump
> dumped monmap epoch 6
> epoch 6
> fsid 604d56db-2fab-45db-a9ea-c418f9a8cca8
> last_changed 2024-03-31T23:54:18.692983+ created 
> 2021-09-30T16:15:12.884602+ min_mon_release 16 (pacific)
> election_strategy: 1
> 0: [v2:10.45.128.28:3300/0,v1:10.45.128.28:6789/0] mon.a001s018
> 1: [v2:10.45.128.27:3300/0,v1:10.45.128.27:6789/0] mon.a001s017
>
>
> Thank you,
> Anantha
>
> -Original Message-
> From: Eugen Block 
> Sent: Monday, April 1, 2024 1:10 PM
> To: ceph-users@ceph.io
> Subject: [ceph-users] Re: ceph status not showing correct monitor 
> services
>
> Maybe it’s just not in the monmap? Can you show the output of:
>
> ceph mon dump
>
> Did you do any maintenance (apparently OSDs restarted recently) and 
> maybe accidentally removed a MON from the monmap?
>
>
> Zitat von "Adiga, Anantha" :
>
>> Hi Anthony,
>>
>> Seeing it since last after noon.  It is same with mgr services as , 
>> "ceph -s" is reporting only TWO instead of THREE
>>
>> Also  mon and mgr shows " is_active: false" see below.
>>
>> # ceph orch ps --daemon_type=mgr
>> NAME HOST  PORTS   STATUS REFRESHED  AGE
>>  MEM USE  MEM LIM  VERSION  IMAGE ID  CONTAINER ID
>> mgr.a001s016.ctmoay  a001s016  *:8443  running (18M) 3m ago  23M
>> 206M-  16.2.5   6e73176320aa  169cafcbbb99
>> mgr.a001s017.bpygfm  a001s017  *:8443  running (19M) 3m ago  23M
>> 332M-  16.2.5   6e73176320aa  97257195158c
>> mgr.a001s018.hcxnef  a001s018  *:8443  running (20M) 3m ago  23M
>> 113M-  16.2.5   6e73176320aa  21ba5896cee2
>>
>> # ceph orch ls --service_name=mgr
>> NAME  PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
>> mgr  3/3  3m ago 23M  a001s016;a001s017;a001s018;count:3
>>
>>
>> # ceph orch ps --daemon_type=mon --format=json-pretty
>>
>> [
>>   {
>> "container_id": "8484a912f96a",
>> "container_image_digests": [
>>
>> docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586
>> ],
>> "container_image_id":
>> "6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f",
>> "container_image_name":
>> docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586,
>> "created": "2024-03-31T23:55:16.164155Z",
>> "daemon_id": "a001s016",
>> "daemon_type": "mon",
>> "hostname": "a001s016",
>> "is_active": false,
>><== why is it false
>> "last_refresh": "2024-04-01T19:38:30.929014Z",
>> "memory_request": 2147483648,
>> "memory_usage": 761685606,
>> "ports": [],
>> "service_name": "mon",
>> "started": "2024-03-31T23:55:16.268266Z",
>> "status": 1,
>> "status_desc": "running",
>> "version": "16.2.5"
>>   },
>>
>>
>> Thank you,
>> Anantha
>>
>> From: Anthony D'Atri 
>> Sent: Monday, April 1, 2024 12:25 PM
>> To: Adiga, Anantha 
>> Cc: ceph-users@ceph.io
>> Subject: Re: [ceph-users] ceph status not showing correct monitor 
>> services
>>
>>
>>
>>
>>  a001s017.bpygfm(active, since 13M), standbys: a001s016.ctmoay
>>
>> Looks like you just had an mgr failover?  Could be that the secondary 
>> mgr hasn't caught up with current events.
>>

[ceph-users] Re: ceph status not showing correct monitor services

2024-04-01 Thread Adiga, Anantha
Both methods are not update the mon map, is there a way to inject mon.a001s016  
into the current mon map?  

# ceph mon dump
dumped monmap epoch 6
epoch 6
fsid 604d56db-2fab-45db-a9ea-c418f9a8cca8
last_changed 2024-03-31T23:54:18.692983+
created 2021-09-30T16:15:12.884602+
min_mon_release 16 (pacific)
election_strategy: 1
0: [v2:10.45.128.28:3300/0,v1:10.45.128.28:6789/0] mon.a001s018
1: [v2:10.45.128.27:3300/0,v1:10.45.128.27:6789/0] mon.a001s017

# ceph tell mon.a001s016 mon_status
Error ENOENT: problem getting command descriptions from mon.a001s016

# ceph tell mon.a001s016 mon_status
Error ENOENT: problem getting command descriptions from mon.a001s016

# ceph tell mon.a001s017 mon_status
{
"name": "a001s017",
"rank": 1,
"state": "peon",
"election_epoch": 162,
"quorum": [
0,
1
],
"quorum_age": 69551,
"features": {
..
..


# ceph orch ls --service_name=mon --export > mon3.yml
service_type: mon
service_name: mon
placement:
  count: 3
  hosts:
  - a001s016
  - a001s017
  - a001s018

# cp mon3.yml mon2.yml
# vi mon2.yml
#cat mon2.yml
service_type: mon
service_name: mon
placement:
  count: 2
  hosts:
  - a001s017
  - a001s018

# ceph orch apply -i mon2.yml --dry-run
WARNING! Dry-Runs are snapshots of a certain point in time and are bound
to the current inventory setup. If any on these conditions changes, the
preview will be invalid. Please make sure to have a minimal
timeframe between planning and applying the specs.

SERVICESPEC PREVIEWS

+-+--++--+
|SERVICE  |NAME  |ADD_TO  |REMOVE_FROM   |
+-+--++--+
|mon  |mon   ||mon.a001s016  |
+-+--++--+

OSDSPEC PREVIEWS

+-+--+--+--++-+
|SERVICE  |NAME  |HOST  |DATA  |DB  |WAL  |
+-+--+--+--++-+
+-+--+--+--++-+

# ceph orch ls --service_name=mon --refresh
NAME  PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
mon  3/3  5m ago 18h  a001s016;a001s017;a001s018;count:3

# ceph orch ps --refresh | grep mon
mon.a001s016a001s016   running (21h) 2s 
ago  21h 734M2048M  16.2.5  6e73176320aa  8484a912f96a
mon.a001s017a001s017   running (18h) 2s 
ago  21h 976M2048M  16.2.5  6e73176320aa  e5e5cb6c256c
mon.a001s018a001s018   running (5w)  2s 
ago   2y1164M2048M  16.2.5  6e73176320aa  7d2bb6d41f54
# ceph orch ps --refresh | grep mon
mon.a001s016a001s016   running (21h)37s 
ago  21h 734M2048M  16.2.5  6e73176320aa  8484a912f96a
mon.a001s017a001s017   running (18h)37s 
ago  21h 977M2048M  16.2.5  6e73176320aa  e5e5cb6c256c
mon.a001s018a001s018   running (5w) 38s 
ago   2y1166M2048M  16.2.5  6e73176320aa  7d2bb6d41f54

# ceph orch apply -i mon2.yml
Scheduled mon update...
# ceph orch ps --refresh | grep mon
mon.a001s016a001s016   running (21h)21s 
ago  21h 734M2048M  16.2.5  6e73176320aa  8484a912f96a
mon.a001s017a001s017   running (18h)20s 
ago  21h 962M2048M  16.2.5  6e73176320aa  e5e5cb6c256c
mon.a001s018a001s018   running (5w) 21s 
ago   2y1156M2048M  16.2.5  6e73176320aa  7d2bb6d41f54
# ceph orch ps --refresh | grep mon
mon.a001s017a001s017   running (18h)23s 
ago  21h 962M2048M  16.2.5  6e73176320aa  e5e5cb6c256c
mon.a001s018a001s018   running (5w) 24s 
ago   2y1156M2048M  16.2.5  6e73176320aa  7d2bb6d41f54
# ceph orch ps --refresh | grep mon
mon.a001s017a001s017   running (18h)27s 
ago  21h 962M2048M  16.2.5  6e73176320aa  e5e5cb6c256c
mon.a001s018a001s018   running (5w)  0s 
ago   2y1154M2048M  16.2.5  6e73176320aa  7d2bb6d41f54
# ceph orch ps --refresh | grep mon
mon.a001s017a001s017   running (18h) 2s 
ago  21h 960M2048M  16.2.5  6e73176320aa  e5e5cb6c256c
mon.a001s018a001s018   running (5w)  3s 
ago   2y1154M2048M  16.2.5  6e73176320aa  7d2bb6d41f54
# ceph orch ps --refresh | grep mon
mon.a001s017a001s017   running (18h) 5s 
ago  21h 960M2048M  16.2.5  6e73176320aa  e5e5cb6c256c
mon.a001s018a001s018   running (

[ceph-users] Re: ceph status not showing correct monitor services

2024-04-01 Thread Adiga, Anantha
root@a001s016:/var/run/ceph/604d56db-2fab-45db-a9ea-c418f9a8cca8# ceph tell 
mon.a001s016 mon_status
Error ENOENT: problem getting command descriptions from mon.a001s016

a001s016 is outside quorum see below 

root@a001s016:/var/run/ceph/604d56db-2fab-45db-a9ea-c418f9a8cca8# ceph tell 
mon.a001s017 mon_status
{
"name": "a001s017",
"rank": 1,
"state": "peon",
"election_epoch": 162,
"quorum": [
0,
1
],
"quorum_age": 79938,
"features": {
"required_con": "2449958747317026820",
"required_mon": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus",
"octopus",
"pacific",
"elector-pinging"
],
"quorum_con": "4540138297136906239",
"quorum_mon": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus",
"octopus",
"pacific",
"elector-pinging"
]
},
"outside_quorum": [],
"extra_probe_peers": [
{
"addrvec": [
{
"type": "v2",
"addr": "10.45.128.26:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "10.45.128.26:6789",
"nonce": 0
}
]
}
],
"sync_provider": [],
"monmap": {
"epoch": 6,
"fsid": "604d56db-2fab-45db-a9ea-c418f9a8cca8",
"modified": "2024-03-31T23:54:18.692983Z",
"created": "2021-09-30T16:15:12.884602Z",
"min_mon_release": 16,
"min_mon_release_name": "pacific",
"election_strategy": 1,
"disallowed_leaders: ": "",
"stretch_mode": false,
"features": {
"persistent": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus",
"octopus",
"pacific",
"elector-pinging"
],
"optional": []
},
"mons": [
{
"rank": 0,
"name": "a001s018",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "10.45.128.28:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "10.45.128.28:6789",
"nonce": 0
}
]
},
"addr": "10.45.128.28:6789/0",
"public_addr": "10.45.128.28:6789/0",
"priority": 0,
"weight": 0,
"crush_location": "{}"
},
{
"rank": 1,
"name": "a001s017",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "10.45.128.27:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "10.45.128.27:6789",
"nonce": 0
}
]
},
"addr": "10.45.128.27:6789/0",
"public_addr": "10.45.128.27:6789/0",
"priority": 0,
"weight": 0,
"crush_location": "{}"
}
]
},
"feature_map": {
"mon": [
{
"features": "0x3f01cfb9fffd",
"release": "luminous",
"num": 1
}
],
"mds": [
{
"features": "0x3f01cfb9fffd",
"release": "luminous",
"num": 3
}
],
"osd": [
{
"features": "0x3f01cfb9fffd",
"release": "luminous",
"num": 15
}
],
"client": [
{
"features": "0x2f018fb86aa42ada",
"release": "luminous",
"num": 50
},
{
"features": "0x2f018fb87aa4aafe",
"release": "luminous",
"num": 40
},
{
"features": "0x3f01cfb8ffed",
"release": "luminous",
"num": 1
},
{
"features": "0x3f01cfb9fffd",
"release": "luminous",
"num": 72
}
]
},
"stretch_mode": false
}
root@a001s016:/var/run/ceph/604d56db-2fab-45

[ceph-users] Re: ceph status not showing correct monitor services

2024-04-01 Thread Adiga, Anantha
# ceph mon stat
e6: 2 mons at 
{a001s017=[v2:10.45.128.27:3300/0,v1:10.45.128.27:6789/0],a001s018=[v2:10.45.128.28:3300/0,v1:10.45.128.28:6789/0]},
 election epoch 162, leader 0 a001s018, quorum 0,1 a001s018,a001s017

# ceph orch ps | grep mon
mon.a001s016a001s016   running (3h)  6m 
ago   3h 527M2048M  16.2.5  6e73176320aa  39db8cfba7e1
mon.a001s017a001s017   running (22h)47s 
ago   1h 993M2048M  16.2.5  6e73176320aa  e5e5cb6c256c
mon.a001s018a001s018   running (5w) 48s 
ago   2y1167M2048M  16.2.5  6e73176320aa  7d2bb6d41f54

# ceph mgr stat
{
"epoch": 1130365,
"available": true,
"active_name": "a001s016.ctmoay",
"num_standby": 1
}

# ceph orch ps | grep mgr
mgr.a001s016.ctmoay a001s016  *:8443   running (18M)   109s 
ago  23M 518M-  16.2.5  6e73176320aa  169cafcbbb99
mgr.a001s017.bpygfm a001s017  *:8443   running (19M) 5m 
ago  23M 501M-  16.2.5  6e73176320aa  97257195158c
mgr.a001s018.hcxnef a001s018  *:8443   running (20M) 5m 
ago  23M 113M-  16.2.5  6e73176320aa  21ba5896cee2

# ceph orch ls --service_name=mgr --export
service_type: mgr
service_name: mgr
placement:
  count: 3
  hosts:
  - a001s016
  - a001s017
  - a001s018

# ceph orch ls --service_name=mon --export
service_type: mon
service_name: mon
placement:
  count: 3
  hosts:
  - a001s016
  - a001s017
  - a001s018

-Original Message-
From: Adiga, Anantha 
Sent: Monday, April 1, 2024 6:06 PM
To: Eugen Block 
Cc: ceph-users@ceph.io
Subject: RE: [ceph-users] Re: ceph status not showing correct monitor services

# ceph tell mon.a001s016 mon_status Error ENOENT: problem getting command 
descriptions from mon.a001s016

a001s016 is outside quorum see below 

# ceph tell mon.a001s017 mon_status {
"name": "a001s017",
"rank": 1,
"state": "peon",
"election_epoch": 162,
"quorum": [
0,
1
],
"quorum_age": 79938,
"features": {
"required_con": "2449958747317026820",
"required_mon": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus",
"octopus",
"pacific",
"elector-pinging"
],
"quorum_con": "4540138297136906239",
"quorum_mon": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus",
"octopus",
"pacific",
"elector-pinging"
]
},
"outside_quorum": [],
"extra_probe_peers": [
{
"addrvec": [
{
"type": "v2",
"addr": "10.45.128.26:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "10.45.128.26:6789",
"nonce": 0
}
]
}
],
"sync_provider": [],
"monmap": {
"epoch": 6,
"fsid": "604d56db-2fab-45db-a9ea-c418f9a8cca8",
"modified": "2024-03-31T23:54:18.692983Z",
"created": "2021-09-30T16:15:12.884602Z",
"min_mon_release": 16,
"min_mon_release_name": "pacific",
"election_strategy": 1,
"disallowed_leaders: ": "",
"stretch_mode": false,
"features": {
"persistent": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus",
"octopus",
"pacific",
"elector-pinging"
],
"optional": []
},
"mons": [
{
"rank": 0,
"name": "a001s018",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "10.45.128.28:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "10.45.128.28:6789",
"nonce": 0
}
]
},
"addr": "10.45.128.28:6789/0",
"public_addr": "10.45.128.28:6789/0",
"priority": 0,
"weight": 0,
"crush_location": "{}"
},
{
"rank": 1,
"name": "a001s017",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "10.45.128.27:3300",
"nonce": 0
   

[ceph-users] Re: S3 Partial Reads from Erasure Pool

2024-04-01 Thread Joshua Baergen
I think it depends what you mean by rados objects and s3 objects here. If
you're talking about an object that was uploaded via MPU, and thus may
comprise many rados objects, I don't think there's a difference in read
behaviors based on pool type. If you're talking about reading a subset byte
range from a single rados object stored on an EC pool, yes, the whole
object is read from the pool in order to serve that subset read, something
that https://github.com/ceph/ceph/pull/55196 endeavours to address.

Josh

On Mon, Mar 25, 2024, 4:27 p.m.  wrote:

> I am dealing with a cluster that is having terrible performance with
> partial reads from an erasure coded pool. Warp tests and s3bench tests
> result in acceptable performance but when the application hits the data,
> performance plummets. Can anyone clear this up for me, When radosgw gets a
> partial read does it have to assemble all the rados objects that make up
> the s3 object before returning the range? With a replicated poll i am
> seeing 6 to 7 GiB/s of read performance and only 1GiB/s of read from the
> erasure coded pool which leads me to believe that the replicated pool is
> returning just the rados objects for the partial s3 object and the erasure
> coded pool is not.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Replace block drives of combined NVME+HDD OSDs

2024-04-01 Thread Zakhar Kirpichenko
Hi,

Unfortunately, some of our HDDs failed and we need to replace these drives
which are parts of "combined" OSDs (DB/WAL on NVME, block storage on HDD).
All OSDs are defined with a service definition similar to this one:

```
service_type: osd
service_id: ceph02_combined_osd
service_name: osd.ceph02_combined_osd
placement:
  hosts:
  - ceph02
spec:
  data_devices:
paths:
- /dev/sda
- /dev/sdb
- /dev/sdc
- /dev/sdd
- /dev/sde
- /dev/sdf
- /dev/sdg
- /dev/sdh
- /dev/sdi
  db_devices:
paths:
- /dev/nvme0n1
- /dev/nvme1n1
  filter_logic: AND
  objectstore: bluestore
```

In the above example, HDDs `sda` and `sdb` are not readable and data cannot
be copied over to new HDDs. NVME partitions of `nvme0n1` with DB/WAL data
are intact, but I guess that data is useless. I think the best approach is
to replace the dead drives and completely rebuild each affected OSD. How
should we go about this, preferably in a way that other OSDs on the node
remain unaffected and operational?

I would appreciate any advice or pointers to the relevant documentation.

Best regards,
Zakhar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Questions about rbd flatten command

2024-04-01 Thread Henry lol
I'm not sure, but it seems that read and write operations are
performed for all objects in rbd.
If so, is there any method to apply qos for flatten operation?

2024년 4월 1일 (월) 오후 11:59, Henry lol 님이 작성:
>
> Hello,
>
> I executed multiple 'rbd flatten' commands simultaneously on a client.
> The elapsed time of each flatten job increased as the number of jobs
> increased, and network I/O was nearly full.
>
> so, I have two questions.
> 1. isn’t the flatten job running within the ceph cluster? Why is
> client-side network I/O so high?
> 2. How can I apply qos for each flatten job to reduce network I/O?
>
> Sincerely,
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io