[ceph-users] Failed cephadm Upgrade - ValueError

2021-05-03 Thread Ashley Merrick
Hello,Wondering if anyone had any feedback on some commands I could try to 
manually update the current OSD that is down to 16.2.1 so I can at least get 
around this upgrade bug and back to 100%?If there is any log's or if it seems a 
new bug and I should create a bugzilla report do let me know.Thanks
> On Fri Apr 30 2021 21:54:30 GMT+0800 (Singapore Standard Time), Ashley 
> Merrick  wrote:
> Hello All,I was running 15.2.8 via cephadm on docker Ubuntu 20.04I just 
> attempted to upgrade to 16.2.1 via the automated method, it successfully 
> upgraded the mon/mgr/mds and some OSD's, however it then failed on an OSD and 
> hasn't been able to pass even after stopping and restarting the upgrade.It 
> reported the following ""message": "Error: UPGRADEREDEPLOYDAEMON: Upgrading 
> daemon osd.35 on host sn-s01 failed.""If I run 'ceph health detail' I get 
> lot's of the following error : "ValueError: not enough values to unpack 
> (expected 2, got 1)" throughout the detail reportUpon googling, it looks like 
> I am hitting something along the lines of https://158.69.68.89/issues/48924 & 
> https://tracker.ceph.com/issues/49522What do I need to do to either get 
> around this bug, or a way I can manually upgrade the remaining ceph OSD's to 
> 16.2.1, currently my cluster is working but the last OSD it failed to upgrade 
> is currently offline (I guess as no image attached to it now as it failed to 
> pull it), and I 
 have a cluster with OSD's from not 15.2.8 and 16.2.1Thanks
>  
> Sent via MXlogin

 
Sent via MXlogin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

2021-05-03 Thread Ilya Dryomov
On Mon, May 3, 2021 at 9:20 AM Magnus Harlander  wrote:
>
> Am 03.05.21 um 00:44 schrieb Ilya Dryomov:
>
> On Sun, May 2, 2021 at 11:15 PM Magnus Harlander  wrote:
>
> Hi,
>
> I know there is a thread about problems with mounting cephfs with 5.11 
> kernels.
>
> ...
>
> Hi Magnus,
>
> What is the output of "ceph config dump"?
>
> Instead of providing those lines, can you run "ceph osd getmap 64281 -o
> osdmap.64281" and attach osdmap.64281 file?
>
> Thanks,
>
> Ilya
>
> Hi Ilya,
>
> [root@s1 ~]# ceph config dump
> WHO MASK  LEVEL OPTION VALUE RO
> globalbasic device_failure_prediction_mode local
> globaladvanced  ms_bind_ipv4   false
>   mon advanced  auth_allow_insecure_global_id_reclaim  false
>   mon advanced  mon_lease  8.00
>   mgr advanced  mgr/devicehealth/enable_monitoring true
>
> getmap output is attached,

I see the problem, but I don't understand the root cause yet.  It is
related to the two missing OSDs:

> May 02 22:54:05 islay kernel: libceph: no match of type 1 in addrvec
> May 02 22:54:05 islay kernel: libceph: corrupt full osdmap (-2) epoch 64281 
> off 3154 (a90fe1d7 of 0083f4bd-c03bdc9b)

> max_osd 12

> osd.0 up   in  ... 
> [v2:192.168.200.141:6804/3027,v1:192.168.200.141:6805/3027] ... exists,up 
> 631bc170-45fd-4948-9a5e-4c278569c0bc
> osd.1 up   in  ... 
> [v2:192.168.200.140:6811/3066,v1:192.168.200.140:6813/3066] ... exists,up 
> 660a762c-001d-4160-a9ee-d0acd078e776
> osd.2 up   in  ... 
> [v2:192.168.200.141:6815/3008,v1:192.168.200.141:6816/3008] ... exists,up 
> e4d94d3a-ec58-46a1-b61c-c47dd39012ed
> osd.3 up   in  ... 
> [v2:192.168.200.140:6800/3067,v1:192.168.200.140:6801/3067] ... exists,up 
> 26d25060-fd99-4d15-a1b2-ebb77646671e
> osd.4 up   in  ... 
> [v2:192.168.200.140:6804/3049,v1:192.168.200.140:6806/3049] ... exists,up 
> 238f197d-ecbc-4588-8a99-6a63c9bb1a17
> osd.5 up   in  ... 
> [v2:192.168.200.140:6816/3073,v1:192.168.200.140:6817/3073] ... exists,up 
> a9dcb26f-0f1c-4067-a26b-a29939285e0b
> osd.6 up   in  ... 
> [v2:192.168.200.141:6808/3020,v1:192.168.200.141:6809/3020] ... exists,up 
> f399b47d-063f-4b2f-bd93-289377dc9945
> osd.7 up   in  ... 
> [v2:192.168.200.141:6800/3023,v1:192.168.200.141:6801/3023] ... exists,up 
> 3557ceca-7bd8-401e-abd3-59bee168e8f6
> osd.8 up   in  ... 
> [v2:192.168.200.141:6812/3017,v1:192.168.200.141:6813/3017] ... exists,up 
> 7f9cad3f-163d-4bb7-85b2-fffd46982fff
> osd.9 up   in  ... 
> [v2:192.168.200.140:6805/3053,v1:192.168.200.140:6807/3053] ... exists,up 
> c543b12a-f9bf-4b83-af16-f6b8a3926e69

The kernel client is failing to parse addrvec entries for non-existent
osd10 and osd11.  It is probably being too stringent, but before fixing
it I'd like to understand what happened to those OSDs.  It looks like
they were removed but not completely.

What let to their removal?  What commands were used?

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cannot create issue in bugtracker

2021-05-03 Thread Tobias Urdin
Hello,

Anybody, still error?


Best regards

-


Internal error
An error occurred on the page you were trying to access.
If you continue to experience problems please contact your Redmine 
administrator for assistance.

If you are the Redmine administrator, check your log files for details about 
the error.


From: Tobias Urdin 
Sent: Friday, April 30, 2021 2:52:57 PM
To: ceph-users@ceph.io
Subject: [ceph-users] Cannot create issue in bugtracker

Hello,


Is it only me that's getting Internal error when trying to create issues in the 
bugtracker for some day or two?

https://tracker.ceph.com/issues/new


Best regards
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cannot create issue in bugtracker

2021-05-03 Thread David Caro

I created an issue during the weekend without problems:

https://tracker.ceph.com/issues/50604


On 05/03 09:36, Tobias Urdin wrote:
> Hello,
> 
> Anybody, still error?
> 
> 
> Best regards
> 
> -
> 
> 
> Internal error
> An error occurred on the page you were trying to access.
> If you continue to experience problems please contact your Redmine 
> administrator for assistance.
> 
> If you are the Redmine administrator, check your log files for details about 
> the error.
> 
> 
> From: Tobias Urdin 
> Sent: Friday, April 30, 2021 2:52:57 PM
> To: ceph-users@ceph.io
> Subject: [ceph-users] Cannot create issue in bugtracker
> 
> Hello,
> 
> 
> Is it only me that's getting Internal error when trying to create issues in 
> the bugtracker for some day or two?
> 
> https://tracker.ceph.com/issues/new
> 
> 
> Best regards
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

-- 
David Caro
SRE - Cloud Services
Wikimedia Foundation 
PGP Signature: 7180 83A2 AC8B 314F B4CE  1171 4071 C7E1 D262 69C3

"Imagine a world in which every single human being can freely share in the
sum of all knowledge. That's our commitment."


signature.asc
Description: PGP signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

2021-05-03 Thread Ilya Dryomov
On Mon, May 3, 2021 at 12:00 PM Magnus Harlander  wrote:
>
> Am 03.05.21 um 11:22 schrieb Ilya Dryomov:
>
> max_osd 12
>
> I never had more then 10 osds on the two osd nodes of this cluster.
>
> I was running a 3 osd-node cluster earlier with more than 10
> osds, but the current cluster has been setup from scratch and
> I definitely don't remember having ever more than 10 osds!
> Very strange!
>
> I had to replace 2 disks because of DOA-Problems, but for that
> I removed 2 osds and created new ones.
>
> I used ceph-deploy do create new osds.
>
> To delete osd.8 I used:
>
> # take it out
> ceph osd out 8
>
> # wait for rebalancing to finish
>
> systemctl stop ceph-osd@8
>
> # wait for a healthy cluster
>
> ceph osd purge 8 --yes-i-really-mean-it
>
> # edit ceph.conf and remove osd.8
>
> ceph-deploy --overwrie-conf admin s0 s1
>
> # Add the new disk and:
> ceph-deploy osd create --data /dev/sdc s0
> ...
>
> it get's created with the next free osd num (8) because purge releases 8 for 
> reuse

It would be nice to track it down, but for the immediate issue of
kernel 5.11 not working, "ceph osd setmaxosd 10" should fix it.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

2021-05-03 Thread Ilya Dryomov
On Mon, May 3, 2021 at 12:27 PM Magnus Harlander  wrote:
>
> Am 03.05.21 um 12:25 schrieb Ilya Dryomov:
>
> ceph osd setmaxosd 10
>
> Bingo! Mount works again.
>
> Vry strange things are going on here (-:
>
> Thanx a lot for now!! If I can help to track it down, please let me know.

Good to know it helped!  I'll think about this some more and probably
plan to patch the kernel client to be less stringent and not choke on
this sort of misconfiguration.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

2021-05-03 Thread Ilya Dryomov
On Mon, May 3, 2021 at 12:24 PM Magnus Harlander  wrote:
>
> Am 03.05.21 um 11:22 schrieb Ilya Dryomov:
>
> There is a 6th osd directory on both machines, but it's empty
>
> [root@s0 osd]# ll
> total 0
> drwxrwxrwt. 2 ceph ceph 200  2. Mai 16:31 ceph-1
> drwxrwxrwt. 2 ceph ceph 200  2. Mai 16:31 ceph-3
> drwxrwxrwt. 2 ceph ceph 200  2. Mai 16:31 ceph-4
> drwxrwxrwt. 2 ceph ceph 200  2. Mai 16:31 ceph-5
> drwxr-xr-x. 2 ceph ceph   6  3. Apr 19:50 ceph-8 <===
> drwxrwxrwt. 2 ceph ceph 200  2. Mai 16:31 ceph-9
> [root@s0 osd]# pwd
> /var/lib/ceph/osd
>
> [root@s1 osd]# ll
> total 0
> drwxrwxrwt  2 ceph ceph 200 May  2 15:39 ceph-0
> drwxr-xr-x. 2 ceph ceph   6 Mar 13 17:54 ceph-1 <===
> drwxrwxrwt  2 ceph ceph 200 May  2 15:39 ceph-2
> drwxrwxrwt  2 ceph ceph 200 May  2 15:39 ceph-6
> drwxrwxrwt  2 ceph ceph 200 May  2 15:39 ceph-7
> drwxrwxrwt  2 ceph ceph 200 May  2 15:39 ceph-8
> [root@s1 osd]# pwd
> /var/lib/ceph/osd
>
> The bogus directories are empty and they are
> used on the other machine for a real osd!
>
> How is that?
>
> Should I remove them and restart ceph.target?

I don't think empty directories matter at this point.  You may not have
had 12 OSDs at any point in time, but the max_osd value appears to have
gotten bumped when you were replacing those disks.

Note that max_osd being greater than the number of OSDs is not a big
problem by itself.  The osdmap is going to be larger and require more
memory but that's it.  You can test by setting it back to 12 and trying
to mount -- it should work.  The issue is specific to how to those OSDs
were replaced -- something went wrong and the osdmap somehow ended up
with rather bogus addrvec entries.  Not sure if it's ceph-deploy's
fault, something weird in ceph.conf (back then) or a an actual ceph
bug.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] OSD slow ops warning not clearing after OSD down

2021-05-03 Thread Frank Schilder
Dear cephers,

I have a strange problem. An OSD went down and recovery finished. For some 
reason, I have a slow ops warning for the failed OSD stuck in the system:

health: HEALTH_WARN
430 slow ops, oldest one blocked for 36 sec, osd.580 has slow ops

The OSD is auto-out:

| 580 | ceph-22 |0  |0  |0   | 0   |0   | 0   | 
autoout,exists |

It is probably a warning dating back to just before the fail. How can I clear 
it?

Thanks and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSD slow ops warning not clearing after OSD down

2021-05-03 Thread Vladimir Sigunov
Hi Frank.
Check your cluster for inactive/incomplete placement groups. I saw similar 
behavior on Octopus when some pgs stuck in incomplete/inactive or peering state.


From: Frank Schilder 
Sent: Monday, May 3, 2021 3:42:48 AM
To: ceph-users@ceph.io 
Subject: [ceph-users] OSD slow ops warning not clearing after OSD down

Dear cephers,

I have a strange problem. An OSD went down and recovery finished. For some 
reason, I have a slow ops warning for the failed OSD stuck in the system:

health: HEALTH_WARN
430 slow ops, oldest one blocked for 36 sec, osd.580 has slow ops

The OSD is auto-out:

| 580 | ceph-22 |0  |0  |0   | 0   |0   | 0   | 
autoout,exists |

It is probably a warning dating back to just before the fail. How can I clear 
it?

Thanks and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How can I get tail information a parted rados object

2021-05-03 Thread Rob Haverkamp
Hi Morphin,

There are multiple ways you can do this.

  1.  run a radosgw-admin bucket radoslist --bucket  write 
that output to a file, grep all entries containing the object name ' 
im034113.jpg', sort that list and download them.
  2.  run a radosgw-admin object stat --bucket  --object  this will output a json document. With the information in the manifest 
key you can find out what rados objects belong to the RGW object.


Kind regards,

Rob
https://www.42on.com/



From: by morphin 
Sent: Saturday, May 1, 2021 11:09 PM
To: Ceph Users 
Subject: [ceph-users] How can I get tail information a parted rados object

Hello.

I'm trying to export objects from rados with rados get. Some objects
bigger than 4M and they have tails. Is there any easy way to get tail
information an object?

For example this is an object:
- c106b26b.3_Img/2017/12/im034113.jpg
These are the objet parts:
- 
c106b26b.3__multipart_Img/2017/12/im034113.jpg.2~fjrC5r_KCWMBat_4bFVtmBv9pxcVL-9.1
- 
c106b26b.3__shadow_Img/2017/12/im034113.jpg.2~fjrC5r_KCWMBat_4bFVtmBv9pxcVL-9.1_1
- 
c106b26b.3__shadow_Img/2017/12/im034113.jpg.2~fjrC5r_KCWMBat_4bFVtmBv9pxcVL-9.1_2
- 
c106b26b.3__multipart_Img/2017/12/im034113.jpg.2~fjrC5r_KCWMBat_4bFVtmBv9pxcVL-9.2

As you can see the object has 2 multipart and 2 shadow object.
This jpg only works when I get all the parts and make it one with the order.
order: "cat 9.1 9.1_1 9.1_2 9.2 > im034113.jpg"

I'm trying to write a code and the code gonna read objects from a list
and find all the parts, bring it together with the order...  But I
couldn't find a good way to get part information.

I followed the link https://www.programmersought.com/article/31497869978/
and I get the object manifest with getxattr and decode it with
"ceph-dencoder type RGWBucketEnt  decode dump_json"
But in the manifest I can not find a path to code it. It's not useful.
Is there any different place that I can take the part information an
object?

Or better! Is there any tool to export an object with its tails?

btw: these objects created by RGW using s3. RGW can not access these
files. Because of that I'm trying to export it from rados and send it
to different RGW.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSD slow ops warning not clearing after OSD down

2021-05-03 Thread Vladimir Sigunov
Hi Frank,
Yes, I would purge the osd. The cluster looks absolutely healthy except of this 
osd.584 Probably,  the purge will help the cluster to forget this faulty one. 
Also, I would restart monitors, too.
With the amount of data you maintain in your cluster, I don't think your 
ceph.conf contains any information about some particular osds, but if it does, 
don't forget to remove the configuration of osd.584 from the ceph.conf

Get Outlook for Android


From: Frank Schilder 
Sent: Monday, May 3, 2021 8:37:09 AM
To: Vladimir Sigunov ; ceph-users@ceph.io 

Subject: Re: OSD slow ops warning not clearing after OSD down

Hi Vladimir,

thanks for your reply. I did, the cluster is healthy:

[root@gnosis ~]# ceph status
  cluster:
id: ---
health: HEALTH_WARN
430 slow ops, oldest one blocked for 36 sec, osd.580 has slow ops

  services:
mon: 3 daemons, quorum ceph-01,ceph-02,ceph-03
mgr: ceph-01(active), standbys: ceph-02, ceph-03
mds: con-fs2-2/2/2 up  {0=ceph-08=up:active,1=ceph-12=up:active}, 2 
up:standby
osd: 584 osds: 578 up, 578 in

  data:
pools:   11 pools, 3215 pgs
objects: 610.3 M objects, 1.2 PiB
usage:   1.5 PiB used, 4.6 PiB / 6.0 PiB avail
pgs: 3191 active+clean
 13   active+clean+scrubbing+deep
 9active+clean+snaptrim_wait
 2active+clean+snaptrim

  io:
client:   358 MiB/s rd, 56 MiB/s wr, 2.35 kop/s rd, 1.32 kop/s wr

[root@gnosis ~]# ceph health detail
HEALTH_WARN 430 slow ops, oldest one blocked for 36 sec, osd.580 has slow ops
SLOW_OPS 430 slow ops, oldest one blocked for 36 sec, osd.580 has slow ops

OSD 580 is down+out and the message does not even increment the seconds. Its 
probably stuck in some part of the health checking that tries to query 580 and 
doesn't understand that the OSD being down means there are no ops.

I tried to restart the OSD on this disk, but it seems completely rigged. The 
iDRAC log on the server says that the disk was removed during operation 
possibly due to a physical connection fail on the SAS lanes. I somehow need to 
get rid of this message and am wondering of purging the OSD would help.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Vladimir Sigunov 
Sent: 03 May 2021 13:45:19
To: ceph-users@ceph.io; Frank Schilder
Subject: Re: OSD slow ops warning not clearing after OSD down

Hi Frank.
Check your cluster for inactive/incomplete placement groups. I saw similar 
behavior on Octopus when some pgs stuck in incomplete/inactive or peering state.


From: Frank Schilder 
Sent: Monday, May 3, 2021 3:42:48 AM
To: ceph-users@ceph.io 
Subject: [ceph-users] OSD slow ops warning not clearing after OSD down

Dear cephers,

I have a strange problem. An OSD went down and recovery finished. For some 
reason, I have a slow ops warning for the failed OSD stuck in the system:

health: HEALTH_WARN
430 slow ops, oldest one blocked for 36 sec, osd.580 has slow ops

The OSD is auto-out:

| 580 | ceph-22 |0  |0  |0   | 0   |0   | 0   | 
autoout,exists |

It is probably a warning dating back to just before the fail. How can I clear 
it?

Thanks and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSD slow ops warning not clearing after OSD down

2021-05-03 Thread Dan van der Ster
Wait, first just restart the leader mon.

See: https://tracker.ceph.com/issues/47380 for a related issue.

-- dan

On Mon, May 3, 2021 at 2:55 PM Vladimir Sigunov
 wrote:
>
> Hi Frank,
> Yes, I would purge the osd. The cluster looks absolutely healthy except of 
> this osd.584 Probably,  the purge will help the cluster to forget this faulty 
> one. Also, I would restart monitors, too.
> With the amount of data you maintain in your cluster, I don't think your 
> ceph.conf contains any information about some particular osds, but if it 
> does, don't forget to remove the configuration of osd.584 from the ceph.conf
>
> Get Outlook for Android
>
> 
> From: Frank Schilder 
> Sent: Monday, May 3, 2021 8:37:09 AM
> To: Vladimir Sigunov ; ceph-users@ceph.io 
> 
> Subject: Re: OSD slow ops warning not clearing after OSD down
>
> Hi Vladimir,
>
> thanks for your reply. I did, the cluster is healthy:
>
> [root@gnosis ~]# ceph status
>   cluster:
> id: ---
> health: HEALTH_WARN
> 430 slow ops, oldest one blocked for 36 sec, osd.580 has slow ops
>
>   services:
> mon: 3 daemons, quorum ceph-01,ceph-02,ceph-03
> mgr: ceph-01(active), standbys: ceph-02, ceph-03
> mds: con-fs2-2/2/2 up  {0=ceph-08=up:active,1=ceph-12=up:active}, 2 
> up:standby
> osd: 584 osds: 578 up, 578 in
>
>   data:
> pools:   11 pools, 3215 pgs
> objects: 610.3 M objects, 1.2 PiB
> usage:   1.5 PiB used, 4.6 PiB / 6.0 PiB avail
> pgs: 3191 active+clean
>  13   active+clean+scrubbing+deep
>  9active+clean+snaptrim_wait
>  2active+clean+snaptrim
>
>   io:
> client:   358 MiB/s rd, 56 MiB/s wr, 2.35 kop/s rd, 1.32 kop/s wr
>
> [root@gnosis ~]# ceph health detail
> HEALTH_WARN 430 slow ops, oldest one blocked for 36 sec, osd.580 has slow ops
> SLOW_OPS 430 slow ops, oldest one blocked for 36 sec, osd.580 has slow ops
>
> OSD 580 is down+out and the message does not even increment the seconds. Its 
> probably stuck in some part of the health checking that tries to query 580 
> and doesn't understand that the OSD being down means there are no ops.
>
> I tried to restart the OSD on this disk, but it seems completely rigged. The 
> iDRAC log on the server says that the disk was removed during operation 
> possibly due to a physical connection fail on the SAS lanes. I somehow need 
> to get rid of this message and am wondering of purging the OSD would help.
>
> Best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> 
> From: Vladimir Sigunov 
> Sent: 03 May 2021 13:45:19
> To: ceph-users@ceph.io; Frank Schilder
> Subject: Re: OSD slow ops warning not clearing after OSD down
>
> Hi Frank.
> Check your cluster for inactive/incomplete placement groups. I saw similar 
> behavior on Octopus when some pgs stuck in incomplete/inactive or peering 
> state.
>
> 
> From: Frank Schilder 
> Sent: Monday, May 3, 2021 3:42:48 AM
> To: ceph-users@ceph.io 
> Subject: [ceph-users] OSD slow ops warning not clearing after OSD down
>
> Dear cephers,
>
> I have a strange problem. An OSD went down and recovery finished. For some 
> reason, I have a slow ops warning for the failed OSD stuck in the system:
>
> health: HEALTH_WARN
> 430 slow ops, oldest one blocked for 36 sec, osd.580 has slow ops
>
> The OSD is auto-out:
>
> | 580 | ceph-22 |0  |0  |0   | 0   |0   | 0   | 
> autoout,exists |
>
> It is probably a warning dating back to just before the fail. How can I clear 
> it?
>
> Thanks and best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Lokendra Rathour
Hi Team,
I was setting up the ceph cluster with

   - Node Details:3 Mon,2 MDS, 2 Mgr, 2 RGW
   - Deployment Type: Active Standby
   - Testing Mode: Failover of MDS Node
   - Setup : Octopus (15.2.7)
   - OS: centos 8.3
   - hardware: HP
   - Ram:  128 GB on each Node
   - OSD: 2 ( 1 tb each)
   - Operation: Normal I/O with mkdir on every 1 second.

T*est Case: Power-off any active MDS Node for failover to happen*

*Observation:*
We have observed that whenever an active MDS Node is down it takes around*
40 seconds* to activate the standby MDS Node.
on further checking the logs for the new-handover MDS Node we have seen
delay on the basis of following inputs:

   1. 10 second delay after which Mon calls for new Monitor election
  1.  [log]  0 log_channel(cluster) log [INF] : mon.cephnode1 calling
  monitor election
   2. 5 second delay in which newly elected Monitor is elected
  1. [log] 0 log_channel(cluster) log [INF] : mon.cephnode1 is new
  leader, mons cephnode1,cephnode3 in quorum (ranks 0,2)
  3. the addition beacon grace time for which the system waits before
   which it enables standby MDS node activation. (approx delay of 19 seconds)
  1. defaults :  sudo ceph config get mon mds_beacon_grace
  15.00
  2. sudo ceph config get mon mds_beacon_interval
  5.00
  3. [log] - 2021-04-30T18:23:10.136+0530 7f4e3925c700  1
  mon.cephnode2@1(leader).mds e776 no beacon from mds.0.771 (gid:
  639443 addr: [v2:
  10.0.4.10:6800/2172152716,v1:10.0.4.10:6801/2172152716] state:
  up:active)* since 18.7951*
   4. *in Total it takes around 40 seconds to handover and activate passive
   standby node. *

*Query:*

   1. Can these variables be configured ?  which we have tried,but are not
   aware of the overall impact on the ceph cluster because of these changes
  1. By tuning these values we could reach the minimum time of 12
  seconds in which the active node comes up.
  2. Values taken to get the said time :
 1. *mon_election_timeout* (default 5) - configured as 1
 2. *mon_lease*(default 5)  - configured as 2
 3.  *mds_beacon_grace* (default 15) - configured as 5
 4.  *mds_beacon_interval* (default 5) - configured as 1

We need to tune this setup to get the failover duration as low as 5-7
seconds.

Please suggest/support and share your inputs, my setup is ready and already
we are testing with multiple scenarios so that we are able to achive min
failover duration.

-- 
~ Lokendra
www.inertiaspeaks.com
www.inertiagroups.com
skype: lokendrarathour
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Olivier AUDRY
hello

perhaps you should have more than one MDS active.

mds: cephfs:3 {0=cephfs-d=up:active,1=cephfs-e=up:active,2=cephfs-
a=up:active} 1 up:standby-replay

I got 3 active mds and one standby.

I'm using rook in kubernetes for this setup.

oau

Le lundi 03 mai 2021 à 19:06 +0530, Lokendra Rathour a écrit :
> Hi Team,
> I was setting up the ceph cluster with
> 
>- Node Details:3 Mon,2 MDS, 2 Mgr, 2 RGW
>- Deployment Type: Active Standby
>- Testing Mode: Failover of MDS Node
>- Setup : Octopus (15.2.7)
>- OS: centos 8.3
>- hardware: HP
>- Ram:  128 GB on each Node
>- OSD: 2 ( 1 tb each)
>- Operation: Normal I/O with mkdir on every 1 second.
> 
> T*est Case: Power-off any active MDS Node for failover to happen*
> 
> *Observation:*
> We have observed that whenever an active MDS Node is down it takes
> around*
> 40 seconds* to activate the standby MDS Node.
> on further checking the logs for the new-handover MDS Node we have
> seen
> delay on the basis of following inputs:
> 
>1. 10 second delay after which Mon calls for new Monitor election
>   1.  [log]  0 log_channel(cluster) log [INF] : mon.cephnode1
> calling
>   monitor election
>2. 5 second delay in which newly elected Monitor is elected
>   1. [log] 0 log_channel(cluster) log [INF] : mon.cephnode1 is
> new
>   leader, mons cephnode1,cephnode3 in quorum (ranks 0,2)
>   3. the addition beacon grace time for which the system waits
> before
>which it enables standby MDS node activation. (approx delay of 19
> seconds)
>   1. defaults :  sudo ceph config get mon mds_beacon_grace
>   15.00
>   2. sudo ceph config get mon mds_beacon_interval
>   5.00
>   3. [log] - 2021-04-30T18:23:10.136+0530 7f4e3925c700  1
>   mon.cephnode2@1(leader).mds e776 no beacon from mds.0.771 (gid:
>   639443 addr: [v2:
>   10.0.4.10:6800/2172152716,v1:10.0.4.10:6801/2172152716] state:
>   up:active)* since 18.7951*
>4. *in Total it takes around 40 seconds to handover and activate
> passive
>standby node. *
> 
> *Query:*
> 
>1. Can these variables be configured ?  which we have tried,but
> are not
>aware of the overall impact on the ceph cluster because of these
> changes
>   1. By tuning these values we could reach the minimum time of 12
>   seconds in which the active node comes up.
>   2. Values taken to get the said time :
>  1. *mon_election_timeout* (default 5) - configured as 1
>  2. *mon_lease*(default 5)  - configured as 2
>  3.  *mds_beacon_grace* (default 15) - configured as 5
>  4.  *mds_beacon_interval* (default 5) - configured as 1
> 
> We need to tune this setup to get the failover duration as low as 5-7
> seconds.
> 
> Please suggest/support and share your inputs, my setup is ready and
> already
> we are testing with multiple scenarios so that we are able to achive
> min
> failover duration.
> 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Failed cephadm Upgrade - ValueError

2021-05-03 Thread Ashley Merrick
Just checked cluster logs and they are full of:cephadm exited with an error 
code: 1, stderr:Reconfig daemon osd.16 ... Traceback (most recent call last): 
File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 7931, in  main() File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 7919, in main r = ctx.func(ctx) File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 1717, in defaultimage return func(ctx) File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 4162, in command_deploy c = get_container(ctx, ctx.fsid, daemon_type, 
daemon_id, File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b
 697d119482", line 2451, in get_container 
volume_mounts=get_container_mounts(ctx, fsid, daemon_type, daemon_id), File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 2292, in get_container_mounts if HostFacts(ctx).selinux_enabled: File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 6451, in selinux_enabled return (self.kernel_security['type'] == 
'SELinux') and \ File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 6434, in kernel_security ret = _fetch_apparmor() File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 6415, in _fetch_apparmor item, mode = line.split(' ') ValueError: not 
enough values to unpack (expected 2, got 1) Traceback (most recent call
  last): File "/usr/share/ceph/mgr/cephadm/serve.py", line 1172, in 
_remote_connection yield (conn, connr) File 
"/usr/share/ceph/mgr/cephadm/serve.py", line 1087, in _run_cephadm code, 
'\n'.join(err))) orchestrator._interface.OrchestratorError: cephadm exited with 
an error code: 1, stderr:Reconfig daemon osd.16 ... Traceback (most recent call 
last): File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 7931, in  main() File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 7919, in main r = ctx.func(ctx) File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 1717, in _default_image return func(ctx) File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 
 4162, in command_deploy c = get_container(ctx, ctx.fsid, daemon_type, 
daemon_id, File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 2451, in get_container volume_mounts=get_container_mounts(ctx, fsid, 
daemon_type, daemon_id), File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 2292, in get_container_mounts if HostFacts(ctx).selinux_enabled: File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 6451, in selinux_enabled return (self.kernel_security['type'] == 
'SELinux') and \ File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 6434, in kernel_security ret = _fetch_apparmor() File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bd
 c911a9c50d6408adfca696c2faaa65c018d660a3b697d119482", line 6415, in 
_fetch_apparmor item, mode = line.split(' ') ValueError: not enough values to 
unpack (expected 2, got 1)being repeated over and over again for each OSD.Again 
listing "ValueError: not enough values to unpack (expected 2, got 1)"
> On Mon May 03 2021 17:20:59 GMT+0800 (Singapore Standard Time), Ashley 
> Merrick  wrote:
> Hello,Wondering if anyone had any feedback on some commands I could try to 
> manually update the current OSD that is down to 16.2.1 so I can at least get 
> around this upgrade bug and back to 100%?If there is any log's or if it seems 
> a new bug and I should create a bugzilla report do let me know.Thanks
>> On Fri Apr 30 2021 21:54:30 GMT+0800 (Singapore Standard Time), Ashley 
>> Merrick  wrote:
>> Hello All,I was running 15.2.8 via cephadm on docker Ubuntu 20.04I just 
>> attempted to upgrade to 16.2.1 via the automated method, it successfully 
>> upgraded the mon/m

[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Eugen Block
Also there's a difference between 'standby-replay' (hot standby) and  
just 'standby'. We use CephFS for a couple of years now with  
standby-replay and the failover takes a couple of seconds max,  
depending on the current load. Have you tried to enable the  
standby-replay config and tested the failover?


ceph fs set cephfs allow_standby_replay true


Zitat von Olivier AUDRY :


hello

perhaps you should have more than one MDS active.

mds: cephfs:3 {0=cephfs-d=up:active,1=cephfs-e=up:active,2=cephfs-
a=up:active} 1 up:standby-replay

I got 3 active mds and one standby.

I'm using rook in kubernetes for this setup.

oau

Le lundi 03 mai 2021 à 19:06 +0530, Lokendra Rathour a écrit :

Hi Team,
I was setting up the ceph cluster with

   - Node Details:3 Mon,2 MDS, 2 Mgr, 2 RGW
   - Deployment Type: Active Standby
   - Testing Mode: Failover of MDS Node
   - Setup : Octopus (15.2.7)
   - OS: centos 8.3
   - hardware: HP
   - Ram:  128 GB on each Node
   - OSD: 2 ( 1 tb each)
   - Operation: Normal I/O with mkdir on every 1 second.

T*est Case: Power-off any active MDS Node for failover to happen*

*Observation:*
We have observed that whenever an active MDS Node is down it takes
around*
40 seconds* to activate the standby MDS Node.
on further checking the logs for the new-handover MDS Node we have
seen
delay on the basis of following inputs:

   1. 10 second delay after which Mon calls for new Monitor election
  1.  [log]  0 log_channel(cluster) log [INF] : mon.cephnode1
calling
  monitor election
   2. 5 second delay in which newly elected Monitor is elected
  1. [log] 0 log_channel(cluster) log [INF] : mon.cephnode1 is
new
  leader, mons cephnode1,cephnode3 in quorum (ranks 0,2)
  3. the addition beacon grace time for which the system waits
before
   which it enables standby MDS node activation. (approx delay of 19
seconds)
  1. defaults :  sudo ceph config get mon mds_beacon_grace
  15.00
  2. sudo ceph config get mon mds_beacon_interval
  5.00
  3. [log] - 2021-04-30T18:23:10.136+0530 7f4e3925c700  1
  mon.cephnode2@1(leader).mds e776 no beacon from mds.0.771 (gid:
  639443 addr: [v2:
  10.0.4.10:6800/2172152716,v1:10.0.4.10:6801/2172152716] state:
  up:active)* since 18.7951*
   4. *in Total it takes around 40 seconds to handover and activate
passive
   standby node. *

*Query:*

   1. Can these variables be configured ?  which we have tried,but
are not
   aware of the overall impact on the ceph cluster because of these
changes
  1. By tuning these values we could reach the minimum time of 12
  seconds in which the active node comes up.
  2. Values taken to get the said time :
 1. *mon_election_timeout* (default 5) - configured as 1
 2. *mon_lease*(default 5)  - configured as 2
 3.  *mds_beacon_grace* (default 15) - configured as 5
 4.  *mds_beacon_interval* (default 5) - configured as 1

We need to tune this setup to get the failover duration as low as 5-7
seconds.

Please suggest/support and share your inputs, my setup is ready and
already
we are testing with multiple scenarios so that we are able to achive
min
failover duration.


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Failed cephadm Upgrade - ValueError

2021-05-03 Thread Ashley Merrick
Created BugTicket : https://tracker.ceph.com/issues/50616
> On Mon May 03 2021 21:49:41 GMT+0800 (Singapore Standard Time), Ashley 
> Merrick  wrote:
> Just checked cluster logs and they are full of:cephadm exited with an error 
> code: 1, stderr:Reconfig daemon osd.16 ... Traceback (most recent call last): 
> File 
> "/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
>  line 7931, in  main() File 
> "/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
>  line 7919, in main r = ctx.func(ctx) File 
> "/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
>  line 1717, in defaultimage return func(ctx) File 
> "/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
>  line 4162, in command_deploy c = get_container(ctx, ctx.fsid, daemon_type, 
> daemon_id, File 
> "/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a
 3b697d119482", line 2451, in get_container 
volume_mounts=get_container_mounts(ctx, fsid, daemon_type, daemon_id), File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 2292, in get_container_mounts if HostFacts(ctx).selinux_enabled: File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 6451, in selinux_enabled return (self.kernel_security['type'] == 
'SELinux') and \ File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 6434, in kernel_security ret = _fetch_apparmor() File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 6415, in _fetch_apparmor item, mode = line.split(' ') ValueError: not 
enough values to unpack (expected 2, got 1) Traceback (most recent ca
 ll last): File "/usr/share/ceph/mgr/cephadm/serve.py", line 1172, in 
_remote_connection yield (conn, connr) File 
"/usr/share/ceph/mgr/cephadm/serve.py", line 1087, in _run_cephadm code, 
'\n'.join(err))) orchestrator._interface.OrchestratorError: cephadm exited with 
an error code: 1, stderr:Reconfig daemon osd.16 ... Traceback (most recent call 
last): File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 7931, in  main() File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 7919, in main r = ctx.func(ctx) File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 1717, in _default_image return func(ctx) File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 lin
 e 4162, in command_deploy c = get_container(ctx, ctx.fsid, daemon_type, 
daemon_id, File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 2451, in get_container volume_mounts=get_container_mounts(ctx, fsid, 
daemon_type, daemon_id), File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 2292, in get_container_mounts if HostFacts(ctx).selinux_enabled: File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 6451, in selinux_enabled return (self.kernel_security['type'] == 
'SELinux') and \ File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482",
 line 6434, in kernel_security ret = _fetch_apparmor() File 
"/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484
 bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482", line 6415, in 
_fetch_apparmor item, mode = line.split(' ') ValueError: not enough values to 
unpack (expected 2, got 1)being repeated over and over again for each OSD.Again 
listing "ValueError: not enough values to unpack (expected 2, got 1)"
>> On Mon May 03 2021 17:20:59 GMT+0800 (Singapore Standard Time), Ashley 
>> Merrick  wrote:
>> Hello,Wondering if anyone had any feedback on some commands I could try to 
>> manually update the current OSD that is down to 16.2.1 so I can at least get 
>> around this upgrade bug and back to 100%?If there is any log's or if it 
>> seems a new bug and I should create a bugzilla report do let me know.Thanks
>>> On Fri Apr 30 2021 21:54:30 GMT+0800 (Singapore Standard Time), Ashley 
>>> Merric

[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Lokendra Rathour
Hello Eugen,

Thankyou for the response.
Yes we tried  ceph-standby-replay but could not see much difference in the
handover time. It was comming as 35 to 40 seconds in either case.
Did you also changed these variables (as mentioned above) along with the
hot-standby ?
Couple of seconds is something we wish to achieve using cephfs.



Best Regards,
Lokendra




On Mon, 3 May 2021, 19:27 Eugen Block,  wrote:

> Also there's a difference between 'standby-replay' (hot standby) and
> just 'standby'. We use CephFS for a couple of years now with
> standby-replay and the failover takes a couple of seconds max,
> depending on the current load. Have you tried to enable the
> standby-replay config and tested the failover?
>
> ceph fs set cephfs allow_standby_replay true
>
>
> Zitat von Olivier AUDRY :
>
> > hello
> >
> > perhaps you should have more than one MDS active.
> >
> > mds: cephfs:3 {0=cephfs-d=up:active,1=cephfs-e=up:active,2=cephfs-
> > a=up:active} 1 up:standby-replay
> >
> > I got 3 active mds and one standby.
> >
> > I'm using rook in kubernetes for this setup.
> >
> > oau
> >
> > Le lundi 03 mai 2021 à 19:06 +0530, Lokendra Rathour a écrit :
> >> Hi Team,
> >> I was setting up the ceph cluster with
> >>
> >>- Node Details:3 Mon,2 MDS, 2 Mgr, 2 RGW
> >>- Deployment Type: Active Standby
> >>- Testing Mode: Failover of MDS Node
> >>- Setup : Octopus (15.2.7)
> >>- OS: centos 8.3
> >>- hardware: HP
> >>- Ram:  128 GB on each Node
> >>- OSD: 2 ( 1 tb each)
> >>- Operation: Normal I/O with mkdir on every 1 second.
> >>
> >> T*est Case: Power-off any active MDS Node for failover to happen*
> >>
> >> *Observation:*
> >> We have observed that whenever an active MDS Node is down it takes
> >> around*
> >> 40 seconds* to activate the standby MDS Node.
> >> on further checking the logs for the new-handover MDS Node we have
> >> seen
> >> delay on the basis of following inputs:
> >>
> >>1. 10 second delay after which Mon calls for new Monitor election
> >>   1.  [log]  0 log_channel(cluster) log [INF] : mon.cephnode1
> >> calling
> >>   monitor election
> >>2. 5 second delay in which newly elected Monitor is elected
> >>   1. [log] 0 log_channel(cluster) log [INF] : mon.cephnode1 is
> >> new
> >>   leader, mons cephnode1,cephnode3 in quorum (ranks 0,2)
> >>   3. the addition beacon grace time for which the system waits
> >> before
> >>which it enables standby MDS node activation. (approx delay of 19
> >> seconds)
> >>   1. defaults :  sudo ceph config get mon mds_beacon_grace
> >>   15.00
> >>   2. sudo ceph config get mon mds_beacon_interval
> >>   5.00
> >>   3. [log] - 2021-04-30T18:23:10.136+0530 7f4e3925c700  1
> >>   mon.cephnode2@1(leader).mds e776 no beacon from mds.0.771 (gid:
> >>   639443 addr: [v2:
> >>   10.0.4.10:6800/2172152716,v1:10.0.4.10:6801/2172152716] state:
> >>   up:active)* since 18.7951*
> >>4. *in Total it takes around 40 seconds to handover and activate
> >> passive
> >>standby node. *
> >>
> >> *Query:*
> >>
> >>1. Can these variables be configured ?  which we have tried,but
> >> are not
> >>aware of the overall impact on the ceph cluster because of these
> >> changes
> >>   1. By tuning these values we could reach the minimum time of 12
> >>   seconds in which the active node comes up.
> >>   2. Values taken to get the said time :
> >>  1. *mon_election_timeout* (default 5) - configured as 1
> >>  2. *mon_lease*(default 5)  - configured as 2
> >>  3.  *mds_beacon_grace* (default 15) - configured as 5
> >>  4.  *mds_beacon_interval* (default 5) - configured as 1
> >>
> >> We need to tune this setup to get the failover duration as low as 5-7
> >> seconds.
> >>
> >> Please suggest/support and share your inputs, my setup is ready and
> >> already
> >> we are testing with multiple scenarios so that we are able to achive
> >> min
> >> failover duration.
> >>
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Troubleshoot MDS failure

2021-05-03 Thread Alessandro Piazza
Dear all,

I'm having a hard time troubleshooting a file-system failure on my 3 node 
cluster (deployed with cephadm + docker). After moving some files between 
folders, the cluster became laggy and Metadata Servers started failing and got 
stuck in rejoin state. Of course I already tried to restart the cluster 
multiple times.

The mds units are now in a failed state because of too many restarts, the 
file-system is degraded and cannot be mounted because no mds is up. I think the 
data pool is ok because I can get files using rados.
I can trigger the standby mds to become the "major" with ceph orch daemon rm 
mds  or deploy a new one but the new "major" mds go again in 
error state.

I don't find the mds logs really helpful but you can find one in the 
attachments for someone more expert than me.
I am hesitant to follow the guide 
https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/ because of 
the warnings and because the ceph-journal-tool is poorly documented.

The following might be useful

seppia:~# ceph fs status
starfs - 0 clients
==
RANK  STATE MDS ACTIVITY   DNSINOS   DIRS   
CAPS
 0rejoin(laggy)  starfs.polposition.njarir 539 25 17
  0
   POOL   TYPE USED  AVAIL
cephfs.starfs.meta  metadata  9900M  1027G
cephfs.starfs.datadata12.1T  1027G
MDS version: ceph version 16.2.0 (0c2054e95bcd9b30fdd908a79ac1d8bbc3394442) 
pacific (stable)

seppia:~ # ceph health detail
HEALTH_WARN 2 failed cephadm daemon(s); 1 filesystem is degraded; insufficient 
standby MDS daemons available; 7 pgs not deep-scrubbed in time
[WRN] CEPHADM_FAILED_DAEMON: 2 failed cephadm daemon(s)
daemon mds.starfs.polposition.njarir on polposition.starfleet.sns.it is in 
error state
daemon mds.starfs.seppia.wdwrho on seppia.starfleet.sns.it is in error state
[WRN] FS_DEGRADED: 1 filesystem is degraded
fs starfs is degraded
[WRN] MDS_INSUFFICIENT_STANDBY: insufficient standby MDS daemons available
have 0; want 1 more
[WRN] PG_NOT_DEEP_SCRUBBED: 7 pgs not deep-scrubbed in time
pg 3.a8 not deep-scrubbed since 2021-04-20T20:07:48.346677+
pg 3.a2 not deep-scrubbed since 2021-04-21T08:10:55.220263+
pg 3.7 not deep-scrubbed since 2021-04-21T07:24:20.073569+
pg 2.0 not deep-scrubbed since 2021-04-21T05:01:18.439456+
pg 9.1a not deep-scrubbed since 2021-04-21T05:18:20.171151+
pg 3.1cb not deep-scrubbed since 2021-04-20T21:54:38.251349+
pg 3.1ef not deep-scrubbed since 2021-04-21T07:07:18.842132+

Thanks for any suggestions,
Alessandro Piazza
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Eugen Block

Hi,


Yes we tried  ceph-standby-replay but could not see much difference in the
handover time. It was comming as 35 to 40 seconds in either case.
Did you also changed these variables (as mentioned above) along with the
hot-standby ?


no, we barely differ from the default configs and haven't changed much.
But we're still running Nautilus so I can't really tell if Octopus  
makes a difference.



Zitat von Lokendra Rathour :


Hello Eugen,

Thankyou for the response.
Yes we tried  ceph-standby-replay but could not see much difference in the
handover time. It was comming as 35 to 40 seconds in either case.
Did you also changed these variables (as mentioned above) along with the
hot-standby ?
Couple of seconds is something we wish to achieve using cephfs.



Best Regards,
Lokendra




On Mon, 3 May 2021, 19:27 Eugen Block,  wrote:


Also there's a difference between 'standby-replay' (hot standby) and
just 'standby'. We use CephFS for a couple of years now with
standby-replay and the failover takes a couple of seconds max,
depending on the current load. Have you tried to enable the
standby-replay config and tested the failover?

ceph fs set cephfs allow_standby_replay true


Zitat von Olivier AUDRY :

> hello
>
> perhaps you should have more than one MDS active.
>
> mds: cephfs:3 {0=cephfs-d=up:active,1=cephfs-e=up:active,2=cephfs-
> a=up:active} 1 up:standby-replay
>
> I got 3 active mds and one standby.
>
> I'm using rook in kubernetes for this setup.
>
> oau
>
> Le lundi 03 mai 2021 à 19:06 +0530, Lokendra Rathour a écrit :
>> Hi Team,
>> I was setting up the ceph cluster with
>>
>>- Node Details:3 Mon,2 MDS, 2 Mgr, 2 RGW
>>- Deployment Type: Active Standby
>>- Testing Mode: Failover of MDS Node
>>- Setup : Octopus (15.2.7)
>>- OS: centos 8.3
>>- hardware: HP
>>- Ram:  128 GB on each Node
>>- OSD: 2 ( 1 tb each)
>>- Operation: Normal I/O with mkdir on every 1 second.
>>
>> T*est Case: Power-off any active MDS Node for failover to happen*
>>
>> *Observation:*
>> We have observed that whenever an active MDS Node is down it takes
>> around*
>> 40 seconds* to activate the standby MDS Node.
>> on further checking the logs for the new-handover MDS Node we have
>> seen
>> delay on the basis of following inputs:
>>
>>1. 10 second delay after which Mon calls for new Monitor election
>>   1.  [log]  0 log_channel(cluster) log [INF] : mon.cephnode1
>> calling
>>   monitor election
>>2. 5 second delay in which newly elected Monitor is elected
>>   1. [log] 0 log_channel(cluster) log [INF] : mon.cephnode1 is
>> new
>>   leader, mons cephnode1,cephnode3 in quorum (ranks 0,2)
>>   3. the addition beacon grace time for which the system waits
>> before
>>which it enables standby MDS node activation. (approx delay of 19
>> seconds)
>>   1. defaults :  sudo ceph config get mon mds_beacon_grace
>>   15.00
>>   2. sudo ceph config get mon mds_beacon_interval
>>   5.00
>>   3. [log] - 2021-04-30T18:23:10.136+0530 7f4e3925c700  1
>>   mon.cephnode2@1(leader).mds e776 no beacon from mds.0.771 (gid:
>>   639443 addr: [v2:
>>   10.0.4.10:6800/2172152716,v1:10.0.4.10:6801/2172152716] state:
>>   up:active)* since 18.7951*
>>4. *in Total it takes around 40 seconds to handover and activate
>> passive
>>standby node. *
>>
>> *Query:*
>>
>>1. Can these variables be configured ?  which we have tried,but
>> are not
>>aware of the overall impact on the ceph cluster because of these
>> changes
>>   1. By tuning these values we could reach the minimum time of 12
>>   seconds in which the active node comes up.
>>   2. Values taken to get the said time :
>>  1. *mon_election_timeout* (default 5) - configured as 1
>>  2. *mon_lease*(default 5)  - configured as 2
>>  3.  *mds_beacon_grace* (default 15) - configured as 5
>>  4.  *mds_beacon_interval* (default 5) - configured as 1
>>
>> We need to tune this setup to get the failover duration as low as 5-7
>> seconds.
>>
>> Please suggest/support and share your inputs, my setup is ready and
>> already
>> we are testing with multiple scenarios so that we are able to achive
>> min
>> failover duration.
>>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Patrick Donnelly
On Mon, May 3, 2021 at 6:36 AM Lokendra Rathour
 wrote:
>
> Hi Team,
> I was setting up the ceph cluster with
>
>- Node Details:3 Mon,2 MDS, 2 Mgr, 2 RGW
>- Deployment Type: Active Standby
>- Testing Mode: Failover of MDS Node
>- Setup : Octopus (15.2.7)
>- OS: centos 8.3
>- hardware: HP
>- Ram:  128 GB on each Node
>- OSD: 2 ( 1 tb each)
>- Operation: Normal I/O with mkdir on every 1 second.
>
> T*est Case: Power-off any active MDS Node for failover to happen*
>
> *Observation:*
> We have observed that whenever an active MDS Node is down it takes around*
> 40 seconds* to activate the standby MDS Node.
> on further checking the logs for the new-handover MDS Node we have seen
> delay on the basis of following inputs:
>
>1. 10 second delay after which Mon calls for new Monitor election
>   1.  [log]  0 log_channel(cluster) log [INF] : mon.cephnode1 calling
>   monitor election

In the process of killing the active MDS, are you also killing a monitor?

-- 
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSD slow ops warning not clearing after OSD down

2021-05-03 Thread Frank Schilder
Hi Vladimir,

thanks for your reply. I did, the cluster is healthy:

[root@gnosis ~]# ceph status
  cluster:
id: ---
health: HEALTH_WARN
430 slow ops, oldest one blocked for 36 sec, osd.580 has slow ops

  services:
mon: 3 daemons, quorum ceph-01,ceph-02,ceph-03
mgr: ceph-01(active), standbys: ceph-02, ceph-03
mds: con-fs2-2/2/2 up  {0=ceph-08=up:active,1=ceph-12=up:active}, 2 
up:standby
osd: 584 osds: 578 up, 578 in

  data:
pools:   11 pools, 3215 pgs
objects: 610.3 M objects, 1.2 PiB
usage:   1.5 PiB used, 4.6 PiB / 6.0 PiB avail
pgs: 3191 active+clean
 13   active+clean+scrubbing+deep
 9active+clean+snaptrim_wait
 2active+clean+snaptrim

  io:
client:   358 MiB/s rd, 56 MiB/s wr, 2.35 kop/s rd, 1.32 kop/s wr

[root@gnosis ~]# ceph health detail
HEALTH_WARN 430 slow ops, oldest one blocked for 36 sec, osd.580 has slow ops
SLOW_OPS 430 slow ops, oldest one blocked for 36 sec, osd.580 has slow ops

OSD 580 is down+out and the message does not even increment the seconds. Its 
probably stuck in some part of the health checking that tries to query 580 and 
doesn't understand that the OSD being down means there are no ops.

I tried to restart the OSD on this disk, but it seems completely rigged. The 
iDRAC log on the server says that the disk was removed during operation 
possibly due to a physical connection fail on the SAS lanes. I somehow need to 
get rid of this message and am wondering of purging the OSD would help.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Vladimir Sigunov 
Sent: 03 May 2021 13:45:19
To: ceph-users@ceph.io; Frank Schilder
Subject: Re: OSD slow ops warning not clearing after OSD down

Hi Frank.
Check your cluster for inactive/incomplete placement groups. I saw similar 
behavior on Octopus when some pgs stuck in incomplete/inactive or peering state.


From: Frank Schilder 
Sent: Monday, May 3, 2021 3:42:48 AM
To: ceph-users@ceph.io 
Subject: [ceph-users] OSD slow ops warning not clearing after OSD down

Dear cephers,

I have a strange problem. An OSD went down and recovery finished. For some 
reason, I have a slow ops warning for the failed OSD stuck in the system:

health: HEALTH_WARN
430 slow ops, oldest one blocked for 36 sec, osd.580 has slow ops

The OSD is auto-out:

| 580 | ceph-22 |0  |0  |0   | 0   |0   | 0   | 
autoout,exists |

It is probably a warning dating back to just before the fail. How can I clear 
it?

Thanks and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Lokendra Rathour
Ok,
Will try with nautilus as well.
But we are really configuring too many variables to achieve 10 seconds of
failover time.
Is it possible for you to share the setup details.
Like we are using
2 node ceph cluster in health ok (configured replication factor and related
variables)
Hardware is HP, and to mount the fs on the client machine we are using
native file system driver. During mount we are passing IP of both the nodes
of ceph.

Please share further details  or related details of your setup if possible
which might help us with the failover delay issue.

Thanks again for your inputs and your response.
We also need to achieve couple of seconds failover delay.

Best regards,
Lokendra


On Mon, 3 May 2021, 20:47 Eugen Block,  wrote:

> Hi,
>
> > Yes we tried  ceph-standby-replay but could not see much difference in
> the
> > handover time. It was comming as 35 to 40 seconds in either case.
> > Did you also changed these variables (as mentioned above) along with the
> > hot-standby ?
>
> no, we barely differ from the default configs and haven't changed much.
> But we're still running Nautilus so I can't really tell if Octopus
> makes a difference.
>
>
> Zitat von Lokendra Rathour :
>
> > Hello Eugen,
> >
> > Thankyou for the response.
> > Yes we tried  ceph-standby-replay but could not see much difference in
> the
> > handover time. It was comming as 35 to 40 seconds in either case.
> > Did you also changed these variables (as mentioned above) along with the
> > hot-standby ?
> > Couple of seconds is something we wish to achieve using cephfs.
> >
> >
> >
> > Best Regards,
> > Lokendra
> >
> >
> >
> >
> > On Mon, 3 May 2021, 19:27 Eugen Block,  wrote:
> >
> >> Also there's a difference between 'standby-replay' (hot standby) and
> >> just 'standby'. We use CephFS for a couple of years now with
> >> standby-replay and the failover takes a couple of seconds max,
> >> depending on the current load. Have you tried to enable the
> >> standby-replay config and tested the failover?
> >>
> >> ceph fs set cephfs allow_standby_replay true
> >>
> >>
> >> Zitat von Olivier AUDRY :
> >>
> >> > hello
> >> >
> >> > perhaps you should have more than one MDS active.
> >> >
> >> > mds: cephfs:3 {0=cephfs-d=up:active,1=cephfs-e=up:active,2=cephfs-
> >> > a=up:active} 1 up:standby-replay
> >> >
> >> > I got 3 active mds and one standby.
> >> >
> >> > I'm using rook in kubernetes for this setup.
> >> >
> >> > oau
> >> >
> >> > Le lundi 03 mai 2021 à 19:06 +0530, Lokendra Rathour a écrit :
> >> >> Hi Team,
> >> >> I was setting up the ceph cluster with
> >> >>
> >> >>- Node Details:3 Mon,2 MDS, 2 Mgr, 2 RGW
> >> >>- Deployment Type: Active Standby
> >> >>- Testing Mode: Failover of MDS Node
> >> >>- Setup : Octopus (15.2.7)
> >> >>- OS: centos 8.3
> >> >>- hardware: HP
> >> >>- Ram:  128 GB on each Node
> >> >>- OSD: 2 ( 1 tb each)
> >> >>- Operation: Normal I/O with mkdir on every 1 second.
> >> >>
> >> >> T*est Case: Power-off any active MDS Node for failover to happen*
> >> >>
> >> >> *Observation:*
> >> >> We have observed that whenever an active MDS Node is down it takes
> >> >> around*
> >> >> 40 seconds* to activate the standby MDS Node.
> >> >> on further checking the logs for the new-handover MDS Node we have
> >> >> seen
> >> >> delay on the basis of following inputs:
> >> >>
> >> >>1. 10 second delay after which Mon calls for new Monitor election
> >> >>   1.  [log]  0 log_channel(cluster) log [INF] : mon.cephnode1
> >> >> calling
> >> >>   monitor election
> >> >>2. 5 second delay in which newly elected Monitor is elected
> >> >>   1. [log] 0 log_channel(cluster) log [INF] : mon.cephnode1 is
> >> >> new
> >> >>   leader, mons cephnode1,cephnode3 in quorum (ranks 0,2)
> >> >>   3. the addition beacon grace time for which the system waits
> >> >> before
> >> >>which it enables standby MDS node activation. (approx delay of 19
> >> >> seconds)
> >> >>   1. defaults :  sudo ceph config get mon mds_beacon_grace
> >> >>   15.00
> >> >>   2. sudo ceph config get mon mds_beacon_interval
> >> >>   5.00
> >> >>   3. [log] - 2021-04-30T18:23:10.136+0530 7f4e3925c700  1
> >> >>   mon.cephnode2@1(leader).mds e776 no beacon from mds.0.771
> (gid:
> >> >>   639443 addr: [v2:
> >> >>   10.0.4.10:6800/2172152716,v1:10.0.4.10:6801/2172152716] state:
> >> >>   up:active)* since 18.7951*
> >> >>4. *in Total it takes around 40 seconds to handover and activate
> >> >> passive
> >> >>standby node. *
> >> >>
> >> >> *Query:*
> >> >>
> >> >>1. Can these variables be configured ?  which we have tried,but
> >> >> are not
> >> >>aware of the overall impact on the ceph cluster because of these
> >> >> changes
> >> >>   1. By tuning these values we could reach the minimum time of 12
> >> >>   seconds in which the active node comes up.
> >> >>   2. Values taken to get the said time :
> >> >>  1. *mon_ele

[ceph-users] Re: OSD slow ops warning not clearing after OSD down

2021-05-03 Thread Frank Schilder
Hi Dan,

just restarted all MONs, no change though :(

Thanks for looking at this. I will wait until tomorrow. My plan is to get the 
disk up again with the same OSD ID and would expect that this will eventually 
allow the message to be cleared.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Dan van der Ster 
Sent: 03 May 2021 15:08:03
To: Vladimir Sigunov
Cc: ceph-users@ceph.io; Frank Schilder
Subject: Re: [ceph-users] Re: OSD slow ops warning not clearing after OSD down

Wait, first just restart the leader mon.

See: https://tracker.ceph.com/issues/47380 for a related issue.

-- dan

On Mon, May 3, 2021 at 2:55 PM Vladimir Sigunov
 wrote:
>
> Hi Frank,
> Yes, I would purge the osd. The cluster looks absolutely healthy except of 
> this osd.584 Probably,  the purge will help the cluster to forget this faulty 
> one. Also, I would restart monitors, too.
> With the amount of data you maintain in your cluster, I don't think your 
> ceph.conf contains any information about some particular osds, but if it 
> does, don't forget to remove the configuration of osd.584 from the ceph.conf
>
> Get Outlook for Android
>
> 
> From: Frank Schilder 
> Sent: Monday, May 3, 2021 8:37:09 AM
> To: Vladimir Sigunov ; ceph-users@ceph.io 
> 
> Subject: Re: OSD slow ops warning not clearing after OSD down
>
> Hi Vladimir,
>
> thanks for your reply. I did, the cluster is healthy:
>
> [root@gnosis ~]# ceph status
>   cluster:
> id: ---
> health: HEALTH_WARN
> 430 slow ops, oldest one blocked for 36 sec, osd.580 has slow ops
>
>   services:
> mon: 3 daemons, quorum ceph-01,ceph-02,ceph-03
> mgr: ceph-01(active), standbys: ceph-02, ceph-03
> mds: con-fs2-2/2/2 up  {0=ceph-08=up:active,1=ceph-12=up:active}, 2 
> up:standby
> osd: 584 osds: 578 up, 578 in
>
>   data:
> pools:   11 pools, 3215 pgs
> objects: 610.3 M objects, 1.2 PiB
> usage:   1.5 PiB used, 4.6 PiB / 6.0 PiB avail
> pgs: 3191 active+clean
>  13   active+clean+scrubbing+deep
>  9active+clean+snaptrim_wait
>  2active+clean+snaptrim
>
>   io:
> client:   358 MiB/s rd, 56 MiB/s wr, 2.35 kop/s rd, 1.32 kop/s wr
>
> [root@gnosis ~]# ceph health detail
> HEALTH_WARN 430 slow ops, oldest one blocked for 36 sec, osd.580 has slow ops
> SLOW_OPS 430 slow ops, oldest one blocked for 36 sec, osd.580 has slow ops
>
> OSD 580 is down+out and the message does not even increment the seconds. Its 
> probably stuck in some part of the health checking that tries to query 580 
> and doesn't understand that the OSD being down means there are no ops.
>
> I tried to restart the OSD on this disk, but it seems completely rigged. The 
> iDRAC log on the server says that the disk was removed during operation 
> possibly due to a physical connection fail on the SAS lanes. I somehow need 
> to get rid of this message and am wondering of purging the OSD would help.
>
> Best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> 
> From: Vladimir Sigunov 
> Sent: 03 May 2021 13:45:19
> To: ceph-users@ceph.io; Frank Schilder
> Subject: Re: OSD slow ops warning not clearing after OSD down
>
> Hi Frank.
> Check your cluster for inactive/incomplete placement groups. I saw similar 
> behavior on Octopus when some pgs stuck in incomplete/inactive or peering 
> state.
>
> 
> From: Frank Schilder 
> Sent: Monday, May 3, 2021 3:42:48 AM
> To: ceph-users@ceph.io 
> Subject: [ceph-users] OSD slow ops warning not clearing after OSD down
>
> Dear cephers,
>
> I have a strange problem. An OSD went down and recovery finished. For some 
> reason, I have a slow ops warning for the failed OSD stuck in the system:
>
> health: HEALTH_WARN
> 430 slow ops, oldest one blocked for 36 sec, osd.580 has slow ops
>
> The OSD is auto-out:
>
> | 580 | ceph-22 |0  |0  |0   | 0   |0   | 0   | 
> autoout,exists |
>
> It is probably a warning dating back to just before the fail. How can I clear 
> it?
>
> Thanks and best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How can I get tail information a parted rados object

2021-05-03 Thread by morphin
Hi Rob.

I think I wasn't clear enough with the first mail.
I'm having issues with the RGW. radosgw-admin or s3 can not access
some objects in the bucket. These objects are exist in the "RADOS" and
I can export with "rados get -p $pooll $object".
But the problem ise 4M chunk and multiparts. I have to find all the
parts with the part number order and after export I need to bring all
the parts together.
I've write a code with golang and I took all the partition info from
rados attr "user.rgw.manifest" and some objects only have
"user.rgw.olh.info".
I can now decode these attrs with "ceph-dencoder  RGWOLHInfo  and
RGWObjManifest decode dump_json" and I can find the parts and order.

Now I can export anything in the rados.

I was looking a tool to do that for me but I think its not exist.
Because I couldn't find.
RGW has t many problems and Sharding issues with Multisite
setup! And the worst part is somehow RGW losing object records... It
could be Sharding, Multisite, Versioning or Lifecycle. I don't know
exactly why! But TO rescue these objects in the rados and write them
again with S3 I had to write special program for it.
I know ceph-dencoder for Developers but the documentation is poor for
RGW and I had to read code, understand, and write a new program
because I need a BRIDGE between RGW and the RADOS.

I think I should publish the program for the community. It can
"Directly rados download and upload to RGW or local" It can upload
1.5M object in 30 minute. (small files)
But  the program is designed for one job. It needs edit for public use.
If there is a option or program please let me know.
If there is no option and if it is a good idea to make one for the
community. I'm ready to publish and work on it.


Now I have Multisite sync error and I don't use these buckets anymore.
I changed the master zone and remove the secondary zone from
zonegroup. I dont have multisite anymore.
After that I migrate buckets to new one and I want to trim the sync
errors but "radosgw-admin error trim" not working.
What Should I do? I Really Really help on this! I'm using 14.2.16 nautilus.


Do you know anything about deleting old periods? What will the effect
on a cluster?



Rob Haverkamp , 3 May 2021 Pzt, 15:15 tarihinde
şunu yazdı:
>
> Hi Morphin,
>
> There are multiple ways you can do this.
>
> run a radosgw-admin bucket radoslist --bucket  write that 
> output to a file, grep all entries containing the object name ' 
> im034113.jpg', sort that list and download them.
> run a radosgw-admin object stat --bucket  --object  
> this will output a json document. With the information in the manifest key 
> you can find out what rados objects belong to the RGW object.
>
>
>
> Kind regards,
>
> Rob
> https://www.42on.com/
>
>
> 
> From: by morphin 
> Sent: Saturday, May 1, 2021 11:09 PM
> To: Ceph Users 
> Subject: [ceph-users] How can I get tail information a parted rados object
>
> Hello.
>
> I'm trying to export objects from rados with rados get. Some objects
> bigger than 4M and they have tails. Is there any easy way to get tail
> information an object?
>
> For example this is an object:
> - c106b26b.3_Img/2017/12/im034113.jpg
> These are the objet parts:
> - 
> c106b26b.3__multipart_Img/2017/12/im034113.jpg.2~fjrC5r_KCWMBat_4bFVtmBv9pxcVL-9.1
> - 
> c106b26b.3__shadow_Img/2017/12/im034113.jpg.2~fjrC5r_KCWMBat_4bFVtmBv9pxcVL-9.1_1
> - 
> c106b26b.3__shadow_Img/2017/12/im034113.jpg.2~fjrC5r_KCWMBat_4bFVtmBv9pxcVL-9.1_2
> - 
> c106b26b.3__multipart_Img/2017/12/im034113.jpg.2~fjrC5r_KCWMBat_4bFVtmBv9pxcVL-9.2
>
> As you can see the object has 2 multipart and 2 shadow object.
> This jpg only works when I get all the parts and make it one with the order.
> order: "cat 9.1 9.1_1 9.1_2 9.2 > im034113.jpg"
>
> I'm trying to write a code and the code gonna read objects from a list
> and find all the parts, bring it together with the order...  But I
> couldn't find a good way to get part information.
>
> I followed the link https://www.programmersought.com/article/31497869978/
> and I get the object manifest with getxattr and decode it with
> "ceph-dencoder type RGWBucketEnt  decode dump_json"
> But in the manifest I can not find a path to code it. It's not useful.
> Is there any different place that I can take the part information an
> object?
>
> Or better! Is there any tool to export an object with its tails?
>
> btw: these objects created by RGW using s3. RGW can not access these
> files. Because of that I'm trying to export it from rados and send it
> to different RGW.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Lokendra Rathour
Yes Patric,
In the process of killing MDS we are also *killing Monitor along with
OSD,Mgr and RGW*. we are performing Poweroff/Reboot the complete node (with
MDS,Mon,RGW,OSD,Mgr daemon).
Cluster: 2 Nodes with MDS|Mon|RGW|OSD each and third node with 1 Mon.

Note : when I am only stopping the MDS service it takes 4-7 Seconds to
activate and resume the standy MDS Node.

Thanks for your inputs.

 Best Regards,
Lokendra


On Mon, May 3, 2021 at 8:50 PM Patrick Donnelly  wrote:

> On Mon, May 3, 2021 at 6:36 AM Lokendra Rathour
>  wrote:
> >
> > Hi Team,
> > I was setting up the ceph cluster with
> >
> >- Node Details:3 Mon,2 MDS, 2 Mgr, 2 RGW
> >- Deployment Type: Active Standby
> >- Testing Mode: Failover of MDS Node
> >- Setup : Octopus (15.2.7)
> >- OS: centos 8.3
> >- hardware: HP
> >- Ram:  128 GB on each Node
> >- OSD: 2 ( 1 tb each)
> >- Operation: Normal I/O with mkdir on every 1 second.
> >
> > T*est Case: Power-off any active MDS Node for failover to happen*
> >
> > *Observation:*
> > We have observed that whenever an active MDS Node is down it takes
> around*
> > 40 seconds* to activate the standby MDS Node.
> > on further checking the logs for the new-handover MDS Node we have seen
> > delay on the basis of following inputs:
> >
> >1. 10 second delay after which Mon calls for new Monitor election
> >   1.  [log]  0 log_channel(cluster) log [INF] : mon.cephnode1 calling
> >   monitor election
>
> In the process of killing the active MDS, are you also killing a monitor?
>
> --
> Patrick Donnelly, Ph.D.
> He / Him / His
> Principal Software Engineer
> Red Hat Sunnyvale, CA
> GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
>
>

-- 
~ Lokendra
www.inertiaspeaks.com
www.inertiagroups.com
skype: lokendrarathour
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Lokendra Rathour
Hello Frank,
Thanks for your inputs.

*Responding to your Queries , Kindly refer below:*

   - *Do you have services co-located? *
   - [loke] : Yes they are colocated:
 - Cephnode1 : MDS,MGR,MON,RGW,OSD,MDS
 - Cephnode2: MDS,MGR,MON,RGW,OSD,MDS
 - Cephnode3: MON
  - Which of the times (1) or (2) are you referring to?
   - For you part One (1) : we can say it like by counting the time since
  the I/O is stopped till the I/O is resumed, which includes

  - call for new mon election
 - election of mon leader
 - calling of MDS standby acting by new Mon Leader
 - resuming I/O stuck threads
 - and other internal process.(i am only point what i could read
 from logs)

  - *How many FS Clients do you have?*
  - we are testing with only one client mounting using native fs driver
  at the moment, where we pass both IP Address of both the MDS
Daemon(in our
  case both the Ceph Nodes) using following method:
 - sudo mount -t ceph 10.0.4.10,10.0.4.11:6789:/volumes/path/
 /mnt/cephconf -o
name=foo,secret=AQAus49gdCHvIxAAB89BcDYqYSqJ8yOJBg5grw==


*one input*:if we only shut-down MDS Active Daemon, we only get 4-7
Seconds, i.e if we are not rebooting the physical node but only the service
MDS.
When we reboot Physical node , Cephnode1 or Cephnode2 ( Mon,Mgr,RGW,OSD
also gets rebooted along with MDS)  we realizing around 40 seconds.

Best Regards,
Lokendra


On Mon, May 3, 2021 at 10:30 PM Frank Schilder  wrote:

> Following up on this and other comments, there are 2 different time
> delays. One (1)  is the time it takes from killing an MDS until a stand-by
> is made an active rank, and (2) the time it takes for the new active rank
> to restore all client sessions. My experience is that (1) takes close to 0
> seconds while (2) can take between 20-30 seconds depending on how busy the
> clients are; the MDS will go through various states before reaching active.
> We usually have ca. 1600 client connections to our FS. With fewer clients,
> MDS fail-over is practically instantaneous. We are using latest mimic.
>
> From what you write, you seem to have a 40 seconds window for (1), which
> points to a problem different to MON config values. This is supported by
> your description including a MON election (??? this should never happen).
> Do you have have services co-located? Which of the times (1) or (2) are you
> referring to? How many FS clients do you have?
>
> Best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> 
> From: Patrick Donnelly 
> Sent: 03 May 2021 17:19:37
> To: Lokendra Rathour
> Cc: Ceph Development; dev; ceph-users
> Subject: [ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay
> issue
>
> On Mon, May 3, 2021 at 6:36 AM Lokendra Rathour
>  wrote:
> >
> > Hi Team,
> > I was setting up the ceph cluster with
> >
> >- Node Details:3 Mon,2 MDS, 2 Mgr, 2 RGW
> >- Deployment Type: Active Standby
> >- Testing Mode: Failover of MDS Node
> >- Setup : Octopus (15.2.7)
> >- OS: centos 8.3
> >- hardware: HP
> >- Ram:  128 GB on each Node
> >- OSD: 2 ( 1 tb each)
> >- Operation: Normal I/O with mkdir on every 1 second.
> >
> > T*est Case: Power-off any active MDS Node for failover to happen*
> >
> > *Observation:*
> > We have observed that whenever an active MDS Node is down it takes
> around*
> > 40 seconds* to activate the standby MDS Node.
> > on further checking the logs for the new-handover MDS Node we have seen
> > delay on the basis of following inputs:
> >
> >1. 10 second delay after which Mon calls for new Monitor election
> >   1.  [log]  0 log_channel(cluster) log [INF] : mon.cephnode1 calling
> >   monitor election
>
> In the process of killing the active MDS, are you also killing a monitor?
>
> --
> Patrick Donnelly, Ph.D.
> He / Him / His
> Principal Software Engineer
> Red Hat Sunnyvale, CA
> GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
~ Lokendra
www.inertiaspeaks.com
www.inertiagroups.com
skype: lokendrarathour
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Eugen Block

I wouldn't recommend a colocated MDS in a production environment.


Zitat von Lokendra Rathour :


Hello Frank,
Thanks for your inputs.

*Responding to your Queries , Kindly refer below:*

   - *Do you have services co-located? *
   - [loke] : Yes they are colocated:
 - Cephnode1 : MDS,MGR,MON,RGW,OSD,MDS
 - Cephnode2: MDS,MGR,MON,RGW,OSD,MDS
 - Cephnode3: MON
  - Which of the times (1) or (2) are you referring to?
   - For you part One (1) : we can say it like by counting the time since
  the I/O is stopped till the I/O is resumed, which includes

  - call for new mon election
 - election of mon leader
 - calling of MDS standby acting by new Mon Leader
 - resuming I/O stuck threads
 - and other internal process.(i am only point what i could read
 from logs)

  - *How many FS Clients do you have?*
  - we are testing with only one client mounting using native fs driver
  at the moment, where we pass both IP Address of both the MDS
Daemon(in our
  case both the Ceph Nodes) using following method:
 - sudo mount -t ceph 10.0.4.10,10.0.4.11:6789:/volumes/path/
 /mnt/cephconf -o
name=foo,secret=AQAus49gdCHvIxAAB89BcDYqYSqJ8yOJBg5grw==


*one input*:if we only shut-down MDS Active Daemon, we only get 4-7
Seconds, i.e if we are not rebooting the physical node but only the service
MDS.
When we reboot Physical node , Cephnode1 or Cephnode2 ( Mon,Mgr,RGW,OSD
also gets rebooted along with MDS)  we realizing around 40 seconds.

Best Regards,
Lokendra


On Mon, May 3, 2021 at 10:30 PM Frank Schilder  wrote:


Following up on this and other comments, there are 2 different time
delays. One (1)  is the time it takes from killing an MDS until a stand-by
is made an active rank, and (2) the time it takes for the new active rank
to restore all client sessions. My experience is that (1) takes close to 0
seconds while (2) can take between 20-30 seconds depending on how busy the
clients are; the MDS will go through various states before reaching active.
We usually have ca. 1600 client connections to our FS. With fewer clients,
MDS fail-over is practically instantaneous. We are using latest mimic.

From what you write, you seem to have a 40 seconds window for (1), which
points to a problem different to MON config values. This is supported by
your description including a MON election (??? this should never happen).
Do you have have services co-located? Which of the times (1) or (2) are you
referring to? How many FS clients do you have?

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Patrick Donnelly 
Sent: 03 May 2021 17:19:37
To: Lokendra Rathour
Cc: Ceph Development; dev; ceph-users
Subject: [ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay
issue

On Mon, May 3, 2021 at 6:36 AM Lokendra Rathour
 wrote:
>
> Hi Team,
> I was setting up the ceph cluster with
>
>- Node Details:3 Mon,2 MDS, 2 Mgr, 2 RGW
>- Deployment Type: Active Standby
>- Testing Mode: Failover of MDS Node
>- Setup : Octopus (15.2.7)
>- OS: centos 8.3
>- hardware: HP
>- Ram:  128 GB on each Node
>- OSD: 2 ( 1 tb each)
>- Operation: Normal I/O with mkdir on every 1 second.
>
> T*est Case: Power-off any active MDS Node for failover to happen*
>
> *Observation:*
> We have observed that whenever an active MDS Node is down it takes
around*
> 40 seconds* to activate the standby MDS Node.
> on further checking the logs for the new-handover MDS Node we have seen
> delay on the basis of following inputs:
>
>1. 10 second delay after which Mon calls for new Monitor election
>   1.  [log]  0 log_channel(cluster) log [INF] : mon.cephnode1 calling
>   monitor election

In the process of killing the active MDS, are you also killing a monitor?

--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




--
~ Lokendra
www.inertiaspeaks.com
www.inertiagroups.com
skype: lokendrarathour
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: using ec pool with rgw

2021-05-03 Thread David Orman
We haven't found a more 'elegant' way, but the process we follow: we
pre-create all the pools prior to creating the realm/zonegroup/zone, then
we period apply, then we remove the default zonegroup/zone, period apply,
then remove the default pools.

Hope this is at least somewhat helpful,
David

On Sat, May 1, 2021 at 11:39 AM Marco Savoca  wrote:

> Hi,
>
> I’m currently deploying a new cluster for cold storage with rgw.
>
> Is there actually a more elegant method to get the bucket data on an
> erasure coding pool other than moving the pool or creating the bucket.data
> pool prior to data upload?
>
> Thanks,
>
> Marco Savoca
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Frank Schilder
Following up on this and other comments, there are 2 different time delays. One 
(1)  is the time it takes from killing an MDS until a stand-by is made an 
active rank, and (2) the time it takes for the new active rank to restore all 
client sessions. My experience is that (1) takes close to 0 seconds while (2) 
can take between 20-30 seconds depending on how busy the clients are; the MDS 
will go through various states before reaching active. We usually have ca. 1600 
client connections to our FS. With fewer clients, MDS fail-over is practically 
instantaneous. We are using latest mimic.

>From what you write, you seem to have a 40 seconds window for (1), which 
>points to a problem different to MON config values. This is supported by your 
>description including a MON election (??? this should never happen). Do you 
>have have services co-located? Which of the times (1) or (2) are you referring 
>to? How many FS clients do you have?

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Patrick Donnelly 
Sent: 03 May 2021 17:19:37
To: Lokendra Rathour
Cc: Ceph Development; dev; ceph-users
Subject: [ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

On Mon, May 3, 2021 at 6:36 AM Lokendra Rathour
 wrote:
>
> Hi Team,
> I was setting up the ceph cluster with
>
>- Node Details:3 Mon,2 MDS, 2 Mgr, 2 RGW
>- Deployment Type: Active Standby
>- Testing Mode: Failover of MDS Node
>- Setup : Octopus (15.2.7)
>- OS: centos 8.3
>- hardware: HP
>- Ram:  128 GB on each Node
>- OSD: 2 ( 1 tb each)
>- Operation: Normal I/O with mkdir on every 1 second.
>
> T*est Case: Power-off any active MDS Node for failover to happen*
>
> *Observation:*
> We have observed that whenever an active MDS Node is down it takes around*
> 40 seconds* to activate the standby MDS Node.
> on further checking the logs for the new-handover MDS Node we have seen
> delay on the basis of following inputs:
>
>1. 10 second delay after which Mon calls for new Monitor election
>   1.  [log]  0 log_channel(cluster) log [INF] : mon.cephnode1 calling
>   monitor election

In the process of killing the active MDS, are you also killing a monitor?

--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Certificat format for the SSL dashboard

2021-05-03 Thread Fabrice Bacchella
Once activated the dashboard, I try to import certificates, but it fails:

$ ceph dashboard set-ssl-certificate-key -i /data/ceph/conf/ceph.key 
Error EINVAL: Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 1337, in _handle_command
return CLICommand.COMMANDS[cmd['prefix']].call(self, cmd, inbuf)
  File "/usr/share/ceph/mgr/mgr_module.py", line 389, in call
return self.func(mgr, **kwargs)
  File "/usr/share/ceph/mgr/dashboard/module.py", line 385, in 
set_ssl_certificate_key
self.set_store('key', inbuf.decode())
AttributeError: 'str' object has no attribute 'decode'

$ ceph dashboard set-ssl-certificate  -i /data/ceph/conf/ceph.crt
Error EINVAL: Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 1337, in _handle_command
return CLICommand.COMMANDS[cmd['prefix']].call(self, cmd, inbuf)
  File "/usr/share/ceph/mgr/mgr_module.py", line 389, in call
return self.func(mgr, **kwargs)
  File "/usr/share/ceph/mgr/dashboard/module.py", line 372, in 
set_ssl_certificate
self.set_store('crt', inbuf.decode())
AttributeError: 'str' object has no attribute 'decode'


They are both PEM encoded files:
file /data/ceph/conf/ceph.key /data/ceph/conf/ceph.crt
/data/ceph/conf/ceph.key: PEM RSA private key
/data/ceph/conf/ceph.crt: PEM certificate

What format does this command expect ?

That error happens on Centos 8.3.2011 with ceph-mgr-16.2.1-0.el8.x86_64, 
downloaded directly from ceph.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Frank Schilder
I concur, having this heavily collocated set-up will not perform any better 
than you observe. Do you really have 2 MDS daemons per host? I just saw hat you 
have only 2 disks, probably 1 per node. In this set-up, you cannot really 
expect good fail-over times due to the amount of simultaneous fails that need 
to be handled:

- MON fail
- OSD fail
- MDS fail
- MGR fail
- (all!) PGs become degraded
- FS data and meta data are on the same disks, so replaying journals, data IO 
and meta data IO all go to the same drive(s)

There are plenty of bottle-necks in this set-up that render this test highly 
unrealistic. You should try to get more production ready hardware, a 2-disk 
ceph cluster isn't. I wouldn't waste time trying to adjust configs to a small 
test case, these config changes will not do any good for proper production 
systems. The set-up you have is good for learning to administrate ceph, it is 
not providing a point for comparison with a production system and will have 
heavily degraded performance. Ceph requires a not exactly small minimum size 
before it starts working well.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Eugen Block 
Sent: 03 May 2021 20:53:51
To: ceph-users@ceph.io
Subject: [ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

I wouldn't recommend a colocated MDS in a production environment.


Zitat von Lokendra Rathour :

> Hello Frank,
> Thanks for your inputs.
>
> *Responding to your Queries , Kindly refer below:*
>
>- *Do you have services co-located? *
>- [loke] : Yes they are colocated:
>  - Cephnode1 : MDS,MGR,MON,RGW,OSD,MDS
>  - Cephnode2: MDS,MGR,MON,RGW,OSD,MDS
>  - Cephnode3: MON
>   - Which of the times (1) or (2) are you referring to?
>- For you part One (1) : we can say it like by counting the time since
>   the I/O is stopped till the I/O is resumed, which includes
>
>   - call for new mon election
>  - election of mon leader
>  - calling of MDS standby acting by new Mon Leader
>  - resuming I/O stuck threads
>  - and other internal process.(i am only point what i could read
>  from logs)
>
>   - *How many FS Clients do you have?*
>   - we are testing with only one client mounting using native fs driver
>   at the moment, where we pass both IP Address of both the MDS
> Daemon(in our
>   case both the Ceph Nodes) using following method:
>  - sudo mount -t ceph 10.0.4.10,10.0.4.11:6789:/volumes/path/
>  /mnt/cephconf -o
> name=foo,secret=AQAus49gdCHvIxAAB89BcDYqYSqJ8yOJBg5grw==
>
>
> *one input*:if we only shut-down MDS Active Daemon, we only get 4-7
> Seconds, i.e if we are not rebooting the physical node but only the service
> MDS.
> When we reboot Physical node , Cephnode1 or Cephnode2 ( Mon,Mgr,RGW,OSD
> also gets rebooted along with MDS)  we realizing around 40 seconds.
>
> Best Regards,
> Lokendra
>
>
> On Mon, May 3, 2021 at 10:30 PM Frank Schilder  wrote:
>
>> Following up on this and other comments, there are 2 different time
>> delays. One (1)  is the time it takes from killing an MDS until a stand-by
>> is made an active rank, and (2) the time it takes for the new active rank
>> to restore all client sessions. My experience is that (1) takes close to 0
>> seconds while (2) can take between 20-30 seconds depending on how busy the
>> clients are; the MDS will go through various states before reaching active.
>> We usually have ca. 1600 client connections to our FS. With fewer clients,
>> MDS fail-over is practically instantaneous. We are using latest mimic.
>>
>> From what you write, you seem to have a 40 seconds window for (1), which
>> points to a problem different to MON config values. This is supported by
>> your description including a MON election (??? this should never happen).
>> Do you have have services co-located? Which of the times (1) or (2) are you
>> referring to? How many FS clients do you have?
>>
>> Best regards,
>> =
>> Frank Schilder
>> AIT Risø Campus
>> Bygning 109, rum S14
>>
>> 
>> From: Patrick Donnelly 
>> Sent: 03 May 2021 17:19:37
>> To: Lokendra Rathour
>> Cc: Ceph Development; dev; ceph-users
>> Subject: [ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay
>> issue
>>
>> On Mon, May 3, 2021 at 6:36 AM Lokendra Rathour
>>  wrote:
>> >
>> > Hi Team,
>> > I was setting up the ceph cluster with
>> >
>> >- Node Details:3 Mon,2 MDS, 2 Mgr, 2 RGW
>> >- Deployment Type: Active Standby
>> >- Testing Mode: Failover of MDS Node
>> >- Setup : Octopus (15.2.7)
>> >- OS: centos 8.3
>> >- hardware: HP
>> >- Ram:  128 GB on each Node
>> >- OSD: 2 ( 1 tb each)
>> >- Operation: Normal I/O with mkdir on every 1 second.
>> >
>> > T*est Case: Power-off any active MDS Node for failover to happen*
>> >
>> > *Observation:*
>> >

[ceph-users] Spam from Chip Cox

2021-05-03 Thread Frank Schilder
Does anyone else receive unsolicited replies from sender "Chip Cox 
" to e-mails posted on this list?

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to set bluestore_rocksdb_options_annex

2021-05-03 Thread ceph
For the records i have tried/done this like:

Ceph config set osd bluestore_rocksdb_options_annex option1=8,option2=4

But i am not sure if it is necessary to restart the osds... cause

Ceph config dump

Shows 

... .. .
Osd advanced option1=8,option2=4 *
... .. .

The "*" is shown in the "RO" field..

Any Suggestions?

Thanks Mehmet

Am 28. April 2021 21:25:51 MESZ schrieb c...@elchaka.de:
>Hello Anthony,
>
> it was introduced in octopus 15.2.10
>See:  https://docs.ceph.com/en/latest/releases/octopus/
>
>Do you know how you would set it in pacific? :)
>Guess, there shouldnt be much difference...
>
>Thank you
>Mehmet
>
>Am 28. April 2021 19:21:19 MESZ schrieb Anthony D'Atri
>:
>>I think that’s new with Pacific.
>>
>>> On Apr 28, 2021, at 1:26 AM, c...@elchaka.de wrote:
>>> 
>>> 
>>> 
>>> Hello,
>>> 
>>> I have an octopus cluster and want to change some values - but i
>>cannot find any documentation on how to set values(multiple) with
>>> 
>>> bluestore_rocksdb_options_annex
>>> 
>>> Could someone give me some examples.
>>> I would like to do this like ceph config set ...
>>> 
>>> Thanks in advice
>>> Mehmet
>>> ___
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>___
>ceph-users mailing list -- ceph-users@ceph.io
>To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io