: Webert de Souza Lima ; CUZA Frédéric
Cc : ceph-users
Objet : RE: [ceph-users] Whole cluster flapping
Hi again Frederic,
It may be worth looking at a recovery sleep.
osd recovery sleep
Description:
Time in seconds to sleep before next recovery or backfill op. Increasing this
value will slow down
s On Behalf Of Webert de
Souza Lima
Sent: 08 August 2018 15:06
To: frederic.c...@sib.fr
Cc: ceph-users
Subject: Re: [ceph-users] Whole cluster flapping
So your OSDs are really too busy to respond heartbeats.
You'll be facing this for sometime until cluster loads get lower.
I would set `ceph osd
p is_healthy 'OSD::osd_op_tp thread 0x7fdabd897700' had
> timed out after 90
>
>
>
> (I update it to 90 instead of 15s)
>
>
>
> Regards,
>
>
>
>
>
>
>
> *De :* ceph-users *De la part de*
> Webert de Souza Lima
> *Envoyé :* 07 August
8
À : ceph-users
Objet : Re: [ceph-users] Whole cluster flapping
oops, my bad, you're right.
I don't know much you can see but maybe you can dig around performance counters
and see what's happening on those OSDs, try these:
~# ceph daemonperf osd.XX
~# ceph daemon osd.XX perf du
a Lima
> *Envoyé :* 07 August 2018 15:08
> *À :* ceph-users
> *Objet :* Re: [ceph-users] Whole cluster flapping
>
>
>
> Frédéric,
>
>
>
> see if the number of objects is decreasing in the pool with `ceph df
> [detail]`
>
>
>
> Regards,
>
>
>
>
Pool is already deleted and no longer present in stats.
Regards,
De : ceph-users De la part de Webert de
Souza Lima
Envoyé : 07 August 2018 15:08
À : ceph-users
Objet : Re: [ceph-users] Whole cluster flapping
Frédéric,
see if the number of objects is decreasing in the pool with `ceph df
> Regards,
>
>
>
> *De :* ceph-users *De la part de*
> Webert de Souza Lima
> *Envoyé :* 31 July 2018 16:25
> *À :* ceph-users
> *Objet :* Re: [ceph-users] Whole cluster flapping
>
>
>
> The pool deletion might have triggered a lot of IO operations on the
.
Regards,
De : ceph-users De la part de Webert de
Souza Lima
Envoyé : 31 July 2018 16:25
À : ceph-users
Objet : Re: [ceph-users] Whole cluster flapping
The pool deletion might have triggered a lot of IO operations on the disks and
the process might be too busy to respond to hearbeats, so the
hanks for all.
Regards,
De : Brent Kennedy
Envoyé : 31 July 2018 23:36
À : CUZA Frédéric ; 'ceph-users'
Objet : RE: [ceph-users] Whole cluster flapping
I have had this happen during large data movements. Stopped happening after I
went to 10Gb though(from 1Gb). What I had done is inj
...@lists.ceph.com] On Behalf Of
CUZA Frédéric
Sent: Tuesday, July 31, 2018 5:06 AM
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Whole cluster flapping
Hi Everyone,
I just upgrade our cluster to Luminous 12.2.7 and I delete a quite large
pool that we had (120 TB).
Our cluster is made of 14 Nodes
The pool deletion might have triggered a lot of IO operations on the disks
and the process might be too busy to respond to hearbeats, so the mons mark
them as down due to no response.
Check also the OSD logs to see if they are actually crashing and
restarting, and disk IO usage (i.e. iostat).
Rega
Hi Everyone,
I just upgrade our cluster to Luminous 12.2.7 and I delete a quite large pool
that we had (120 TB).
Our cluster is made of 14 Nodes with each composed of 12 OSDs (1 HDD -> 1 OSD),
we have SDD for journal.
After I deleted the large pool my cluster started to flapping on all OSDs.
Os
12 matches
Mail list logo