the memory can balloon for some unknown reason.
> The devs have asked a couple times for dumps of those logs replaying
> huge-memory causing pglogs.
>
> In this case -- Benjamin's issue -- I'm trying to understand if this
> is related to:
> * a huge pg log -- would need t
oh jeez, sorry about the subject line - I forgot to change it after asking
a coworker to review the message. This is not a draft.
On Mon, Jan 24, 2022 at 6:44 PM Benjamin Staffin
wrote:
> I have a cluster where 46 out of 120 OSDs have begun crash looping with
> the same stack trace (see
I have a cluster where 46 out of 120 OSDs have begun crash looping with the
same stack trace (see pasted output below). The cluster is in a very bad
state with this many OSDs down, unsurprisingly.
The day before this problem showed up, the k8s cluster was under extreme
memory pressure and a lot o