[ceph-users] Re: osd_pglog memory hoarding - another case

Kalle Happonen Tue, 17 Nov 2020 02:46:08 -0800

Hi Dan @ co.,
Thanks for the support (moral and technical).

That sounds like a good guess, but it seems like there is nothing alarming 
here. In all our pools, some pgs are a bit over 3100, but not at any 
exceptional values.


cat pgdumpfull.txt | jq '.pg_map.pg_stats[] |
select(.ondisk_log_size > 3100)' | egrep "pgid|ondisk_log_size"
  "pgid": "37.2b9",
  "ondisk_log_size": 3103,
  "pgid": "33.e",
  "ondisk_log_size": 3229,
  "pgid": "7.2",
  "ondisk_log_size": 3111,
  "pgid": "26.4",
  "ondisk_log_size": 3185,
  "pgid": "33.4",
  "ondisk_log_size": 3311,
  "pgid": "33.8",
  "ondisk_log_size": 3278,

I also have no idea what the average size of a pg log entry should be, in our 
case it seems it's around 8 MB (22GB/3000 entires).

Cheers,
Kalle

----- Original Message -----
> From: "Dan van der Ster" <d...@vanderster.com>
> To: "Kalle Happonen" <kalle.happo...@csc.fi>
> Cc: "ceph-users" <ceph-users@ceph.io>, "xie xingguo" 
> <xie.xing...@zte.com.cn>, "Samuel Just" <sj...@redhat.com>
> Sent: Tuesday, 17 November, 2020 12:22:28
> Subject: Re: [ceph-users] osd_pglog memory hoarding - another case

> Hi Kalle,
> 
> Do you have active PGs now with huge pglogs?
> You can do something like this to find them:
> 
>   ceph pg dump -f json | jq '.pg_map.pg_stats[] |
> select(.ondisk_log_size > 3000)'
> 
> If you find some, could you increase to debug_osd = 10 then share the osd log.
> I am interested in the debug lines from calc_trim_to_aggressively (or
> calc_trim_to if you didn't enable pglog_hardlimit), but the whole log
> might show other issues.
> 
> Cheers, dan
> 
> 
> On Tue, Nov 17, 2020 at 9:55 AM Dan van der Ster <d...@vanderster.com> wrote:
>>
>> Hi Kalle,
>>
>> Strangely and luckily, in our case the memory explosion didn't reoccur
>> after that incident. So I can mostly only offer moral support.
>>
>> But if this bug indeed appeared between 14.2.8 and 14.2.13, then I
>> think this is suspicious:
>>
>>    b670715eb4 osd/PeeringState: do not trim pg log past last_update_ondisk
>>
>>    https://github.com/ceph/ceph/commit/b670715eb4
>>
>> Given that it adds a case where the pg_log is not trimmed, I wonder if
>> there could be an unforeseen condition where `last_update_ondisk`
>> isn't being updated correctly, and therefore the osd stops trimming
>> the pg_log altogether.
>>
>> Xie or Samuel: does that sound possible?
>>
>> Cheers, Dan
>>
>> On Tue, Nov 17, 2020 at 9:35 AM Kalle Happonen <kalle.happo...@csc.fi> wrote:
>> >
>> > Hello all,
>> > wrt:
>> > https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/7IMIWCKIHXNULEBHVUIXQQGYUDJAO2SF/
>> >
>> > Yesterday we hit a problem with osd_pglog memory, similar to the thread 
>> > above.
>> >
>> > We have a 56 node object storage (S3+SWIFT) cluster with 25 OSD disk per 
>> > node.
>> > We run 8+3 EC for the data pool (metadata is on replicated nvme pool).
>> >
>> > The cluster has been running fine, and (as relevant to the post) the memory
>> > usage has been stable at 100 GB / node. We've had the default pg_log of 
>> > 3000.
>> > The user traffic doesn't seem to have been exceptional lately.
>> >
>> > Last Thursday we updated the OSDs from 14.2.8 -> 14.2.13. On Friday the 
>> > memory
>> > usage on OSD nodes started to grow. On each node it grew steadily about 30
>> > GB/day, until the servers started OOM killing OSD processes.
>> >
>> > After a lot of debugging we found that the pg_logs were huge. Each OSD 
>> > process
>> > pg_log had grown to ~22GB, which we naturally didn't have memory for, and 
>> > then
>> > the cluster was in an unstable situation. This is significantly more than 
>> > the
>> > 1,5 GB in the post above. We do have ~20k pgs, which may directly affect 
>> > the
>> > size.
>> >
>> > We've reduced the pg_log to 500, and started offline trimming it where we 
>> > can,
>> > and also just waited. The pg_log size dropped to ~1,2 GB on at least some
>> > nodes, but we're  still recovering, and have a lot of ODSs down and out 
>> > still.
>> >
>> > We're unsure if version 14.2.13 triggered this, or if the osd restarts 
>> > triggered
>> > this (or something unrelated we don't see).
>> >
>> > This mail is mostly to figure out if there are good guesses why the pg_log 
>> > size
>> > per OSD process exploded? Any technical (and moral) support is appreciated.
>> > Also, currently we're not sure if 14.2.13 triggered this, so this is also 
>> > to
>> > put a data point out there for other debuggers.
>> >
>> > Cheers,
>> > Kalle Happonen
>> > _______________________________________________
>> > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: osd_pglog memory hoarding - another case

Reply via email to