Hello Igor >> It looks like you're right. >> >> ceph tell osd.1750 status >> { >> "cluster_fsid": "bec60cda-a306-11ed-abd9-75488d4e8f4a", >> "osd_fsid": "388906f8-1df8-45a2-9895-067ee2e0c055", >> "whoami": 1750, >> "state": "active", >> "maps": "[1502335~261370]", >> "oldest_map": "1502335", >> "newest_map": "1763704", >> "cluster_osdmap_trim_lower_bound": 1502335, >> "num_pgs": 0 >> } >> >> One of the OSDs I've checked has about 261K maps >> Could this cause bluestore to grow to ~380GiB? > > You better check objects size/count in meta pool using ceph-objectstore- > tool and estimate the total numbers: > > ceph-objectstore-tool --data-path <path-to-osd> --op meta-list > > meta_list; cat meta_list | wc > > ceph-objectstore-tool --data-path <path-to-osd> --pgid meta <oid> dump | > grep size > > The latter command can obtain onode size for a given object - just use a > few oids corresponding to specific onode types (osdmap* and inc_osdmap* > are of particular interest, other types to be checked if they are in > bulky counts) from meta_list file. >
There is a lot of objects on the OSD. ceph-objectstore-tool --data-path ./osd.1750 --op meta-list | wc -l 528023 (Almost all are osdmap or inc_osdmap) On smaller healthy cluster similar command gives me about 1400 objects I've checked few osdmap.* objectsthey are also 6-7 times bigger than osdmap.* objects on smaller healthy cluster.
examples:ceph-objectstore-tool --data-path ./osd.1750 --pgid meta '{"oid":"osdmap.1612379","key":"","snapid":0,"hash":42339
99933,"max":0,"pool":-1,"namespace":"","max":0}' dump |grep sizeError getting attr on : meta,#-1:bc6dba3f:::osdmap.1612379:0#, (61) No data available
"size": 1622339, "blksize": 4096, "size": 1622339, "expected_object_size": 0, "expected_write_size": 0,ceph-objectstore-tool --data-path ./osd.1750 --pgid meta '{"oid":"osdmap.1524595","key":"","snapid":0,"hash":4208913981,"max":0,"pool":-1,"namespace":"","max":0}' dump |grep size Error getting attr on : meta,#-1:bc777b5f:::osdmap.1524595:0#, (61) No data available
"size": 1922137, "blksize": 4096, "size": 1922137, "expected_object_size": 0, "expected_write_size": 0, >> What settings could affect number of mapps stored on OSD. >> Only think that comes to mind is mon_min_osdmap_epochs which I >> configured to 2000 a while ago. >> > osdmap-s should be trimmed automatically in a healthy cluster. Perhaps > an ongoing rebalancing prevents or some other issue from that. The first > question would be how osdmap epochs evolve? Is oldest_map increasing? Is > the delta decreasing? > oldest_map is not increasing. Delta has slightly increased since yesterday by about 2763. SSD OSD usage increased by about 1% yesterday. We are quite close to finishing backfill. I'm expecting to get active+clean cluster before OSD fill up. data: volumes: 1/1 healthy pools: 8 pools, 20641 pgs objects: 2.15G objects, 7.6 PiB usage: 11 PiB used, 16 PiB / 27 PiB avail pgs: 196301608/17155230197 objects misplaced (1.144%) 17576 active+clean 1723 active+clean+scrubbing 779 active+clean+scrubbing+deep 381 active+remapped+backfill_wait 182 active+remapped+backfilling io: client: 232 B/s rd, 0 op/s rd, 0 op/s wr recovery: 2.4 GiB/s, 639 objects/sI guess we will wait 3-4 days for cluster recovery to finish and I will update you if something happens once it's HEALTH_OK.
Is there a way to force epoch/osdmap trimming on rebalancing cluster?Afaik there no way to do it. It would be nice if we had such ability. It's not the first time we have this class of issues (big mon DBs). I would consider it a major flaw of the ceph. It cannot get healthier until it's not in a perfectly healthy state.
Best regards Adam Prycki
smime.p7s
Description: Kryptograficzna sygnatura S/MIME
_______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io