We're lucky that we are in the process of expanding the cluster, instead of expanding we'll just build a new Bluestore cluster and migrate data to it.
-----Original Message----- From: Dan van der Ster <d...@vanderster.com> Sent: Tuesday, July 14, 2020 9:17 AM To: Eric Smith <eric.sm...@vecima.com> Cc: ceph-users@ceph.io Subject: Re: [ceph-users] Re: Luminous 12.2.12 - filestore OSDs take an hour to boot Thanks for this info -- adding it to our list of reasons never to use FileStore again. In your case, are you able to migrate? On Tue, Jul 14, 2020 at 3:13 PM Eric Smith <eric.sm...@vecima.com> wrote: > > FWIW Bluestore is not affected by this problem! > > -----Original Message----- > From: Eric Smith <eric.sm...@vecima.com> > Sent: Saturday, July 11, 2020 6:40 AM > To: ceph-users@ceph.io > Subject: [ceph-users] Re: Luminous 12.2.12 - filestore OSDs take an > hour to boot > > It does appear that long file names and filestore seem to be a real problem. > We have a cluster where 99% of the objects have names longer than N (220+?) > characters such that it truncates the file name (as seen below with > "_<sha-sum>_0_long") and stores the full object name in xattrs for the > object. During boot the OSD goes out to lunch for increasing amounts of time > based on the number of objects on disk you have that meet this criteria (With > 2.4 million ish objects that meet this criteria, the OSD takes over an hour > to boot). I plan on testing this same scenario with BlueStore to see if it's > also susceptible to these boot / read issues. > > Eric > > -----Original Message----- > From: Eric Smith <eric.sm...@vecima.com> > Sent: Friday, July 10, 2020 1:46 PM > To: ceph-users@ceph.io > Subject: [ceph-users] Re: Luminous 12.2.12 - filestore OSDs take an > hour to boot > > For what it's worth - all of our objects are generating LONG named object > files like so... > > \uABCD\ucontent.\srecording\swzdchd\u\utnda-trg-1008007-wzdchd-2162037 > 06303281120-230932949-1593482400-159348660000000001\swzdchd\u\utpc2-tp > 1-1008007-wzdchd-216203706303281120-230932949-1593482400-1593486600000 > 00001\u\uwzdchd3._0bfd7c716b839cb7b3ad_0_long > > Does this matter? AFAICT it sees this as a long file name and has to lookup > the object name in the xattrs ? Is that bad? > > -----Original Message----- > From: Eric Smith <eric.sm...@vecima.com> > Sent: Friday, July 10, 2020 6:59 AM > To: ceph-users@ceph.io > Subject: [ceph-users] Luminous 12.2.12 - filestore OSDs take an hour > to boot > > I have a cluster running Luminous 12.2.12 with Filestore and it takes my OSDs > somewhere around an hour to start (They do start successfully - eventually). > I have the following log entries that seem to show the OSD process attempting > to descend into the PG directory on disk and create an object list of some > sort: > > 2020-07-09 18:29:28.017207 7f3b680afd80 20 osd.1 137390 clearing > temps in 8.14ads3_head pgid 8.14ads3 > 2020-07-09 18:29:28.017211 7f3b680afd80 20 > filestore(/var/lib/ceph/osd/ceph-1) collection_list(5012): pool is 8 > shard is 3 pgid 8.14ads3 > 2020-07-09 18:29:28.017213 7f3b680afd80 10 > filestore(/var/lib/ceph/osd/ceph-1) collection_list(5020): first > checking temp pool > 2020-07-09 18:29:28.017215 7f3b680afd80 20 > filestore(/var/lib/ceph/osd/ceph-1) collection_list(5012): pool is -10 > shard is 3 pgid 8.14ads3 > 2020-07-09 18:29:28.017221 7f3b680afd80 20 _collection_list_partial > start:GHMIN end:GHMAX-64 ls.size 0 > 2020-07-09 18:29:28.017263 7f3b680afd80 20 > filestore(/var/lib/ceph/osd/ceph-1) objects: [] > 2020-07-09 18:29:28.017268 7f3b680afd80 10 > filestore(/var/lib/ceph/osd/ceph-1) collection_list(5028): fall > through to non-temp collection, start 3#-1:00000000::::0# > 2020-07-09 18:29:28.017272 7f3b680afd80 20 _collection_list_partial > start:3#-1:00000000::::0# end:GHMAX-64 ls.size 0 > 2020-07-09 18:29:28.038124 7f3b680afd80 20 list_by_hash_bitwise prefix > D > 2020-07-09 18:29:28.058679 7f3b680afd80 20 list_by_hash_bitwise prefix > DA > 2020-07-09 18:29:28.069432 7f3b680afd80 20 list_by_hash_bitwise prefix > DA4 > 2020-07-09 18:29:29.789598 7f3b51a87700 20 > filestore(/var/lib/ceph/osd/ceph-1) sync_entry(4010): woke after > 5.000074 > 2020-07-09 18:29:29.789634 7f3b51a87700 10 journal commit_start > max_applied_seq 53085082, open_ops 0 > 2020-07-09 18:29:29.789639 7f3b51a87700 10 journal commit_start > blocked, all open_ops have completed > 2020-07-09 18:29:29.789641 7f3b51a87700 10 journal commit_start > nothing to do > 2020-07-09 18:29:29.789663 7f3b51a87700 20 > filestore(/var/lib/ceph/osd/ceph-1) sync_entry(3994): waiting for > max_interval 5.000000 > 2020-07-09 18:29:34.789815 7f3b51a87700 20 > filestore(/var/lib/ceph/osd/ceph-1) sync_entry(4010): woke after > 5.000109 > 2020-07-09 18:29:34.789898 7f3b51a87700 10 journal commit_start > max_applied_seq 53085082, open_ops 0 > 2020-07-09 18:29:34.789902 7f3b51a87700 10 journal commit_start > blocked, all open_ops have completed > 2020-07-09 18:29:34.789906 7f3b51a87700 10 journal commit_start > nothing to do > 2020-07-09 18:29:34.789939 7f3b51a87700 20 > filestore(/var/lib/ceph/osd/ceph-1) sync_entry(3994): waiting for > max_interval 5.000000 > 2020-07-09 18:29:38.651689 7f3b680afd80 20 list_by_hash_bitwise prefix > DA41 > 2020-07-09 18:29:39.790069 7f3b51a87700 20 > filestore(/var/lib/ceph/osd/ceph-1) sync_entry(4010): woke after > 5.000128 > 2020-07-09 18:29:39.790090 7f3b51a87700 10 journal commit_start > max_applied_seq 53085082, open_ops 0 > 2020-07-09 18:29:39.790092 7f3b51a87700 10 journal commit_start > blocked, all open_ops have completed > 2020-07-09 18:29:39.790093 7f3b51a87700 10 journal commit_start > nothing to do > 2020-07-09 18:29:39.790102 7f3b51a87700 20 > filestore(/var/lib/ceph/osd/ceph-1) sync_entry(3994): waiting for > max_interval 5.000000 > 2020-07-09 18:29:44.790200 7f3b51a87700 20 > filestore(/var/lib/ceph/osd/ceph-1) sync_entry(4010): woke after > 5.000095 > 2020-07-09 18:29:44.790256 7f3b51a87700 10 journal commit_start > max_applied_seq 53085082, open_ops 0 > 2020-07-09 18:29:44.790265 7f3b51a87700 10 journal commit_start > blocked, all open_ops have completed > 2020-07-09 18:29:44.790268 7f3b51a87700 10 journal commit_start > nothing to do > 2020-07-09 18:29:44.790286 7f3b51a87700 20 > filestore(/var/lib/ceph/osd/ceph-1) sync_entry(3994): waiting for > max_interval 5.000000 > 2020-07-09 18:29:49.790353 7f3b51a87700 20 > filestore(/var/lib/ceph/osd/ceph-1) sync_entry(4010): woke after > 5.000066 > 2020-07-09 18:29:49.790374 7f3b51a87700 10 journal commit_start > max_applied_seq 53085082, open_ops 0 > 2020-07-09 18:29:49.790376 7f3b51a87700 10 journal commit_start > blocked, all open_ops have completed > 2020-07-09 18:29:49.790378 7f3b51a87700 10 journal commit_start > nothing to do > 2020-07-09 18:29:49.790387 7f3b51a87700 20 > filestore(/var/lib/ceph/osd/ceph-1) sync_entry(3994): waiting for > max_interval 5.000000 > 2020-07-09 18:29:50.564479 7f3b680afd80 20 list_by_hash_bitwise prefix > DA410000 > 2020-07-09 18:29:50.564501 7f3b680afd80 20 list_by_hash_bitwise prefix > DA410000 ob 3#8:b5280000::::head# > 2020-07-09 18:29:50.564508 7f3b680afd80 20 list_by_hash_bitwise prefix > DA41002A > > Any idea what's going on here? I can run a find of every file on the > filesystem in under 12 minutes so I'm not sure what's taking so long. > > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an > email to ceph-users-le...@ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an > email to ceph-users-le...@ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an > email to ceph-users-le...@ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an > email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io