[ceph-users] Re: Luminous 12.2.12 - filestore OSDs take an hour to boot

Eric Smith Tue, 14 Jul 2020 06:22:10 -0700

We're lucky that we are in the process of expanding the cluster, instead of 
expanding we'll just build a new Bluestore cluster and migrate data to it.


-----Original Message-----
From: Dan van der Ster <d...@vanderster.com> 
Sent: Tuesday, July 14, 2020 9:17 AM
To: Eric Smith <eric.sm...@vecima.com>
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: Luminous 12.2.12 - filestore OSDs take an hour to 
boot

Thanks for this info -- adding it to our list of reasons never to use FileStore 
again.
In your case, are you able to migrate?


On Tue, Jul 14, 2020 at 3:13 PM Eric Smith <eric.sm...@vecima.com> wrote:
>
> FWIW Bluestore is not affected by this problem!
>
> -----Original Message-----
> From: Eric Smith <eric.sm...@vecima.com>
> Sent: Saturday, July 11, 2020 6:40 AM
> To: ceph-users@ceph.io
> Subject: [ceph-users] Re: Luminous 12.2.12 - filestore OSDs take an 
> hour to boot
>
> It does appear that long file names and filestore seem to be a real problem. 
> We have a cluster where 99% of the objects have names longer than N (220+?) 
> characters such that it truncates the file name (as seen below with 
> "_<sha-sum>_0_long") and stores the full object name in xattrs for the 
> object. During boot the OSD goes out to lunch for increasing amounts of time 
> based on the number of objects on disk you have that meet this criteria (With 
> 2.4 million ish objects that meet this criteria, the OSD takes over an hour 
> to boot). I plan on testing this same scenario with BlueStore to see if it's 
> also susceptible to these boot / read issues.
>
> Eric
>
> -----Original Message-----
> From: Eric Smith <eric.sm...@vecima.com>
> Sent: Friday, July 10, 2020 1:46 PM
> To: ceph-users@ceph.io
> Subject: [ceph-users] Re: Luminous 12.2.12 - filestore OSDs take an 
> hour to boot
>
> For what it's worth - all of our objects are generating LONG named object 
> files like so...
>
> \uABCD\ucontent.\srecording\swzdchd\u\utnda-trg-1008007-wzdchd-2162037
> 06303281120-230932949-1593482400-159348660000000001\swzdchd\u\utpc2-tp
> 1-1008007-wzdchd-216203706303281120-230932949-1593482400-1593486600000
> 00001\u\uwzdchd3._0bfd7c716b839cb7b3ad_0_long
>
> Does this matter? AFAICT it sees this as a long file name and has to lookup 
> the object name in the xattrs ? Is that bad?
>
> -----Original Message-----
> From: Eric Smith <eric.sm...@vecima.com>
> Sent: Friday, July 10, 2020 6:59 AM
> To: ceph-users@ceph.io
> Subject: [ceph-users] Luminous 12.2.12 - filestore OSDs take an hour 
> to boot
>
> I have a cluster running Luminous 12.2.12 with Filestore and it takes my OSDs 
> somewhere around an hour to start (They do start successfully - eventually). 
> I have the following log entries that seem to show the OSD process attempting 
> to descend into the PG directory on disk and create an object list of some 
> sort:
>
> 2020-07-09 18:29:28.017207 7f3b680afd80 20 osd.1 137390  clearing 
> temps in 8.14ads3_head pgid 8.14ads3
> 2020-07-09 18:29:28.017211 7f3b680afd80 20 
> filestore(/var/lib/ceph/osd/ceph-1) collection_list(5012): pool is 8 
> shard is 3 pgid 8.14ads3
> 2020-07-09 18:29:28.017213 7f3b680afd80 10 
> filestore(/var/lib/ceph/osd/ceph-1) collection_list(5020): first 
> checking temp pool
> 2020-07-09 18:29:28.017215 7f3b680afd80 20 
> filestore(/var/lib/ceph/osd/ceph-1) collection_list(5012): pool is -10 
> shard is 3 pgid 8.14ads3
> 2020-07-09 18:29:28.017221 7f3b680afd80 20 _collection_list_partial 
> start:GHMIN end:GHMAX-64 ls.size 0
> 2020-07-09 18:29:28.017263 7f3b680afd80 20 
> filestore(/var/lib/ceph/osd/ceph-1) objects: []
> 2020-07-09 18:29:28.017268 7f3b680afd80 10 
> filestore(/var/lib/ceph/osd/ceph-1) collection_list(5028): fall 
> through to non-temp collection, start 3#-1:00000000::::0#
> 2020-07-09 18:29:28.017272 7f3b680afd80 20 _collection_list_partial 
> start:3#-1:00000000::::0# end:GHMAX-64 ls.size 0
> 2020-07-09 18:29:28.038124 7f3b680afd80 20 list_by_hash_bitwise prefix 
> D
> 2020-07-09 18:29:28.058679 7f3b680afd80 20 list_by_hash_bitwise prefix 
> DA
> 2020-07-09 18:29:28.069432 7f3b680afd80 20 list_by_hash_bitwise prefix 
> DA4
> 2020-07-09 18:29:29.789598 7f3b51a87700 20 
> filestore(/var/lib/ceph/osd/ceph-1) sync_entry(4010): woke after 
> 5.000074
> 2020-07-09 18:29:29.789634 7f3b51a87700 10 journal commit_start 
> max_applied_seq 53085082, open_ops 0
> 2020-07-09 18:29:29.789639 7f3b51a87700 10 journal commit_start 
> blocked, all open_ops have completed
> 2020-07-09 18:29:29.789641 7f3b51a87700 10 journal commit_start 
> nothing to do
> 2020-07-09 18:29:29.789663 7f3b51a87700 20 
> filestore(/var/lib/ceph/osd/ceph-1) sync_entry(3994):  waiting for 
> max_interval 5.000000
> 2020-07-09 18:29:34.789815 7f3b51a87700 20 
> filestore(/var/lib/ceph/osd/ceph-1) sync_entry(4010): woke after 
> 5.000109
> 2020-07-09 18:29:34.789898 7f3b51a87700 10 journal commit_start 
> max_applied_seq 53085082, open_ops 0
> 2020-07-09 18:29:34.789902 7f3b51a87700 10 journal commit_start 
> blocked, all open_ops have completed
> 2020-07-09 18:29:34.789906 7f3b51a87700 10 journal commit_start 
> nothing to do
> 2020-07-09 18:29:34.789939 7f3b51a87700 20 
> filestore(/var/lib/ceph/osd/ceph-1) sync_entry(3994):  waiting for 
> max_interval 5.000000
> 2020-07-09 18:29:38.651689 7f3b680afd80 20 list_by_hash_bitwise prefix 
> DA41
> 2020-07-09 18:29:39.790069 7f3b51a87700 20 
> filestore(/var/lib/ceph/osd/ceph-1) sync_entry(4010): woke after 
> 5.000128
> 2020-07-09 18:29:39.790090 7f3b51a87700 10 journal commit_start 
> max_applied_seq 53085082, open_ops 0
> 2020-07-09 18:29:39.790092 7f3b51a87700 10 journal commit_start 
> blocked, all open_ops have completed
> 2020-07-09 18:29:39.790093 7f3b51a87700 10 journal commit_start 
> nothing to do
> 2020-07-09 18:29:39.790102 7f3b51a87700 20 
> filestore(/var/lib/ceph/osd/ceph-1) sync_entry(3994):  waiting for 
> max_interval 5.000000
> 2020-07-09 18:29:44.790200 7f3b51a87700 20 
> filestore(/var/lib/ceph/osd/ceph-1) sync_entry(4010): woke after 
> 5.000095
> 2020-07-09 18:29:44.790256 7f3b51a87700 10 journal commit_start 
> max_applied_seq 53085082, open_ops 0
> 2020-07-09 18:29:44.790265 7f3b51a87700 10 journal commit_start 
> blocked, all open_ops have completed
> 2020-07-09 18:29:44.790268 7f3b51a87700 10 journal commit_start 
> nothing to do
> 2020-07-09 18:29:44.790286 7f3b51a87700 20 
> filestore(/var/lib/ceph/osd/ceph-1) sync_entry(3994):  waiting for 
> max_interval 5.000000
> 2020-07-09 18:29:49.790353 7f3b51a87700 20 
> filestore(/var/lib/ceph/osd/ceph-1) sync_entry(4010): woke after 
> 5.000066
> 2020-07-09 18:29:49.790374 7f3b51a87700 10 journal commit_start 
> max_applied_seq 53085082, open_ops 0
> 2020-07-09 18:29:49.790376 7f3b51a87700 10 journal commit_start 
> blocked, all open_ops have completed
> 2020-07-09 18:29:49.790378 7f3b51a87700 10 journal commit_start 
> nothing to do
> 2020-07-09 18:29:49.790387 7f3b51a87700 20 
> filestore(/var/lib/ceph/osd/ceph-1) sync_entry(3994):  waiting for 
> max_interval 5.000000
> 2020-07-09 18:29:50.564479 7f3b680afd80 20 list_by_hash_bitwise prefix 
> DA410000
> 2020-07-09 18:29:50.564501 7f3b680afd80 20 list_by_hash_bitwise prefix 
> DA410000 ob 3#8:b5280000::::head#
> 2020-07-09 18:29:50.564508 7f3b680afd80 20 list_by_hash_bitwise prefix 
> DA41002A
>
> Any idea what's going on here? I can run a find of every file on the 
> filesystem in under 12 minutes so I'm not sure what's taking so long.
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an 
> email to ceph-users-le...@ceph.io 
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an 
> email to ceph-users-le...@ceph.io 
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an 
> email to ceph-users-le...@ceph.io 
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an 
> email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Luminous 12.2.12 - filestore OSDs take an hour to boot

Reply via email to