[ceph-users] Re: Luminous 12.2.12 - filestore OSDs take an hour to boot
It does appear that long file names and filestore seem to be a real problem. We have a cluster where 99% of the objects have names longer than N (220+?) characters such that it truncates the file name (as seen below with "__0_long") and stores the full object name in xattrs for the object. During boot the OSD goes out to lunch for increasing amounts of time based on the number of objects on disk you have that meet this criteria (With 2.4 million ish objects that meet this criteria, the OSD takes over an hour to boot). I plan on testing this same scenario with BlueStore to see if it's also susceptible to these boot / read issues. Eric -Original Message- From: Eric Smith Sent: Friday, July 10, 2020 1:46 PM To: ceph-users@ceph.io Subject: [ceph-users] Re: Luminous 12.2.12 - filestore OSDs take an hour to boot For what it's worth - all of our objects are generating LONG named object files like so... \uABCD\ucontent.\srecording\swzdchd\u\utnda-trg-1008007-wzdchd-216203706303281120-230932949-1593482400-1593486601\swzdchd\u\utpc2-tp1-1008007-wzdchd-216203706303281120-230932949-1593482400-1593486601\u\uwzdchd3._0bfd7c716b839cb7b3ad_0_long Does this matter? AFAICT it sees this as a long file name and has to lookup the object name in the xattrs ? Is that bad? -Original Message- From: Eric Smith Sent: Friday, July 10, 2020 6:59 AM To: ceph-users@ceph.io Subject: [ceph-users] Luminous 12.2.12 - filestore OSDs take an hour to boot I have a cluster running Luminous 12.2.12 with Filestore and it takes my OSDs somewhere around an hour to start (They do start successfully - eventually). I have the following log entries that seem to show the OSD process attempting to descend into the PG directory on disk and create an object list of some sort: 2020-07-09 18:29:28.017207 7f3b680afd80 20 osd.1 137390 clearing temps in 8.14ads3_head pgid 8.14ads3 2020-07-09 18:29:28.017211 7f3b680afd80 20 filestore(/var/lib/ceph/osd/ceph-1) collection_list(5012): pool is 8 shard is 3 pgid 8.14ads3 2020-07-09 18:29:28.017213 7f3b680afd80 10 filestore(/var/lib/ceph/osd/ceph-1) collection_list(5020): first checking temp pool 2020-07-09 18:29:28.017215 7f3b680afd80 20 filestore(/var/lib/ceph/osd/ceph-1) collection_list(5012): pool is -10 shard is 3 pgid 8.14ads3 2020-07-09 18:29:28.017221 7f3b680afd80 20 _collection_list_partial start:GHMIN end:GHMAX-64 ls.size 0 2020-07-09 18:29:28.017263 7f3b680afd80 20 filestore(/var/lib/ceph/osd/ceph-1) objects: [] 2020-07-09 18:29:28.017268 7f3b680afd80 10 filestore(/var/lib/ceph/osd/ceph-1) collection_list(5028): fall through to non-temp collection, start 3#-1:0# 2020-07-09 18:29:28.017272 7f3b680afd80 20 _collection_list_partial start:3#-1:0# end:GHMAX-64 ls.size 0 2020-07-09 18:29:28.038124 7f3b680afd80 20 list_by_hash_bitwise prefix D 2020-07-09 18:29:28.058679 7f3b680afd80 20 list_by_hash_bitwise prefix DA 2020-07-09 18:29:28.069432 7f3b680afd80 20 list_by_hash_bitwise prefix DA4 2020-07-09 18:29:29.789598 7f3b51a87700 20 filestore(/var/lib/ceph/osd/ceph-1) sync_entry(4010): woke after 5.74 2020-07-09 18:29:29.789634 7f3b51a87700 10 journal commit_start max_applied_seq 53085082, open_ops 0 2020-07-09 18:29:29.789639 7f3b51a87700 10 journal commit_start blocked, all open_ops have completed 2020-07-09 18:29:29.789641 7f3b51a87700 10 journal commit_start nothing to do 2020-07-09 18:29:29.789663 7f3b51a87700 20 filestore(/var/lib/ceph/osd/ceph-1) sync_entry(3994): waiting for max_interval 5.00 2020-07-09 18:29:34.789815 7f3b51a87700 20 filestore(/var/lib/ceph/osd/ceph-1) sync_entry(4010): woke after 5.000109 2020-07-09 18:29:34.789898 7f3b51a87700 10 journal commit_start max_applied_seq 53085082, open_ops 0 2020-07-09 18:29:34.789902 7f3b51a87700 10 journal commit_start blocked, all open_ops have completed 2020-07-09 18:29:34.789906 7f3b51a87700 10 journal commit_start nothing to do 2020-07-09 18:29:34.789939 7f3b51a87700 20 filestore(/var/lib/ceph/osd/ceph-1) sync_entry(3994): waiting for max_interval 5.00 2020-07-09 18:29:38.651689 7f3b680afd80 20 list_by_hash_bitwise prefix DA41 2020-07-09 18:29:39.790069 7f3b51a87700 20 filestore(/var/lib/ceph/osd/ceph-1) sync_entry(4010): woke after 5.000128 2020-07-09 18:29:39.790090 7f3b51a87700 10 journal commit_start max_applied_seq 53085082, open_ops 0 2020-07-09 18:29:39.790092 7f3b51a87700 10 journal commit_start blocked, all open_ops have completed 2020-07-09 18:29:39.790093 7f3b51a87700 10 journal commit_start nothing to do 2020-07-09 18:29:39.790102 7f3b51a87700 20 filestore(/var/lib/ceph/osd/ceph-1) sync_entry(3994): waiting for max_interval 5.00 2020-07-09 18:29:44.790200 7f3b51a87700 20 filestore(/var/lib/ceph/osd/ceph-1) sync_entry(4010): woke after 5.95 2020-07-09 18:29:44.790256 7f3b51a87700 10 journal commit_start max_applied_seq 53085082, open_ops 0 2020-07-09 18:29:44.790265 7f3b51a87700 10 journal commit_start blocked, all o
[ceph-users] Radosgw activity in cephadmin
Hi , I have been trying to get S3 compliant up and running on Ceph using Cephadm, but unable to proceed with radosgw -admin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] RGW multi-object delete failing with 403 denied
Hi An RGW access denied problem that I can't get anywhere with... * Bucket mybucket owned by user "c" * Bucket policy grants s3:listBucket on mybucket, and s3:putObject & s3:deleteObject on mybucket/* to user "j", and s3:getObject to * (I even granted s3:* on mybucket/* to "j" with no effect) * User "j" can create objects in mybucket, and can delete individual objects (using DELETE) * User "j" get 403 when trying to do a multi-object-delete (POST /mybucket/?delete with a list of 4 object keys) Code is a Java servlet running in Wildfly, loading its credentials from the default ~/.aws/credentials file. It enables path-style access. If I change the credentials in there to those of the bucket owner "c" it works... What's different about permissioning for multi-object-delete? Log file shows access has been granted, but further down there is a suspicious "Permissions for user not found" (don't know if that is expected or not). Thanks, Chris --- Extract from RGW log with debugging at level 20: 2020-07-11T17:55:54.038+0100 7f45adad7700 20 req 15 0.00402s s3:multi_object_delete rgw::auth::s3::LocalEngine granted access 2020-07-11T17:55:54.038+0100 7f45adad7700 20 req 15 0.00402s s3:multi_object_delete rgw::auth::s3::AWSAuthStrategy granted access 2020-07-11T17:55:54.038+0100 7f45adad7700 2 req 15 0.00402s s3:multi_object_delete normalizing buckets and tenants 2020-07-11T17:55:54.038+0100 7f45adad7700 10 s->object= s->bucket=mybucket 2020-07-11T17:55:54.038+0100 7f45adad7700 2 req 15 0.00402s s3:multi_object_delete init permissions 2020-07-11T17:55:54.038+0100 7f45adad7700 20 get_system_obj_state: rctx=0x7f45adacc288 obj=default.rgw.meta:root:mybucket state=0x5628b912e9a0 s->prefetch_data=0 2020-07-11T17:55:54.038+0100 7f45adad7700 10 cache get: name=default.rgw.meta+root+mybucket : hit (requested=0x16, cached=0x17) 2020-07-11T17:55:54.038+0100 7f45adad7700 20 get_system_obj_state: s->obj_tag was set empty 2020-07-11T17:55:54.038+0100 7f45adad7700 10 cache get: name=default.rgw.meta+root+mybucket : hit (requested=0x11, cached=0x17) 2020-07-11T17:55:54.038+0100 7f45adad7700 15 decode_policy Read AccessControlPolicyxmlns="http://s3.amazonaws.com/doc/2006-03-01/";>cCxmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; xsi:type="CanonicalUser">cCFULL_CONTROL 2020-07-11T17:55:54.038+0100 7f45adad7700 20 get_system_obj_state: rctx=0x7f45adacc668 obj=default.rgw.meta:users.uid:j state=0x5628b912e9a0 s->prefetch_data=0 2020-07-11T17:55:54.038+0100 7f45adad7700 10 cache get: name=default.rgw.meta+users.uid+j : hit (requested=0x6, cached=0x17) 2020-07-11T17:55:54.038+0100 7f45adad7700 20 get_system_obj_state: s->obj_tag was set empty 2020-07-11T17:55:54.038+0100 7f45adad7700 20 Read xattr: user.rgw.idtag 2020-07-11T17:55:54.038+0100 7f45adad7700 10 cache get: name=default.rgw.meta+users.uid+j : hit (requested=0x3, cached=0x17) 2020-07-11T17:55:54.038+0100 7f45adad7700 2 req 15 0.00402s s3:multi_object_delete recalculating target 2020-07-11T17:55:54.038+0100 7f45adad7700 2 req 15 0.00402s s3:multi_object_delete reading permissions 2020-07-11T17:55:54.038+0100 7f45adad7700 2 req 15 0.00402s s3:multi_object_delete init op 2020-07-11T17:55:54.038+0100 7f45adad7700 2 req 15 0.00402s s3:multi_object_delete verifying op mask 2020-07-11T17:55:54.038+0100 7f45adad7700 20 req 15 0.00402s s3:multi_object_delete required_mask= 4 user.op_mask=7 2020-07-11T17:55:54.038+0100 7f45adad7700 2 req 15 0.00402s s3:multi_object_delete verifying op permissions 2020-07-11T17:55:54.038+0100 7f45adad7700 20 req 15 0.00402s s3:multi_object_delete -- Getting permissions begin with perm_mask=50 2020-07-11T17:55:54.038+0100 7f45adad7700 5 req 15 0.00402s s3:multi_object_delete Searching permissions for identity=rgw::auth::SysReqApplier -> rgw::auth::LocalApplier(acct_user=j, acct_name=J, subuser=, perm_mask=15, is_admin=0) mask=50 2020-07-11T17:55:54.038+0100 7f45adad7700 5 Searching permissions for uid=j 2020-07-11T17:55:54.038+0100 7f45adad7700 5 Permissions for user not found 2020-07-11T17:55:54.038+0100 7f45adad7700 5 Searching permissions for group=1 mask=50 2020-07-11T17:55:54.038+0100 7f45adad7700 5 Permissions for group not found 2020-07-11T17:55:54.038+0100 7f45adad7700 5 Searching permissions for group=2 mask=50 2020-07-11T17:55:54.038+0100 7f45adad7700 5 Permissions for group not found 2020-07-11T17:55:54.038+0100 7f45adad7700 5 req 15 0.00402s s3:multi_object_delete -- Getting permissions done for identity=rgw::auth::SysReqApplier -> rgw::auth::LocalApplier(acct_user=j, acct_name=J, subuser=, perm_mask=15, is_admin=0), owner=c, perm=0 2020-07-11T17:55:54.038+0100 7f45adad7700 10 req 15 0.00402s s3:multi_object_delete identity=rgw::auth::SysReqApplier -> rgw::auth::LocalApplier(acct_user=j, acct_name=J, subuser=, perm_mask=15, is_admin=0) requested perm (type)=2, policy perm=0, user_perm_mask=2,
[ceph-users] [errno 2] RADOS object not found (error connecting to the cluster)
Hi, I have executed the ceph -n osd.0 --show-config command but replied with that error message. [errno 2] RADOS object not found (error connecting to the cluster) Could someone prompt me to the right direction what could be the problem? Thanks. Regards. ceph version 15.2.4 I copied the client.admin key to the hosts and here is my ceph.conf file. # minimal ceph.conf for 4372945a-b43d-11ea-b1b7-49709def22d4 [global] fsid = 4372945a-b43d-11ea-b1b7-49709def22d4 mon_host = 192.168.1.10,192.168.1.11,192.168.1.12 mon_initial_members = 192.168.1.10,192.168.1.11 public network = 192.168.1.0/24 cluster network = 10.10.10.0/24 auth_client_required = cephx auth_cluster_required = cephx auth_service_required = cephx ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: RGW multi-object delete failing with 403 denied
Hi Chris, I don't see the problem offhand, so could you create a tracker issue? thanks, Matt On Sat, Jul 11, 2020 at 2:00 PM Chris Palmer wrote: > > Hi > > An RGW access denied problem that I can't get anywhere with... > > * Bucket mybucket owned by user "c" > * Bucket policy grants s3:listBucket on mybucket, and s3:putObject & > s3:deleteObject on mybucket/* to user "j", and s3:getObject to * (I > even granted s3:* on mybucket/* to "j" with no effect) > * User "j" can create objects in mybucket, and can delete individual > objects (using DELETE) > * User "j" get 403 when trying to do a multi-object-delete (POST > /mybucket/?delete with a list of 4 object keys) > > Code is a Java servlet running in Wildfly, loading its credentials from > the default ~/.aws/credentials file. It enables path-style access. If I > change the credentials in there to those of the bucket owner "c" it works... > > What's different about permissioning for multi-object-delete? > > Log file shows access has been granted, but further down there is a > suspicious "Permissions for user not found" (don't know if that is > expected or not). > > Thanks, Chris > > --- > > Extract from RGW log with debugging at level 20: > > 2020-07-11T17:55:54.038+0100 7f45adad7700 20 req 15 0.00402s > s3:multi_object_delete rgw::auth::s3::LocalEngine granted access > 2020-07-11T17:55:54.038+0100 7f45adad7700 20 req 15 0.00402s > s3:multi_object_delete rgw::auth::s3::AWSAuthStrategy granted access > 2020-07-11T17:55:54.038+0100 7f45adad7700 2 req 15 0.00402s > s3:multi_object_delete normalizing buckets and tenants > 2020-07-11T17:55:54.038+0100 7f45adad7700 10 s->object= > s->bucket=mybucket > 2020-07-11T17:55:54.038+0100 7f45adad7700 2 req 15 0.00402s > s3:multi_object_delete init permissions > 2020-07-11T17:55:54.038+0100 7f45adad7700 20 get_system_obj_state: > rctx=0x7f45adacc288 obj=default.rgw.meta:root:mybucket > state=0x5628b912e9a0 s->prefetch_data=0 > 2020-07-11T17:55:54.038+0100 7f45adad7700 10 cache get: > name=default.rgw.meta+root+mybucket : hit (requested=0x16, cached=0x17) > 2020-07-11T17:55:54.038+0100 7f45adad7700 20 get_system_obj_state: > s->obj_tag was set empty > 2020-07-11T17:55:54.038+0100 7f45adad7700 10 cache get: > name=default.rgw.meta+root+mybucket : hit (requested=0x11, cached=0x17) > 2020-07-11T17:55:54.038+0100 7f45adad7700 15 decode_policy Read > AccessControlPolicy xmlns="http://s3.amazonaws.com/doc/2006-03-01/";>cC xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; > xsi:type="CanonicalUser">cCFULL_CONTROL > 2020-07-11T17:55:54.038+0100 7f45adad7700 20 get_system_obj_state: > rctx=0x7f45adacc668 obj=default.rgw.meta:users.uid:j > state=0x5628b912e9a0 s->prefetch_data=0 > 2020-07-11T17:55:54.038+0100 7f45adad7700 10 cache get: > name=default.rgw.meta+users.uid+j : hit (requested=0x6, cached=0x17) > 2020-07-11T17:55:54.038+0100 7f45adad7700 20 get_system_obj_state: > s->obj_tag was set empty > 2020-07-11T17:55:54.038+0100 7f45adad7700 20 Read xattr: user.rgw.idtag > 2020-07-11T17:55:54.038+0100 7f45adad7700 10 cache get: > name=default.rgw.meta+users.uid+j : hit (requested=0x3, cached=0x17) > 2020-07-11T17:55:54.038+0100 7f45adad7700 2 req 15 0.00402s > s3:multi_object_delete recalculating target > 2020-07-11T17:55:54.038+0100 7f45adad7700 2 req 15 0.00402s > s3:multi_object_delete reading permissions > 2020-07-11T17:55:54.038+0100 7f45adad7700 2 req 15 0.00402s > s3:multi_object_delete init op > 2020-07-11T17:55:54.038+0100 7f45adad7700 2 req 15 0.00402s > s3:multi_object_delete verifying op mask > 2020-07-11T17:55:54.038+0100 7f45adad7700 20 req 15 0.00402s > s3:multi_object_delete required_mask= 4 user.op_mask=7 > 2020-07-11T17:55:54.038+0100 7f45adad7700 2 req 15 0.00402s > s3:multi_object_delete verifying op permissions > 2020-07-11T17:55:54.038+0100 7f45adad7700 20 req 15 0.00402s > s3:multi_object_delete -- Getting permissions begin with perm_mask=50 > 2020-07-11T17:55:54.038+0100 7f45adad7700 5 req 15 0.00402s > s3:multi_object_delete Searching permissions for > identity=rgw::auth::SysReqApplier -> > rgw::auth::LocalApplier(acct_user=j, acct_name=J, subuser=, > perm_mask=15, is_admin=0) mask=50 > 2020-07-11T17:55:54.038+0100 7f45adad7700 5 Searching permissions for uid=j > 2020-07-11T17:55:54.038+0100 7f45adad7700 5 Permissions for user not found > 2020-07-11T17:55:54.038+0100 7f45adad7700 5 Searching permissions for > group=1 mask=50 > 2020-07-11T17:55:54.038+0100 7f45adad7700 5 Permissions for group not found > 2020-07-11T17:55:54.038+0100 7f45adad7700 5 Searching permissions for > group=2 mask=50 > 2020-07-11T17:55:54.038+0100 7f45adad7700 5 Permissions for group not found > 2020-07-11T17:55:54.038+0100 7f45adad7700 5 req 15 0.00402s > s3:multi_object_delete -- Getting permissions done for > identity=rgw::auth::SysReqApplier -> > rgw::auth::LocalApplier(acct_user=j, acct_name=J, subuser=, > perm_mask=15, is_admin