Chris, this looks like a bug that "lfs getstripe -M" is not using supplementary groups, or similar. You wrote that the directory has GID=130817, so this is not the primary GID of the user accessing it, so it must depend on the supplementary group permissions to access it. The "regular" ls access *is* using the supplementary GID to allow access, and when the directory is cached on the client then "lfs getstripe -M" is getting this information out of the client-side cache (where the client VFS is locally checking the GID for access permission).
I suspect this hasn't really been an issue in the past because few users use "lfs getstripe -M", and most of those are root or are accessing their own files/directories, so do not need a supplementary group to access this information. It also seems (but isn't shown) that the directory does not have world-read permission? What does "stat" on this directory show? Could you please file a ticket in Jira with the details so that this issue can be tracked. I don't know how easy/hard it will be to fix this, since this information is obtained via ioctl(), and we don't necessarily want non-owners of files to be able to call every ioctl on the file/directory. Note, it is recommended to use "lfs getdirstripe --m" (or "--mdt-index") instead of "-M" to get the MDT index of a file, since the "-M" option is deprecated to This would imply you are running a Lustre 2.10 client? The "-m" option is already available in 2.10, and "-M" will print a warning in 2.12 and later. I tested this on master and was not able to reproduce the problem. If I set the directory mode=0640 I got permission denied for directories that I didn't have supplementary group access on, but it worked on the first try (after flushing all client locks and dropping all caches). That means the problem seems to already be fixed in master, and possibly 2.12 also. Cheers, Andreas On Jul 2, 2020, at 10:26, Chang, Christopher <[email protected]<mailto:[email protected]>> wrote: Hi Andreas, It doesn’t appear to be this issue. I verified the client “id” and server “l_getidentity -d” views before and after issuing an “ls” as the user to get getstripe working, and there’s no change. Client: el3:~> id uid=131364(***) gid=131364(***) groups=131364(***),130033(globus-access),130774(eagle-users),130808(ewer),130817(naris),131016(esp-wps-inputs),131178(lex-access),131237(naermpcm),249837(aces),249945(hpcapps),249996(n-apps) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 el3:~> lfs getstripe -M /projects/naris/pcm_110819/NARIS_TechBreak2050_missingDPV/StageC_RT5min error opening /projects/naris/pcm_110819/NARIS_TechBreak2050_missingDPV/StageC_RT5min: Permission denied (13) … el3:~> ls /projects/naris/pcm_110819/NARIS_TechBreak2050_missingDPV/StageC_RT5min ~Model ( c_RT5min_... el3:~> lfs getstripe -M /projects/naris/pcm_110819/NARIS_TechBreak2050_missingDPV/StageC_RT5min 1 el3:~> id uid=131364(***) gid=131364(***) groups=131364(***),130033(globus-access),130774(eagle-users),130808(ewer),130817(naris),131016(esp-wps-inputs),131178(lex-access),131237(naermpcm),249837(aces),249945(hpcapps),249996(n-apps) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 Server: [root@mds02 ~]# l_getidentity -d 131364 uid=131364 gid=131364,130808,130817,131016,131237,249837,249945,249996 permissions: nid perm (client does an ls) [root@mds02 ~]# l_getidentity -d 131364 uid=131364 gid=131364,130808,130817,131016,131237,249837,249945,249996 permissions: nid perm The relevant gid for the target directory is 130817. I verified that all 3 of our MDSs had the same view before and after the “ls”. Thanks; Chris From: Andreas Dilger <[email protected]<mailto:[email protected]>> Date: Sunday, June 28, 2020 at 5:11 PM To: Christopher Chang <[email protected]<mailto:[email protected]>> Cc: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>>, "Kaiser, Timothy" <[email protected]<mailto:[email protected]>> Subject: Re: [lustre-discuss] Permission denied on lfs getstripe On Jun 26, 2020, at 10:45, Chang, Christopher <[email protected]<mailto:[email protected]>> wrote: Hi, We’re running into an error with a particular directory. It is weird because it can be resolved in an unexpected way, but only for a time. The error manifests as: el3:out> lfs getstripe -M /projects/naris/pcm_110819/NARIS_TechBreak2050_missingDPV/StageC_RT5min error opening /projects/naris/pcm_110819/NARIS_TechBreak2050_missingDPV/StageC_RT5min: Permission denied (13) llapi_semantic_traverse: Failed to open '/projects/naris/pcm_110819/NARIS_TechBreak2050_missingDPV/StageC_RT5min': Permission denied (13) error: getstripe failed for /projects/naris/pcm_110819/NARIS_TechBreak2050_missingDPV/StageC_RT5min. The temporary resolution is: el3:out> ls /projects/naris/pcm_110819/NARIS_TechBreak2050_missingDPV/StageC_RT5min ~Model ( c_RT5min_TechBreak2050_092P_OLd000_001 ) Log.txt Model c_RT5min_TechBreak2050_092P_OLd000_033 Solution.h5 Model c_RT5min_TechBreak2050_092P_OLd000_062 Solution.h5 … Then el3:out> lfs getstripe -M /projects/naris/pcm_110819/NARIS_TechBreak2050_missingDPV/StageC_RT5min 1 el3:out> It looks like the user might only have supplementary group access to this file? You could check on the client by running "id" to list the primary user ID and supplementary groups, then "ls -ln" on the file to see what group it is owned by. If that is the case, it would indicate that the MDS /etc/group (or other source of supplementary group information, like NIS or LDAP, via /etc/nsswitch.conf) is not up-to-date with what is on the clients, or you have mdt.*.identity_upcall=NONE on the MDS instead of =l_getidentity. You can test what l_getidentity on the MDS thinks the supplementary groups are for a particular user by running "l_getidentity -d <uid>" to compare what "id" returns on the client. Cheers, Andreas However, the getstripe command will only continue to work for about 10 minutes, then it goes back to the permission denied errors. It only happens with a selection of files or directories, so we were thinking it might be connected to a particular OSS or MDT, but not sure what to look for. I am not the Lustre admin, so please forgive incomplete information. If folks can request specific command output, preferably from user space, that would accelerate my ability to answer questions. If something needs to get run while logged into a particular Lustre component (MDT, OSS, etc.), please do not hesitate to assume that I don’t know that. We’re running Lustre 2.10.7 provided by DDN on CentOS 7.4. All help appreciated, thanks! Chris -- Christopher H. Chang, Ph.D. Computational Scientist National Renewable Energy Laboratory 15013 Denver West Pkwy., MS ESIF301 Golden, CO 80401 _______________________________________________ lustre-discuss mailing list [email protected]<mailto:[email protected]> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org<https://gcc01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.lustre.org%2Flistinfo.cgi%2Flustre-discuss-lustre.org&data=02%7C01%7CChristopher.Chang%40nrel.gov%7C82c57c8d48b841f873f008d81bb892be%7Ca0f29d7e28cd4f5484427885aee7c080%7C0%7C0%7C637289826842344348&sdata=3lmXyp1ujhSxyHY4YFacQ4ZEFrKXJrZdIzKSMd%2BLnrc%3D&reserved=0> Cheers, Andreas -- Andreas Dilger Principal Lustre Architect Whamcloud Cheers, Andreas -- Andreas Dilger Principal Lustre Architect Whamcloud
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
