On Thu, Mar 27, 2025 at 10:13 PM Pali Rohár <p...@kernel.org> wrote: > > On Thursday 27 March 2025 21:57:34 Amir Goldstein wrote: > > On Thu, Mar 27, 2025 at 8:26 PM Pali Rohár <p...@kernel.org> wrote: > > > > > > On Thursday 27 March 2025 12:47:02 Amir Goldstein wrote: > > > > On Sun, Mar 23, 2025 at 11:32 AM Pali Rohár <p...@kernel.org> wrote: > > > > > > > > > > On Sunday 23 March 2025 09:45:06 Amir Goldstein wrote: > > > > > > On Fri, Mar 21, 2025 at 8:50 PM Andrey Albershteyn > > > > > > <aalbe...@redhat.com> wrote: > > > > > > > > > > > > > > This patchset introduced two new syscalls getfsxattrat() and > > > > > > > setfsxattrat(). These syscalls are similar to FS_IOC_FSSETXATTR > > > > > > > ioctl() > > > > > > > except they use *at() semantics. Therefore, there's no need to > > > > > > > open the > > > > > > > file to get an fd. > > > > > > > > > > > > > > These syscalls allow userspace to set filesystem inode attributes > > > > > > > on > > > > > > > special files. One of the usage examples is XFS quota projects. > > > > > > > > > > > > > > XFS has project quotas which could be attached to a directory. All > > > > > > > new inodes in these directories inherit project ID set on parent > > > > > > > directory. > > > > > > > > > > > > > > The project is created from userspace by opening and calling > > > > > > > FS_IOC_FSSETXATTR on each inode. This is not possible for special > > > > > > > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left > > > > > > > with empty project ID. Those inodes then are not shown in the > > > > > > > quota > > > > > > > accounting but still exist in the directory. This is not critical > > > > > > > but in > > > > > > > the case when special files are created in the directory with > > > > > > > already > > > > > > > existing project quota, these new inodes inherit extended > > > > > > > attributes. > > > > > > > This creates a mix of special files with and without attributes. > > > > > > > Moreover, special files with attributes don't have a possibility > > > > > > > to > > > > > > > become clear or change the attributes. This, in turn, prevents > > > > > > > userspace > > > > > > > from re-creating quota project on these existing files. > > > > > > > > > > > > > > Christian, if this get in some mergeable state, please don't > > > > > > > merge it > > > > > > > yet. Amir suggested these syscalls better to use updated struct > > > > > > > fsxattr > > > > > > > with masking from Pali Rohár patchset, so, let's see how it goes. > > > > > > > > > > > > Andrey, > > > > > > > > > > > > To be honest I don't think it would be fair to delay your syscalls > > > > > > more > > > > > > than needed. > > > > > > > > > > I agree. > > > > > > > > > > > If Pali can follow through and post patches on top of your syscalls > > > > > > for > > > > > > next merge window that would be great, but otherwise, I think the > > > > > > minimum requirement is that the syscalls return EINVAL if fsx_pad > > > > > > is not zero. we can take it from there later. > > > > > > > > > > IMHO SYS_getfsxattrat is fine in this form. > > > > > > > > > > For SYS_setfsxattrat I think there are needed some modifications > > > > > otherwise we would have problem again with backward compatibility as > > > > > is with ioctl if the syscall wants to be extended in future. > > > > > > > > > > I would suggest for following modifications for SYS_setfsxattrat: > > > > > > > > > > - return EINVAL if fsx_xflags contains some reserved or unsupported > > > > > flag > > > > > > > > > > - add some flag to completely ignore fsx_extsize, fsx_projid, and > > > > > fsx_cowextsize fields, so SYS_setfsxattrat could be used just to > > > > > change fsx_xflags, and so could be used without the preceding > > > > > SYS_getfsxattrat call. > > > > > > > > > > What do you think about it? > > > > > > > > I think all Andrey needs to do now is return -EINVAL if fsx_pad is not > > > > zero. > > > > > > > > You can use this later to extend for the semantics of flags/fields mask > > > > and we can have a long discussion later on what this semantics should > > > > be. > > > > > > > > Right? > > > > > > > > Amir. > > > > > > It is really enough? > > > > I don't know. Let's see... > > > > > All new extensions later would have to be added > > > into fsx_pad fields, and currently unused bits in fsx_xflags would be > > > unusable for extensions. > > > > I am working under the assumption that the first extension would be > > to support fsx_xflags_mask and from there, you could add filesystem > > flags support checks and then new flags. Am I wrong? > > > > Obviously, fsx_xflags_mask would be taken from fsx_pad space. > > After that extension is implemented, calling SYS_setfsxattrat() with > > a zero fsx_xflags_mask would be silly for programs that do not do > > the legacy get+set. > > > > So when we introduce fsx_xflags_mask, we could say that a value > > of zero means that the mask is not being checked at all and unknown > > flags in set syscall are ignored (a.k.a legacy ioctl behavior). > > > > Programs that actually want to try and set without get will have to set > > a non zero fsx_xflags_mask to do something useful. > > Here we need to also solve the problem that without GET call we do not > have valid values for fsx_extsize, fsx_projid, and fsx_cowextsize. So > maybe we would need some flag in fsx_pad that fsx_extsize, fsx_projid, > or fsx_cowextsize are ignored/masked. > > > I don't think this is great. > > I would rather that the first version of syscalls will require the mask > > and will always enforce filesystems supported flags. > > It is not great... But what about this? In a first step (part of this > syscall patch series) would be just a check that fsx_pad is zero. > Non-zero will return -EINVAL. > > In next changes would added fsx_filter bit field, which for each > fsx_xflags and also for fsx_extsize, fsx_projid, and fsx_cowextsize > fields would add a new bit flag which would say (when SET) that the > particular thing has to be ignored.
1. I don't like the inverse mask. statx already has the stx_mask and stx_attributes_mask, so I rather stick to same semantics because some of those attributes are exposed via statx as well 2. fsx_*extsize already have a bit that says if that the particular attribute is valid or not, so setting a zero fsx_cowextsize with the flag FS_XFLAG_COWEXTSIZE has no effect in xfs: /* * Only set the extent size hint if we've already determined that the * extent size hint should be set on the inode. If no extent size flags * are set on the inode then unconditionally clear the extent size hint. */ if (ip->i_diflags & (XFS_DIFLAG_EXTSIZE | XFS_DIFLAG_EXTSZINHERIT)) ip->i_extsize = XFS_B_TO_FSB(mp, fa->fsx_extsize); else ip->i_extsize = 0; if (xfs_has_v3inodes(mp)) { if (ip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE) ip->i_cowextsize = XFS_B_TO_FSB(mp, fa->fsx_cowextsize); else ip->i_cowextsize = 0; } I think we need to enforce this logic in fileattr_set_prepare() and I think we need to add a flag FS_XFLAG_PROJID that will be set in GET when fsx_projid != 0 and similarly required when setting fsx_projid != 0. Probably will need to add some backward compat glue for this flag in GET ioctl to avoid breaking out of tree fs and fuse. > > So when fsx_pad is all-zeros then fsx_filter (first field in fsx_pad) > would say that nothing in fsx_xflags, fsx_extsize, fsx_projid, and > fsx_cowextsize is ignored, and hence behave like before. > > And when something in fsx_pad/fsx_filter is set then it says which > fields are ignored/filtered-out. > > > If you can get those patches (on top of current series) posted and > > reviewed in time for the next merge window, including consensus > > on the actual semantics, that would be the best IMO. > > I think that this starting to be more complicated to rebase my patches > in a way that they do not affect IOCTL path but implement it properly > for new syscall path. It does not sounds like a trivial thing which I > would finish in merge window time and having proper review and consensus > on this. > Yes, it is better to separate the two efforts. wrt erroring on unsupported SET flags, all fs other than xfs already have some variant of fileattr_has_fsx(), so xfs is the only filesystem that requires special care with the new syscalls. It's easier to write a patch than it is to explain what I mean, so I'll try to write a patch. Thanks, Amir.