On Wed, Mar 13, 2024 at 1:57 AM Bernd Schubert <bernd.schub...@fastmail.fm> wrote: > > Hi Amir, > > thanks for your help! > > On 3/9/24 03:46, Amir Goldstein wrote: > > On Thu, Mar 7, 2024 at 8:47 PM Bernd Schubert > > <bernd.schub...@fastmail.fm> wrote: > >> > >> Hi all, > >> > >> this is certainly not kind of the mail I was hoping for as a new libfuse > >> maintainer. > >> > >> As you can see from the title and from discussion below (sorry this is > >> not typical ML discussion style), we have a bit of of problem with > >> libfuse ABI compatibility. > >> > >> While scanning through git history, Ashley found a commit that was adding > >> members in the middle of struct - doesn't break API, but binary > >> compatibility. That commit already landed a year ago in these releases: > >> > >> git tag --contains a5eb7f2a0117ab43119ef5724cf5f4f2f181804a > >> fuse-3.14.1 > >> fuse-3.15.0 > >> fuse-3.15.1 > >> fuse-3.16.1 > >> fuse-3.16.2 > >> > >> > >> Obviously this needs improved testing for, but right now we wonder what > >> would be the preferred action to avoid issues. > >> > >> a) We could fix it and move bits to the right place. Fixes everything > >> compiled with old versions, but breaks everything compiled with the > >> versions above > >> > >> b) Increase the .so version - enforces recompilation of all binaries. > >> Intrusive, especially for distributions, but probably safest. > >> > >> c) Other ideas? > >> > > > > Heuristically, you can detect most of the shifted flags at runtime > > because... > > > >> > >> > >> I don't think there is anything in libfuse that would allow us to > >> detect which version of libfuse a library was linked to. > >> > > > > I think we do know for sure if fs was linked with libfuse < 3.12 > > without fuse_loop_mt_312? > > so only left to detect 3.12..3.14 vs. 3.14..3.16 > > Hmm, I guess I miss something, but how would I know if it was linked > with fuse_loop_mt_312? That needs an elf reader? Assuming we would put > this into the library, somehow, how does this work with stripped binaries? > > bschubert2@imesrv6 example>nm passthrough_ll | head -n1 > 00000000000003b4 r __abi_tag > bschubert2@imesrv6 example>strip passthrough_ll > bschubert2@imesrv6 example>nm passthrough_ll | head -n1 > nm: passthrough_ll: no symbols > >
I thought that invoking fuse_loop_mt_*() indicated the lib version that binary was built with. I wasn't taking FUSE_USE_VERSION into account. > > > > >> > >> The commit shifted these members in struct fuse_file_info { > >> > >> struct fuse_file_info { > >> ... > >> /** Can be filled by open/create, to allow parallel direct writes > >> on this > >> * file */ > >> unsigned int parallel_direct_writes : 1; --> introduced the shift > > > > Not expected in flush/release, so can be heuristically interpreted as flush > > > >> > >> /** Indicates a flush operation. Set in flush operation, also > >> maybe set in highlevel lock operation and lowlevel release > >> operation. */ > >> unsigned int flush : 1; > >> > > > > Not expected in open/create, so can be heuristically interpreted as > > nonseekable > > > >> /** Can be filled in by open, to indicate that the file is not > >> seekable. */ > >> unsigned int nonseekable : 1; > >> > > > > Not expected in release, so can be heuristically interpreted as > > flock_release > > > >> /* Indicates that flock locks for this file should be > >> released. If set, lock_owner shall contain a valid value. > >> May only be set in ->release(). */ > >> unsigned int flock_release : 1; > >> > > > > Not expected in opendir, so can be heuristically interpreted as > > cache_readdir > > > >> /** Can be filled in by opendir. It signals the kernel to > >> enable caching of entries returned by readdir(). Has no > >> effect when set in other contexts (in particular it does > >> nothing when set by open()). */ > >> unsigned int cache_readdir : 1; > >>I am not sure I know how versioned symbols work, but doesn't new lib always have all the functions fuse_loop_mt_*() and old binary will be invoking the No, I was not looking closely. > > > > Ignored in open, but based on the comment above, it may be > > implied that some fs may set it in open() reply > > > >> /** Can be filled in by open, to indicate that flush is not needed > >> on close. */ > >> unsigned int noflush : 1; > > > > noflush is just an optimization, which the kernel ignores anyway > > when writeback cache is enabled, so no harm done if it is wrongly > > interpreted as readdir_cache in open() and ignored. > > It is also quite recent (3.11) so not very likely used. > > > >> }; > >> > >> I.e. setting flush would actually set parallel_direct_writes > >> (note: with binaries compiled against older libfuse versions) > >> > > > > It would be suspicious if fs sets parallel_direct_writes flush at > > flush() or release(), so that can be used as a hint of old ABI > > interpreted as flush and issue a warning. > > > > It would be pretty ugly to use these heuristics in the library forever, > > but perhaps as safety measure for binaries built with a range of > > libfuse version that is not likely to be found in the wild (3.14..3.16) > > it is an acceptable compromise? > > I really like your idea about heuristics, I just don't know how to limit > the libfuse version range. > > > > > and perhaps the next libfuse version can pass the header version > > fs was compiled with as argument to fuse_loop_mt_317()? > > > I think adding it to fuse_loop_mt_317 would be easy and could be > auto-added with a macro. I think more difficult is to add it to the old > api/abi. I.e. it would take a few years until that api version gets > mostly used? > I guess we would add __fuse_loop_mt_*() that take an auto lib version argument. Old binaries are old binaries. But binaries rebuilt with new lib will always end up calling __fuse_loop_mt_*() functions with the lib version. No? Thanks, Amir.