Hi Amir, thanks for your help!
On 3/9/24 03:46, Amir Goldstein wrote: > On Thu, Mar 7, 2024 at 8:47 PM Bernd Schubert > <bernd.schub...@fastmail.fm> wrote: >> >> Hi all, >> >> this is certainly not kind of the mail I was hoping for as a new libfuse >> maintainer. >> >> As you can see from the title and from discussion below (sorry this is >> not typical ML discussion style), we have a bit of of problem with >> libfuse ABI compatibility. >> >> While scanning through git history, Ashley found a commit that was adding >> members in the middle of struct - doesn't break API, but binary >> compatibility. That commit already landed a year ago in these releases: >> >> git tag --contains a5eb7f2a0117ab43119ef5724cf5f4f2f181804a >> fuse-3.14.1 >> fuse-3.15.0 >> fuse-3.15.1 >> fuse-3.16.1 >> fuse-3.16.2 >> >> >> Obviously this needs improved testing for, but right now we wonder what >> would be the preferred action to avoid issues. >> >> a) We could fix it and move bits to the right place. Fixes everything >> compiled with old versions, but breaks everything compiled with the >> versions above >> >> b) Increase the .so version - enforces recompilation of all binaries. >> Intrusive, especially for distributions, but probably safest. >> >> c) Other ideas? >> > > Heuristically, you can detect most of the shifted flags at runtime > because... > >> >> >> I don't think there is anything in libfuse that would allow us to >> detect which version of libfuse a library was linked to. >> > > I think we do know for sure if fs was linked with libfuse < 3.12 > without fuse_loop_mt_312? > so only left to detect 3.12..3.14 vs. 3.14..3.16 Hmm, I guess I miss something, but how would I know if it was linked with fuse_loop_mt_312? That needs an elf reader? Assuming we would put this into the library, somehow, how does this work with stripped binaries? bschubert2@imesrv6 example>nm passthrough_ll | head -n1 00000000000003b4 r __abi_tag bschubert2@imesrv6 example>strip passthrough_ll bschubert2@imesrv6 example>nm passthrough_ll | head -n1 nm: passthrough_ll: no symbols > >> >> The commit shifted these members in struct fuse_file_info { >> >> struct fuse_file_info { >> ... >> /** Can be filled by open/create, to allow parallel direct writes on >> this >> * file */ >> unsigned int parallel_direct_writes : 1; --> introduced the shift > > Not expected in flush/release, so can be heuristically interpreted as flush > >> >> /** Indicates a flush operation. Set in flush operation, also >> maybe set in highlevel lock operation and lowlevel release >> operation. */ >> unsigned int flush : 1; >> > > Not expected in open/create, so can be heuristically interpreted as > nonseekable > >> /** Can be filled in by open, to indicate that the file is not >> seekable. */ >> unsigned int nonseekable : 1; >> > > Not expected in release, so can be heuristically interpreted as flock_release > >> /* Indicates that flock locks for this file should be >> released. If set, lock_owner shall contain a valid value. >> May only be set in ->release(). */ >> unsigned int flock_release : 1; >> > > Not expected in opendir, so can be heuristically interpreted as cache_readdir > >> /** Can be filled in by opendir. It signals the kernel to >> enable caching of entries returned by readdir(). Has no >> effect when set in other contexts (in particular it does >> nothing when set by open()). */ >> unsigned int cache_readdir : 1; >> > > Ignored in open, but based on the comment above, it may be > implied that some fs may set it in open() reply > >> /** Can be filled in by open, to indicate that flush is not needed >> on close. */ >> unsigned int noflush : 1; > > noflush is just an optimization, which the kernel ignores anyway > when writeback cache is enabled, so no harm done if it is wrongly > interpreted as readdir_cache in open() and ignored. > It is also quite recent (3.11) so not very likely used. > >> }; >> >> I.e. setting flush would actually set parallel_direct_writes >> (note: with binaries compiled against older libfuse versions) >> > > It would be suspicious if fs sets parallel_direct_writes flush at > flush() or release(), so that can be used as a hint of old ABI > interpreted as flush and issue a warning. > > It would be pretty ugly to use these heuristics in the library forever, > but perhaps as safety measure for binaries built with a range of > libfuse version that is not likely to be found in the wild (3.14..3.16) > it is an acceptable compromise? I really like your idea about heuristics, I just don't know how to limit the libfuse version range. > > and perhaps the next libfuse version can pass the header version > fs was compiled with as argument to fuse_loop_mt_317()? I think adding it to fuse_loop_mt_317 would be easy and could be auto-added with a macro. I think more difficult is to add it to the old api/abi. I.e. it would take a few years until that api version gets mostly used?