Hi, Daniel P. Berrangé wrote on Tue, Jul 20, 2021 at 06:14:53PM +0100: > > > When attempting to read a file (other than a pipe or FIFO) that supports > > > non-blocking reads and has no data currently available: > > > > > > - If O_NONBLOCK is set, read() shall return -1 and set errno to > > > [EAGAIN]. > > > > > > - If O_NONBLOCK is clear, read() shall block the calling thread until > > > some data becomes available. > > > > > > - The use of the O_NONBLOCK flag has no effect if there is some data > > > available. > > > > > > and > > > > > > [EAGAIN] > > > The file is neither a pipe, nor a FIFO, nor a socket, the O_NONBLOCK > > > flag is set for the file descriptor, and the thread would be delayed in > > > the read operation.
Not much time to reply now (will follow up in more details tomorrow), but that was (unfortunately?) a voluntary change: http://lkml.kernel.org/r/20200205003457.24340-2-l2...@cock.li with the argument that if some program goes out of its way to use O_NONBLOCK on open it can also handle short reads (which by the way can also happen without O_NONBLOCK as bugs on other filesystems, so in my opinion is something that should be handled regardless of what we do here -- I've seen enough data being eaten by programs that take shortcuts on IO as soon as something goes wrong instead of erroring or taking care of these) Unfortunately GNU tar doesn't, and Nikos Tsipinakis (added to Ccs) sent a patch to bug-tar@ in ... september last year which looked good enough to me but didn't seem to be taken? I didn't check. I agree this isn't posix, but it was discussed as a quirk that seemed better than yet another weird mount option that e.g. NFS has for special netfs behaviour. The argument was for synthetic fs having a file that would stream things and implementing tail -f like behaviour on it, problem being that if they would make the fake-file a pipe it would stay local to the linux client and not go through the 9p server, so it was deemed difficult to emulate the behaviour. If you have a practical way of doing this then I'll be happy to revert Sergey's commit (also added to cc), as I can agree sticking to posix when possible is better. > > I was thinking to something like that (not tested yet): > > > > --- a/fs/9p/vfs_file.c > > +++ b/fs/9p/vfs_file.c > > @@ -389,8 +389,22 @@ v9fs_file_read_iter(struct kiocb *iocb, struct > > iov_iter *t> > > p9_debug(P9_DEBUG_VFS, "count %zu offset %lld\n", > > iov_iter_count(to), iocb->ki_pos); > > > > - if (iocb->ki_filp->f_flags & O_NONBLOCK) > > + if (iocb->ki_filp->f_flags & O_NONBLOCK) { > > + size_t count = iov_iter_count(to); > > + > > ret = p9_client_read_once(fid, iocb->ki_pos, to, &err); > > + if (!ret) > > + return err; > > + > > + /* > > + * POSIX requires to ignore O_NONBLOCK if some data is > > + * already available. > > + */ > > + if (ret != count) { > > + iocb->ki_pos += ret; > > + ret = p9_client_read(fid, iocb->ki_pos, to, &err); > > + } That seems overly complicated, just use p9_client_read as per the else (e.g. remove the condition) if that's what you want. -- Dominique