On Fri, 10 May 2013, John Baldwin wrote:

On Thursday, May 09, 2013 11:05:46 pm Bruce Evans wrote:
On Mon, 6 May 2013, John Baldwin wrote:
...
static int
vn_ioctl(fp, com, data, active_cred, td)
        struct file *fp;
        u_long com;
        void *data;
        struct ucred *active_cred;
        struct thread *td;
{
        struct vnode *vp = fp->f_vnode;
        struct vattr vattr;

        switch (vp->v_type) {
        case VREG:
        case VDIR:
                switch (com) {
                case FIONREAD:
                        vn_lock(vp, LK_SHARED | LK_RETRY);
                        error = VOP_GETATTR(vp, &vattr, active_cred);
                        VOP_UNLOCK(vp, 0);
                        if (!error)
                                *(int *)data = vattr.va_size - fp->f_offset;
                        return (error);
                case FIONBIO:
                case FIOASYNC:
                        return (0);     /* XXX */
                default:
                        return (VOP_IOCTL(vp, com, data, fp->f_flag,
                            active_cred, td));
                }
        default:
                return (ENOTTY);
        }
}
...
(The 'XXX' comment could perhaps be expanded to something along the lines of
'Allow fcntl() to toggle FNONBLOCK and FASYNC.')

Is that what it is about?  IIRC, upper layers do some partial handling
and then call lower layers to possibly do some more handling.  But here
and in some other places there is nothing more to be done.  No comment
is needed, but maybe the XXX's were reminders to clean up the layering.
FreeBSD has cleaned up the layering a bit, by using differnt fops for
different file types.

The problem is this code in fcntl() which I ran into when working on a
tutorial for writing character devices:

        case F_SETFL:
                error = fget_unlocked(fdp, fd, CAP_FCNTL, F_SETFL, &fp, NULL);
                if (error != 0)
                        break;
                do {
                        tmp = flg = fp->f_flag;
                        tmp &= ~FCNTLFLAGS;
                        tmp |= FFLAGS(arg & ~O_ACCMODE) & FCNTLFLAGS;
                } while(atomic_cmpset_int(&fp->f_flag, flg, tmp) == 0);
                tmp = fp->f_flag & FNONBLOCK;
                error = fo_ioctl(fp, FIONBIO, &tmp, td->td_ucred, td);
                if (error != 0) {
                        fdrop(fp, td);
                        break;
                }
                tmp = fp->f_flag & FASYNC;
                error = fo_ioctl(fp, FIOASYNC, &tmp, td->td_ucred, td);
                if (error == 0) {
                        fdrop(fp, td);
                        break;
                }
                atomic_clear_int(&fp->f_flag, FNONBLOCK);
                tmp = 0;
                (void)fo_ioctl(fp, FIONBIO, &tmp, td->td_ucred, td);
                fdrop(fp, td);
                break;

Hmm, this seems to have the bug that if you had FNONBLOCK set and
tried to set FASYNC via fcntl() but FIOASYNC isn't supported,
FNONBLOCK is cleared.  It seems we should only clear FNONBLOCK
if it wasn't set in 'flg'.  I think this would fix that:

sys_socket.c mishandles this differently by voiding the result of
all fo_ioctls().  The above only assumes that since the first
fo_ioctl() for FIONBIO succeeded, the second one will too.

I now remember more about this bad code:
- it is still very broken for devices.  ioctls are inherently per-device,
   but file flags aren't even per-file -- they are per-file-descriptor
   -- so the device state becomes inconsistent with the file descriptor
   state if more than 1 fd is open on a device and any non-null change
   of the file flags is made using one of the fd's (applications might
   be able to keep the file flags fairly consistent by doing fcntl for
   all the fd's (including ones shared across processes), but the file
   flags would be at least transiently inconsistent.  Only the FNONBLOCK
   (O_NONBLOCK) and FASYNC (O_ASYNC) flags are passed down.  O_NONBLOCK
   works right for some devices because it is also passed to open() and
   read()/write() (so it will work right for ioctl() if the device
   driver ignores it then).  There are related not so bad cases for
   F_GETFL, F_GETOWN and F_SETOWN.  F_GETFL returns per-fd flags.  There
   are no per-fd ownerships, so F_GETOWN has to use an ioctl to fetch
   a non-per-fd ownership (the one set by F_SETOWN) which is at least
   consistently not per-fd.

Most drivers/file types are actually non-broken for FIONBIO and do nothing
except return 0 for it (I only grepped for FIONBIO; not for O_NONBLOCK
or FNONBLOCK).  This return is sometimes XXX'ed, but it really
shouldn't be.  Returning 0 and not doing anything means that the driver
supports O_NONBLOCK correctly (by checking it per-fd in the flag passed
to open() and read/write()).  It is the drivers that do something which
deserve an XXX.  The only broken ioctl routines for FIONBIO are now:
- subr_bus.c: devioctl()
- sys_socket.c: soo_ioctl()
- pcm/dsp.c: dsp_ioctl()
- audit/audit_pipe.c: audit_pipe_ioctl()
- cbus/fdc.c: fdioctl().  This is different from fdc/fdc.c.  The latter
   just returns 0 and claims that this is for backwards compatibility
   with old utilities.  Apparently the pc98 version hasn't been moved
   to userland as much as the pc98 version.  But the comment is wrong,
   since old utilities are just broken since they depend on kernel
   features that were removed.
IIRC, not much more than soo_ioctl() had this bug in 4.4BSD, except
F_SETOWN as  more complicated and broken (but not supported for as
many devices/file types).  truckman@ did a lot of work to fix F_SETOWN
and assoicated signal handling.

Index: kern_descrip.c
===================================================================
--- kern_descrip.c      (revision 250451)
+++ kern_descrip.c      (working copy)
@@ -539,7 +539,7 @@ kern_fcntl(struct thread *td, int fd, int cmd, int
                }
                tmp = fp->f_flag & FASYNC;
                error = fo_ioctl(fp, FIOASYNC, &tmp, td->td_ucred, td);
-               if (error == 0) {
+               if (error == 0 || (flg & FNONBLOCK)) {
                        fdrop(fp, td);
                        break;
                }

This would mean you would no longer have to support the FIOASYNC
ioctl just to allow FIONBLOCK to be set.

Looks good, but...

Note, btw, that kern_ioctl() doesn't enforce quite the same
requirement as F_SETFL in that FIONBIO doesn't check for FASYNC and vice
versa.  ioctl() also doesn't revert the change if the backing fo_ioctl
method fails.

... it's bogus to back out of just the FIONBIO setting.  Other flags
changes (to O_APPEND and O_DIRECT) are not backed out of.

Apparently the idea in fcntl was to back out of all changes.  This
may have even worked before O_APPEND was supported.  O_DIRECT didn't
exist until later.  ioctl() is too complicated to have ever attempted
this.  This causes problems for applications programming: I don't know
what the spec for fcntl() or ioctl() says, but for tcsetattr() (which
is implemented using ioctl() in FreeBSD), POSIX says that the syscall
shall (?) succeed if at least one attribute was "set" (changed?).  So
when you ask tcsetattr() to change lots of attributes, the only correct
way to determine if it succeeded is to call tcgetattr() and check that
all the changes that you care about actually occurred).

I actually think that fcntl should probably ignore ENOTTY errors
if the flag is not set.  Otherwise using fcntl to toggle some other
flag can succeed but still return an error.

I think ENOTTY errors for FIONBIO are rare, since too many places copy
the code that returns 0 for FIONBIO even when O_NONBLOCK is not supported.
E.g., the non-pc98 fdc driver doesn't support O_NONBLOCK, but it has a
compatibiity excuse for claiming support.

I think fo_ioctl() shouldn't be called for null changes (especially
from the unset case to the unset case).  Calling it should fail quite
often and thus give ENOTTY even for callers that have no interest in
FNONBLOCK or FASYNC but want to toggle some other changes.  However,
always calling it minimises the inconsistencies for drivers(...) that
act on it -- this syncs the driver state with the fd state from the
last fcntl() on the set of fd's referencing the device, irrespective
of whether the current fcntl() is toggling the state.

Bruce
_______________________________________________
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Reply via email to