On Thu, 2007-10-11 at 17:08 +0200, Miklos Szeredi wrote: > > diff -puN fs/namei.c~get-write-in-__dentry_open fs/namei.c > > --- lxc/fs/namei.c~get-write-in-__dentry_open 2007-10-03 > > 14:44:52.000000000 -0700 > > +++ lxc-dave/fs/namei.c 2007-10-04 18:02:48.000000000 -0700 > > @@ -1621,14 +1621,6 @@ int may_open(struct nameidata *nd, int a > > return -EACCES; > > > > flag &= ~O_TRUNC; > > - } else if (flag & FMODE_WRITE) { > > - /* > > - * effectively: !special_file() > > - * balanced by __fput() > > - */ > > - error = mnt_want_write(nd->mnt); > > - if (error) > > - return error; > > } > > Maybe readonly should still be checked here, so that the order of > error checking doesn't change. If racing with a read-only remount the > order is irrelevant anyway. Something like this? > > } else if (flag & FMODE_WRITE && __mnt_is_readonly(nd->mnt)) { > return -EROFS > }
I think that would be a bug if anything actually managed to trip that code. all of the may_open() calls should have been covered by the __dentry_open() mnt writer. > > error = vfs_permission(nd, acc_mode); > > @@ -1778,11 +1770,7 @@ do_last: > > > > /* Negative dentry, just create the file */ > > if (!path.dentry->d_inode) { > > - error = mnt_want_write(nd->mnt); > > - if (error) > > - goto exit_mutex_unlock; > > error = open_namei_create(nd, &path, flag, mode); > > - mnt_drop_write(nd->mnt); > > This is still needed, isn't it? Yes, it is. I'll add a big fat comment this time about why we need it. > And they should be added around do_truncate() as well, since you > remove the protection from may_open(). > > This one introduces an interesting race between ro-remount and > open(O_TRUNC), where the truncate can succeed but the open fail with > EROFS. Is that a problem? You're right, this does introduce that race, and it is relatively hard to fix properly. But, the 'return a filp' patch makes it easy to fix. I've put a temporary kludge in the updated version of this patch, and fixed it properly in that later patch. > > cleanup_all: > > fops_put(f->f_op); > > - if (f->f_mode & FMODE_WRITE) > > + if (f->f_mode & FMODE_WRITE) { > > put_write_access(inode); > > + mnt_drop_write(mnt); > > Shouldn't this be conditional on !special_file()? It certainly should. -- Dave - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/