Re: Update of file offset on write() etc. is non-atomic with I/O

2014-05-04 Thread Michael Kerrisk
Ouch! I've just seen that trimming the CC on this reply took me out of a large part of the subsequent conversation. PLEASE don't trim CCs, and especially not the address of the OP!! On Mon, Mar 3, 2014 at 10:03 PM, George Spelvin wrote: >> struct fd { >> struct file *file; >> - int nee

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-10 Thread Al Viro
On Wed, Mar 05, 2014 at 12:04:11AM +, Al Viro wrote: > There's also a pile of crap around sockfd_lookup/sockfd_put, related > to that. Moreover, there's net/compat.c, which probably ought to > have the compat syscalls themselves moved to net/socket.c (under > ifdef CONFIG_COMPAT) and switched

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-06 Thread Yongzhi Pan
2014-03-06 23:03 GMT+08:00 Michael Kerrisk (man-pages) : > > The original code (where Yongzhi Pan reported an issue) and > the multi_writer.c test code where both mine actually. Anyway, > I applied the patch to 3.14-rc5, and I (not withstanding the other > points raised by Al about the patch) I con

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-06 Thread Michael Kerrisk (man-pages)
On 03/03/2014 06:36 PM, Linus Torvalds wrote: > Ok, sorry for the long delay, I was distracted (and hoping that Al > would come up with a patch). > > Anyway, attached is the patch I think we should do for this issue. It > is fairly simple: > > - it adds a "f_pos_mutex" to the "struct file". > >

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-04 Thread Al Viro
On Tue, Mar 04, 2014 at 01:17:50PM -0800, Linus Torvalds wrote: > On Tue, Mar 4, 2014 at 12:00 PM, Al Viro wrote: > > > > OK, with the attached set (the first one is essentially unchanged from > > your first one), it seems to work and produce better code on all targets > > I've tried. Comments? >

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-04 Thread Linus Torvalds
On Tue, Mar 4, 2014 at 12:00 PM, Al Viro wrote: > > OK, with the attached set (the first one is essentially unchanged from > your first one), it seems to work and produce better code on all targets > I've tried. Comments? I'm certainly ok with it. You seem to have left the fput_light() function

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-04 Thread Cedric Blancher
On 4 March 2014 00:39, Al Viro wrote: > On Mon, Mar 03, 2014 at 03:23:55PM -0800, Linus Torvalds wrote: > >> This just uses a "flags" field, and we currently only have two bits >> that we use: FDPUT_FPUT and FDPUT_POS_UNLOCK. The first patch knows >> that "fget_light()" writes 0/1 for that, which

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-04 Thread Al Viro
On Tue, Mar 04, 2014 at 01:05:17AM +, Al Viro wrote: > It's probably not worth replacing struct fd with typedef to unsigned long - > too easy to have it confused with a file descriptor itself and pass to > something that expects e.g. int. In any case, since we leave fdget() > inlined, compile

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Al Viro
On Mon, Mar 03, 2014 at 04:42:36PM -0800, Linus Torvalds wrote: > On Mon, Mar 3, 2014 at 4:23 PM, Al Viro wrote: > > > > I wonder if something like > > > > static inline struct fd fdget(int fd) > > { > > unsigned long v = __fdget(fd); > > return (struct fd){(struct file *)(v & ~1),

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Linus Torvalds
On Mon, Mar 3, 2014 at 4:23 PM, Al Viro wrote: > > I wonder if something like > > static inline struct fd fdget(int fd) > { > unsigned long v = __fdget(fd); > return (struct fd){(struct file *)(v & ~1), v & 1}; > } Make it use "&3" so that we have the two bits we need, and I'll ha

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Al Viro
On Mon, Mar 03, 2014 at 03:59:36PM -0800, Linus Torvalds wrote: > I doubt it's worth caring about. Even when passing things in memory, > the end result isn't that much worse than the fget_light() model that > passes just one of the two fields in memory. I'm not sure if that's the right approach,

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread George Spelvin
Linus Torvalds wrote: On Mon, Mar 3, 2014 at 1:26 PM, Al Viro wrote: >> Most of the cases have it kept separately in registers, actually - there's >> a reason why fdget() and friends are inlined. > Yes. And bit test and set ops on registers are actually cheaper than > playing around with bytes.

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Linus Torvalds
On Mon, Mar 3, 2014 at 3:42 PM, Al Viro wrote: > On Mon, Mar 03, 2014 at 03:34:43PM -0800, Linus Torvalds wrote: >> >> (Side note: I think sparc or something doesn't support it, and may >> return things in memory. I can't really seem to find it in myself to >> care) > > sparc64 actually does suppo

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Al Viro
On Mon, Mar 03, 2014 at 11:39:02PM +, Al Viro wrote: > On Mon, Mar 03, 2014 at 03:23:55PM -0800, Linus Torvalds wrote: > > > This just uses a "flags" field, and we currently only have two bits > > that we use: FDPUT_FPUT and FDPUT_POS_UNLOCK. The first patch knows > > that "fget_light()" write

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Linus Torvalds
On Mon, Mar 3, 2014 at 3:39 PM, Al Viro wrote: > > do_sendfile() is also there and this one is even more unpleasant ;-/ > We probably can ignore that one (until POSIX learns of its existence), > thouhg... Yeah, I saw the do_sendfile one and decided we don't care. Not only is is out of POSIX spec

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Al Viro
On Mon, Mar 03, 2014 at 03:34:43PM -0800, Linus Torvalds wrote: > On Mon, Mar 3, 2014 at 3:28 PM, Al Viro wrote: > > > > Umm... I would be very surprised if that worked well. You have just > > forced fdget() to comply to the ABI. And unless that has such structs > > returned in register pairs,

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Al Viro
On Mon, Mar 03, 2014 at 03:23:55PM -0800, Linus Torvalds wrote: > This just uses a "flags" field, and we currently only have two bits > that we use: FDPUT_FPUT and FDPUT_POS_UNLOCK. The first patch knows > that "fget_light()" writes 0/1 for that, which is the same as the > FDPUT_FPUT bit. I didn't

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Linus Torvalds
On Mon, Mar 3, 2014 at 3:28 PM, Al Viro wrote: > > Umm... I would be very surprised if that worked well. You have just > forced fdget() to comply to the ABI. And unless that has such structs > returned in register pairs, there's no way for compiler to do about > that. Register pairs are very

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Al Viro
On Mon, Mar 03, 2014 at 02:17:55PM -0800, Linus Torvalds wrote: > Something like the attached untested patch. This gets rid of > "fget_light()", and instead makes "fdget()" the native interface (same > for the "raw" version). Umm... I would be very surprised if that worked well. You have just

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Linus Torvalds
On Mon, Mar 3, 2014 at 2:55 PM, Linus Torvalds wrote: > > I think I'll respin this with the compat readv/writev case fixed and > with the bitfield replaced with an "unsigned int" that we just do > bitops by hand on. Because that code generation makes me feel slightly > ill. Ok, how about this ver

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Linus Torvalds
On Mon, Mar 3, 2014 at 1:52 PM, Linus Torvalds wrote: >> >> Most of the cases have it kept separately in registers, actually - there's >> a reason why fdget() and friends are inlined. > > Yes. And bit test and set ops on registers are actually cheaper than > playing around with bytes. Ugh. gcc ge

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Linus Torvalds
On Mon, Mar 3, 2014 at 2:10 PM, Al Viro wrote: > > It can - compat read/write/read/writev are also there these days. There is no "compat_read/write()" - or I would have noticed it when grepping. Regular read/write is fine for the compat case. It's only readv/writev that have compat versions, bec

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Linus Torvalds
On Mon, Mar 3, 2014 at 2:09 PM, Al Viro wrote: > > Yes, but in that case fdget() has grabbed a reference to that sucker, > so the only way to end with refcount 1 is to have the damn thing gone > from descriptor table in between. And AFAICS in that case we are just > fine without f_pos_lock. Ahh.

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Linus Torvalds
On Mon, Mar 3, 2014 at 2:01 PM, Al Viro wrote: > > The thing is, the callers in there do *not* keep struct file * at all - > they keep struct socket * and use sock->file to get struct file * back > when they need it. Not a problem. Just make sockfd_lookup_light() use the broken inefficient callin

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Al Viro
On Mon, Mar 03, 2014 at 02:01:33PM -0800, Linus Torvalds wrote: > On Mon, Mar 3, 2014 at 1:45 PM, Al Viro wrote: > > > > And it looks like you've missed compat counterparts. > > Grr. Yes, I missed them for compat_readv/writev - even though I > grepped for various users, but I grepped for all the

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Al Viro
On Mon, Mar 03, 2014 at 01:56:31PM -0800, Linus Torvalds wrote: > On Mon, Mar 3, 2014 at 1:45 PM, Al Viro wrote: > > > > Um... That's odd - we *could* get there with f.need_put and > > file_count(file) equal to 1, but why would we want to take > > f_pos_lock in that case? > > Because that means

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Al Viro
On Mon, Mar 03, 2014 at 01:52:13PM -0800, Linus Torvalds wrote: > On Mon, Mar 3, 2014 at 1:26 PM, Al Viro wrote: > > On Mon, Mar 03, 2014 at 04:03:59PM -0500, George Spelvin wrote: > >> > >> (If you want to use bits, why not use the two lsbits of the file pointer > >> for the purpose? That would

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Linus Torvalds
On Mon, Mar 3, 2014 at 1:45 PM, Al Viro wrote: > > And it looks like you've missed compat counterparts. Grr. Yes, I missed them for compat_readv/writev - even though I grepped for various users, but I grepped for all the *other* cases (ie vfs_{read|write|llseek}). That means that the locker helpe

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Linus Torvalds
On Mon, Mar 3, 2014 at 1:45 PM, Al Viro wrote: > > Um... That's odd - we *could* get there with f.need_put and > file_count(file) equal to 1, but why would we want to take > f_pos_lock in that case? Because that means that the file table is shared among threads. So another thread can access the

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Linus Torvalds
On Mon, Mar 3, 2014 at 1:26 PM, Al Viro wrote: > On Mon, Mar 03, 2014 at 04:03:59PM -0500, George Spelvin wrote: >> >> (If you want to use bits, why not use the two lsbits of the file pointer >> for the purpose? That would save a lot of space.) > > Most of the cases have it kept separately in reg

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Al Viro
On Mon, Mar 03, 2014 at 09:36:58AM -0800, Linus Torvalds wrote: > +static struct fd fdget_pos(int fd) > +{ > + struct fd f = fdget(fd); > + struct file *file = f.file; > + > + if (file && (file->f_mode & FMODE_ATOMIC_POS)) { > + if (f.need_put || file_count(file) > 1) { Um.

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Al Viro
On Mon, Mar 03, 2014 at 04:03:59PM -0500, George Spelvin wrote: > > struct fd { > > struct file *file; > > - int need_put; > > + unsigned need_put:1, need_pos_unlock:1; > > }; > > Since we're rounding up to 2*sizeof(struct file *) anyway, is this a case > where wasting space on a couple

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread George Spelvin
> struct fd { > struct file *file; > - int need_put; > + unsigned need_put:1, need_pos_unlock:1; > }; Since we're rounding up to 2*sizeof(struct file *) anyway, is this a case where wasting space on a couple of char (or bool) flags would generate better code than a bitfield? In pa

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-03-03 Thread Linus Torvalds
Ok, sorry for the long delay, I was distracted (and hoping that Al would come up with a patch). Anyway, attached is the patch I think we should do for this issue. It is fairly simple: - it adds a "f_pos_mutex" to the "struct file". - it adds a new FMODE_ATOMIC_POS flag to the file mode flags t

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-02-22 Thread Michael Kerrisk (man-pages)
On Sun, Feb 23, 2014 at 2:18 AM, Kevin Easton wrote: > On Fri, Feb 21, 2014 at 07:01:31AM +0100, Michael Kerrisk (man-pages) wrote: >> Here's the fulll list from POSIX.1-2008/SUSv4 Section XSI 2.9.7: >> >> [[ >> 2.9.7 Thread Interactions with Regular File Operations >> >> All of the following func

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-02-22 Thread Kevin Easton
On Fri, Feb 21, 2014 at 07:01:31AM +0100, Michael Kerrisk (man-pages) wrote: > Here's the fulll list from POSIX.1-2008/SUSv4 Section XSI 2.9.7: > > [[ > 2.9.7 Thread Interactions with Regular File Operations > > All of the following functions shall be atomic with respect to each > other in the ef

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-02-20 Thread Michael Kerrisk (man-pages)
On Thu, Feb 20, 2014 at 7:29 PM, Al Viro wrote: > On Thu, Feb 20, 2014 at 06:15:15PM +, Zuckerman, Boris wrote: >> Hi, >> >> You probably already considered that - sorry, if so... >> >> Instead of the mutex Windows use ExecutiveResource with shared and exclusive >> semantics. Readers serializ

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-02-20 Thread Al Viro
On Thu, Feb 20, 2014 at 06:15:15PM +, Zuckerman, Boris wrote: > Hi, > > You probably already considered that - sorry, if so… > > Instead of the mutex Windows use ExecutiveResource with shared and exclusive > semantics. Readers serialize by taking the resource shared and writers take > it ex

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-02-20 Thread Zuckerman, Boris
quot; , Yongzhi Pan Sent: Thu, Feb 20, 2014 17:15:07 GMT+00:00 Subject: Re: Update of file offset on write() etc. is non-atomic with I/O Yes, I do think we violate POSIX here because of how we handle f_pos (the earlier thread from 2006 you point to talks about the "thread safe" part, the poi

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-02-20 Thread Linus Torvalds
Yes, I do think we violate POSIX here because of how we handle f_pos (the earlier thread from 2006 you point to talks about the "thread safe" part, the point here about the actual wording of "atomic" is a separate issue). Long long ago we used to just pass in the pointer to f_pos directly, and the

Re: Update of file offset on write() etc. is non-atomic with I/O

2014-02-18 Thread Michael Kerrisk
[expanding the CC list a little more to bring in some previously interested parties] On Mon, Feb 17, 2014 at 4:41 PM, Michael Kerrisk (man-pages) wrote: > Hello all, > > A note from Yongzhi Pan about some of my own code led me to dig deeper > and discover behavior that is surprising and also seem

Update of file offset on write() etc. is non-atomic with I/O

2014-02-17 Thread Michael Kerrisk (man-pages)
Hello all, A note from Yongzhi Pan about some of my own code led me to dig deeper and discover behavior that is surprising and also seems to be a fairly clear violation of POSIX requirements. It appears that write() (and, presumably read() and other similar system calls) are not atomic with re