On Sat, Jan 14, 2017 at 01:57:57AM +, Al Viro wrote:
> OK. Let's wait for Alan to confirm that the last variant works and
> I'll send a pull request (with Cc: stable # v4.9).
... and here it is.
The following changes since commit b4b8664d291ac1998e0f0bcdc96b6397f0fe68b3:
arm64: don't pul
On Sat, Jan 14, 2017 at 8:29 AM, Alan J. Wylie wrote:
> at 13:16 on Sat 14-Jan-2017 Alan J. Wylie (a...@wylie.me.uk) wrote:
>
>> I'll run this for a bit, then apply it to my workstation (which I'm
>> rather fond of) and make sure there are no new regressions.
>
> Looking good.
Thanks for the pinp
at 13:16 on Sat 14-Jan-2017 Alan J. Wylie (a...@wylie.me.uk) wrote:
> I'll run this for a bit, then apply it to my workstation (which I'm
> rather fond of) and make sure there are no new regressions.
Looking good.
--
Alan J. Wylie http://www.wylie.me.uk/
at 22:50 on Fri 13-Jan-2017 Al Viro (v...@zeniv.linux.org.uk) wrote:
> Or, even better, we can get rid of all wraparound-related crap if we
> calculate the final value of pipe->nrbufs and watch for _that_ as
> loop condition:
>
> diff --git a/lib/iov_iter.c b/lib/iov_iter.c
> index 25f572303801..
On Fri, Jan 13, 2017 at 05:46:34PM -0800, Linus Torvalds wrote:
> On Fri, Jan 13, 2017 at 5:24 PM, Al Viro wrote:
> >
> > Why would advance by 0 change ->iov_offset here?
>
> That's not my worry. Advancing by zero obviously doesn't change the position.
>
> But the _truncation_ of the rest requir
On Fri, Jan 13, 2017 at 5:24 PM, Al Viro wrote:
>
> Why would advance by 0 change ->iov_offset here?
That's not my worry. Advancing by zero obviously doesn't change the position.
But the _truncation_ of the rest requires iov_offset to be zero in
order to actually truncate everything.
So I was w
On Sat, Jan 14, 2017 at 01:24:28AM +, Al Viro wrote:
> On Fri, Jan 13, 2017 at 04:59:37PM -0800, Linus Torvalds wrote:
>
> > EXCEPT.
> >
> > I don't think "i->iov_offset" is actually correct. If you truncated
> > the whole thing, you should have cleared iov_offset too, and that
> > never happ
On Fri, Jan 13, 2017 at 04:59:37PM -0800, Linus Torvalds wrote:
> EXCEPT.
>
> I don't think "i->iov_offset" is actually correct. If you truncated
> the whole thing, you should have cleared iov_offset too, and that
> never happened. So truncating everything will not empty the buffers
> for us, we'
On Fri, Jan 13, 2017 at 2:50 PM, Al Viro wrote:
>
> Or, even better, we can get rid of all wraparound-related crap if we
> calculate the final value of pipe->nrbufs and watch for _that_ as
> loop condition:
Ok, I like the 'expected nrbufs' approach. It makes the actual freeing
loop trivial, and g
On Fri, Jan 13, 2017 at 10:13:08PM +, Al Viro wrote:
> On Fri, Jan 13, 2017 at 09:59:19PM +, Al Viro wrote:
> > + if (off) {
> > + pipe->bufs[idx].len = off - pipe->bufs[idx].offset;
> > + /* free all after idx; n can't be 0 */
>
> Yes, it can
On Fri, Jan 13, 2017 at 09:59:19PM +, Al Viro wrote:
> + if (off) {
> + pipe->bufs[idx].len = off - pipe->bufs[idx].offset;
> + /* free all after idx; n can't be 0 */
Yes, it can - full pipe, truncation to the part of the first buffer ;-/
OTO
On Fri, Jan 13, 2017 at 09:55:04PM +, Al Viro wrote:
> On Fri, Jan 13, 2017 at 08:47:31PM +, Al Viro wrote:
> > On Fri, Jan 13, 2017 at 12:32:37PM -0800, Linus Torvalds wrote:
> >
> > > Ugh. I still think your patch is butt-ugly, and the index comparisons
> > > make me nervous, but..
> >
On Fri, Jan 13, 2017 at 08:47:31PM +, Al Viro wrote:
> On Fri, Jan 13, 2017 at 12:32:37PM -0800, Linus Torvalds wrote:
>
> > Ugh. I still think your patch is butt-ugly, and the index comparisons
> > make me nervous, but..
>
> No arguments here - 6am on 20-odd hours of uptime is _not_ a good t
On Fri, Jan 13, 2017 at 12:32:37PM -0800, Linus Torvalds wrote:
> Ugh. I still think your patch is butt-ugly, and the index comparisons
> make me nervous, but..
No arguments here - 6am on 20-odd hours of uptime is _not_ a good time
for writing, especially since the data structure needs better doc
On Fri, Jan 13, 2017 at 12:11 PM, Al Viro wrote:
>
> PS: 'size' argument of iov_iter_advance() is the second "some" in the
> above - we tell it how much we want to advance by and everything past
> that point is, in case of PIPE_ITER, discarded.
Ok. The naming threw me. It would be more logical to
On Fri, Jan 13, 2017 at 12:08:44PM -0800, Linus Torvalds wrote:
> On Fri, Jan 13, 2017 at 11:33 AM, Linus Torvalds
> wrote:
> > This function looks so broken that I must be missing something. Why
> > doesn't pipe_advance() just look like the following:
> >
> > static void pipe_advance(struct iov
On Fri, Jan 13, 2017 at 08:08:27PM +, Al Viro wrote:
> Because it's "truncate to size", not "throw everything up to that point
> out".
>
> We have some amount of data pushed into pipe (in this case - 0) and we
> have some buffers allocated by ..._get_pages() past the end of it.
> Some of that
On Fri, Jan 13, 2017 at 11:33 AM, Linus Torvalds
wrote:
> This function looks so broken that I must be missing something. Why
> doesn't pipe_advance() just look like the following:
>
> static void pipe_advance(struct iov_iter *i, size_t size)
> {
...
> pipe_buf_release(pipe, bu
On Fri, Jan 13, 2017 at 11:33:18AM -0800, Linus Torvalds wrote:
> What am I missing?
The fact that we want to free the _tail_, not the beginning of the
damn thing.
> Why is "pipe_advance()" written in that incomprehensible form? Why
> don't we do the pipe_buf_release() as we advance through it,
On Fri, Jan 13, 2017 at 3:18 AM, Al Viro wrote:
>
> in case when pipe->nrbufs == pipe->buffers and idx == pipe->curbuf. IOW,
> we have a full pipe and want to empty it entirely; the fun question is,
> of course, telling that case from having nothing to free with the same
> full pipe...
That's ju
On Fri, Jan 13, 2017 at 10:32:47AM +, Alan J. Wylie wrote:
> # dmesg | tail
> ...
> [ 141.553455] nr: 0->16, cur: 5->5, buffers: 16->16
> [ 141.553461] copied: 0, count:2147479552, idx:5, offs:0
OK, that looks like pipe_advance() fencepost (hopefully) dealt with in
the lib/iov_iter.c patch
On Fri, Jan 13, 2017 at 10:20:19AM +, Al Viro wrote:
> OK, so it is iov_iter_advance() failing to free the shit allocated, either
> due to some breakage in pipe_advance() or buggered 'copied'... Let's
> see which one; could you apply the following and run your reproducer? The
> only differen
at 10:20 on Fri 13-Jan-2017 Al Viro (v...@zeniv.linux.org.uk) wrote:
> OK, so it is iov_iter_advance() failing to free the shit allocated, either
> due to some breakage in pipe_advance() or buggered 'copied'... Let's
> see which one; could you apply the following and run your reproducer? The
> o
On Fri, Jan 13, 2017 at 09:54:02AM +, Alan J. Wylie wrote:
> root 1669 0.0 0.0 76144 5412 ?S09:51 0:00 \_
> /usr/sbin/postdrop -r
>
> Another hang.
>
> # dmesg | tail
> [ 22.352442] r8169 :03:00.0: loading
> /lib/firmware/4.9.3-dirty/rtl_nic/rtl8168e-3
at 09:33 on Fri 13-Jan-2017 Al Viro (v...@zeniv.linux.org.uk) wrote:
> > 1735 splice(5, NULL, 1, NULL, 9223372036854775807, 0) = -1 EAGAIN
> > (Resource temporarily unavailable)
>
> Lovely... So it was getting -EAGAIN all along. Just in case - could you
> try the delta below and see if it tri
On Fri, Jan 13, 2017 at 07:23:51AM +, Alan J. Wylie wrote:
> at 15:28 on Thu 12-Jan-2017 Linus Torvalds (torva...@linux-foundation.org)
> wrote:
>
> > So assuming Al hasn't figured it out by the time you get back, can
> > you try to send us the same strace for the working case?
>
> Full outp
at 04:00 on Fri 13-Jan-2017 Al Viro (v...@zeniv.linux.org.uk) wrote:
> That, and check if it is reproducible on commit 523ac9afc73a with
> commit 8e54cadab447 cherry-picked on top of that.
$ git checkout 523ac9afc73a
$ git cherry-pick 8e54cadab447
[detached HEAD 3826f27f6830] fix default_file_sp
at 15:28 on Thu 12-Jan-2017 Linus Torvalds (torva...@linux-foundation.org)
wrote:
> So assuming Al hasn't figured it out by the time you get back, can
> you try to send us the same strace for the working case?
Full output at https://wylie.me.uk/tmp/strace1.txt
# uname -a
Linux frodo 4.8.17 #1 S
On Thu, Jan 12, 2017 at 03:28:32PM -0800, Linus Torvalds wrote:
> On Thu, Jan 12, 2017 at 2:46 PM, Alan J. Wylie wrote:
> >
> > I'm off to bed now with a stinking head cold I'm afraid.
>
> So assuming Al hasn't figured it out by the time you get back, can you
> try to send us the same strace for
On Thu, Jan 12, 2017 at 2:46 PM, Alan J. Wylie wrote:
>
> I'm off to bed now with a stinking head cold I'm afraid.
So assuming Al hasn't figured it out by the time you get back, can you
try to send us the same strace for the working case?
That fd5 (ptmx) does have O_NONBLOCK, so maybe the EAGAIN
On Thu, Jan 12, 2017 at 03:14:41PM -0800, Linus Torvalds wrote:
> On Thu, Jan 12, 2017 at 3:02 PM, Linus Torvalds
> wrote:
> >
> > Looking at the callers of "do_splice_to()", we already have the
> > wait_for_space() in do_splice(), but we do *not* have it in the
> > do_splice_from() case when both
On Thu, Jan 12, 2017 at 3:02 PM, Linus Torvalds
wrote:
>
> Looking at the callers of "do_splice_to()", we already have the
> wait_for_space() in do_splice(), but we do *not* have it in the
> do_splice_from() case when both the input and output file descriptors
> are pipes.
Bah. That case doesn't
On Thu, Jan 12, 2017 at 03:02:13PM -0800, Linus Torvalds wrote:
> splice(5, NULL, 1, NULL, 9223372036854775807, 0) = -1 EAGAIN
> (Resource temporarily unavailable)
>
> and note that the commit in question introduces that -EAGAIN error code.
>
> The old code never returned EAGAIN at all (well
On Thu, Jan 12, 2017 at 2:46 PM, Al Viro wrote:
>
> PS: what about the /proc/mounts contents? If it's something 9p-backed kvm,
> your bisect might have been caught on the bug I'd mentioned - if the breakage
> you are seeing in 4.9.3 has started after that commit and before the
> backport of the f
On Thu, Jan 12, 2017 at 10:46:06PM +, Alan J. Wylie wrote:
> at 14:26 on Thu 12-Jan-2017 Linus Torvalds (torva...@linux-foundation.org)
> wrote:
> 13761 copy_file_range(5, NULL, 1, NULL, 9223372036854775807, 0) = -1 EXDEV
> (Invalid cross-device link)
> 13761 sendfile(1, 5, NULL, 92233720368
On Thu, Jan 12, 2017 at 10:37:18PM +, Al Viro wrote:
> On Thu, Jan 12, 2017 at 02:26:42PM -0800, Linus Torvalds wrote:
> > On Thu, Jan 12, 2017 at 12:26 PM, Alan J. Wylie wrote:
> > >
> > > Strace shows that the processes are hanging in write() and read() calls.
> >
> > If this is splice-rela
at 14:26 on Thu 12-Jan-2017 Linus Torvalds (torva...@linux-foundation.org)
wrote:
> > Strace shows that the processes are hanging in write() and read() calls.
>
> If this is splice-related, I'm assuming that they aren't actually the
> two ends of the same pipe, and there is somebody doing splice
On Thu, Jan 12, 2017 at 02:26:42PM -0800, Linus Torvalds wrote:
> On Thu, Jan 12, 2017 at 12:26 PM, Alan J. Wylie wrote:
> >
> > Strace shows that the processes are hanging in write() and read() calls.
>
> If this is splice-related, I'm assuming that they aren't actually the
> two ends of the sam
On Thu, Jan 12, 2017 at 12:26 PM, Alan J. Wylie wrote:
>
> Strace shows that the processes are hanging in write() and read() calls.
If this is splice-related, I'm assuming that they aren't actually the
two ends of the same pipe, and there is somebody doing splice in the
middle.
I'm not seeing th
at 20:31 on Thu 12-Jan-2017 Al Viro (v...@zeniv.linux.org.uk) wrote:
> On Thu, Jan 12, 2017 at 08:26:52PM +, Alan J. Wylie wrote:
> >
> > Some time after 4.9.0 was released, I noticed that a cron job running
> > systemd-nspawn was hanging trying to send mail.
>
> Check if 8e54cadab447 (fix de
Some time after 4.9.0 was released, I noticed that a cron job running
systemd-nspawn was hanging trying to send mail.
I have trimmed it down to a minimal demo:
/* from crontab */
48 19 * * * date; /work/chroot-shared/test.sh; date
/* script --8<--8<--8<--
On Thu, Jan 12, 2017 at 08:26:52PM +, Alan J. Wylie wrote:
>
> Some time after 4.9.0 was released, I noticed that a cron job running
> systemd-nspawn was hanging trying to send mail.
Check if 8e54cadab447 (fix default_file_splice_read()) fixes it.
42 matches
Mail list logo