2015-07-03 18:00 GMT+03:00 Al Viro :
> Could you verify that the patch below deals with your setup as well?
> If it does, I'm going to put it into tonight's pull request, after I get
> some sleep... Right now I'm about to crawl in direction of bed - 25 hours
> of uptime is a bit too much... ;-/
>
On Fri, Jul 03, 2015 at 10:42:10AM +0100, Al Viro wrote:
> AFAICS, we get occasional stray responses from somewhere. And no, it doesn't
> seem to be related to flushes or to dropping chan->lock in req_done() (this
> run had been with chan->lock taken on the outside of the loop).
>
> What I reall
On Fri, Jul 03, 2015 at 11:19:31AM +0300, Andrey Ryabinin wrote:
> On 07/02/2015 07:49 PM, Al Viro wrote:
> > On Thu, Jul 02, 2015 at 05:43:32PM +0100, Al Viro wrote:
> >>req->tc->tag = tag-1;
> >> + if (req->status != REQ_STATUS_IDLE)
> >> + pr_err("using tag %d with odd status (%d)"
On 07/02/2015 07:49 PM, Al Viro wrote:
> On Thu, Jul 02, 2015 at 05:43:32PM +0100, Al Viro wrote:
>> req->tc->tag = tag-1;
>> +if (req->status != REQ_STATUS_IDLE)
>> +pr_err("using tag %d with odd status (%d)", tag, req->status);
>
> Should be tag - 1 here, actually.
So, with
Al Viro wrote on Thu, Jul 02, 2015:
> On Thu, Jul 02, 2015 at 07:56:29PM +0200, Dominique Martinet wrote:
> > Using cache=none here so behavious is likely different with cache, but
> > basically you can't get more than one tag per user thread accessing the
> > 9P mount...
>
> Yes, and...? You can
On Thu, Jul 02, 2015 at 12:16:14PM -0700, Linus Torvalds wrote:
> On Thu, Jul 2, 2015 at 11:40 AM, Al Viro wrote:
> >
> > All they are used for is matching response to request. Basically, you
> > can have up to 65535 pending requests. Reusing it right after getting
> > the response is fine.
>
>
2015-07-02 20:56 GMT+03:00 Dominique Martinet :
>
> Still definitely needs fixing, but I think the issue is somewhere
> else... If Andrey could share the workload he uses I can try with other
> servers, would be nice if we can rule a qemu bug out completely :)
>
I simply run trinity from 9p rootfs
2015-07-02 19:43 GMT+03:00 Al Viro :
> On Thu, Jul 02, 2015 at 03:19:57PM +0300, Andrey Ryabinin wrote:
>
>> Added:
>> +if (total > count)
>> + *(char *)0 = 0
>>
>> and never hit this condition.
>>
>
> OK, so it's definitely a mismatched response.
>
>> req->tc->tag = tag-1;
>>
On Thu, Jul 2, 2015 at 11:40 AM, Al Viro wrote:
>
> All they are used for is matching response to request. Basically, you
> can have up to 65535 pending requests. Reusing it right after getting
> the response is fine.
Reusing a tag right after getting the completion may be fine in
theory, but i
On Thu, 2 Jul 2015 19:56:29 +0200
Dominique Martinet wrote:
> Jeff Layton wrote on Thu, Jul 02, 2015:
> > So p9_idpool_create should take an argument for the "end" value, and
> > then store that in a new field in p9_idpool. Then they can pass that in
> > as the "end" parm in idr_alloc. Or, they c
On Thu, Jul 02, 2015 at 07:56:29PM +0200, Dominique Martinet wrote:
> Using cache=none here so behavious is likely different with cache, but
> basically you can't get more than one tag per user thread accessing the
> 9P mount...
Yes, and...? You can get a lot more than one user thread... Andrey
On Thu, Jul 02, 2015 at 01:01:39PM -0400, Jeff Layton wrote:
> So p9_idpool_create should take an argument for the "end" value, and
> then store that in a new field in p9_idpool. Then they can pass that in
> as the "end" parm in idr_alloc. Or, they could give up using the same
> function there and
Jeff Layton wrote on Thu, Jul 02, 2015:
> So p9_idpool_create should take an argument for the "end" value, and
> then store that in a new field in p9_idpool. Then they can pass that in
> as the "end" parm in idr_alloc. Or, they could give up using the same
> function there and use a different one f
On Thu, 2 Jul 2015 17:45:35 +0100
Al Viro wrote:
> On Thu, Jul 02, 2015 at 08:07:38AM -0400, Jeff Layton wrote:
>
> > Erm...and why is it passing in '0' to idr_alloc for the end value if it
> > can't deal with more than 16 bits? That seems like a plain old bug...
>
> Because they are using the
On Thu, Jul 02, 2015 at 05:43:32PM +0100, Al Viro wrote:
> req->tc->tag = tag-1;
> + if (req->status != REQ_STATUS_IDLE)
> + pr_err("using tag %d with odd status (%d)", tag, req->status);
Should be tag - 1 here, actually.
diff --git a/net/9p/client.c b/net/9p/client.c
index
On Thu, Jul 02, 2015 at 08:07:38AM -0400, Jeff Layton wrote:
> Erm...and why is it passing in '0' to idr_alloc for the end value if it
> can't deal with more than 16 bits? That seems like a plain old bug...
Because they are using the same function (with different pool, obviously)
for FID allocati
On Thu, Jul 02, 2015 at 03:19:57PM +0300, Andrey Ryabinin wrote:
> Added:
> +if (total > count)
> + *(char *)0 = 0
>
> and never hit this condition.
>
OK, so it's definitely a mismatched response.
> req->tc->tag = tag-1;
> + if (WARN_ON(req->status != REQ_STATUS_IDL
On 07/02/2015 11:42 AM, Al Viro wrote:
> On Thu, Jul 02, 2015 at 09:25:30AM +0100, Al Viro wrote:
>> On Thu, Jul 02, 2015 at 11:19:03AM +0300, Andrey Ryabinin wrote:
>>> Besides qemu, I've also tried kvmtool with the same result. IOW I'm seeing
>>> this under kvmtool as well. It just takes a bit lo
On Thu, 2 Jul 2015 04:20:42 +0100
Al Viro wrote:
> On Wed, Jul 01, 2015 at 07:44:08PM +0100, Al Viro wrote:
> > Mismatched reply could also be a possibility, but only if we end up with
> > sending more than one request with the same tag without waiting for response
> > for the first one.
>
> ...
On Thu, 2 Jul 2015 08:00:26 -0400
Jeff Layton wrote:
> On Thu, 2 Jul 2015 04:20:42 +0100
> Al Viro wrote:
>
> > On Wed, Jul 01, 2015 at 07:44:08PM +0100, Al Viro wrote:
> > > Mismatched reply could also be a possibility, but only if we end up with
> > > sending more than one request with the sa
On Thu, Jul 02, 2015 at 09:25:30AM +0100, Al Viro wrote:
> On Thu, Jul 02, 2015 at 11:19:03AM +0300, Andrey Ryabinin wrote:
> > Besides qemu, I've also tried kvmtool with the same result. IOW I'm seeing
> > this under kvmtool as well. It just takes a bit longer to reproduce
> > this in kvmtool.
> >
On Thu, Jul 02, 2015 at 11:19:03AM +0300, Andrey Ryabinin wrote:
> Besides qemu, I've also tried kvmtool with the same result. IOW I'm seeing
> this under kvmtool as well. It just takes a bit longer to reproduce
> this in kvmtool.
>
> > The bug I suspected to be the cause of that is in tag allocat
On 07/02/2015 10:59 AM, Al Viro wrote:
> On Thu, Jul 02, 2015 at 10:50:05AM +0300, Andrey Ryabinin wrote:
>
and see if it triggers. I'm not sure if failing with ENOMEM is the
right response (another variant is to sleep there until the pile
gets cleaned or until we get killed), and W
On Thu, Jul 02, 2015 at 10:50:05AM +0300, Andrey Ryabinin wrote:
> >> and see if it triggers. I'm not sure if failing with ENOMEM is the
> >> right response (another variant is to sleep there until the pile
> >> gets cleaned or until we get killed), and WARN_ON_ONCE() is definitely
> >> not for t
On Thu, Jul 02, 2015 at 10:19:07AM +0300, Andrey Ryabinin wrote:
> On 07/02/2015 07:10 AM, Al Viro wrote:
> >>
> >> It should be easy to confirm - in p9_client_prepare_req() add
> >>if (WARN_ON_ONCE(tag != (u16)tag)) {
> >>p9_idpool_put(tag, c->tagpool);
> >>
[repeating, since my previous email didn't reach mailing lists]
2015-07-02 7:10 GMT+03:00 Al Viro :
>> It should be easy to confirm - in p9_client_prepare_req() add
>> if (WARN_ON_ONCE(tag != (u16)tag)) {
>> p9_idpool_put(tag, c->tagpool);
>>
[9p and sunrpc folks added to Cc]
On Thu, Jul 02, 2015 at 04:20:42AM +0100, Al Viro wrote:
> On Wed, Jul 01, 2015 at 07:44:08PM +0100, Al Viro wrote:
> > Mismatched reply could also be a possibility, but only if we end up with
> > sending more than one request with the same tag without waiting for
On Wed, Jul 01, 2015 at 07:44:08PM +0100, Al Viro wrote:
> Mismatched reply could also be a possibility, but only if we end up with
> sending more than one request with the same tag without waiting for response
> for the first one.
... and I think I see what's going on. Tags are 16bit. Suppose t
On Wed, Jul 01, 2015 at 02:25:43PM +0300, Andrey Ryabinin wrote:
> I've attached gdb instead.
> So, after message "bogus RWRITE: 93 -> 4096"
> I've got this:
>
> (gdb) p *req->rc
> $11 = {size = 11, id = 119 'w', tag = 3, offset = 11, capacity = 8192, sdata
> = 0x8802347b8020 "\v"}
> (gdb) p
On 07/01/2015 11:55 AM, Al Viro wrote:
> On Wed, Jul 01, 2015 at 11:41:04AM +0300, Andrey Ryabinin wrote:
>> On 07/01/2015 11:27 AM, Al Viro wrote:
>>>
>>> Could you check if 3.19 was getting anything similar? I.e. in
>>> p9_client_write() there add
>>> if (count > rsize)
>>> prin
On Wed, Jul 01, 2015 at 11:41:04AM +0300, Andrey Ryabinin wrote:
> On 07/01/2015 11:27 AM, Al Viro wrote:
> >
> > Could you check if 3.19 was getting anything similar? I.e. in
> > p9_client_write() there add
> > if (count > rsize)
> > printk(KERN_ERR "bogus RWRITE: %d -> %d\n", r
On 07/01/2015 11:27 AM, Al Viro wrote:
>
> Could you check if 3.19 was getting anything similar? I.e. in
> p9_client_write() there add
> if (count > rsize)
> printk(KERN_ERR "bogus RWRITE: %d -> %d\n", rsize, count);
> just before
> p9_debug(P9_DEBUG_9P, "<<< RWRITE cou
On Wed, Jul 01, 2015 at 10:50:59AM +0300, Andrey Ryabinin wrote:
> # dmesg | grep fucked
>
> [ 114.732166] fucked: sent 2037, server says it got 2047 (err = 0)
> [ 124.937105] fucked: sent 27, server says it got 4096 (err = 0)
> [ 154.075400] fucked: sent 19, server says it got 4096 (err = 0)
On 07/01/2015 09:27 AM, Al Viro wrote:
> On Mon, Jun 22, 2015 at 03:02:11PM +0300, Andrey Ryabinin wrote:
>> On 06/22/2015 12:12 AM, Al Viro wrote:
>>> On Thu, Apr 23, 2015 at 01:16:15PM +0300, Andrey Ryabinin wrote:
This change caused following:
>>>
This could happen when p9pdu_readf() c
On Mon, Jun 22, 2015 at 03:02:11PM +0300, Andrey Ryabinin wrote:
> On 06/22/2015 12:12 AM, Al Viro wrote:
> > On Thu, Apr 23, 2015 at 01:16:15PM +0300, Andrey Ryabinin wrote:
> >> This change caused following:
> >
> >> This could happen when p9pdu_readf() changes 'count' to some value >
> >> iov_
On 06/22/2015 12:12 AM, Al Viro wrote:
> On Thu, Apr 23, 2015 at 01:16:15PM +0300, Andrey Ryabinin wrote:
>> This change caused following:
>
>> This could happen when p9pdu_readf() changes 'count' to some value >
>> iov_iter_count(from):
>>
>> p9_client_write():
>> <...>
>> int count
On Sun, Jun 21, 2015 at 02:16:15PM -0700, Linus Torvalds wrote:
> On Sun, Jun 21, 2015 at 2:12 PM, Al Viro wrote:
> > + if (count > rsize) {
> > + WARN_ON(1);
> > + count = rsize;
> > + }
>
> So if we'd actually want to merge
On Sun, Jun 21, 2015 at 2:12 PM, Al Viro wrote:
> + if (count > rsize) {
> + WARN_ON(1);
> + count = rsize;
> + }
So if we'd actually want to merge it with the warning, I'd prefer writing it as
if (WARN_ON_ONCE(count
On Thu, Apr 23, 2015 at 01:16:15PM +0300, Andrey Ryabinin wrote:
> This change caused following:
> This could happen when p9pdu_readf() changes 'count' to some value >
> iov_iter_count(from):
>
> p9_client_write():
> <...>
> int count = iov_iter_count(from);
> <...>
>
On 04/23/2015 01:16 PM, Andrey Ryabinin wrote:
> On 04/15/2015 09:14 PM, Al Viro wrote:
>> 9p: switch p9_client_write() to passing it struct iov_iter *
>
> Hi Al,
>
> This change caused following:
>
> [ 91.637917]
> ==
> [
On 04/15/2015 09:14 PM, Al Viro wrote:
> 9p: switch p9_client_write() to passing it struct iov_iter *
Hi Al,
This change caused following:
[ 91.637917]
==
[ 91.639252] BUG: KASan: out of bounds on stack in
iov_iter_advan
Now that net-next went in... Here's the next big chunk - killing
->aio_read() and ->aio_write(). There'll be one more pile today (direct_IO
changes and generic_write_checks() cleanups/fixes), but I'd prefer to keep
that one separate. Please, pull from
git://git.kernel.org/pub/scm/linux/k
42 matches
Mail list logo