On Sun Jun 8 14:17:16 EDT 2014, quans...@quanstro.net wrote:
> On Sun Jun 8 13:55:52 EDT 2014, cinap_len...@felloff.net wrote:
> > right. the question is, how did it vanish from the image cache.
>
> i think it is in the image cache, but .ref >1.
perhaps independent of your question,
my assumpti
On Tue Jun 10 09:58:18 EDT 2014, st...@quintile.net wrote:
> > if a process exits and is then run again, it will always be re-read
> > from storage. (since channel comparisons factor in to finding
> > an image.) only if the lifetime overlaps will the cached image be
> > used.
>
> The one place w
no. my attachimage() compares qid + mountid (which is unique) and
reattaches the passed in channel if image->c was nil. when
a porcess exits, the segments are released, decrementing ref of
the pages and the images. the image has an additional field pgref where
it counts the number of page referenc
> if a process exits and is then run again, it will always be re-read
> from storage. (since channel comparisons factor in to finding
> an image.) only if the lifetime overlaps will the cached image be
> used.
The one place where I can imagine lots of cache hits is when running
parallel mk jobs
On Mon Jun 9 23:55:00 EDT 2014, cinap_len...@felloff.net wrote:
> while you'r at it. take a look at 9front imageattach() code.
> it allows the chan attached to the image to be released when the
> image is not in use. this avoids all these chans and mounts
> being kept arround until the image is re
while you'r at it. take a look at 9front imageattach() code.
it allows the chan attached to the image to be released when the
image is not in use. this avoids all these chans and mounts
being kept arround until the image is reclaimed. the problem
is worked arround in iostats by killing the filesyst
On Mon Jun 9 04:25:00 EDT 2014, charles.fors...@gmail.com wrote:
> On 8 June 2014 19:37, Charles Forsyth wrote:
>
> > On 8 June 2014 19:15, erik quanstrom wrote:
> >
> >> i think it is in the image cache, but .ref >1.
> >
> >
> > but in that case it will still not pio, but make a local writabl
> On 8 June 2014 19:37, Charles Forsyth wrote:
>
> > On 8 June 2014 19:15, erik quanstrom wrote:
> >
> >> i think it is in the image cache, but .ref >1.
> >
> >
> > but in that case it will still not pio, but make a local writable copy.
>
>
> in fact ref > 1 is the copy-on-write case and in a
On 8 June 2014 19:37, Charles Forsyth wrote:
> On 8 June 2014 19:15, erik quanstrom wrote:
>
>> i think it is in the image cache, but .ref >1.
>
>
> but in that case it will still not pio, but make a local writable copy.
in fact ref > 1 is the copy-on-write case and in a sense the usual one,
w
On 8 June 2014 19:15, erik quanstrom wrote:
> i think it is in the image cache, but .ref >1.
but in that case it will still not pio, but make a local writable copy.
On Sun Jun 8 13:55:52 EDT 2014, cinap_len...@felloff.net wrote:
> right. the question is, how did it vanish from the image cache.
i think it is in the image cache, but .ref >1.
- erik
right. the question is, how did it vanish from the image cache.
--
cinap
On Sun Jun 8 13:51:18 EDT 2014, charles.fors...@gmail.com wrote:
> On 8 June 2014 18:34, erik quanstrom wrote:
>
> > well, those are the measurements. do you think they are misleading?
> > perhaps
> > with the pio happening in another context? i haven't hunted this down.
> >
>
> the differe
On 8 June 2014 18:34, erik quanstrom wrote:
> well, those are the measurements. do you think they are misleading?
> perhaps
> with the pio happening in another context? i haven't hunted this down.
>
the difference is only how fault makes the copy (easy or hard), there
shouldn't be any call to
duppage() causes the freelist to be shuffled differently. without
stuffing cached pages at the freelist tail, the tail accumulates
a uncached "stopper" page which breaks the invariant of imagereclaim
which just scans from the tail backwards as long as the pages are
cached.
imagereclaim does not mo
> that doesn't make any sense. duppage copied the page the wrong way
> round (used the image page and put another copy in). eliminating
> duppage simply copies the page from the image cache instead of using
> that page. there isn't any i/o in either case.
well, those are the measurements. do y
On 8 June 2014 15:53, erik quanstrom wrote:
> i was experimenting a bit with cinap's version of dropping duppage, and for
> the lame build the kernel tests there's quite a bit more i/o
>
that doesn't make any sense. duppage copied the page the wrong way round
(used the image page and put another
i get consistent results with iostats for building pc64.
(on amd64)
166 192 10491600 /bin/rc
4 90 34330800 /bin/awk
37 515128000 /bin/echo
17 43 10378600 /bin/sed
3 1751567
i was experimenting a bit with cinap's version of dropping duppage, and for
the lame build the kernel tests there's quite a bit more i/o
duppage no duppage
read4597629153366962
rpc 73674 75718
you can see below that both end up rea
with panic's duppage fix, i get hits.
Tue Mar 13 17:31:34: minooka# duppage: p->ref 2 != 1
Tue Mar 13 17:31:34: duppage: p->ref 2 != 1
Tue Mar 13 17:31:34: duppage: p->ref 3 != 1
Tue Mar 13 17:31:34: duppage: p->ref 2 != 1
Tue Mar 13 17:31:34: duppage: p->ref 2 != 1
Tue Mar 13 17:31:34: duppage: p
> Jetzt verstehe ich. Before returning 1 (failure) duppage
> always calls uncachepage first, so no harm is done.
exactly!
> How about submitting a patch(1)?
i think geoff is working on it.
i wanted to verify this first. the code is subtile and
there might be even better ways to fix this.
modifi
> in 9front, i changed the return type of duppage to void.
Jetzt verstehe ich. Before returning 1 (failure) duppage
always calls uncachepage first, so no harm is done.
Good analysis. How about submitting a patch(1)?
fixfault() in fault.c is the only user of duppage().
in 9front, i changed the return type of duppage to void.
fixfault() now looks like this:
...
if(lkp->image == &swapimage)
ref = lkp->ref + swapcount(lkp->daddr);
else
> the problem we have is that we temporarily unlock
> p to acquire palloc lock, wich opens a chance for
> someone to take a ref on p, but duppage doesnt
> recheck after reqcquiering the p lock.
>
> if this happens, duppage() has to return and let
> fixfault() make a copy for its segment.
That mak
no, this is different.
the XXX - there's a bug is about the new page that
duppage() makes. it explains a race where the new
page is on the freelist and the image cache.
whats important is that when someone (lookpage)
locks the page, its refcount and image/daddr is
consistent wich should be the cas
> anyone with a mp system can confirm this?
Yes, I've confirmed by experiment that duppage(lkp) can return
with lkp->ref > 1.
Is what you've found related to the "XXX - here's a bug" comment
in duppage? Maybe that's what really needs fixing.
> a change that rechecks the refcount after calling duppage() in
> fixfault() and doing a copy like for the ref > 1 case seems to have
> made the problem go away. (system is running for 8 days now)
>
> anyone with a mp system can confirm this?
the description sounds logical, but i haven't seen th
I think we should institute a Sherlock Holmes award at iwp9.
(It wouldn't mean you need to throw yourself off a building.)
discovered odd behaviour on a mp system. it was running rchttpd and
werc and after like 3 or 7 days of load, broken sed and grep processes
appeared in the process table. inspecting the process with acid
yields a strange picture. the process crashed (or aborted themselves)
before any data was rea
29 matches
Mail list logo