> AFAIK the disk is doing just fine.  Moreover, even during the period when
> fossil is complaining, venti/read on 9fs's score works just fine.  So I
> don't believe the fault is venti's.

i don't believe that conclusion is warranted.
/sys/src/cmd/fossil/cache.c:683,684
is where this condition gets set.  so either
the read fails or the score or length is bad.
%r is not set (see a few lines down) so when
combined with this report:

> This is likely too large a hammer, but when this happens I rebuild the venti 
> index
> so that I can get past the issue.  I see this more under Plan 9 than p9p.  The
> block in error always exists in an arena and a checkarenas reports no errors.
> The problem usually persists across reboots until I reconstitute the index.

it's reasonable to guess that the block returned
might not be the right one.

in principle, this could be a drive failure,
bad memory or a venti bug. i don't have a
lot of venti experience, but i think this
/sys/src/cmd/venti/srv/lump.c:226,230
is where venti reads and it seems to insure
that the initial read double-checks scores. 
it would 1e-80 hard for a drive error
to sneak by, so that leaves us with memory
errors or venti cache bugs.

it's hard to see how reindexing would fix
a cache bug, though.  so maybe i'm all wet.

it would be interesting to know if the score
of the block returned by venti/read is correct.

- erik

Reply via email to