Re: Race in svn_atomic_namespace__create

Stefan Fuhrmann Tue, 06 Nov 2012 02:48:26 -0800

On Mon, Nov 5, 2012 at 3:16 PM, Stefan Sperling <s...@elego.de> wrote:

> On Mon, Nov 05, 2012 at 02:54:07PM +0100, Stefan Fuhrmann wrote:
> > On Sun, Nov 4, 2012 at 10:40 AM, Stefan Sperling <s...@elego.de> wrote:
> > > I just came across something that reminded me of this thread.
> > > It seems PostgreSQL is doing something quite similar to what we
> > > want to do here:
> > >
> > >  When the first PostgreSQL process attaches to the shared memory
> segment,
> > > it
> > >  checks how many processes are attached.  If the result is anything
> other
> > > than
> > >  "one", it knows that there's another copy of PostgreSQL running which
> is
> > >  pointed at the same data directory, and it bails out.
> > > http://rhaas.blogspot.nl/2012/06/absurd-shared-memory-limits.html
> > >
> >
> > IIUIC, the problems they are trying to solve are:
> >
> > * have only one process open / manage a given data base
> > * have SHM of arbitrary size
> >
> > Currently, we use named SHM to make the value of
> > two 64 bit numbers per repo visible to all processes.
> > Also, we don't have a master process that would
> > channel access to a given repository.
> >
> > The "corruption" issue is only about how to behave
> > if someone wrote random data to one of our repo
> > files. That's being addressed now (don't crash, have
> > a predictable behavior in most cases).
> >
> > > If this works for postgres I wonder why it wouldn't work for us.
> > > Is this something we cannot do because APR doesn't provide the
> > > necessary abstractions?
> > >
> >
> > The postgres code / approach may be helpful when
> > we try to move the whole membuffer cache into a
> > SHM segment.
>
> Ah, I see.
>
> Next question: Why don't we use a single SHM segment for the revprop cache?
>
> Revprop values are usually small so mapping a small amount of memory
> would suffice. And using a single SHM segment would make updated values
> immediately visible in all processes, wouldn't it? And we wouldn't need the
> generation number dance to make sure all processes see up-to-date values.
> Whichever process updates a revprop value would update the corresponding
> section of shared memory.
>

First of all, I want to point out that we now have a
working implementation for 1.8 and what we are
discussing here is probably targeted at future releases.

If we want revprop-only caches (to keep things simple),
we still need to handle the following basic trade-off:
Lifetime (effectiveness) ./. size. To be effective with
e.g. serf, the cache content should survive single
requests i.e. live longer than an fs_t. We also need
several MB (~200B/rev) per repo for decent hit rates.

OTOH, there may be hundreds of repositories on a
server and it is very hard to re-size the revprop cache
when the number of revs in a repo grows. It is thus
not quite feasible to keep fairly-sized per-repository
caches around indefinitely - even if they only contain
revprops.

That means that we should have one (or some small
number) shared cache for all repositories and let e.g.
some external process manage its lifetime etc. But
that is technically no different from having our membuffer
cache use shared memory instead of being process
local - which is a good thing.

The downside is that we need to address the following
3 issues when moving membuffer to SHM. From the
easiest to the hardest:

* make generations an integral feature of the cache
  (e.g. by tagging index entries and bumping the
   values upon "replace")
  This is necessary to get rid of the revprop generations.
  Race between revprop readers 1 and writer 2:
  1: lookup revprop in cache -> miss
  1: read revprop from disk -> "old content"
  2: store new revprop on disk and in cache
  1: store "old content" in cache
  Note that after the 3rd step, the new content may or
  may not be cached, i.e. we can't check for it in step 4.

* Have some SHM not bound to a repository or a
  parent / child (fork) process relationship. Make it
  work on most platforms.

* Portable, robust (lock owners may die), very low
  overhead (~1musec) many-readers-one-writer locks
  on the cache content. I have some ideas on how
  to do that but this will be very hard to do correctly.

I'd like to see all that solved and SHM being used
for membuffer - which has been designed with that
goal in mind. It's the robustness part that makes it
so much harder to do than I thought back then.

-- Stefan^2.

-- 
Certified & Supported Apache Subversion Downloads:
*

http://www.wandisco.com/subversion/download
*

Re: Race in svn_atomic_namespace__create

Reply via email to