Re: Membuf "optimization" in r1532186

Stefan Fuhrmann Tue, 31 Dec 2013 08:26:37 -0800

On Sun, Dec 22, 2013 at 2:52 PM, Branko Čibej <br...@wandisco.com> wrote:

> On 22.12.2013 14:16, Stefan Fuhrmann wrote:
> >
> > On Mon, Dec 9, 2013 at 11:01 AM, Branko Čibej <br...@wandisco.com
> > <mailto:br...@wandisco.com>> wrote:
> >
> >     To clarify, the most often used pattern where the initial membuf
> >     size os
> >     0 is when normalizing UTF-8 strings, where we let the utf8proc code
> >     determine how large the allocation has to be, based on its analysis
> of
> >     the string; the only alternative is to allocate a far larger
> >     buffer than
> >     you can ever need, and incidentally making assumptions about how the
> >     normalization is implemented. The extra allocation you introduced
> here
> >     does not speed anything up; rather the opposite.
> >
> >
> > It is not an extra allocation. For 0 bytes we simply get a valid pointer
> > but the next allocation will return the same pointer. So, there is no
> > waste.
>

[Last post to this topic as this is *really* a minor change.]

How on earth do you know that? Do you have a crystal ball that tells you
> that there will be no intervening allocations from the same pool?

Even if the active block in the pool has been completely
allocated (zero free memory), allocating 0 extra bytes is
for free in the current implementation.

>
> Or
> another one that tells you what will happen to APR's pool implementation
> in some future version?
>

I obviously can't tell - except that a major point of the APR
pool design is to be space efficient at the cost of being
unable to de-alloc selectively. If it ever were to add some
per-allocation overhead, its size will still be small relative
to the actual data buffer size.

> (On the other note about apr_palloc taking less time than a mispredicted

> conditional jump ... you're assuming that the apr_palloc code is in the
> L1 instruction cache,

Which it will be in most cases. If it is not, the initial allocation
will prime L1I for the following re-alloc. In general, SVN has
quite high L1I hit rates, i.e. high temporal code locality.

> and you're assuming that everyone uses Intel Core
> processors

apr_palloc latency is dominated by L1D latency. The latter
is usually subject to the same design forces than pipeline
depth. Even for embedded PPC, 2xL1D latency <= branch
misprediction latency.

> — and that everyone uses the same compiler you do. None of
> the above is likely to be true, in general.)
>

Well, with a good compiler, constant propagation will make
the old special-cased membuf_create() than the new one
calling apr_palloc (even if the latter gets a constant prop
code variant as well). The resize code is the place where
we can skip a NULL check.

-- Stefan^2.

Re: Membuf "optimization" in r1532186

Reply via email to