John Naylor <john.nay...@2ndquadrant.com> writes:
> v2 had an Assert that was only correct while experimenting with
> eliding right shift. Fixed in v3.

I think there must have been something wrong with your test that
said that eliminating the right shift from the non-CLZ code made
it slower.  It should be an unconditional win, just as it is for
the CLZ code path.  (Maybe some odd cache-line-boundary effect?)

Also, I think it's just weird to account for ALLOC_MINBITS one
way in the CLZ path and the other way in the other path.

I decided that it might be a good idea to do performance testing
in-place rather than in a standalone test program.  I whipped up
the attached that just does a bunch of palloc/pfree cycles.
I got the following results on a non-cassert build (medians of
a number of tests; the times are repeatable to ~ 0.1% for me):

HEAD:           2429.431 ms
v3 CLZ:         2131.735 ms
v3 non-CLZ:     2477.835 ms
remove shift:   2266.755 ms

I didn't bother to try this on non-x86_64 architectures, as
previous testing convinces me the outcome should be about the
same.

Hence, pushed that way, with a bit of additional cosmetic foolery:
the static assertion made more sense to me in relation to the
documented assumption that size <= ALLOC_CHUNK_LIMIT, and I
thought the comment could use some work.

                        regards, tom lane

/*

create function drive_palloc(count int) returns void
strict volatile language c as '.../drive_palloc.so';

\timing

select drive_palloc(10000000);

 */

#include "postgres.h"

#include "fmgr.h"
#include "miscadmin.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/memutils.h"

PG_MODULE_MAGIC;

/*
 * drive_palloc(count int) returns void
 */
PG_FUNCTION_INFO_V1(drive_palloc);
Datum
drive_palloc(PG_FUNCTION_ARGS)
{
	int32		count = PG_GETARG_INT32(0);

	while (count-- > 0)
	{
		for (size_t sz = 1; sz <= 8192; sz <<= 1)
		{
			void *p = palloc(sz);
			pfree(p);
		}

		CHECK_FOR_INTERRUPTS();
	}

	PG_RETURN_VOID();
}

Reply via email to