On 19.12.2011 22:12, Tom Lane wrote:
Noah Misch<n...@leadboat.com>  writes:
On Mon, Dec 19, 2011 at 05:09:11PM +0200, Heikki Linnakangas wrote:
That is not sufficient on platforms with a weak memory model, like Itanium.

Other processors may observe the lock as held after its release, but there's no
correctness problem.

How weak is the memory model, exactly?

A correctness problem would ensue if out-of-order stores are possible,
ie other processors could observe the lock being freed (and then acquire
it) before seeing changes to shared variables that had been made while
holding the lock.

I'm a little dubious that this applies to Itanium, because I don't see
any memory fence instruction in the TAS macro.  If we needed one in
S_UNLOCK I'd rather expect there to be one in TAS as well.

There's a pretty good manual called "Implementing Spinlocks on Itanium and PA-RISC" from HP at: http://h21007.www2.hp.com/portal/download/files/unprot/itanium/spinlocks.pdf. It says, in section "3.2.1 Try to get a spinlock":

> Note also, that the ‘xchg’ instruction always
> has the ‘acquire’ semantics, so it enforces the correct memory ordering.

But the current implementation seems to be safe, anyway. In section "3.2.3 Freeing a spinlock", that manual says:

> Note, that a simple assignment statement into a volatile variable will not work, as we are
> assuming that the +Ovolatile=__unordered compile option is being used.

So +Ovolatile=__unordered would allow the kind of optimization that I thought was possible, and we would have a problem if we used it. But the default is more conservative, and a simple assignment does work.

I compiled the attached test program on an HP Itanium box, using the same flags you get from PostgreSQL's configure on that box. The relevant assembler output is:

        xchg4           r14 = [r15], r14           // M [slocktest.c: 66/3]
//file/line/col slocktest.c/67/3
        ld1.acq         r16 = [r11]                // M [slocktest.c: 67/3]
        nop.i           0                          // I
//file/line/col slocktest.c/68/3
        st1.rel         [r11] = r10             ;; // M [slocktest.c: 68/3]
//file/line/col slocktest.c/69/3
        st4.rel         [r15] = r0                 // M [slocktest.c: 69/3]
//file/line/col slocktest.c/70/1


The trick I missed is that the compiler attaches .rel to all the stores and .acq to all the loads through a volatile pointer. gcc seems to do the same. So we're safe.


It would be interesting to test whether using +Ovolatile=__unordered would make a difference to performance. Tacking those memory fence modifiers to all the volatile loads and stores might be expensive.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
#include <stdio.h>

#if defined(__GNUC__) || defined(__INTEL_COMPILER)
#if defined(__ia64__) || defined(__ia64)	/* Intel Itanium */
#define HAS_TEST_AND_SET

typedef unsigned int slock_t;

#define TAS(lock) tas(lock)

/* On IA64, it's a win to use a non-locking test before the xchg proper */
#define TAS_SPIN(lock)	(*(lock) ? 1 : TAS(lock))

#ifndef __INTEL_COMPILER

static __inline__ int
tas(volatile slock_t *lock)
{
	long int	ret;

	__asm__ __volatile__(
		"	xchg4 	%0=%1,%2	\n"
:		"=r"(ret), "+m"(*lock)
:		"r"(1)
:		"memory");
	return (int) ret;
}

#else /* __INTEL_COMPILER */

static __inline__ int
tas(volatile slock_t *lock)
{
	int		ret;

	ret = _InterlockedExchange(lock,1);	/* this is a xchg asm macro */

	return ret;
}

#endif /* __INTEL_COMPILER */
#endif	 /* __ia64__ || __ia64 */
#endif	/* defined(__GNUC__) || defined(__INTEL_COMPILER) */

#if defined(__hpux) && defined(__ia64) && !defined(__GNUC__)

#define HAS_TEST_AND_SET

typedef unsigned int slock_t;

#include <ia64/sys/inline.h>
#define TAS(lock) _Asm_xchg(_SZ_W, lock, 1, _LDHINT_NONE)
/* On IA64, it's a win to use a non-locking test before the xchg proper */
#define TAS_SPIN(lock)	(*(lock) ? 1 : TAS(lock))

#endif	/* HPUX on IA64, non gcc */

slock_t lock;
char shared2;

int main(int argc, char **argv)
{
  volatile char *p = &shared2;
  char local;

  TAS(&lock);
  local = *p;
  *p = 123;
  *((volatile slock_t *) &lock) = 0;
}
-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Reply via email to