On 31/07/2024 08:52, Thomas Munro wrote:
On Tue, Jul 30, 2024 at 9:50 AM Thomas Munro <thomas.mu...@gmail.com> wrote:
I guess we should also consider reimplementing the spinlock on the
atomic API, but I can see that Andres is poking at spinlock code right
now so I'll keep out of his way...

Here is a first attempt at that.

Looks good, thanks!

I haven't compared the generated asm yet, but it seems to work OK.
The old __i386__ implementation of TAS() said:

         * When this was last tested, we didn't have separate TAS() and 
TAS_SPIN()
         * macros.  Nowadays it probably would be better to do a non-locking 
test
         * in TAS_SPIN() but not in TAS(), like on x86_64, but no-one's done the
         * testing to verify that.  Without some empirical evidence, better to
         * leave it alone.

It seems that you did what the comment suggested. That seems fine. For sake of completeness, if someone has an i386 machine lying around, it would be nice to verify that. Or an official CPU manufacturer's implementation guide, or references to other implementations or something.

2.  The pg_atomic_unlocked_test_flag() function was surprising to me:
it returns true if it's not currently set (according to a relaxed
load).  Most of this patch was easy, but figuring out that I had
reverse polarity here was a multi-coffee operation :-)  I can't call
it wrong though, as it's not based on <stdatomic.h>, and it's clearly
documented, so *shrug*.

Huh, yeah that's unexpected.

3.  As for why we have a function that <stdatomic.h> doesn't, I
speculate that it might have been intended for implementing this exact
patch, ie wanting to perform that relaxed load while spinning as
recommended by Intel.  (If we strictly had to use <stdatomic.h>
functions, we couldn't use atomic_flag due to the lack of a relaxed
load operation on that type, so we'd probably have to use atomic_char
instead.  Perhaps one day we will cross that bridge.)

As a side note, I remember when I've tried to use pg_atomic_flag in the past, I wanted to do an atomic compare-and-exchange on it, to clear the value and return the old value. Surprisingly, there's no function to do that. There's pg_atomic_test_set_flag(), but no pg_atomic_test_clear_flag(). C11 has both "atomic_flag" and "atomic_bool", and I guess what I actually wanted was atomic_bool.

- *     On platforms with weak memory ordering, the TAS(), TAS_SPIN(), and
- *     S_UNLOCK() macros must further include hardware-level memory fence
- *     instructions to prevent similar re-ordering at the hardware level.
- *     TAS() and TAS_SPIN() must guarantee that loads and stores issued after
- *     the macro are not executed until the lock has been obtained.  
Conversely,
- *     S_UNLOCK() must guarantee that loads and stores issued before the macro
- *     have been executed before the lock is released.

That old comment means that both SpinLockAcquire() and SpinLockRelease() acted as full memory barriers, and looking at the implementations, that was indeed so. With the new implementation, SpinLockAcquire() will have "acquire semantics" and SpinLockRelease will have "release semantics". That's very sensible, and I don't believe it will break anything, but it's a change in semantics nevertheless.

--
Heikki Linnakangas
Neon (https://neon.tech)



Reply via email to