Hello.

Thank you for your insights. The patch has ben changed according to your 
suggestions.

Radek

---
>From b055fb898c8f09ee1ae598c4c7d85ab2673d7a4c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Radek=20Barto=C5=88?= <[email protected]>
Date: Thu, 5 Jun 2025 12:41:37 +0200
Subject: [PATCH v2] Cygwin: implement spinlock pause for AArch64

---
 winsup/cygwin/local_includes/cygtls.h           | 5 ++++-
 winsup/cygwin/thread.cc                         | 5 +++++
 winsup/testsuite/winsup.api/pthread/cpu_relax.h | 3 ++-
 3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/winsup/cygwin/local_includes/cygtls.h 
b/winsup/cygwin/local_includes/cygtls.h
index 4698352ae..0b8439475 100644
--- a/winsup/cygwin/local_includes/cygtls.h
+++ b/winsup/cygwin/local_includes/cygtls.h
@@ -242,8 +242,11 @@ public: /* Do NOT remove this public: line, it's a marker 
for gentls_offsets. */
   {
     while (InterlockedExchange (&stacklock, 1))
       {
-#ifdef __x86_64__
+#if defined(__x86_64__)
        __asm__ ("pause");
+#elif defined(__aarch64__)
+       __asm__ ("dmb ishst\n"
+                 "yield");
 #else
 #error unimplemented for this target
 #endif
diff --git a/winsup/cygwin/thread.cc b/winsup/cygwin/thread.cc
index fea6079b8..510e2be93 100644
--- a/winsup/cygwin/thread.cc
+++ b/winsup/cygwin/thread.cc
@@ -1968,7 +1968,12 @@ pthread_spinlock::lock ()
       else if (spins < FAST_SPINS_LIMIT)
         {
           ++spins;
+#if defined(__x86_64__)
           __asm__ volatile ("pause":::);
+#elif defined(__aarch64__)
+          __asm__ volatile ("dmb ishst\n"
+                            "yield":::);
+#endif
         }
       else
        {
diff --git a/winsup/testsuite/winsup.api/pthread/cpu_relax.h 
b/winsup/testsuite/winsup.api/pthread/cpu_relax.h
index 1936dc5f4..71cec0b2b 100644
--- a/winsup/testsuite/winsup.api/pthread/cpu_relax.h
+++ b/winsup/testsuite/winsup.api/pthread/cpu_relax.h
@@ -4,7 +4,8 @@
 #if defined(__x86_64__) || defined(__i386__)  // Check for x86 architectures
    #define CPU_RELAX() __asm__ volatile ("pause" :::)
 #elif defined(__aarch64__) || defined(__arm__)  // Check for ARM architectures
-   #define CPU_RELAX() __asm__ volatile ("yield" :::)
+   #define CPU_RELAX() __asm__ volatile ("dmb ishst \
+                                          yield" :::)
 #else
    #error unimplemented for this target
 #endif
--
2.49.0.vfs.0.3

________________________________________
From: Cygwin-patches 
<[email protected]> on behalf of 
Brian Inglis <[email protected]>
Sent: Thursday, June 12, 2025 11:54 PM
To: [email protected] <[email protected]>
Subject: [EXTERNAL] Re: [PATCH] Cygwin: implement spinlock pause for AArch64

On 2025-06-12 14:47, Jeremy Drake via Cygwin-patches wrote:
> On Thu, 12 Jun 2025, Brian Inglis wrote:
>> Rust apparently uses yield on arm32, and isb (instruction sync barrier) on
>> aarch64, as yield is effectively a NOP (although it could be implemented to
>> free up pipeline slots, SMT switch, or signal), while isb (with optional sy
>> operand) is more like pause on x86_64:
>
> I looked up what mingw-w64 does, and for both arm32 and aarch64 they use
> "dmb ishst" followed by "yield" for YieldProcessor().  I think this makes
> sense, since you'd want any pending stores to be available before
> re-checking the spin condition.

That may be better depending on load and store acquire/release options described
in relation to barriers:

        https://github.com/eclipse-openj9/openj9/issues/6332

        https://devblogs.microsoft.com/oldnewthing/20220812-00/?p=106968

--
Take care. Thanks, Brian Inglis              Calgary, Alberta, Canada

La perfection est atteinte                   Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter  not when there is no more to add
mais lorsqu'il n'y a plus rien à retrancher  but when there is no more to cut
                                 -- Antoine de Saint-Exupéry

Attachment: v2-0001-Cygwin-implement-spinlock-pause-for-AArch64.patch
Description: v2-0001-Cygwin-implement-spinlock-pause-for-AArch64.patch

Reply via email to