I wrote: > A more useful test would be to directly experiment with contended > spinlocks. As I recall, we had some test cases laying about when > we were fooling with the spin delay stuff on Intel --- maybe > resurrecting one of those would be useful?
The last really significant performance testing we did in this area seems to have been in this thread: https://www.postgresql.org/message-id/flat/CA%2BTgmoZvATZV%2BeLh3U35jaNnwwzLL5ewUU_-t0X%3DT0Qwas%2BZdA%40mail.gmail.com A relevant point from that is Haas' comment I think optimizing spinlocks for machines with only a few CPUs is probably pointless. Based on what I've seen so far, spinlock contention even at 16 CPUs is negligible pretty much no matter what you do. Whether your implementation is fast or slow isn't going to matter, because even an inefficient implementation will account for only a negligible percentage of the total CPU time - much less than 1% - as opposed to a 64-core machine, where it's not that hard to find cases where spin-waits consume the *majority* of available CPU time (recall previous discussion of lseek). So I wonder whether this patch is getting ahead of the game. It does seem that ARM systems with a couple dozen cores exist, but are they common enough to optimize for yet? Can we even find *one* to test on and verify that this is a win and not a loss? (Also, seeing that there are so many different ARM vendors, results from just one chipset might not be too trustworthy ...) regards, tom lane