On Mon, Jun 30, 2014 at 12:20 PM, Andres Freund <and...@2ndquadrant.com> wrote: > On 2014-06-30 11:38:31 -0400, Robert Haas wrote: >> On Mon, Jun 30, 2014 at 11:17 AM, Andres Freund <and...@2ndquadrant.com> >> wrote: >> >> Now, we want to make these >> >> operations compiler fences as well, and AIUI your proposal for that is >> >> to make NEW_S_UNLOCK(lock) = out of line function call + S_LOCK(dummy) >> >> + S_UNLOCK(dummy) + S_UNLOCK(lock). >> > >> > My proposal was to use barrier.h provided barriers as long as it >> > provides a native implementation. If neither a compiler nor a memory >> > barrier is provided my proposal was to fall back to the memory barrier >> > emulation we already have, but move it out of line (to avoid reordering >> > issues). The reason for doing so is that the release *has* to be a >> > release barrier. >> >> What do you mean by "the memory barrier emulation we already have"? >> The only memory barrier emulation we have, to my knowledge, is a >> spinlock acquire-release cycle. > > Yes. > > Unrelated, but why haven't we defined it as if (TAS(dummy)) > S_UNLOCK(dummy);? That's just about as strong and less of a performance > drain? > Hm, I guess platforms that do an unlocked test first would be > problematic :/
Yes; also, there's no requirement for S_LOCK() to be defined in terms of TAS(), and thus no requirement for TAS() to exist at all. >> But trying to use a spinlock >> acquire-release to shore up problems with the spinlock release >> implementation makes my head explode. > > Well, it actually makes some sense. Nearly any TAS() implementation is > going to have some memory barrier semantics - so using a TAS() as > fallback makes sense. That's why we're relying on it for use in memory > barrier emulation after all. As far as I know, all of our TAS() implementations prevent CPU reordering in the acquire direction. It is not clear that they provide CPU-reordering guarantees adequate for the release direction, even when paired with whatever S_UNLOCK() implementation they're mated with. And it's quite clear that many of them aren't adequate to prevent compiler-reordering in any case. > Also note that barrier.h assumes that a S_LOCK/S_UNLOCK acts as a > compiler barrier - which really isn't guaranteed by the current contract > of s_lock.h. Although the tas() implementations look safe. Ugh, you're right. That should really be moved out-of-line. >> >> On the other hand, the API contract change making >> >> these into compiler barriers is a master-only change. I think for >> >> purposes of this patch we should assume that the existing code is >> >> sufficient to prevent CPU reordering and just focus on fixing compiler >> >> ordering problems. Whatever needs to change on the CPU-reordering end >> >> of things is a separate commit. >> > >> > I'm not arguing against that it should be a separate commit. Just that >> > the proper memory barrier bit has to come first. >> >> I feel like you're speaking somewhat imprecisely in an area where very >> precise speech is needed to avoid confusion. If you're saying that >> you think we should fix the S_UNLOCK() implementations that fail to >> prevent CPU-ordering before we change all the S_UNLOCK() >> implementations to prevent compiler-reordering, then that is fine with >> me; > > Yes, that's what I think is needed. OK, let's do it that way then. >> Do you want to propose a patch that does *only* that first part before >> we go further here? Do you want me to try to put together such a >> patch based on the details mentioned on this thread? > > I'm fine with either - we're both going to flying pretty blind :/. > > I think the way S_UNLOCK's release memory barrier semantics are fixed > might influence the way the compiler barrier issue can be solved. I think I'm starting to understand the terminological confusion: to me, a memory barrier means a fence against both the compiler and the CPU. I now think you're using it to mean a fence against the CPU, as distinct from a fence against the compiler. That had me really confused in some previous replies. > Fixing > the release semantics will involve going through all tas() > implementations and see whether the default release semantics are > ok. The ones with broken semantics will need to grow their own > S_UNLOCK. I am wondering if that commit shouldn't just remove the > default S_UNLOCK and move it to the tas() implementation sites. So, I think that here you are talking about CPU behavior rather than compiler behavior. Next paragraph is on that basis. The implementations that don't currently have their own S_UNLOCK() are i386, x86_64, Itanium, ARM without GCC atomics, S/390 zSeries, Sparc, Linux m68k, VAX, m32r, SuperH, non-Linux m68k, Univel CC, HP/UX non-GCC, Sun Studio, and WIN32_ONLY_COMPILER. Because most of those are older platforms, I'm betting that more of them than not are OK; I think we should confine ourselves to trying to fix the ones we're sure are wrong, which if I understand you correctly are ARM without GCC atomics, Sparc, and MIPS. I think it'd be better to just add copies of S_UNLOCK to those three rather and NOT duplicate the definition in the other 12 places. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers