On Tue, Mar 3, 2026 at 10:10 AM Florian Weimer <[email protected]> wrote: > > * Jakub Jelinek: > > > Functionally, I think whether we do an 8-bit or 32-bit or 64-bit > > or with 0 constant doesn't matter, we don't modify any values on the > > stack, just pretend to modify it. The 8-bit and 32-bit ors > > are 1-byte shorter though than 64-bit one. How the 3 behave > > performance-wise is unknown, if the particular probed spot on the > > stack hasn't been stored/read for a while and won't be for a while, > > then I'd think it shouldn't matter, dunno if there can be store > > forwarding effects if it has been e.g. written or read very recently > > by some other function as say 32-bit access and now is 8-bit. The > > access after the probe (if it happens soon enough) should be in valid > > programs a store (and again, dunno if there can be issues if the > > sizes are different). > > I don't see a discussion in the code why a read-modify-write operation > is used for probing. Maybe because a plain load would need an extra > register? And a plain store could result in a valgrind false positive? > But maybe we could use CMPL? The register version is one byte shorter > than ORB, too.
A write has the advantage (or disadvantage?) of causing COW faulting of the zero page. But IIRC the desire is to hit the stack guard page. For STLF issues, reading a byte will always forward from earlier stores but writing a byte is problematic for all following reads larger than a byte (though that would be reading uninitialized data, so we'd expect another write there). I'd say a byte is clearly superior than any larger size for performance reasons. Whether COW faulting is desired or not is another question, I'd say probably not? Richard. > I think for -fstack-check, in may be desirable to trigger copy-on-write > on some systems, but for -fstack-clash-protection, that does not seem > necessary (no proactive triggering of stack overflow traps needed). > > Thanks, > Florian >
