Hello, On Mon, 10 Jul 2023, Alexander Monakov wrote:
> I think the main question is why you're going with this (weak) form > instead of the (strong) form "may only clobber the low XMM regs": I want to provide both. One of them allows more arbitrary function definitions, the other allows more register (parts) to be preserved. I feel both have their place. > as Richi noted, surely for libcalls we'd like to know they preserve > AVX-512 mask registers as well? Yeah, mask registers. I'm still pondering this. We would need to split the 8 maskregs into two parts. Hmm. > Note this interacts with anything that interposes between the caller > and the callee, like the Glibc lazy binding stub (which used to > zero out high halves of 512-bit arguments in ZMM registers). > Not an immediate problem for the patch, just something to mind perhaps. Yeah, needs to be kept in mind indeed. Anything coming in between the caller and a so-marked callee needs to preserve things. > > I chose to make it possible to write function definitions with that > > attribute with GCC adding the necessary callee save/restore code in > > the xlogue itself. > > But you can't trivially restore if the callee is sibcalling — what > happens then (a testcase might be nice)? I hoped early on that the generic code that prohibits sibcalls between call sites of too "different" ABIs would deal with this, and then forgot to check. Turns out you had a good hunch here, it actually does a sibcall, destroying the guarantees. Thanks! :) > > Carefully note that this is only possible for the SSE2 registers, as > > other parts of them would need instructions that are only optional. > > What is supposed to happen on 32-bit x86 with -msse -mno-sse2? Hmm. I feel the best answer here is "that should error out". I'll add a test and adjust patch if necessary. > > When a function doesn't contain calls to > > unknown functions we can be a bit more lenient: we can make it so that > > GCC simply doesn't touch xmm8-15 at all, then no save/restore is > > necessary. > > What if the source code has a local register variable bound to xmm15, > i.e. register double x asm("xmm15"); asm("..." : "+x"(x)); ? Makes a good testcase as well. My take: it's acceptable with the only-sse2-preserved attribute (xmm15 will in this case be saved/restored), and should be an error with the everything-preserved attribute (maybe we can make an exception as here we only specify an XMM reg, instead of larger parts). > > To that end I introduce actually two related attributes (for naming > > see below): > > * nosseclobber: claims (and ensures) that xmm8-15 aren't clobbered > > This is the weak/active form; I'd suggest "preserve_high_sse". But it preserves only the low parts :-) You swapped the two in your mind when writing the reply? > > I would welcome any comments, about the names, the approach, the attempt > > at documenting the intricacies of these attributes and anything. > > I hope the new attributes are supposed to be usable with function > pointers? From the code it looks that way, but the documentation doesn't > promise that. Yes, like all ABI influencing attributes they _have_ to be part of the functions type (and hence transfer to function pointers), with appropriate incompatible-conversion errors and warnings at the appropriate places. (I know that this isn't always the way we're dealing with ABI-infuencing attributes, and often refer to a decl only. All those are actual bugs.) And yes, I will adjust the docu to be explicit about this. Ciao, Michael.