Re: I've submitted 207175 for a clang 3.8.0 va_list handling problem for powerpc [fpr use also tested]

Mark Millard Thu, 18 Feb 2016 14:30:12 -0800

On 2016-Feb-17, at 9:23 PM, Mark Millard <[email protected]> wrote:
> 
> My fpr related notes/worries about the fix were wrong.
> 
> I finally got some time to look at this again and I see that I somehow missed 
> the following code when I looked before:
> 
>  // The calling convention either uses 1-2 GPRs or 1 FPR.
>  Address NumRegsAddr = Address::invalid();
>  if (isInt || IsSoftFloatABI) {
>    NumRegsAddr = Builder.CreateStructGEP(VAList, 0, CharUnits::Zero(), "gpr");
>  } else {
>    NumRegsAddr = Builder.CreateStructGEP(VAList, 1, CharUnits::One(), "fpr");
>  }
> 
> So the
> 
>    Builder.CreateStore(Builder.getInt8(8), NumRegsAddr);
> 
> in Case 2 is tracking gpr vs. fpr usage contexts as it should. Also:
> 
>  llvm::Value *NumRegs = Builder.CreateLoad(NumRegsAddr, "numUsedRegs");
> 
>  // "Align" the register count when TY is i64.
>  if (isI64 || (isF64 && IsSoftFloatABI)) {
>    NumRegs = Builder.CreateAdd(NumRegs, Builder.getInt8(1));
>    NumRegs = Builder.CreateAnd(NumRegs, Builder.getInt8((uint8_t) ~1U));
>  }
> 
>  llvm::Value *CC =
>      Builder.CreateICmpULT(NumRegs, Builder.getInt8(8), "cond");
> 
> is using the same bounds check figure (8) for gpr and fpr.
> 
> Apparently that common bound is one reason that the clang numbering is not 
> the same as the ABI document's numbering: clang's numbering allows using the 
> same figure for both contexts. (Given the prior alignment for isI64 (or isF64 
> with IsSoftFloatABI).)
> 
> Sorry for the prior noise about fpr.
> 
> It is still true that DOUBLE_OR_FLOAT is untested so far.
> 
> ===
> Mark Millard
> markmi at dsl-only.net

I finally got some time to apply to some basic testing involving double as well 
(for involving fpr use). . .

No problems with exceptions. Looking at the memory contents at various stages 
in gdb looks good. va_list's gpr, fpr, overflow_arg_area changes as its va_args 
use progresses look good. Values extracted by va_args use looks good. Both 
default and -O2.

The added

> Builder.CreateStore(Builder.getInt8(8), NumRegsAddr);

passes my checks. I've not observed any problems from buildworld materials, 
unlike when that line is missing.

[Note: I run with the signal delivery modified to have a "red zone" to deal 
with other aspects of clang 3.8.0 code generation that are not ABI compliant 
for when the stack pointer is moved. Having a "red zone" is still operationally 
correct for an ABI compliant code generation, it just temporarily wastes more 
bytes. Also: the kernel was built with gcc 4.2.1 but world was built with clang 
3.8.0.]

===
Mark Millard
markmi at dsl-only.net

. . . [bad fpr related material omitted] . . .

On 2016-Feb-16, at 2:45 AM, Mark Millard <[email protected]> wrote:

I used:

> Index: /usr/src/contrib/llvm/tools/clang/lib/CodeGen/TargetInfo.cpp
> ===================================================================
> --- /usr/src/contrib/llvm/tools/clang/lib/CodeGen/TargetInfo.cpp      
> (revision 295601)
> +++ /usr/src/contrib/llvm/tools/clang/lib/CodeGen/TargetInfo.cpp      
> (working copy)
> @@ -3569,6 +3569,8 @@
> {
>   CGF.EmitBlock(UsingOverflow);
> 
> +    Builder.CreateStore(Builder.getInt8(8), NumRegsAddr);
> +
>   // Everything in the overflow area is rounded up to a size of at least 4.
>   CharUnits OverflowAreaAlign = CharUnits::fromQuantity(4);

as my test change. (Get evidence of operation before potential cleanup of the 
duplicated 8's.)

After a full buildworld/installworld based on the updated compiler. . .

My simple example of the problem no longer fails.

"ls -l -n /" now works.

"svnlite update -r295601 /usr/src" now works.

So whatever you want to do for the details of any submitted code, the basics of 
the change do avoid the SEGVs and allow these programs to work.

===
Mark Millard
markmi at dsl-only.net

On 2016-Feb-15, at 4:27 PM, Mark Millard <[email protected]> wrote:

On 2016-Feb-15, at 1:20 PM, Mark Millard <markmi at dsl-only.net> wrote:
> 
> On 2016-Feb-15, at 12:18 PM, Roman Divacky <rdivacky at vlakno.cz> wrote:
>> 
>> On Mon, Feb 15, 2016 at 12:17:50PM -0800, Mark Millard wrote:
>>> On 2016-Feb-15, at 11:11 AM, Roman Divacky <rdivacky at vlakno.cz> wrote:
>>>> 
>>>> Mark, I believe you're right. What do you think about this patch?
>>>> 
>>>> Index: tools/clang/lib/CodeGen/TargetInfo.cpp
>>>> ===================================================================
>>>> --- tools/clang/lib/CodeGen/TargetInfo.cpp (revision 260852)
>>>> +++ tools/clang/lib/CodeGen/TargetInfo.cpp (working copy)
>>>> @@ -3599,6 +3599,8 @@
>>>> {
>>>> CGF.EmitBlock(UsingOverflow);
>>>> 
>>>> +    Builder.CreateStore(Builder.getInt8(11), NumRegsAddr);
>>>> +
>>>> // Everything in the overflow area is rounded up to a size of at least 4.
>>>> CharUnits OverflowAreaAlign = CharUnits::fromQuantity(4);
>>>> 
>>>> 
>>>> Can you test it?
>>> 
>>> It may be later today before I can start the the test process.
>>> 
>>> While your change is not wrong as presented, it does seem to be based on 
>>> the ABI document's numbering with the range 3 <= gr <12, where 3 <= gr < 11 
>>> cover r3-r10 use and gr=11 implies overflow stack area use. (gr being the 
>>> ABI documents name.)
>>> 
>>> The clang code generation that I saw while analyzing the problem and the 
>>> clang source that you had me look at did not use that numbering. Instead it 
>>> seems to be based on 0 <= gpr < 9, where 0 <= gpr < 8 cover r3-r10 use and 
>>> gpr=8 implies overflow stack area use. (gpr being what gdb showed me as I 
>>> remember.) In other words: clang counts the number of "parameter registers" 
>>> already in use as it goes along instead of tracking register numbers that 
>>> have been used.
>>> 
>>> So assigning any value that appears to be positive and >= 8 should work, 
>>> such as:
>>> 
>>> Builder.CreateStore(Builder.getInt8(8), NumRegsAddr);
>> 
>> Can you check what number gcc uses? We want to be interoperable with gcc.
>> 
>> Anyway, thanks for testing!
>> 
>> Roman
> 
> I'll do that check of gcc 4.2.1 code generation before starting the test 
> later today.
> 
> But if the clang numbering is different in gcc 4.2.1 then far more than just 
> adding a
> 
>> Builder.CreateStore(Builder.getInt8(?), NumRegsAddr)
> 
> 
> for some "?" would need to be involved in the changes in order to reach 
> compatibility.
> 
> 
> I'll note that for clang 3.8.0 the actual comparison instruction generated is 
> of the form
> 
>> cmplwi  r?,7
> 
> 
> for some r?, such as r5 or r4, and the conditional branch generated is a bgt 
> instruction.
> 
> ===
> Mark Millard
> markmi at dsl-only.net

gcc 4.2.1 generates comparison instructions for va_arg of the form:

cmplwi  cr7,r0,8

and the conditional branch generated is a "bge cr7, . . ." instruction.

So the same number range is in use by both compilers: They are compatible for 
the bounds checks for reg vs. overflow for how they count, equality 
inclusion/exclusion matching up with the specific number (8 vs. 7) used to make 
things the same overall.

Other aspects of the code generation distinctions would take me time to 
analyze. It will be a while before I will be looking at other points.

===
Mark Millard
markmi at dsl-only.net

_______________________________________________
[email protected] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain
To unsubscribe, send any mail to "[email protected]"

Re: I've submitted 207175 for a clang 3.8.0 va_list handling problem for powerpc [fpr use also tested]

Reply via email to