[llvm-commits] [llvm-gcc-4.2] r40580 - in /llvm-gcc-4.2/trunk/gcc: llvm-convert.cpp toplev.c

2007-07-29 Thread Anton Korobeynikov
Author: asl
Date: Sun Jul 29 11:46:07 2007
New Revision: 40580

URL: http://llvm.org/viewvc/llvm-project?rev=40580&view=rev
Log:
Unbreak C++ FE. libstdc++ can be compiled now and even tests from 
llvm-testsuite passed!

Modified:
llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp
llvm-gcc-4.2/trunk/gcc/toplev.c

Modified: llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp
URL: 
http://llvm.org/viewvc/llvm-project/llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp?rev=40580&r1=40579&r2=40580&view=diff

==
--- llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp (original)
+++ llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp Sun Jul 29 11:46:07 2007
@@ -69,6 +69,13 @@
 
 #define ITANIUM_STYLE_EXCEPTIONS
 
+// Check for GCC bug 17347: C++ FE sometimes creates bogus ctor trees
+// which we should throw out
+#define BOGUS_CTOR(exp)\
+  (DECL_INITIAL(exp) &&\
+   TREE_CODE(DECL_INITIAL(exp)) == CONSTRUCTOR &&  \
+   !TREE_TYPE(DECL_INITIAL(exp)))
+
 
//===--===//
 //   Matching LLVM Values with GCC DECL trees
 
//===--===//
@@ -5106,7 +5113,9 @@
 // If this is an aggregate, emit it to LLVM now.  GCC happens to
 // get this case right by forcing the initializer into memory.
 if (TREE_CODE(exp) == CONST_DECL || TREE_CODE(exp) == VAR_DECL) {
-  if ((DECL_INITIAL(exp) || !TREE_PUBLIC(exp)) && GV->isDeclaration()) {
+  if ((DECL_INITIAL(exp) || !TREE_PUBLIC(exp)) && !DECL_EXTERNAL(exp) &&
+  GV->isDeclaration() &&
+  !BOGUS_CTOR(exp)) {
 emit_global_to_llvm(exp);
 Decl = DECL_LLVM(exp); // Decl could have change if it changed 
type.
   }
@@ -6224,7 +6233,9 @@
   // If this is an aggregate, emit it to LLVM now.  GCC happens to
   // get this case right by forcing the initializer into memory.
   if (TREE_CODE(exp) == CONST_DECL || TREE_CODE(exp) == VAR_DECL) {
-if ((DECL_INITIAL(exp) || !TREE_PUBLIC(exp)) && Val->isDeclaration()) {
+if ((DECL_INITIAL(exp) || !TREE_PUBLIC(exp)) && !DECL_EXTERNAL(exp) &&
+Val->isDeclaration() &&
+!BOGUS_CTOR(exp)) {
   emit_global_to_llvm(exp);
   // Decl could have change if it changed type.
   Val = cast(DECL_LLVM(exp));

Modified: llvm-gcc-4.2/trunk/gcc/toplev.c
URL: 
http://llvm.org/viewvc/llvm-project/llvm-gcc-4.2/trunk/gcc/toplev.c?rev=40580&r1=40579&r2=40580&view=diff

==
--- llvm-gcc-4.2/trunk/gcc/toplev.c (original)
+++ llvm-gcc-4.2/trunk/gcc/toplev.c Sun Jul 29 11:46:07 2007
@@ -2135,6 +2135,7 @@
   /* LLVM LOCAL begin */
 #ifdef ENABLE_LLVM
   llvm_lang_dependent_init(name);
+  init_eh();
   return 1; /* don't initialize the RTL backend */
 #endif
   /* LLVM LOCAL end */


___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] PR1146 patch for llvm-gcc-4.0

2007-07-29 Thread Reid Spencer
All,

Attached is my patch for llvm-gcc-4.0. This rearranges the attribute
processing a bit and centralizes the K&R stuff in the
FunctionTypeConverter. The methods to convert a function type now take
an extra ParamAttrsList parameter that is generated for use with
Functions and call sites. This goes in conjunction with the previous
patch to llvm for PR1146 (sent a few days ago).

This is passing tests so I thought I'd send it out for review. I'll
commit this later today if it passes muster.

Reid.
Index: gcc/llvm-backend.cpp
===
--- gcc/llvm-backend.cpp	(revision 40580)
+++ gcc/llvm-backend.cpp	(working copy)
@@ -987,10 +987,12 @@
 Function *FnEntry = TheModule->getFunction(Name);
 if (FnEntry == 0) {
   unsigned CC;
+  const ParamAttrsList *PAL = 0;
   const FunctionType *Ty = 
-TheTypeConverter->ConvertFunctionType(TREE_TYPE(decl), NULL, CC);
+TheTypeConverter->ConvertFunctionType(TREE_TYPE(decl), NULL, CC, PAL);
   FnEntry = new Function(Ty, Function::ExternalLinkage, Name, TheModule);
   FnEntry->setCallingConv(CC);
+  FnEntry->setParamAttrs(PAL);
 
   // Check for external weak linkage
   if (DECL_EXTERNAL(decl) && DECL_WEAK(decl))
Index: gcc/llvm-convert.cpp
===
--- gcc/llvm-convert.cpp	(revision 40580)
+++ gcc/llvm-convert.cpp	(working copy)
@@ -503,11 +503,12 @@
   // allows C functions declared as "T foo() {}" to be treated like 
   // "T foo(void) {}" and allows us to handle functions with K&R-style
   // definitions correctly.
+  const ParamAttrsList *PAL = 0;
   if (TYPE_ARG_TYPES(TREE_TYPE(FnDecl)) == 0) {
 FTy = TheTypeConverter->ConvertArgListToFnType(TREE_TYPE(TREE_TYPE(FnDecl)),
DECL_ARGUMENTS(FnDecl),
static_chain,
-   CallingConv);
+   CallingConv, PAL);
 #ifdef TARGET_ADJUST_LLVM_CC
 TARGET_ADJUST_LLVM_CC(CallingConv, TREE_TYPE(FnDecl));
 #endif
@@ -515,7 +516,7 @@
 // Otherwise, just get the type from the function itself.
 FTy = TheTypeConverter->ConvertFunctionType(TREE_TYPE(FnDecl),
 		static_chain,
-		CallingConv);
+		CallingConv, PAL);
   }
   
   // If we've already seen this function and created a prototype, and if the
@@ -525,6 +526,7 @@
 Fn = cast(DECL_LLVM(FnDecl));
 assert(Fn->getCallingConv() == CallingConv &&
"Calling convention disagreement between prototype and impl!");
+
 // The visibility can be changed from the last time we've seen this
 // function. Set to current.
 if (TREE_PUBLIC(FnDecl)) {
@@ -566,6 +568,9 @@
   // The function should not already have a body.
   assert(Fn->empty() && "Function expanded multiple times!");
   
+  // Assign the parameter attributes that the function should use.
+  Fn->setParamAttrs(PAL);
+
   // Compute the linkage that the function should get.
   if (!TREE_PUBLIC(FnDecl) /*|| lang_hooks.llvm_is_in_anon(subr)*/) {
 Fn->setLinkage(Function::InternalLinkage);
@@ -2535,6 +2540,7 @@
   return Res;
   }
 
+  const ParamAttrsList *PAL = 0;
   Value *Callee = Emit(TREE_OPERAND(exp, 0), 0);
 
   if (TREE_OPERAND(exp, 2)) {
@@ -2547,12 +2553,12 @@
 unsigned CallingConv;
 const Type *Ty = TheTypeConverter->ConvertFunctionType(function_type,
static_chain,
-   CallingConv);
+   CallingConv, PAL);
 Callee = CastToType(Instruction::BitCast, Callee, PointerType::get(Ty));
   }
 
   //EmitCall(exp, DestLoc);
-  Value *Result = EmitCallOf(Callee, exp, DestLoc);
+  Value *Result = EmitCallOf(Callee, exp, DestLoc, PAL);
 
   // If the function has the volatile bit set, then it is a "noreturn" function.
   // Output an unreachable instruction right after the function to prevent LLVM
@@ -2611,7 +2617,7 @@
 
 /// HandleScalarResult - This callback is invoked if the function returns a
 /// simple scalar result value.
-void HandleScalarResult(const Type *RetTy) {
+void HandleScalarResult(const Type *RetTy, tree treeTy) {
   // There is nothing to do here if we return a scalar or void.
   assert(DestLoc == 0 &&
  "Call returns a scalar but caller expects aggregate!");
@@ -2678,7 +2684,8 @@
 /// EmitCallOf - Emit a call to the specified callee with the operands specified
 /// in the CALL_EXP 'exp'.  If the result of the call is a scalar, return the
 /// result, otherwise store it in DestLoc.
-Value *TreeToLLVM::EmitCallOf(Value *Callee, tree exp, Value *DestLoc) {
+Value *TreeToLLVM::EmitCallOf(Value *Callee, tree exp, Value *DestLoc,
+  const ParamAttrsList *PAL) {
   

[llvm-commits] [llvm] r40581 - /llvm/trunk/test/CFrontend/exact-div-expr.c

2007-07-29 Thread Reid Spencer
Author: reid
Date: Sun Jul 29 13:23:22 2007
New Revision: 40581

URL: http://llvm.org/viewvc/llvm-project?rev=40581&view=rev
Log:
Be explicit about which level of optimization is being asked for. The -O option
is equivalent to -O1.

Modified:
llvm/trunk/test/CFrontend/exact-div-expr.c

Modified: llvm/trunk/test/CFrontend/exact-div-expr.c
URL: 
http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CFrontend/exact-div-expr.c?rev=40581&r1=40580&r2=40581&view=diff

==
--- llvm/trunk/test/CFrontend/exact-div-expr.c (original)
+++ llvm/trunk/test/CFrontend/exact-div-expr.c Sun Jul 29 13:23:22 2007
@@ -1,5 +1,5 @@
-// RUN: %llvmgcc -S %s -o - -O | grep ashr
-// RUN: %llvmgcc -S %s -o - -O | not grep sdiv
+// RUN: %llvmgcc -S %s -o - -O1 | grep ashr
+// RUN: %llvmgcc -S %s -o - -O1 | not grep sdiv
 
 long long test(int *A, int *B) {
   return A-B;


___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] [llvm-gcc-4.0] r40582 - /llvm-gcc-4.0/trunk/gcc/llvm-types.cpp

2007-07-29 Thread Christopher Lamb
Author: clamb
Date: Sun Jul 29 18:25:53 2007
New Revision: 40582

URL: http://llvm.org/viewvc/llvm-project?rev=40582&view=rev
Log:
Add support to emit noalias attribute on function parameters when the 
__restrict qualifier is used.

Modified:
llvm-gcc-4.0/trunk/gcc/llvm-types.cpp

Modified: llvm-gcc-4.0/trunk/gcc/llvm-types.cpp
URL: 
http://llvm.org/viewvc/llvm-project/llvm-gcc-4.0/trunk/gcc/llvm-types.cpp?rev=40582&r1=40581&r2=40582&view=diff

==
--- llvm-gcc-4.0/trunk/gcc/llvm-types.cpp (original)
+++ llvm-gcc-4.0/trunk/gcc/llvm-types.cpp Sun Jul 29 18:25:53 2007
@@ -1010,6 +1010,11 @@
   else
 Attributes |= ParamAttr::SExt;
 }
+
+// Compute noalias attributes.
+if (TREE_CODE(ArgTy) == POINTER_TYPE || TREE_CODE(ArgTy) == REFERENCE_TYPE)
+  if (TYPE_RESTRICT(ArgTy))
+Attributes |= ParamAttr::NoAlias;
 
 #ifdef LLVM_TARGET_ENABLE_REGPARM
 // Allow the target to mark this as inreg.


___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] [llvm] r40583 - in /llvm/trunk/test: C++Frontend/2007-07-29-RestrictPtrArg.cpp C++Frontend/2007-07-29-RestrictRefArg.cpp CFrontend/2007-07-29-RestrictPtrArg.c

2007-07-29 Thread Christopher Lamb
Author: clamb
Date: Sun Jul 29 18:29:16 2007
New Revision: 40583

URL: http://llvm.org/viewvc/llvm-project?rev=40583&view=rev
Log:
Add tests for generating noalias parameter attribute from __restrict qualified 
function parameters. C++ tests are currently XFAILing see PR1582.

Added:
llvm/trunk/test/C++Frontend/2007-07-29-RestrictPtrArg.cpp
llvm/trunk/test/C++Frontend/2007-07-29-RestrictRefArg.cpp
llvm/trunk/test/CFrontend/2007-07-29-RestrictPtrArg.c

Added: llvm/trunk/test/C++Frontend/2007-07-29-RestrictPtrArg.cpp
URL: 
http://llvm.org/viewvc/llvm-project/llvm/trunk/test/C%2B%2BFrontend/2007-07-29-RestrictPtrArg.cpp?rev=40583&view=auto

==
--- llvm/trunk/test/C++Frontend/2007-07-29-RestrictPtrArg.cpp (added)
+++ llvm/trunk/test/C++Frontend/2007-07-29-RestrictPtrArg.cpp Sun Jul 29 
18:29:16 2007
@@ -0,0 +1,8 @@
+// RUN: %llvmgxx -c -emit-llvm %s -o - | llvm-dis | grep noalias
+// XFAIL: i[1-9]86|alpha|ia64|arm|x86_64|amd64
+// NOTE: This should be un-XFAILed when the C++ type qualifiers are fixed
+
+void foo(int * __restrict myptr1, int * myptr2) {
+  myptr1[0] = 0;
+  myptr2[0] = 0;
+}

Added: llvm/trunk/test/C++Frontend/2007-07-29-RestrictRefArg.cpp
URL: 
http://llvm.org/viewvc/llvm-project/llvm/trunk/test/C%2B%2BFrontend/2007-07-29-RestrictRefArg.cpp?rev=40583&view=auto

==
--- llvm/trunk/test/C++Frontend/2007-07-29-RestrictRefArg.cpp (added)
+++ llvm/trunk/test/C++Frontend/2007-07-29-RestrictRefArg.cpp Sun Jul 29 
18:29:16 2007
@@ -0,0 +1,8 @@
+// RUN: %llvmgxx -c -emit-llvm %s -o - | llvm-dis | grep noalias
+// XFAIL: i[1-9]86|alpha|ia64|arm|x86_64|amd64
+// NOTE: This should be un-XFAILed when the C++ type qualifiers are fixed
+
+void foo(int & __restrict myptr1, int & myptr2) {
+  myptr1 = 0;
+  myptr2 = 0;
+}

Added: llvm/trunk/test/CFrontend/2007-07-29-RestrictPtrArg.c
URL: 
http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CFrontend/2007-07-29-RestrictPtrArg.c?rev=40583&view=auto

==
--- llvm/trunk/test/CFrontend/2007-07-29-RestrictPtrArg.c (added)
+++ llvm/trunk/test/CFrontend/2007-07-29-RestrictPtrArg.c Sun Jul 29 18:29:16 
2007
@@ -0,0 +1,6 @@
+// RUN: %llvmgxx -c -emit-llvm %s -o - | llvm-dis | grep noalias
+
+void foo(int * __restrict myptr1, int * myptr2) {
+  myptr1[0] = 0;
+  myptr2[0] = 0;
+}
\ No newline at end of file


___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


Re: [llvm-commits] Patch for X86 to use subregs

2007-07-29 Thread Evan Cheng

Sent from my iPhone

On Jul 28, 2007, at 4:36 PM, Christopher Lamb <[EMAIL PROTECTED] 
> wrote:




On Jul 28, 2007, at 2:26 PM, Evan Cheng wrote:

On Jul 28, 2007, at 11:52 AM, Christopher Lamb <[EMAIL PROTECTED] 
> wrote:




On Jul 28, 2007, at 1:48 AM, Evan Cheng wrote:


Very cool! I need to read it more carefully.


But I see you are lowering zext to a single insert_subreg. Is  
that right? It won't zero out the top part, no?


It's only lowering (zext i32 to i64) to an insert_subreg on x86-64  
where all writes to 32-bit registers implicitly zero-extend into  
the upper 32-bits.




I know. But thy mismatch semantically. A insert_subreg to the lower  
part should not change the upper half. I think this is only legal  
for anyext.


On x86-64 the semantics of a 2 operand i32 insert_subreg is that the  
input super-value is implicitly zero. So in this sense the insert  
isn't changing the upper half, it's just that the upper half is  
being set to zero implicitly rather than explicitly. If you'll  
notice the insert_subreg is a two operand (implicit super value) not  
a three operand version. If the insert were the three operand  
version, and the super value as coming from an implicit def I'd  
agree with you, but it's not.


Ok, let's step back for a second. There are a couple of issues that  
should be addressed. Plz help me understand. :)


1: Semantics of insert_subreg should be the same across all targets,  
right?


2: two operant variant of insert_subreg should mean the superreg is  
undef. If you insert a value into a low part, the rest of the superreg  
is still undef.


3: why is there a two operant variant in the first place? Why not use  
undef for the superreg operant?


4: what's the benefit of isel a zext to insert_subreg and then xform  
it to a 32-bit move? Why not just isel the zext to the move? It's not  
legal to coalesce it away anyway.


Evan



Also the current behavior is to use a 32-bit mov instruction for  
both zeroext and for anyext, I don't see how this is any different.



--
Chris


Sent from my iPhone

On Jul 28, 2007, at 12:17 AM, Christopher Lamb <[EMAIL PROTECTED] 
> wrote:


This patch changes the X86 back end to use the new subreg  
operations for appropriate truncate and extend operations. This  
should allow regression testing of the subreg feature going  
forward, as it's now used in a public target.


The patch passed DejaGnu and all of SingleSource on my x86  
machine, but there are changes for x86-64 as well which I  
haven't been able to test. Output assembly for x86-64 appears  
sane, but I'd appreciate someone giving the patch a try on their  
x86-64 system. Other 32-bit x86 testing is also appreciated.


Thanks
--
Christopher Lamb




___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


--
Christopher Lamb



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


--
Christopher Lamb



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


Re: [llvm-commits] Patch for X86 to use subregs

2007-07-29 Thread Christopher Lamb


On Jul 29, 2007, at 6:20 PM, Evan Cheng wrote:


Sent from my iPhone

On Jul 28, 2007, at 4:36 PM, Christopher Lamb  
<[EMAIL PROTECTED]> wrote:




On Jul 28, 2007, at 2:26 PM, Evan Cheng wrote:

On Jul 28, 2007, at 11:52 AM, Christopher Lamb  
<[EMAIL PROTECTED]> wrote:




On Jul 28, 2007, at 1:48 AM, Evan Cheng wrote:


Very cool! I need to read it more carefully.


But I see you are lowering zext to a single insert_subreg. Is  
that right? It won't zero out the top part, no?


It's only lowering (zext i32 to i64) to an insert_subreg on  
x86-64 where all writes to 32-bit registers implicitly zero- 
extend into the upper 32-bits.




I know. But thy mismatch semantically. A insert_subreg to the  
lower part should not change the upper half. I think this is only  
legal for anyext.


On x86-64 the semantics of a 2 operand i32 insert_subreg is that  
the input super-value is implicitly zero. So in this sense the  
insert isn't changing the upper half, it's just that the upper  
half is being set to zero implicitly rather than explicitly. If  
you'll notice the insert_subreg is a two operand (implicit super  
value) not a three operand version. If the insert were the three  
operand version, and the super value as coming from an implicit  
def I'd agree with you, but it's not.


Ok, let's step back for a second. There are a couple of issues that  
should be addressed. Plz help me understand. :)


1: Semantics of insert_subreg should be the same across all  
targets, right?


I'm not certain that this should be so. x86-64 clearly has a target  
specific semantics of a 32-bit into 64-bit insert.


2: two operant variant of insert_subreg should mean the superreg is  
undef. If you insert a value into a low part, the rest of the  
superreg is still undef.


I think the meaning of insert_subreg instruction (both 2 and 3  
operand versions) must have semantics specific to the target. For  
example, on x86-64 there is no valid 3 operand insert_subreg for a 32- 
bit value into 64-bits, because the 32-bit result is always going to  
be zero extended and overwrite the upper 32-bits.


3: why is there a two operant variant in the first place? Why not  
use undef for the superreg operant?


To note, the two operand variant is of the MachineInstr. The DAG form  
would be to represent the superregister as coming from an undef node,  
but this gets isel'd to the two operand MachineInstr of insert_subreg.


The reason is that undef is typically selected to an implicit def of  
a register. This causes an unnecessary move to be generated later on.  
This move can be optimized away later with more difficulty during  
subreg lowering by checking whether the input register is defined by  
an implicit def pseudo instruction, but instead I decided to perform  
the optimization during ISel on the DAG form during instruction  
selection.


With what you're suggesting
reg1024 = ...
reg1026 = insert_subreg undef, reg1024, 1
reg1027 = insert_subreg reg1026, reg1025, 1
use reg1027

would be isel'd to then subreg lowered to:

R6 = ...
implicit def R01 <= this implicit def is unecessary
R23 = R01 <= this copy is unnecessary
R2 = R6
R45 = R23
R5 = R6
use R45

4: what's the benefit of isel a zext to insert_subreg and then  
xform it to a 32-bit move?


The xform to a 32-bit move is only the conservative behavior. The  
zext can be implicit if regalloc can coalesce subreg_inserts.


Why not just isel the zext to the move? It's not legal to coalesce  
it away anyway.


Actually it is legal to coalesce it. On x86-64 any write to a 32-bit  
register zero extends the value to 64-bits. For the insert_subreg  
under discussion the inserted value is a 32-bit result, that has in- 
fact already be zero extended implicitly.




Also the current behavior is to use a 32-bit mov instruction for  
both zeroext and for anyext, I don't see how this is any different.



--
Chris


Sent from my iPhone

On Jul 28, 2007, at 12:17 AM, Christopher Lamb  
<[EMAIL PROTECTED]> wrote:


This patch changes the X86 back end to use the new subreg  
operations for appropriate truncate and extend operations.  
This should allow regression testing of the subreg feature  
going forward, as it's now used in a public target.


The patch passed DejaGnu and all of SingleSource on my x86  
machine, but there are changes for x86-64 as well which I  
haven't been able to test. Output assembly for x86-64 appears  
sane, but I'd appreciate someone giving the patch a try on  
their x86-64 system. Other 32-bit x86 testing is also  
appreciated.


Thanks
--
Christopher Lamb




___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


--
Christopher Lamb



___
llvm-commits mailing list
ll

Re: [llvm-commits] Patch for X86 to use subregs

2007-07-29 Thread Evan Cheng


On Jul 29, 2007, at 9:37 PM, Christopher Lamb wrote:



On Jul 29, 2007, at 6:20 PM, Evan Cheng wrote:


Sent from my iPhone

On Jul 28, 2007, at 4:36 PM, Christopher Lamb  
<[EMAIL PROTECTED]> wrote:




On Jul 28, 2007, at 2:26 PM, Evan Cheng wrote:

On Jul 28, 2007, at 11:52 AM, Christopher Lamb  
<[EMAIL PROTECTED]> wrote:




On Jul 28, 2007, at 1:48 AM, Evan Cheng wrote:


Very cool! I need to read it more carefully.


But I see you are lowering zext to a single insert_subreg. Is  
that right? It won't zero out the top part, no?


It's only lowering (zext i32 to i64) to an insert_subreg on  
x86-64 where all writes to 32-bit registers implicitly zero- 
extend into the upper 32-bits.




I know. But thy mismatch semantically. A insert_subreg to the  
lower part should not change the upper half. I think this is  
only legal for anyext.


On x86-64 the semantics of a 2 operand i32 insert_subreg is that  
the input super-value is implicitly zero. So in this sense the  
insert isn't changing the upper half, it's just that the upper  
half is being set to zero implicitly rather than explicitly. If  
you'll notice the insert_subreg is a two operand (implicit super  
value) not a three operand version. If the insert were the three  
operand version, and the super value as coming from an implicit  
def I'd agree with you, but it's not.


Ok, let's step back for a second. There are a couple of issues  
that should be addressed. Plz help me understand. :)


1: Semantics of insert_subreg should be the same across all  
targets, right?


I'm not certain that this should be so. x86-64 clearly has a target  
specific semantics of a 32-bit into 64-bit insert.


No, that won't do. insert_subreg and extract_subreg are by definition  
target independent. They must have the same semantics. You are  
forcing x86-64 32-bit zero-extending move to fit insert_subreg when  
they are really not the same thing.


2: two operant variant of insert_subreg should mean the superreg  
is undef. If you insert a value into a low part, the rest of the  
superreg is still undef.


I think the meaning of insert_subreg instruction (both 2 and 3  
operand versions) must have semantics specific to the target. For  
example, on x86-64 there is no valid 3 operand insert_subreg for a  
32-bit value into 64-bits, because the 32-bit result is always  
going to be zero extended and overwrite the upper 32-bits.


It just means there is no way to implement a insert_subreg with a  
single instruction under x86-64. But that is perfectly ok. Apart from  
anyext, x86-64 just isn't going to benefit from it. It's also  
impossible to read or modify the higher 32-bits.


3: why is there a two operant variant in the first place? Why not  
use undef for the superreg operant?


To note, the two operand variant is of the MachineInstr. The DAG  
form would be to represent the superregister as coming from an  
undef node, but this gets isel'd to the two operand MachineInstr of  
insert_subreg.


The reason is that undef is typically selected to an implicit def  
of a register. This causes an unnecessary move to be generated  
later on. This move can be optimized away later with more  
difficulty during subreg lowering by checking whether the input  
register is defined by an implicit def pseudo instruction, but  
instead I decided to perform the optimization during ISel on the  
DAG form during instruction selection.


With what you're suggesting
reg1024 = ...
reg1026 = insert_subreg undef, reg1024, 1
reg1027 = insert_subreg reg1026, reg1025, 1
use reg1027

would be isel'd to then subreg lowered to:

R6 = ...
implicit def R01 <= this implicit def is unecessary


That's a pseudo instruction, it doesn't cost anything.


R23 = R01 <= this copy is unnecessary


It can be coalesced to:
R23 = undef


R2 = R6
R45 = R23
R5 = R6
use R45


Using undef explicit is the right way to go. There is a good reason  
it's there. Having the two operand version of insert_subreg that  
implicitly use an undef value doesn't fit into the overall llvm  
philosophy.



4: what's the benefit of isel a zext to insert_subreg and then  
xform it to a 32-bit move?


The xform to a 32-bit move is only the conservative behavior. The  
zext can be implicit if regalloc can coalesce subreg_inserts.


Why not just isel the zext to the move? It's not legal to coalesce  
it away anyway.


Actually it is legal to coalesce it. On x86-64 any write to a 32- 
bit register zero extends the value to 64-bits. For the  
insert_subreg under discussion the inserted value is a 32-bit  
result, that has in-fact already be zero extended implicitly.


It's not legal to coalesce away the 32-bit zero extending move.

Suppose RAX contains some value with top 32-bits non-zero.
mov EAX, EAX (zero extend top bits)
use RAX (expecting top bits to be zero)

Coalesced away the move is a miscompilation.

Evan




Also the current behavior is to use a 32-bit mov instruction for  
both zeroext and for anyext, I don't see how