[llvm-commits] llvm-gcc: better support for variable size struct fields

Duncan Sands Tue, 06 Mar 2007 08:26:33 -0800

This patch applies on top of the previously posted patch
"llvm-gcc: use component_ref_field_offset in component references",
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070226/045399.html.


This fixes wrong handling of structs containing more than one
variable sized field.  Two C testcases are attached.  Consider
for example the struct from VarSizeInStruct1:
  struct f { char w; char x[n]; char z[]; };
DecodeStructFields does the following: it creates a field for
w; it skips x; it doesn't skip z because it is the last field,
however it thinks it starts at byte offset 0 because it starts
at the variable offset n+1, and so pops w out of the struct
and replaces it with z, leading to the LLVM struct { [0 * i8] }.
This causes an assertion failure later on in EmitLV_COMPONENT_REF.
The other testcase is similar, and leads to a different assertion
failure.

I first prepared a minimal fix, but later noticed that the code for
handling variable sized struct fields was unnecessarily complicated
and came up with this more involved patch, which removes a bunch of
special casing.  The existing code seems rather fixated on whether
TYPE_SIZE is constant or not, when what really matters is whether the
field starts at a constant offset.  I simply skip over fields if and
only if they don't start at a constant offset, and let variable sized
fields be emitted and indexed like any others (no more ~0U field indices).
I remove the special casing for ~0U field indices everywhere in favour
of the generic code, for example in EmitLV_COMPONENT_REF.  The only
place where something still needs to be done is in ConvertRecordCONSTRUCTOR,
but there too the code simplifies.

I see no additional failures in the regression test suite or in
MultiSource with these changes.

Enjoy!

Duncan.

Index: gcc.llvm.master/gcc/llvm-convert.cpp
===================================================================
--- gcc.llvm.master.orig/gcc/llvm-convert.cpp	2007-03-05 13:05:41.000000000 +0100
+++ gcc.llvm.master/gcc/llvm-convert.cpp	2007-03-05 14:55:33.000000000 +0100
@@ -4635,12 +4635,6 @@
     assert(DECL_LLVM_SET_P(FieldDecl) && "Struct not laid out for LLVM?");
     ConstantInt *CI = cast<ConstantInt>(DECL_LLVM(FieldDecl));
     uint32_t MemberIndex = CI->getZExtValue();
-    if (MemberIndex == ~0U) {
-      assert(isStructWithVarSizeArrayAtEnd(StructTy) &&
-             "Isn't var sized array access!");
-      CI = ConstantInt::get(Type::Int32Ty, StructTy->getNumContainedTypes()-1);
-      MemberIndex = CI->getZExtValue();
-    }
     assert(MemberIndex < StructTy->getNumContainedTypes() &&
            "Field Idx out of range!");
     FieldPtr = new GetElementPtrInst(StructAddrLV.Ptr,
@@ -5451,35 +5445,32 @@
       // If not, things are much simpler.
       assert(DECL_LLVM_SET_P(Field) && "Struct not laid out for LLVM?");
       unsigned FieldNo = cast<ConstantInt>(DECL_LLVM(Field))->getZExtValue();
-      
+      assert(FieldNo < ResultElts.size() && "Invalid struct field number!");
+
+      // Example: struct X { int A; char C[]; } x = { 4, "foo" };
+      assert(TYPE_SIZE(TREE_TYPE(Field)) ||
+             (FieldNo == ResultElts.size()-1 &&
+              isStructWithVarSizeArrayAtEnd(STy))
+             && "field with no size is not array at end of struct!");
+
       // If this is an initialization of a global that ends with a variable
       // sized array at its end, and the initializer has a non-zero number of
-      // elements, we must handle this case now.  In this case, FieldNo is ~0U
-      // and Val contains the actual type for the array.  
-      if (FieldNo == ~0U) {
-        // Handle: struct X { int A; char C[]; } x = { 4, "foo" };
-        assert(isStructWithVarSizeArrayAtEnd(STy) &&
-               "Struct doesn't end with variable sized array!");
-        FieldNo = STy->getNumElements()-1;
-        ResultElts[FieldNo] = Val;
-      } else {
-        assert(FieldNo < ResultElts.size() && "Invalid struct field number!");
-
-        // Otherwise, we know that the initializer has to match the element type
-        // of the LLVM structure field.  If not, then there is something that is
-        // not straight-forward going on.  For example, we could be initializing
-        // an unaligned integer field (e.g. due to attribute packed) with an 
-        // integer.  The struct field will have type [4 x ubyte] instead of
-        // "int" for example.  If we ignored this, we would lay out the
-        // initializer wrong.
-        if (Val->getType() != STy->getElementType(FieldNo))
-          Val = ConvertStructFieldInitializerToType(Val, 
+      // elements, then Val contains the actual type for the array.  Otherwise,
+      // we know that the initializer has to match the element type of the LLVM
+      // structure field.  If not, then there is something that is not
+      // straight-forward going on.  For example, we could be initializing an
+      // unaligned integer field (e.g. due to attribute packed) with an
+      // integer.  The struct field will have type [4 x ubyte] instead of
+      // "int" for example.  If we ignored this, we would lay out the
+      // initializer wrong.
+      if (TYPE_SIZE(TREE_TYPE(Field)) &&
+          Val->getType() != STy->getElementType(FieldNo))
+        Val = ConvertStructFieldInitializerToType(Val,
                                                   STy->getElementType(FieldNo));
 
-        ResultElts[FieldNo] = Val;
-      }
+      ResultElts[FieldNo] = Val;
     }
-    
+
     NextField = TREE_CHAIN(Field);
   }
   
@@ -5691,31 +5682,18 @@
     ConstantInt *CI = cast<ConstantInt>(DECL_LLVM(FieldDecl));
     uint64_t MemberIndex = CI->getZExtValue();
     
-    if (MemberIndex  != ~0U) {
-      std::vector<Value*> Idxs;
-      Idxs.push_back(Constant::getNullValue(Type::Int32Ty));
-      Idxs.push_back(CI);
-      FieldPtr = ConstantExpr::getGetElementPtr(StructAddrLV, &Idxs[0], 
-                                                Idxs.size());
-      
-      // Now that we did an offset from the start of the struct, subtract off
-      // the offset from BitStart.
-      if (MemberIndex) {
-        const StructLayout *SL = TD.getStructLayout(cast<StructType>(StructTy));
-        BitStart -= SL->getElementOffset(MemberIndex) * 8;
-      }
-    } else {
-      // We were unable to make a nice offset, emit an ugly one.
-      Constant *Offset = Convert(field_offset);
-      FieldPtr = ConstantExpr::getPtrToInt(StructAddrLV, Offset->getType());
-      FieldPtr = ConstantExpr::getAdd(FieldPtr, Offset);
-      FieldPtr = ConstantExpr::getIntToPtr(FieldPtr, PointerType::get(FieldTy));
-      
-      // Do horrible pointer arithmetic to get the address of the field.
-      unsigned ByteOffset = TREE_INT_CST_LOW(field_offset);
-      BitStart -= ByteOffset * 8;
-    }
+    std::vector<Value*> Idxs;
+    Idxs.push_back(Constant::getNullValue(Type::Int32Ty));
+    Idxs.push_back(CI);
+    FieldPtr = ConstantExpr::getGetElementPtr(StructAddrLV, &Idxs[0],
+                                              Idxs.size());
     
+    // Now that we did an offset from the start of the struct, subtract off
+    // the offset from BitStart.
+    if (MemberIndex) {
+      const StructLayout *SL = TD.getStructLayout(cast<StructType>(StructTy));
+      BitStart -= SL->getElementOffset(MemberIndex) * 8;
+    }
   } else {
     Constant *Offset = Convert(field_offset);
     Constant *Ptr = ConstantExpr::getPtrToInt(StructAddrLV, Offset->getType());
Index: gcc.llvm.master/gcc/llvm-types.cpp
===================================================================
--- gcc.llvm.master.orig/gcc/llvm-types.cpp	2007-03-05 13:05:20.000000000 +0100
+++ gcc.llvm.master/gcc/llvm-types.cpp	2007-03-05 15:11:38.000000000 +0100
@@ -972,8 +972,11 @@
       while (CurFieldNo < ElementOffsetInBytes.size() &&
              getFieldEndOffsetInBytes(CurFieldNo)*8 <= FieldOffsetInBits)
         ++CurFieldNo;
-      if (CurFieldNo >= ElementOffsetInBytes.size()) return ~0U;
-      return CurFieldNo;
+      if (CurFieldNo < ElementOffsetInBytes.size())
+        return CurFieldNo;
+      // Otherwise, we couldn't find the field!
+      assert(0 && "Could not find field!");
+      return ~0U;
     }
 
     // Handle zero sized fields now.  If the next field is zero sized, return
@@ -1019,28 +1022,15 @@
   return Result;
 }
 
-/// isNotLastField - Return true if this is not the last field in its record.
-///
-static bool isNotLastField(tree Field) {
-  for (Field = TREE_CHAIN(Field); Field; Field = TREE_CHAIN(Field))
-    if (TREE_CODE(Field) == FIELD_DECL)
-      return true;
-  return false;
-}
-
-
 /// DecodeStructFields - This method decodes the specified field, if it is a
 /// FIELD_DECL, adding or updating the specified StructTypeConversionInfo to
 /// reflect it.  
 void TypeConverter::DecodeStructFields(tree Field,
                                        StructTypeConversionInfo &Info) {
   if (TREE_CODE(Field) != FIELD_DECL ||
-      // Don't include variable sized fields, doing so would break anything
-      // after them... unless this is the last field.
-      ((TYPE_SIZE(TREE_TYPE(Field)) == 0 ||
-        TREE_CODE(TYPE_SIZE(TREE_TYPE(Field))) != INTEGER_CST) &&
-       isNotLastField(Field))) return;
-  
+      TREE_CODE(DECL_FIELD_OFFSET(Field)) != INTEGER_CST)
+    return;
+
   // Handle bit-fields specially.
   if (DECL_BIT_FIELD_TYPE(Field)) {
     DecodeStructBitField(Field, Info);
@@ -1228,12 +1218,8 @@
   // variable offset.
   unsigned CurFieldNo = 0;
   for (tree Field = TYPE_FIELDS(type); Field; Field = TREE_CHAIN(Field))
-    if (TREE_CODE(Field) == FIELD_DECL) {
-      // If this field comes after a variable sized element, stop trying to
-      // assign DECL_LLVM's.
-      if (TREE_CODE(DECL_FIELD_OFFSET(Field)) != INTEGER_CST)
-        break;
-      
+    if (TREE_CODE(Field) == FIELD_DECL &&
+        TREE_CODE(DECL_FIELD_OFFSET(Field)) == INTEGER_CST) {
       unsigned FieldOffsetInBits = getFieldOffsetInBits(Field);
       tree FieldType = TREE_TYPE(Field);
       
@@ -1251,10 +1237,9 @@
       
       // Figure out if this field is zero bits wide, e.g. {} or [0 x int].  Do
       // not include variable sized fields here.
-      bool isZeroSizeField = 
-        TYPE_SIZE(FieldType) && TREE_CODE(TYPE_SIZE(FieldType)) == INTEGER_CST&&
-        TREE_INT_CST_LOW(TYPE_SIZE(FieldType)) == 0;
-        
+      bool isZeroSizeField = !TYPE_SIZE(FieldType) ||
+        integer_zerop(TYPE_SIZE(FieldType));
+
       unsigned FieldNo = 
         Info.getLLVMFieldFor(FieldOffsetInBits, CurFieldNo, isZeroSizeField);
       SET_DECL_LLVM(Field, ConstantInt::get(Type::Int32Ty, FieldNo));

// RUN: %llvmgcc %s -S -o -
void* p (int n) {
  struct f {
    char w; char x[n]; char z[];
  } F;
  F.x[0]='x';
  return &F;
}

// RUN: %llvmgcc %s -S -o -
char p (int n) {
  struct f {
    char w; char x[n]; char y[n];
  } F;

  return F.x[0];
}

_______________________________________________
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] llvm-gcc: better support for variable size struct fields

Reply via email to