On Thu, 2014-04-24 at 17:53 +0200, Francisco Jerez wrote: > Jan Vesely <jan.ves...@rutgers.edu> writes: > > > On Thu, 2014-04-24 at 12:05 +0200, Francisco Jerez wrote: > > <snip> > >> >>> > >> >>> I don't think that using getTypeAllocSize() instead of > >> >>> getTypeStoreSize() to calculate clover::argument::size would be a > >> >>> satisfactory solution. By doing that you'd redefine the API argument > >> >>> size exposed to the host for *all* argument types to be the > >> >>> device-dependent aligned size, which is definitely not what you want. > >> >> > >> >>> AFAIK 3-element vectors are an exception because they are the only > >> >>> types > >> >>> that are defined to have a different API size from their actual usable > >> >>> size, so they probably deserve special handling in invocation.cpp (as > >> >>> you did in your first patch). As the API size is target-independent I > >> >>> don't think that the fix belongs in Clang or LLVM, Clover is likely at > >> >>> fault. > >> >> > >> >> The way I understood the ch 6.1.5 is that both API and OpenCL C 3 > >> >> element vectors are required to be 4*sizeof(component). So a > >> >> sizeof(float3) == sizeof(cl_float3) == 16, and should be both host and > >> >> target independent. > >> > > >> > Well, sizeof() is supposed to take into account the alignment, so this > >> > should be the case already. > > > > AFAIK sizeof only includes padding. I explain below. > > > > Yes, and 3-component vectors are being padded to 4 components already. > > >> > > >> >> That's why clang (or more precisely libclc) looked like a correct > >> >> place. > >> >> > >> > > >> > Right. I guess the other possibility would be to redefine cl_float3 as > >> > cl_float4 in libclc. > >> > >> Sorry, I meant to redefine "gentype3" as "gentype4" for every vector > >> element type "gentype". > >> > >> > You mentioned that it had undesired consequences. Which exactly? > > > > With this change: > > -typedef __attribute__((ext_vector_type(3))) float float3; > > +typedef __attribute__((ext_vector_type(4))) float float3; > > > > errors about duplicate ops (since type3 and type4 are now the same type) > > can be fixed. However, I don't know a clean way to fix the following: > > > > error: too few elements in vector initialization (expected 4 elements, > > have 3) > > > > Even if we can get rid of it, or reduce it to warning. > > having the same type for type3 and type4 causes problem that we won't be > > able to distinguish following situations: > > > > flaot4 A > > > > float3 B = A.xyz; // This is OK, should not produce warning/error; > > > > float4 C = A.xyz; // This should at least produce a warning. even if we > > allow using fewer elements for vector initialization. > > > > Right, I agree that this wouldn't be a good idea. > > > > >> > > >> >> I understand that target device can have stricter alignment rules, but I > >> >> don't see how it can have different type sizes (my reading of the specs > >> >> is that these are binding for the target as well). > >> >> > >> > > >> > In theory the sizes are binding for most built-in types, but R600 > >> > doesn't support certain integer types so clover has code to take into > >> > account the differing size, alignment and endianness between host and > >> > device. I guess that in most cases it would be possible to use the > >> > "official" ABI for passing kernel arguments to the device (and that's a > >> > requirement for your solution using DataLayout::getTypeAllocSize() to > >> > work reliably, otherwise you'll be taking into account device-specific > >> > padding), but you'll have to fix the R600 back-end so it's able to deal > >> > with (i.e. lower) all built-in types specified by OpenCL. I think that > >> > this would be useful on its own, Tom should know better than I how > >> > difficult it would be. > > > > I didn't know about the additional R600 type size magic, in this case I > > agree that the original approach (detect 3elem vectors and change size) > > is probably the best in this situation. > > > >> > > >> >> I can resend the original patch with debug output replaced by a comment. > >> >> > >> > A slightly more general solution than what you did could be to align the > >> > store size of scalar and vector arguments to the next power of two to > >> > calculate the API size (in particular, that would make 3- and 4-element > >> > vectors have the same size). From 6.1.5: "A built-in data type that is > >> > not a power of two bytes in size must be aligned to the next larger > >> > power of two." > > > > This part is what made me think that the _alignment_ only restricts > > starting address (and not size). > > It also restricts the size. From section 6.3 on the sizeof operator: > "The sizeof operator yields the size (in bytes) of its operand, > including any padding bytes (refer to section 6.1.5) needed for > alignment [...]. When applied to an operand that has structure or union > type, the result is the total number of bytes in such an object, > including internal and trailing padding." > > > Otherwise, the addition of 3 element vector special case would not be > > necessary. > > Maybe it wouldn't be strictly necessary, but it's still clarifying as an > illustrative corollary of the general rule. > > > i.e I think that the > > following mem layout is legal in OCL 1.0 but not in OCL1.1+ > > > > 0x10: float3 here > > 0x1c: int here // this should belong to float3 in ocl 1.1+ > > Does OpenCL 1.0 support 3-component vectors at all?
ah, I missed that. > > > > > while the following is illegal in both: > > > > 0x8: float3 here > > > > So, I don't think that the quoted text requires all builtin types to > > have 2^x size (although all but 3 element vectors do). > > > > Well, don't interpret it that way if you don't want to, but all OpenCL > built-in types do (or at least the ones you are allowed to use as kernel > argument types), so this solution seems more elegant to me than > special-casing 3-component vectors. with no 3 element vectors in ocl1.0 and the sizeof clarification. it all makes sense to me. Thanks for the explanation. I'll post a patch that checks the power-of-two rule. thanks again, Jan > > Thanks. > > > regards, > > Jan > > > > > >> > > >> >> regards, > >> >> Jan > >> >> > >> > > >> > Thanks. > >> > > >> >>> > >> >>> Thanks. > >> >>> > >> >>> > regards, > >> >>> > Jan > >> >>> > > >> >>> > > >> >>> > -- > >> >>> > Jan Vesely <jan.ves...@rutgers.edu> > >> >>> > _______________________________________________ > >> >>> > mesa-dev mailing list > >> >>> > mesa-dev@lists.freedesktop.org > >> >>> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > >> >> > >> >> -- > >> >> Jan Vesely <jan.ves...@rutgers.edu> -- Jan Vesely <jan.ves...@rutgers.edu>
signature.asc
Description: This is a digitally signed message part
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev