On Thu, 2016-06-23 at 18:03 -0700, Francisco Jerez wrote: > Jan Vesely <jan.ves...@rutgers.edu> writes: > > > On Wed, 2016-06-22 at 20:22 -0700, Francisco Jerez wrote: > > > Jan Vesely <jan.ves...@rutgers.edu> writes: > > > > > > > On Wed, 2016-06-22 at 17:07 -0700, Francisco Jerez wrote: > > > > > Jan Vesely <jan.ves...@rutgers.edu> writes: > > > > > > > > > > > On Mon, 2016-06-13 at 17:24 -0700, Francisco Jerez wrote: > > > > > > > Serge Martin <edb+m...@sigluy.net> writes: > > > > > > > > > > > > > > > This fix getting the size of a struct arg. vec3 types > > > > > > > > still > > > > > > > > work > > > > > > > > ok. > > > > > > > > Only buit-in args need to have power of two alignment, > > > > > > > > getTypeAllocSize > > > > > > > > reports the correct size. > > > > > > > > --- > > > > > > > > src/gallium/state_trackers/clover/llvm/invocation.cpp > > > > > > > > | 3 > > > > > > > > ++- > > > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > diff --git > > > > > > > > a/src/gallium/state_trackers/clover/llvm/invocation.cpp > > > > > > > > b/src/gallium/state_trackers/clover/llvm/invocation.cpp > > > > > > > > index 03487d6..9af51539 100644 > > > > > > > > --- > > > > > > > > a/src/gallium/state_trackers/clover/llvm/invocation.cpp > > > > > > > > +++ > > > > > > > > b/src/gallium/state_trackers/clover/llvm/invocation.cpp > > > > > > > > @@ -472,7 +472,8 @@ namespace { > > > > > > > > // aligned to the next larger power of > > > > > > > > two". We > > > > > > > > need > > > > > > > > this > > > > > > > > // alignment for three element vectors, which > > > > > > > > have > > > > > > > > // non-power-of-2 store size. > > > > > > > > - const unsigned arg_api_size = > > > > > > > > util_next_power_of_two(arg_store_size); > > > > > > > > + const unsigned arg_api_size = arg_type- > > > > > > > > > isStructTy() > > > > > > > > ? > > > > > > > > + arg_store_size : > > > > > > > > util_next_power_of_two(arg_store_size); > > > > > > > > > > > > > > > Hm... Isn't this still going to be broken if you pass a > > > > > > > struct > > > > > > > argument > > > > > > > to a kernel function and the alignment of any of the > > > > > > > struct > > > > > > > members > > > > > > > doesn't match the target-specific data layout? Not sure > > > > > > > we > > > > > > > can > > > > > > > fix > > > > > > > this > > > > > > > sensibly without requiring the target's data layout to > > > > > > > match > > > > > > > the > > > > > > > CL > > > > > > > API > > > > > > > exactly. Any suggestions Tom? > > > > > > > > > > > > according to 6.7.2.1 compilers can arbitrarily insert > > > > > > padding > > > > > > between > > > > > > struct members (except at the beginning). > > > > > > > > > > What spec version are you looking at? My CL spec doesn't > > > > > have > > > > > any > > > > > section labeled 6.7.2.1. > > > > > > > > c99 specs, I did not find anything specific for CLC (it might > > > > be > > > > that I > > > > just need to look harder). CLC 2.0 adds additional constraint > > > > that > > > > you > > > > can't use address space qualifiers. > > > > > > > > > > I'd expect that whatever the CL spec says regarding the memory > > > layout > > > of > > > CLC types (e.g. section 6.1.5 which specifies the usual alignment > > > rules > > > for CL types and section 6.11.1 and 6.11.3 which specify various > > > variable and type declaration attributes giving finer control > > > over > > > the > > > alignment of variable and struct member declarations) fully > > > overrides > > > the C99 spec. > > > > Right, even if we consider that none of the C99 6.7.2.1 apply (and > > at > > least CL2.0 6.5.6 does not make it sound so), it only gives us one > > side, we can check that the CLC struct layout follows what we would > > expect. We don't have means to check and enforce that the host side > > struct layout is compatible. > > > > Yes, exactly, the CL spec doesn't have anything to say about the > host-side memory layout, that's up to the host platform's ABI to > define. > > > > > > > > > > > > > > > Even if size/alignment of individual members match CL API > > > > > > exactly, > > > > > > there's no guarantee that the structure layout/size will be > > > > > > the > > > > > > same. > > > > > > > > > > > How can you exchange structured data with a CL kernel then, > > > > > assuming > > > > > that the layout of structure types in memory is fully > > > > > unspecified > > > > > as > > > > > you > > > > > say? > > > > > > > > that is my point. My understanding is that it relies on a > > > > silent > > > > assumption that both CLC and the host compiler will create the > > > > same > > > > structure layout given the same structure elements. > > > > > > > > big endian host can create: > > > > struct foo { > > > > cl_int a; > > > > // 16 bit padding; > > > > cl_short b; > > > > cl_int c; > > > > }; > > > > > > I don't think this is a valid representation of the structure > > > according > > > to CL rules, my understanding is if your host happens to lay out > > > structure fields in this way you have to either marshal things > > > manually > > > or use compiler-specific attributes to get the host compiler to > > > put > > > things at the right location (according to CL API rules). I > > > believe > > > that Khronos' cl_platform.h specifies alignment attributes in the > > > cl_* > > > host-side typedefs specifically for this purpose. > > > > maybe I'm missing something, but what rule does it break? shorts > > are by > > default 2 byte aligned, ints are 4 byte aligned, they cannot be > > reordered, so there needs to be a 2 byte hole somewhere. is there a > > rule that requires members to be at the nearest aligned location? > > That's how it works on most CPU ABIs I'm aware of, you take the first > available offset compatible with the alignment requirements of the > element datatype. I'm not sure if the CL spec defines the memory > layout > of structure types to that level of detail, but I would assume that's > by > omission rather than it being intentionally unspecified. > > > At least on the host side the compiler can insert stuff like '60 > > bytes > > padding to avoid false sharing'. > > That should also be a valid transformation in a (device-side) CL > program > if the compiler can prove that the memory layout of a struct type is > never visible by the host, so none of the (more or less explicitly > specified) CL ABI applies. The same goes for a host-side C compiler, > it > can lay out the program's data structures any way it likes as long as > the externally visible C ABI of the target platform is preserved. > > > If the application programmer needs to use specific attributes to > > get > > compatible layout we have no way to check that, other than checking > > the provided size. > > > > what I'm trying to says is; does it make sense to have stricter > > checks > > on device side if we can't enforce them on the host side? > > > We can't but the application can as long as any binary data exchanged > through the CL API follows some well-specified and device-independent > memory layout. It's the application's responsibility to make sure > that > any binary blobs passed to the CL API comply with the CL data layout: > one (somewhat non-portable) possibility is to rely on the host C > compiler to lay out your data in the same way the CL API expects by > using compiler-specific type annotations, but the CL implementation > cannot give you any guarantees that it will be possible to achieve > that > on all platforms, for maximum portability the application can always > marshal any data exchanged with CL kernels by hand.
makes sense, thanks for detailed reply. A lot of the stuff falls into "implicit assumptions" category for me. I knew that shared data needed to have rules for interoperability, but I expected they might be more explicit (similar to how endianness of pointers is handled) thanks again, Jan > > > Jan > > > > > > > > > while little endian device could create: > > > > struct foo { > > > > int a; > > > > short b; > > > > // 16 bit padding > > > > int c; > > > > }; > > > > > > > > If cl_short/short alignment is 2bytes, the above structures and > > > > all > > > > the > > > > members have the same size/alignment, yet are not compatible. > > > > > > > > Am I missing something that would prevent the above? > > > > > > > > Jan > > > > > > > > > > > > > > > Jan > > > > > > > > > > > > > > > > > > > > > llvm::Type *target_type = arg_type- > > > > > > > > >isIntegerTy() > > > > > > > > ? > > > > > > > > TD.getSmallestLegalIntType(mod- > > > > > > > > > getContext(), > > > > > > > > arg_store_size * 8) > > > > > > > > -- > > > > > > > > 2.5.5 > > > > > > -- > > > > > > > > > > > > Jan Vesely <jan.ves...@rutgers.edu> > > > > -- > > > > > > > > Jan Vesely <jan.ves...@rutgers.edu> > > -- > > > > Jan Vesely <jan.ves...@rutgers.edu> -- Jan Vesely <jan.ves...@rutgers.edu>
signature.asc
Description: This is a digitally signed message part
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev