b-sumner added a comment. In https://reviews.llvm.org/D37822#877903, @Anastasia wrote:
> In https://reviews.llvm.org/D37822#877572, @yaxunl wrote: > > > In https://reviews.llvm.org/D37822#873876, @Anastasia wrote: > > > > > In https://reviews.llvm.org/D37822#872446, @yaxunl wrote: > > > > > > > In https://reviews.llvm.org/D37822#872291, @Anastasia wrote: > > > > > > > > > Could you please explain a bit more why the alignment have to be put > > > > > explicitly in the struct? I am just not very convinced this is > > > > > general enough. > > > > > > > > > > > > The captured variables are fields of the block literal struct. Due to > > > > alignment requirement of these fields, there is alignment requirement of > > > > the block literal struct. The ISA of the block invoke function is > > > > generated with the assumption of these alignments. If the block literal > > > > is > > > > allocated at a memory address not satisfying the alignment > > > > requirement, the kernel behavior is undefined. > > > > > > > > Generally, __enqueue_kernel library function needs to prepare the > > > > kernel argument before launching the kernel. It usually does this by > > > > copying > > > > the block literal to some buffer then pass the address of the buffer > > > > to the kernel. Then the address of the buffer has to satisfy the > > > > alignment > > > > requirement. > > > > > > > > If this block literal struct is not general enough, how about add > > > > another field as target reserved size, and leave the remaining space of > > > > header for > > > > target specific use. And add a target hook to allow target fill the > > > > reserved space, e.g. > > > > > > > > struct __opencl_block_literal { > > > > int total_size; > > > > int align; > > > > __generic void *invoke; > > > > int target_reserved_size; /* round up to 4 bytes */ > > > > int target_reserved[]; > > > > /* captures */ > > > > }; > > > > > > > > > > > > > I like the idea of the target reserved part actually. But not sure how it > > > could be used without adding any target specific methods? > > > > > > If we decide to add target reserved fields, I can add target hooks to fill > > these fields. However I would suggest to leave this for future since I > > don't see there is need for other fields for now. > > > I could imagine it can be usefull for some vendor implementations. > > >> However, I am still not clear why the alignment of this struct has to be > >> different from any other struct Clang produces. Normally the alignment of > >> objects have to be known during IR generation to put them correctly in the > >> attributes of generated alloca, store and loads. But as a field inside > >> struct I don't know how it can be useful. I would imagine `enqueue_kernel` > >> would just operate on the block as if it would be an arbitrary buffer of > >> data. Also would size of the struct not account for any padding to make > >> sure the alignment can be deduced based on it correctly? > > > > enqueue_kernel needs to pass the block struct to the kernel. Let's assume > > it does this by copying the block struct to a buffer. If enqueue_kernel > > does not know the alignment of the struct, it can only put it at an > > arbitrary address in the buffer. Then the kernel has to copy the struct to > > an aligned private memory and load the fields. However, if the > > enqueued_kernel knows the alignment of the struct, it can put it at an > > address satisfying the alignment. Then the kernel can load the fields > > directly from the buffer, skips the step of copying to an aligned private > > memory. Therefore, alignment of the block struct is usually a useful > > information for enqueue_kernel. I think that's why in the SPIRV spec > > OpEnqueueKernel requires an alignment operand for the block context. > > Ok, I just think in C if you use `malloc` to obtain a pointer to some memory > location it doesn't take any alignment information. Then you can use the > pointer to copy any data including the struct into the location its pointed > to. And the pointer can be used later on correctly. I think the alignment is > deduced in this case from the type or the size of an object. Do you know > where the alignment information is used for SPIRV call? Also how is the block > represented in SPIRV? Actually malloc alignment is not sufficient more many uses such as CPU supported vectors, e.g. AVX512 or passed to create buffer with use-host-pointer. In such cases you need posix_memalign or some similar API. Having the alignment means it is available if needed. If an implementation doesn't need it, there is no harm is there? https://reviews.llvm.org/D37822 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits