mshockwave added a comment. > From this picture I don't see how the flattening itself can help us to avoid > using global memory. Surely in both cases the captures content will have to > be copied into the memory accessible for the enqueued kernel (which is a > global memory in a general case, but doesn't have to be in some cases I am > guessing). Perhaps I am missing some extra step in the approach you are > proposing. If you rely on the parameter passing using normal function call > into the block_invoke then in both cases we can skip the memcpy of captures > at all. Otherwise both appoaches will need to make a copy of the captures.
Now I agree with the necessity of global `accessable_memory`. After all, kernel enqueuing functions in the host side(clEnqueueNDRangeXXX) also require pre-allocated `__global` memory, we should follow the same fashion. > What we can improve though is avoiding extra data copy using the copy helpers > you are proposing (this can also be achieved by calling mempy passing the > capture offset pointer into block_literal and captures size instead of the > the whole block_literal as highlighted above). We can also potentially avoid > reloading of the captures in the enqueued kernel though the capture > flattening, but this depends on the calling convention (or rather enqueueing > convension I would say). It seems that the flattening approach leave little space for the implementation, which violate the generic property of llvm. Also, although cap_copy_helper looks helpful, I think there is little chance for one to copy individual captured variables - copy the entire block_literal is sufficient. Of course, we can reduce block_literal size by removing redundant fields for the sake of optimization but that's another discussion topic I think. I would like to remove this code review. @Anastasia thanks for your patient discussing with newbie llvm/clang developer like me :) https://reviews.llvm.org/D24715 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits