Zoltán Gilián <zoltan.gil...@gmail.com> writes: >> auto img = dynamic_cast<image_argument &>(**(explicit_arg - 1)) > > Ok, so it should be (explicit_arg - 2) for the image format, I > presume.
Why? Your module::argument::image_size and ::image_format cases don't touch the explicit_arg iterator at all AFAICT, so it will be left pointing one past the last general semantic argument > This will be incorrect, however, if the targets that need implicit > argument for format metadata are indeed a strict superset of the ones > that need dimension, as you mentioned before. The targets that only > need format will break this code. Should I swap the order of the > format and dimension implicit args to make this approach work under > the aforementioned assumption? > Why would it be incorrect? The kernel::_args vector explicit_arg iterates in contains explicit arguments only so it shouldn't make a difference in which order you emit them as long as you don't interleave other explicit arguments in between. >> It also seems like you've got rid of the static casts you had >> previously? > > That is a mistake, I'll fix it. > >> My expectation here was that the compiler would be able to hard-code >> sampler indices in the shader without the API passing any actual data in >> the input buffer. Doesn't that work for you? > > Yes, that's correct, but this is also the case with images. If image > index is uploaded explicitly, I don't see why it can't be done with > sampler indices. But probably it's a better idea to send the sampler > value rather than the index, in case the kernel needs it (e.g. the > normalized or non-normalized nature of texture coordinates may have to > be specified in the fetch instruction itself, and not by hardware > registers), so I'll definitely change this. Some hardware (e.g. Intel's) will need an index (which can probably be hardcoded in the shader for now) into a table of sampler configurations which can be set-up later on by the driver at state-emit time. In any case we can always add a new argument semantic enum later on for the target to select sampler config vs index if different drivers turn out to need different things. > But the bigger problem is, that the byte offsets of the kernel > arguments are computed considering the sampler argument too, so the > binary expects it to be present in the input vector. Furthermore I > can't erase the sampler argument from the IR, because it is needed to > make it possible for get_kernel_args to detect the sampler. But if > sampler_argument::bind doesn't append 4 bytes (clang compiles > sampler_t to i32) to the input vector, the binary will try to load the > following arguments from wrong locations. Hmmm... So you only need it as padding? Wouldn't it be possible for you to declare samplers to be 0 bytes? > >> I don't think it's a good idea to use such obvious names for the >> implicit argument types. Couldn't they collide with some type declared > by the user that happens to have the same name? How about > "__clover_image_size" or something similar? > > Indeed, that's a good point. I'd go with something like > "__opencl_image_*" or "__llvm_image_*", because these strings will be > added to llvm, and non-clover code may depend on them in the future. > Sounds good to me. >> (Identifiers starting with >> double underscore are reserved for the implementation at least on C and >> GLSL, not sure about OpenCL-C) > > I believe this is true for OpenCL C too, since it is an extension to > C99. Identifiers starting with double underscores are reserved in C99. > IIRC identifiers starting with double underscore and single underscore followed by some upper-case letter were reserved even in the original ANSI C spec. > On Sat, Jul 25, 2015 at 1:06 PM, Francisco Jerez <curroje...@riseup.net> > wrote: >> Zoltan Gilian <zoltan.gil...@gmail.com> writes: >> >>> Read-only and write-only image arguments are recognized and >>> distinguished. >>> Attributes of the image arguments are passed to the kernel as implicit >>> arguments. >>> --- >>> src/gallium/state_trackers/clover/core/kernel.cpp | 46 ++++++- >>> src/gallium/state_trackers/clover/core/kernel.hpp | 13 +- >>> src/gallium/state_trackers/clover/core/memory.cpp | 2 +- >>> src/gallium/state_trackers/clover/core/module.hpp | 4 +- >>> .../state_trackers/clover/llvm/invocation.cpp | 147 >>> ++++++++++++++++++++- >>> 5 files changed, 198 insertions(+), 14 deletions(-) >>> >>> diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp >>> b/src/gallium/state_trackers/clover/core/kernel.cpp >>> index 0756f06..1a6c28f 100644 >>> --- a/src/gallium/state_trackers/clover/core/kernel.cpp >>> +++ b/src/gallium/state_trackers/clover/core/kernel.cpp >>> @@ -158,13 +158,18 @@ >>> kernel::exec_context::bind(intrusive_ptr<command_queue> _q, >>> auto margs = find(name_equals(kern.name()), m.syms).args; >>> auto msec = find(type_equals(module::section::text), m.secs); >>> auto explicit_arg = kern._args.begin(); >>> + image_argument *last_image_arg = nullptr; >>> >>> for (auto &marg : margs) { >>> switch (marg.semantic) { >>> - case module::argument::general: >>> + case module::argument::general: { >>> + auto image_arg = >>> dynamic_cast<image_argument*>(explicit_arg->get()); >>> + if (image_arg) { >>> + last_image_arg = image_arg; >>> + } >>> (*(explicit_arg++))->bind(*this, marg); >>> break; >>> - >>> + } >>> case module::argument::grid_dimension: { >>> const cl_uint dimension = grid_offset.size(); >>> auto arg = argument::create(marg); >>> @@ -182,6 +187,36 @@ >>> kernel::exec_context::bind(intrusive_ptr<command_queue> _q, >>> } >>> break; >>> } >>> + case module::argument::image_size: { >>> + assert(last_image_arg); >>> + auto img = last_image_arg->get(); >> >> Instead of carrying around an extra variable during the loop, you could >> achieve the same effect more locally by doing: >> >> | auto img = dynamic_cast<image_argument &>(**(explicit_arg - 1)) >> | .get(); >> >> The cast to reference would also make sure that the argument is of the >> specified type or otherwise throw std::bad_cast which is as good as an >> assertion failure. >> >>> + std::vector<cl_uint> image_size({ >>> + img->width(), >>> + img->height(), >>> + img->depth()}); >> >> Parentheses are not necessary around the curly braces, you could just: >> >> | std::vector<cl_uint> image_size { >> | img->width(), >> | img->height(), >> | img->depth() >> | }; >> >> It also seems like you've got rid of the static casts you had >> previously? These image methods return a size_t value which may be of >> different size than cl_uint and cause a warning. >> >>> + for (auto x: image_size) { >> >> Leave a space around the colon for consistency, also in the cases below. >> >>> + auto arg = argument::create(marg); >>> + >>> + arg->set(sizeof(x), &x); >>> + arg->bind(*this, marg); >>> + } >>> + break; >>> + } >>> + case module::argument::image_format: { >>> + assert(last_image_arg); >>> + auto img = last_image_arg->get(); >>> + cl_image_format fmt = img->format(); >>> + std::vector<cl_uint> image_format({ >>> + fmt.image_channel_data_type, >>> + fmt.image_channel_order}); >> >> Same comments apply here. >> >>> + for (auto x: image_format) { >>> + auto arg = argument::create(marg); >>> + >>> + arg->set(sizeof(x), &x); >>> + arg->bind(*this, marg); >>> + } >>> + break; >>> + } >>> } >>> } >>> >>> @@ -532,6 +567,13 @@ kernel::sampler_argument::set(size_t size, const void >>> *value) { >>> void >>> kernel::sampler_argument::bind(exec_context &ctx, >>> const module::argument &marg) { >>> + auto v = bytes(ctx.samplers.size()); >>> + >>> + extend(v, module::argument::zero_ext, marg.target_size); >>> + byteswap(v, ctx.q->device().endianness()); >>> + align(ctx.input, marg.target_align); >>> + insert(ctx.input, v); >>> + >> >> My expectation here was that the compiler would be able to hard-code >> sampler indices in the shader without the API passing any actual data in >> the input buffer. Doesn't that work for you? >> >>> st = s->bind(*ctx.q); >>> ctx.samplers.push_back(st); >>> } >>> diff --git a/src/gallium/state_trackers/clover/core/kernel.hpp >>> b/src/gallium/state_trackers/clover/core/kernel.hpp >>> index d6432a4..007ffcc 100644 >>> --- a/src/gallium/state_trackers/clover/core/kernel.hpp >>> +++ b/src/gallium/state_trackers/clover/core/kernel.hpp >>> @@ -190,7 +190,14 @@ namespace clover { >>> pipe_surface *st; >>> }; >>> >>> - class image_rd_argument : public argument { >>> + class image_argument : public argument { >>> + public: >>> + const image *get() const { return img; } >> >> Please write the return statement on its own line for consistency. >> >>> + protected: >>> + image *img; >>> + }; >>> + >>> + class image_rd_argument : public image_argument { >>> public: >>> virtual void set(size_t size, const void *value); >>> virtual void bind(exec_context &ctx, >>> @@ -198,11 +205,10 @@ namespace clover { >>> virtual void unbind(exec_context &ctx); >>> >>> private: >>> - image *img; >>> pipe_sampler_view *st; >>> }; >>> >>> - class image_wr_argument : public argument { >>> + class image_wr_argument : public image_argument { >>> public: >>> virtual void set(size_t size, const void *value); >>> virtual void bind(exec_context &ctx, >>> @@ -210,7 +216,6 @@ namespace clover { >>> virtual void unbind(exec_context &ctx); >>> >>> private: >>> - image *img; >>> pipe_surface *st; >>> }; >>> >>> diff --git a/src/gallium/state_trackers/clover/core/memory.cpp >>> b/src/gallium/state_trackers/clover/core/memory.cpp >>> index 055336a..b852e68 100644 >>> --- a/src/gallium/state_trackers/clover/core/memory.cpp >>> +++ b/src/gallium/state_trackers/clover/core/memory.cpp >>> @@ -189,7 +189,7 @@ image2d::image2d(clover::context &ctx, cl_mem_flags >>> flags, >>> const cl_image_format *format, size_t width, >>> size_t height, size_t row_pitch, >>> void *host_ptr) : >>> - image(ctx, flags, format, width, height, 0, >>> + image(ctx, flags, format, width, height, 1, >>> row_pitch, 0, height * row_pitch, host_ptr) { >>> } >>> >>> diff --git a/src/gallium/state_trackers/clover/core/module.hpp >>> b/src/gallium/state_trackers/clover/core/module.hpp >>> index 9d65688..5db0548 100644 >>> --- a/src/gallium/state_trackers/clover/core/module.hpp >>> +++ b/src/gallium/state_trackers/clover/core/module.hpp >>> @@ -72,7 +72,9 @@ namespace clover { >>> enum semantic { >>> general, >>> grid_dimension, >>> - grid_offset >>> + grid_offset, >>> + image_size, >>> + image_format >>> }; >>> >>> argument(enum type type, size_t size, >>> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp >>> b/src/gallium/state_trackers/clover/llvm/invocation.cpp >>> index 967284d..e5cd59c 100644 >>> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp >>> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp >>> @@ -340,18 +340,91 @@ namespace { >>> PM.run(*mod); >>> } >>> >>> + // Kernel metadata >>> + >>> + const llvm::MDNode * >>> + get_kernel_metadata(const llvm::Function *kernel_func) { >>> + auto mod = kernel_func->getParent(); >>> + auto kernels_node = mod->getNamedMetadata("opencl.kernels"); >>> + if (!kernels_node) { >>> + return nullptr; >>> + } >>> + >>> + const llvm::MDNode *kernel_node = nullptr; >>> + for (unsigned i = 0; i < kernels_node->getNumOperands(); ++i) { >>> +#if HAVE_LLVM >= 0x0306 >>> + auto func = llvm::mdconst::dyn_extract<llvm::Function>( >>> +#else >>> + auto func = llvm::dyn_cast<llvm::Function>( >>> +#endif >>> + >>> kernels_node->getOperand(i)->getOperand(0)); >>> + if (func == kernel_func) { >>> + kernel_node = kernels_node->getOperand(i); >>> + break; >>> + } >>> + } >>> + >>> + return kernel_node; >>> + } >>> + >>> + llvm::MDNode* >>> + node_from_op_checked(const llvm::MDOperand &md_operand, >>> + llvm::StringRef expect_name, >>> + unsigned expect_num_args) >>> + { >>> + auto node = llvm::cast<llvm::MDNode>(md_operand); >>> + assert(node->getNumOperands() == expect_num_args && >>> + "Wrong number of operands."); >>> + >>> + auto str_node = llvm::cast<llvm::MDString>(node->getOperand(0)); >>> + assert(str_node->getString() == expect_name && >>> + "Wrong metadata node name."); >>> + >>> + return node; >>> + } >>> + >>> + struct kernel_arg_md { >>> + llvm::StringRef type_name; >>> + llvm::StringRef access_qual; >>> + kernel_arg_md(llvm::StringRef type_name_, llvm::StringRef >>> access_qual_): >>> + type_name(type_name_), access_qual(access_qual_) {} >>> + }; >>> + >>> + std::vector<kernel_arg_md> >>> + get_kernel_arg_md(const llvm::Function *kernel_func) { >>> + auto num_args = kernel_func->getArgumentList().size(); >>> + >>> + auto kernel_node = get_kernel_metadata(kernel_func); >>> + auto aq = node_from_op_checked(kernel_node->getOperand(2), >>> + "kernel_arg_access_qual", num_args + >>> 1); >>> + auto ty = node_from_op_checked(kernel_node->getOperand(3), >>> + "kernel_arg_type", num_args + 1); >>> + >>> + auto res = std::vector<kernel_arg_md>(); >>> + res.reserve(num_args); >> >> std::vector<kernel_arg_md> res; >> >>> + for (unsigned i = 0; i < num_args; ++i) { >>> + res.push_back(kernel_arg_md( >>> + llvm::cast<llvm::MDString>(ty->getOperand(i+1))->getString(), >>> + llvm::cast<llvm::MDString>(aq->getOperand(i+1))->getString())); >>> + } >>> + >>> + return res; >>> + } >>> + >>> std::vector<module::argument> >>> get_kernel_args(const llvm::Module *mod, const std::string &kernel_name, >>> const clang::LangAS::Map &address_spaces) { >>> >>> std::vector<module::argument> args; >>> llvm::Function *kernel_func = mod->getFunction(kernel_name); >>> + assert(kernel_func && "Kernel name not found in module."); >>> + auto arg_md = get_kernel_arg_md(kernel_func); >>> >>> llvm::DataLayout TD(mod); >>> + llvm::Type *size_type = >>> + TD.getSmallestLegalIntType(mod->getContext(), sizeof(cl_uint) * >>> 8); >>> >>> - for (llvm::Function::const_arg_iterator I = kernel_func->arg_begin(), >>> - E = kernel_func->arg_end(); I != E; >>> ++I) { >>> - const llvm::Argument &arg = *I; >>> + for (const auto &arg: kernel_func->args()) { >>> >>> llvm::Type *arg_type = arg.getType(); >>> const unsigned arg_store_size = TD.getTypeStoreSize(arg_type); >>> @@ -369,6 +442,68 @@ namespace { >>> unsigned target_size = TD.getTypeStoreSize(target_type); >>> unsigned target_align = TD.getABITypeAlignment(target_type); >>> >>> + llvm::StringRef type_name = arg_md[arg.getArgNo()].type_name; >>> + llvm::StringRef access_qual = arg_md[arg.getArgNo()].access_qual; >>> + >>> + // Image >>> + bool is_image2d = type_name == "image2d_t"; >>> + bool is_image3d = type_name == "image3d_t"; >> >> These should be marked const. >> >>> + if (is_image2d || is_image3d) { >>> + bool is_write_only = access_qual == "write_only"; >>> + bool is_read_only = access_qual == "read_only"; >>> + >> >> And these too. >> >>> + typename module::argument::type marg_type; >>> + if (is_image2d && is_read_only) { >>> + marg_type = module::argument::image2d_rd; >>> + } else if (is_image2d && is_write_only) { >>> + marg_type = module::argument::image2d_wr; >>> + } else if (is_image3d && is_read_only) { >>> + marg_type = module::argument::image3d_rd; >>> + } else if (is_image3d && is_write_only) { >>> + marg_type = module::argument::image3d_wr; >>> + } else { >>> + assert(0 && "Wrong image access qualifier"); >>> + } >>> + >>> + args.push_back(module::argument(marg_type, >>> + arg_store_size, target_size, >>> + target_align, >>> + module::argument::zero_ext)); >>> + continue; >>> + } >>> + >>> + // Sampler >>> + if (type_name == "sampler_t") { >>> + args.push_back(module::argument(module::argument::sampler, >>> + arg_store_size, target_size, >>> + target_align, >>> + module::argument::zero_ext)); >>> + continue; >>> + } >>> + >>> + // Image size implicit argument >>> + if (type_name == "image_size") { >> >> I don't think it's a good idea to use such obvious names for the >> implicit argument types. Couldn't they collide with some type declared >> by the user that happens to have the same name? How about >> "__clover_image_size" or something similar? (Identifiers starting with >> double underscore are reserved for the implementation at least on C and >> GLSL, not sure about OpenCL-C) >> >>> + args.push_back(module::argument(module::argument::scalar, >>> + sizeof(cl_uint), >>> + TD.getTypeStoreSize(size_type), >>> + >>> TD.getABITypeAlignment(size_type), >>> + module::argument::zero_ext, >>> + module::argument::image_size)); >>> + continue; >>> + } >>> + >>> + // Image format implicit argument >>> + if (type_name == "image_format") { >>> + args.push_back(module::argument(module::argument::scalar, >>> + sizeof(cl_uint), >>> + TD.getTypeStoreSize(size_type), >>> + >>> TD.getABITypeAlignment(size_type), >>> + module::argument::zero_ext, >>> + >>> module::argument::image_format)); >>> + continue; >>> + } >>> + >>> + // Other types >>> if (llvm::isa<llvm::PointerType>(arg_type) && arg.hasByValAttr()) >>> { >>> arg_type = >>> >>> llvm::dyn_cast<llvm::PointerType>(arg_type)->getElementType(); >>> @@ -413,9 +548,6 @@ namespace { >>> // Append implicit arguments. XXX - The types, ordering and >>> // vector size of the implicit arguments should depend on the >>> // target according to the selected calling convention. >>> - llvm::Type *size_type = >>> - TD.getSmallestLegalIntType(mod->getContext(), sizeof(cl_uint) * >>> 8); >>> - >>> args.push_back( >>> module::argument(module::argument::scalar, sizeof(cl_uint), >>> TD.getTypeStoreSize(size_type), >>> @@ -744,6 +876,9 @@ clover::compile_program_llvm(const std::string &source, >>> std::vector<char> code = compile_native(mod, triple, processor, >>> get_debug_flags() & >>> DBG_ASM, >>> r_log); >>> + // Target-specific passes may alter functions >>> + kernels.clear(); >>> + find_kernels(mod, kernels); >> >> Bah... To make this sort of mistake impossible I'd get rid of the >> "kernels" argument of optimize() and build_module_native/llvm(), and >> make them call find_kernels() directly. It would also be nice to have >> find_kernels() return its result as return value instead of returning >> void, so you could declare kernel arrays as const and initialize them at >> the same point. >> >>> m = build_module_native(code, mod, kernels, address_spaces, >>> r_log); >>> break; >>> } >>> -- >>> 2.4.6
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev