But samplers aren't necessary for the image attributes, so I'll break this patch into two.
On Mon, Jul 27, 2015 at 10:25 AM, Zoltán Gilián <zoltan.gil...@gmail.com> wrote: >> > Hmmm... So you only need it as padding? Wouldn't it be possible for >> > you to declare samplers to be 0 bytes? >> Maybe it can be done by changing the type of the sampler arg from i32 >> to an empty struct. I'll have to try this, I don't know if it will >> work. > > But then it would only work for the target I'm working with. > >> Some hardware (e.g. Intel's) will need an index (which can probably be >> hardcoded in the shader for now) into a table of sampler configurations >> which can be set-up later on by the driver at state-emit time. > > Yes, this is also the case with r600 texture fetch instructions: the > sampler configurations are emitted, and are referenced by an index in > the kernel. But if the kernel code itself needs the sampler value, > e.g. it branches and calls normalized or non-normalized texture fetch > based on the appropriate bit of the sampler, it needs the data in a > kernel-accessible place. And AFAIK, the hardware registers which set > up samplers for fetches are not accessible by the kernel. > >> In any >> case we can always add a new argument semantic enum later on for the >> target to select sampler config vs index if different drivers turn out >> to need different things. > > The index can indeed be hard-coded by the compiler, as you pointed it > out, so uploading the index will not be useful for any target (and > adding the code to do it is a mistake on my end). However the kernel > may need the sampler value in an accessible place. So we are in a > similar situation as image attribute metadata. But instead of adding > implicit arguments after the 0-sized sampler arg, the sampler value > could simply be uploaded as the explicit argument. The benefit of the > latter approach is simplicity, while the former only saves 4 bytes for > some targets at a cost of complexity. Does adding the sampler value to > the input vector breaks something I don't know about? > > On Sun, Jul 26, 2015 at 11:37 PM, Zoltán Gilián <zoltan.gil...@gmail.com> > wrote: >>> Why? Your module::argument::image_size and ::image_format cases don't >>> touch the explicit_arg iterator at all AFAICT, so it will be left >>> pointing one past the last general semantic argument >> >> Ok, my mistake, I didn't think this through. >> >>> Hmmm... So you only need it as padding? Wouldn't it be possible for >>> you to declare samplers to be 0 bytes? >> >> Maybe it can be done by changing the type of the sampler arg from i32 >> to an empty struct. I'll have to try this, I don't know if it will >> work. >> >> On Sun, Jul 26, 2015 at 2:40 PM, Francisco Jerez <curroje...@riseup.net> >> wrote: >>> Zoltán Gilián <zoltan.gil...@gmail.com> writes: >>> >>>>> auto img = dynamic_cast<image_argument &>(**(explicit_arg - 1)) >>>> >>>> Ok, so it should be (explicit_arg - 2) for the image format, I >>>> presume. >>> >>> Why? Your module::argument::image_size and ::image_format cases don't >>> touch the explicit_arg iterator at all AFAICT, so it will be left >>> pointing one past the last general semantic argument >>> >>>> This will be incorrect, however, if the targets that need implicit >>>> argument for format metadata are indeed a strict superset of the ones >>>> that need dimension, as you mentioned before. The targets that only >>>> need format will break this code. Should I swap the order of the >>>> format and dimension implicit args to make this approach work under >>>> the aforementioned assumption? >>>> >>> Why would it be incorrect? The kernel::_args vector explicit_arg >>> iterates in contains explicit arguments only so it shouldn't make a >>> difference in which order you emit them as long as you don't interleave >>> other explicit arguments in between. >>> >>>>> It also seems like you've got rid of the static casts you had >>>>> previously? >>>> >>>> That is a mistake, I'll fix it. >>>> >>>>> My expectation here was that the compiler would be able to hard-code >>>>> sampler indices in the shader without the API passing any actual data in >>>>> the input buffer. Doesn't that work for you? >>>> >>>> Yes, that's correct, but this is also the case with images. If image >>>> index is uploaded explicitly, I don't see why it can't be done with >>>> sampler indices. But probably it's a better idea to send the sampler >>>> value rather than the index, in case the kernel needs it (e.g. the >>>> normalized or non-normalized nature of texture coordinates may have to >>>> be specified in the fetch instruction itself, and not by hardware >>>> registers), so I'll definitely change this. >>> >>> Some hardware (e.g. Intel's) will need an index (which can probably be >>> hardcoded in the shader for now) into a table of sampler configurations >>> which can be set-up later on by the driver at state-emit time. In any >>> case we can always add a new argument semantic enum later on for the >>> target to select sampler config vs index if different drivers turn out >>> to need different things. >>> >>>> But the bigger problem is, that the byte offsets of the kernel >>>> arguments are computed considering the sampler argument too, so the >>>> binary expects it to be present in the input vector. Furthermore I >>>> can't erase the sampler argument from the IR, because it is needed to >>>> make it possible for get_kernel_args to detect the sampler. But if >>>> sampler_argument::bind doesn't append 4 bytes (clang compiles >>>> sampler_t to i32) to the input vector, the binary will try to load the >>>> following arguments from wrong locations. >>> >>> Hmmm... So you only need it as padding? Wouldn't it be possible for >>> you to declare samplers to be 0 bytes? >>> >>>> >>>>> I don't think it's a good idea to use such obvious names for the >>>>> implicit argument types. Couldn't they collide with some type declared >>>> by the user that happens to have the same name? How about >>>> "__clover_image_size" or something similar? >>>> >>>> Indeed, that's a good point. I'd go with something like >>>> "__opencl_image_*" or "__llvm_image_*", because these strings will be >>>> added to llvm, and non-clover code may depend on them in the future. >>>> >>> Sounds good to me. >>> >>>>> (Identifiers starting with >>>>> double underscore are reserved for the implementation at least on C and >>>>> GLSL, not sure about OpenCL-C) >>>> >>>> I believe this is true for OpenCL C too, since it is an extension to >>>> C99. Identifiers starting with double underscores are reserved in C99. >>>> >>> IIRC identifiers starting with double underscore and single underscore >>> followed by some upper-case letter were reserved even in the original >>> ANSI C spec. >>> >>>> On Sat, Jul 25, 2015 at 1:06 PM, Francisco Jerez <curroje...@riseup.net> >>>> wrote: >>>>> Zoltan Gilian <zoltan.gil...@gmail.com> writes: >>>>> >>>>>> Read-only and write-only image arguments are recognized and >>>>>> distinguished. >>>>>> Attributes of the image arguments are passed to the kernel as implicit >>>>>> arguments. >>>>>> --- >>>>>> src/gallium/state_trackers/clover/core/kernel.cpp | 46 ++++++- >>>>>> src/gallium/state_trackers/clover/core/kernel.hpp | 13 +- >>>>>> src/gallium/state_trackers/clover/core/memory.cpp | 2 +- >>>>>> src/gallium/state_trackers/clover/core/module.hpp | 4 +- >>>>>> .../state_trackers/clover/llvm/invocation.cpp | 147 >>>>>> ++++++++++++++++++++- >>>>>> 5 files changed, 198 insertions(+), 14 deletions(-) >>>>>> >>>>>> diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp >>>>>> b/src/gallium/state_trackers/clover/core/kernel.cpp >>>>>> index 0756f06..1a6c28f 100644 >>>>>> --- a/src/gallium/state_trackers/clover/core/kernel.cpp >>>>>> +++ b/src/gallium/state_trackers/clover/core/kernel.cpp >>>>>> @@ -158,13 +158,18 @@ >>>>>> kernel::exec_context::bind(intrusive_ptr<command_queue> _q, >>>>>> auto margs = find(name_equals(kern.name()), m.syms).args; >>>>>> auto msec = find(type_equals(module::section::text), m.secs); >>>>>> auto explicit_arg = kern._args.begin(); >>>>>> + image_argument *last_image_arg = nullptr; >>>>>> >>>>>> for (auto &marg : margs) { >>>>>> switch (marg.semantic) { >>>>>> - case module::argument::general: >>>>>> + case module::argument::general: { >>>>>> + auto image_arg = >>>>>> dynamic_cast<image_argument*>(explicit_arg->get()); >>>>>> + if (image_arg) { >>>>>> + last_image_arg = image_arg; >>>>>> + } >>>>>> (*(explicit_arg++))->bind(*this, marg); >>>>>> break; >>>>>> - >>>>>> + } >>>>>> case module::argument::grid_dimension: { >>>>>> const cl_uint dimension = grid_offset.size(); >>>>>> auto arg = argument::create(marg); >>>>>> @@ -182,6 +187,36 @@ >>>>>> kernel::exec_context::bind(intrusive_ptr<command_queue> _q, >>>>>> } >>>>>> break; >>>>>> } >>>>>> + case module::argument::image_size: { >>>>>> + assert(last_image_arg); >>>>>> + auto img = last_image_arg->get(); >>>>> >>>>> Instead of carrying around an extra variable during the loop, you could >>>>> achieve the same effect more locally by doing: >>>>> >>>>> | auto img = dynamic_cast<image_argument &>(**(explicit_arg - >>>>> 1)) >>>>> | .get(); >>>>> >>>>> The cast to reference would also make sure that the argument is of the >>>>> specified type or otherwise throw std::bad_cast which is as good as an >>>>> assertion failure. >>>>> >>>>>> + std::vector<cl_uint> image_size({ >>>>>> + img->width(), >>>>>> + img->height(), >>>>>> + img->depth()}); >>>>> >>>>> Parentheses are not necessary around the curly braces, you could just: >>>>> >>>>> | std::vector<cl_uint> image_size { >>>>> | img->width(), >>>>> | img->height(), >>>>> | img->depth() >>>>> | }; >>>>> >>>>> It also seems like you've got rid of the static casts you had >>>>> previously? These image methods return a size_t value which may be of >>>>> different size than cl_uint and cause a warning. >>>>> >>>>>> + for (auto x: image_size) { >>>>> >>>>> Leave a space around the colon for consistency, also in the cases below. >>>>> >>>>>> + auto arg = argument::create(marg); >>>>>> + >>>>>> + arg->set(sizeof(x), &x); >>>>>> + arg->bind(*this, marg); >>>>>> + } >>>>>> + break; >>>>>> + } >>>>>> + case module::argument::image_format: { >>>>>> + assert(last_image_arg); >>>>>> + auto img = last_image_arg->get(); >>>>>> + cl_image_format fmt = img->format(); >>>>>> + std::vector<cl_uint> image_format({ >>>>>> + fmt.image_channel_data_type, >>>>>> + fmt.image_channel_order}); >>>>> >>>>> Same comments apply here. >>>>> >>>>>> + for (auto x: image_format) { >>>>>> + auto arg = argument::create(marg); >>>>>> + >>>>>> + arg->set(sizeof(x), &x); >>>>>> + arg->bind(*this, marg); >>>>>> + } >>>>>> + break; >>>>>> + } >>>>>> } >>>>>> } >>>>>> >>>>>> @@ -532,6 +567,13 @@ kernel::sampler_argument::set(size_t size, const >>>>>> void *value) { >>>>>> void >>>>>> kernel::sampler_argument::bind(exec_context &ctx, >>>>>> const module::argument &marg) { >>>>>> + auto v = bytes(ctx.samplers.size()); >>>>>> + >>>>>> + extend(v, module::argument::zero_ext, marg.target_size); >>>>>> + byteswap(v, ctx.q->device().endianness()); >>>>>> + align(ctx.input, marg.target_align); >>>>>> + insert(ctx.input, v); >>>>>> + >>>>> >>>>> My expectation here was that the compiler would be able to hard-code >>>>> sampler indices in the shader without the API passing any actual data in >>>>> the input buffer. Doesn't that work for you? >>>>> >>>>>> st = s->bind(*ctx.q); >>>>>> ctx.samplers.push_back(st); >>>>>> } >>>>>> diff --git a/src/gallium/state_trackers/clover/core/kernel.hpp >>>>>> b/src/gallium/state_trackers/clover/core/kernel.hpp >>>>>> index d6432a4..007ffcc 100644 >>>>>> --- a/src/gallium/state_trackers/clover/core/kernel.hpp >>>>>> +++ b/src/gallium/state_trackers/clover/core/kernel.hpp >>>>>> @@ -190,7 +190,14 @@ namespace clover { >>>>>> pipe_surface *st; >>>>>> }; >>>>>> >>>>>> - class image_rd_argument : public argument { >>>>>> + class image_argument : public argument { >>>>>> + public: >>>>>> + const image *get() const { return img; } >>>>> >>>>> Please write the return statement on its own line for consistency. >>>>> >>>>>> + protected: >>>>>> + image *img; >>>>>> + }; >>>>>> + >>>>>> + class image_rd_argument : public image_argument { >>>>>> public: >>>>>> virtual void set(size_t size, const void *value); >>>>>> virtual void bind(exec_context &ctx, >>>>>> @@ -198,11 +205,10 @@ namespace clover { >>>>>> virtual void unbind(exec_context &ctx); >>>>>> >>>>>> private: >>>>>> - image *img; >>>>>> pipe_sampler_view *st; >>>>>> }; >>>>>> >>>>>> - class image_wr_argument : public argument { >>>>>> + class image_wr_argument : public image_argument { >>>>>> public: >>>>>> virtual void set(size_t size, const void *value); >>>>>> virtual void bind(exec_context &ctx, >>>>>> @@ -210,7 +216,6 @@ namespace clover { >>>>>> virtual void unbind(exec_context &ctx); >>>>>> >>>>>> private: >>>>>> - image *img; >>>>>> pipe_surface *st; >>>>>> }; >>>>>> >>>>>> diff --git a/src/gallium/state_trackers/clover/core/memory.cpp >>>>>> b/src/gallium/state_trackers/clover/core/memory.cpp >>>>>> index 055336a..b852e68 100644 >>>>>> --- a/src/gallium/state_trackers/clover/core/memory.cpp >>>>>> +++ b/src/gallium/state_trackers/clover/core/memory.cpp >>>>>> @@ -189,7 +189,7 @@ image2d::image2d(clover::context &ctx, cl_mem_flags >>>>>> flags, >>>>>> const cl_image_format *format, size_t width, >>>>>> size_t height, size_t row_pitch, >>>>>> void *host_ptr) : >>>>>> - image(ctx, flags, format, width, height, 0, >>>>>> + image(ctx, flags, format, width, height, 1, >>>>>> row_pitch, 0, height * row_pitch, host_ptr) { >>>>>> } >>>>>> >>>>>> diff --git a/src/gallium/state_trackers/clover/core/module.hpp >>>>>> b/src/gallium/state_trackers/clover/core/module.hpp >>>>>> index 9d65688..5db0548 100644 >>>>>> --- a/src/gallium/state_trackers/clover/core/module.hpp >>>>>> +++ b/src/gallium/state_trackers/clover/core/module.hpp >>>>>> @@ -72,7 +72,9 @@ namespace clover { >>>>>> enum semantic { >>>>>> general, >>>>>> grid_dimension, >>>>>> - grid_offset >>>>>> + grid_offset, >>>>>> + image_size, >>>>>> + image_format >>>>>> }; >>>>>> >>>>>> argument(enum type type, size_t size, >>>>>> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp >>>>>> b/src/gallium/state_trackers/clover/llvm/invocation.cpp >>>>>> index 967284d..e5cd59c 100644 >>>>>> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp >>>>>> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp >>>>>> @@ -340,18 +340,91 @@ namespace { >>>>>> PM.run(*mod); >>>>>> } >>>>>> >>>>>> + // Kernel metadata >>>>>> + >>>>>> + const llvm::MDNode * >>>>>> + get_kernel_metadata(const llvm::Function *kernel_func) { >>>>>> + auto mod = kernel_func->getParent(); >>>>>> + auto kernels_node = mod->getNamedMetadata("opencl.kernels"); >>>>>> + if (!kernels_node) { >>>>>> + return nullptr; >>>>>> + } >>>>>> + >>>>>> + const llvm::MDNode *kernel_node = nullptr; >>>>>> + for (unsigned i = 0; i < kernels_node->getNumOperands(); ++i) { >>>>>> +#if HAVE_LLVM >= 0x0306 >>>>>> + auto func = llvm::mdconst::dyn_extract<llvm::Function>( >>>>>> +#else >>>>>> + auto func = llvm::dyn_cast<llvm::Function>( >>>>>> +#endif >>>>>> + >>>>>> kernels_node->getOperand(i)->getOperand(0)); >>>>>> + if (func == kernel_func) { >>>>>> + kernel_node = kernels_node->getOperand(i); >>>>>> + break; >>>>>> + } >>>>>> + } >>>>>> + >>>>>> + return kernel_node; >>>>>> + } >>>>>> + >>>>>> + llvm::MDNode* >>>>>> + node_from_op_checked(const llvm::MDOperand &md_operand, >>>>>> + llvm::StringRef expect_name, >>>>>> + unsigned expect_num_args) >>>>>> + { >>>>>> + auto node = llvm::cast<llvm::MDNode>(md_operand); >>>>>> + assert(node->getNumOperands() == expect_num_args && >>>>>> + "Wrong number of operands."); >>>>>> + >>>>>> + auto str_node = llvm::cast<llvm::MDString>(node->getOperand(0)); >>>>>> + assert(str_node->getString() == expect_name && >>>>>> + "Wrong metadata node name."); >>>>>> + >>>>>> + return node; >>>>>> + } >>>>>> + >>>>>> + struct kernel_arg_md { >>>>>> + llvm::StringRef type_name; >>>>>> + llvm::StringRef access_qual; >>>>>> + kernel_arg_md(llvm::StringRef type_name_, llvm::StringRef >>>>>> access_qual_): >>>>>> + type_name(type_name_), access_qual(access_qual_) {} >>>>>> + }; >>>>>> + >>>>>> + std::vector<kernel_arg_md> >>>>>> + get_kernel_arg_md(const llvm::Function *kernel_func) { >>>>>> + auto num_args = kernel_func->getArgumentList().size(); >>>>>> + >>>>>> + auto kernel_node = get_kernel_metadata(kernel_func); >>>>>> + auto aq = node_from_op_checked(kernel_node->getOperand(2), >>>>>> + "kernel_arg_access_qual", num_args >>>>>> + 1); >>>>>> + auto ty = node_from_op_checked(kernel_node->getOperand(3), >>>>>> + "kernel_arg_type", num_args + 1); >>>>>> + >>>>>> + auto res = std::vector<kernel_arg_md>(); >>>>>> + res.reserve(num_args); >>>>> >>>>> std::vector<kernel_arg_md> res; >>>>> >>>>>> + for (unsigned i = 0; i < num_args; ++i) { >>>>>> + res.push_back(kernel_arg_md( >>>>>> + >>>>>> llvm::cast<llvm::MDString>(ty->getOperand(i+1))->getString(), >>>>>> + >>>>>> llvm::cast<llvm::MDString>(aq->getOperand(i+1))->getString())); >>>>>> + } >>>>>> + >>>>>> + return res; >>>>>> + } >>>>>> + >>>>>> std::vector<module::argument> >>>>>> get_kernel_args(const llvm::Module *mod, const std::string >>>>>> &kernel_name, >>>>>> const clang::LangAS::Map &address_spaces) { >>>>>> >>>>>> std::vector<module::argument> args; >>>>>> llvm::Function *kernel_func = mod->getFunction(kernel_name); >>>>>> + assert(kernel_func && "Kernel name not found in module."); >>>>>> + auto arg_md = get_kernel_arg_md(kernel_func); >>>>>> >>>>>> llvm::DataLayout TD(mod); >>>>>> + llvm::Type *size_type = >>>>>> + TD.getSmallestLegalIntType(mod->getContext(), sizeof(cl_uint) >>>>>> * 8); >>>>>> >>>>>> - for (llvm::Function::const_arg_iterator I = >>>>>> kernel_func->arg_begin(), >>>>>> - E = kernel_func->arg_end(); I != >>>>>> E; ++I) { >>>>>> - const llvm::Argument &arg = *I; >>>>>> + for (const auto &arg: kernel_func->args()) { >>>>>> >>>>>> llvm::Type *arg_type = arg.getType(); >>>>>> const unsigned arg_store_size = TD.getTypeStoreSize(arg_type); >>>>>> @@ -369,6 +442,68 @@ namespace { >>>>>> unsigned target_size = TD.getTypeStoreSize(target_type); >>>>>> unsigned target_align = TD.getABITypeAlignment(target_type); >>>>>> >>>>>> + llvm::StringRef type_name = arg_md[arg.getArgNo()].type_name; >>>>>> + llvm::StringRef access_qual = >>>>>> arg_md[arg.getArgNo()].access_qual; >>>>>> + >>>>>> + // Image >>>>>> + bool is_image2d = type_name == "image2d_t"; >>>>>> + bool is_image3d = type_name == "image3d_t"; >>>>> >>>>> These should be marked const. >>>>> >>>>>> + if (is_image2d || is_image3d) { >>>>>> + bool is_write_only = access_qual == "write_only"; >>>>>> + bool is_read_only = access_qual == "read_only"; >>>>>> + >>>>> >>>>> And these too. >>>>> >>>>>> + typename module::argument::type marg_type; >>>>>> + if (is_image2d && is_read_only) { >>>>>> + marg_type = module::argument::image2d_rd; >>>>>> + } else if (is_image2d && is_write_only) { >>>>>> + marg_type = module::argument::image2d_wr; >>>>>> + } else if (is_image3d && is_read_only) { >>>>>> + marg_type = module::argument::image3d_rd; >>>>>> + } else if (is_image3d && is_write_only) { >>>>>> + marg_type = module::argument::image3d_wr; >>>>>> + } else { >>>>>> + assert(0 && "Wrong image access qualifier"); >>>>>> + } >>>>>> + >>>>>> + args.push_back(module::argument(marg_type, >>>>>> + arg_store_size, target_size, >>>>>> + target_align, >>>>>> + >>>>>> module::argument::zero_ext)); >>>>>> + continue; >>>>>> + } >>>>>> + >>>>>> + // Sampler >>>>>> + if (type_name == "sampler_t") { >>>>>> + args.push_back(module::argument(module::argument::sampler, >>>>>> + arg_store_size, target_size, >>>>>> + target_align, >>>>>> + >>>>>> module::argument::zero_ext)); >>>>>> + continue; >>>>>> + } >>>>>> + >>>>>> + // Image size implicit argument >>>>>> + if (type_name == "image_size") { >>>>> >>>>> I don't think it's a good idea to use such obvious names for the >>>>> implicit argument types. Couldn't they collide with some type declared >>>>> by the user that happens to have the same name? How about >>>>> "__clover_image_size" or something similar? (Identifiers starting with >>>>> double underscore are reserved for the implementation at least on C and >>>>> GLSL, not sure about OpenCL-C) >>>>> >>>>>> + args.push_back(module::argument(module::argument::scalar, >>>>>> + sizeof(cl_uint), >>>>>> + >>>>>> TD.getTypeStoreSize(size_type), >>>>>> + >>>>>> TD.getABITypeAlignment(size_type), >>>>>> + module::argument::zero_ext, >>>>>> + >>>>>> module::argument::image_size)); >>>>>> + continue; >>>>>> + } >>>>>> + >>>>>> + // Image format implicit argument >>>>>> + if (type_name == "image_format") { >>>>>> + args.push_back(module::argument(module::argument::scalar, >>>>>> + sizeof(cl_uint), >>>>>> + >>>>>> TD.getTypeStoreSize(size_type), >>>>>> + >>>>>> TD.getABITypeAlignment(size_type), >>>>>> + module::argument::zero_ext, >>>>>> + >>>>>> module::argument::image_format)); >>>>>> + continue; >>>>>> + } >>>>>> + >>>>>> + // Other types >>>>>> if (llvm::isa<llvm::PointerType>(arg_type) && >>>>>> arg.hasByValAttr()) { >>>>>> arg_type = >>>>>> >>>>>> llvm::dyn_cast<llvm::PointerType>(arg_type)->getElementType(); >>>>>> @@ -413,9 +548,6 @@ namespace { >>>>>> // Append implicit arguments. XXX - The types, ordering and >>>>>> // vector size of the implicit arguments should depend on the >>>>>> // target according to the selected calling convention. >>>>>> - llvm::Type *size_type = >>>>>> - TD.getSmallestLegalIntType(mod->getContext(), sizeof(cl_uint) >>>>>> * 8); >>>>>> - >>>>>> args.push_back( >>>>>> module::argument(module::argument::scalar, sizeof(cl_uint), >>>>>> TD.getTypeStoreSize(size_type), >>>>>> @@ -744,6 +876,9 @@ clover::compile_program_llvm(const std::string >>>>>> &source, >>>>>> std::vector<char> code = compile_native(mod, triple, processor, >>>>>> get_debug_flags() & >>>>>> DBG_ASM, >>>>>> r_log); >>>>>> + // Target-specific passes may alter functions >>>>>> + kernels.clear(); >>>>>> + find_kernels(mod, kernels); >>>>> >>>>> Bah... To make this sort of mistake impossible I'd get rid of the >>>>> "kernels" argument of optimize() and build_module_native/llvm(), and >>>>> make them call find_kernels() directly. It would also be nice to have >>>>> find_kernels() return its result as return value instead of returning >>>>> void, so you could declare kernel arrays as const and initialize them at >>>>> the same point. >>>>> >>>>>> m = build_module_native(code, mod, kernels, address_spaces, >>>>>> r_log); >>>>>> break; >>>>>> } >>>>>> -- >>>>>> 2.4.6 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev