On Tue, 2014-09-02 at 15:36 +0300, Francisco Jerez wrote: > Jan Vesely <jan.ves...@rutgers.edu> writes: > > > On Sat, 2014-08-16 at 13:13 +0300, Francisco Jerez wrote: > >> Jan Vesely <jan.ves...@rutgers.edu> writes: > >> > >> > On Thu, 2014-08-07 at 16:02 +0300, Francisco Jerez wrote: > >> >> Jan Vesely <jan.ves...@rutgers.edu> writes: > >> >> > >> >> > This respin includes Francisco's approach of providing implicit > >> >> > in the arg vector passed from clover, and Tom's idea of appending > >> >> > implicit args after the kernel args. > >> >> > > >> >> > >> >> Hmmm... Maybe it would make sense to add some sort of versioning > >> >> (e.g. as part of the target triple) to the binary interface between > >> >> clover and the kernel instead, so we can handle this sort of > >> >> non-backwards compatible changes and the compiler back-end and libclc > >> >> have some way to find out whether some specific feature is available and > >> >> e.g. some specific extension should be enabled. > >> >> > >> >> > I assumed it's not safe to modify exec.input, so the input vector is > >> >> > copied > >> >> > before appending work dim. > >> >> > > >> >> > >> >> Why wouldn't it be safe? You just need to make sure they're appended > >> >> before the compute state is created. > >> > > >> > I thought there might be a problem when called from multiple threads, > >> > but it looks like most of the vars are local to the current call anyway. > >> > > >> > I looked at the code a bit better, and need a bit of help with what the > >> > proffered approach would be. > >> > > >> > exec_context::bind() appends all kernel args to the input vector. If the > >> > implicit args are added before bind() it shifts all other args, which is > >> > not what we want. > >> > if the implicit args are appended after, they are not accounted for in > >> > shader->input_size (and not copied by the driver). > >> > > >> > my current code modifies exec_context::bind() to preserve the content of > >> > input before binding kernel args, and append the old content after the > >> > args are bound. > >> > I have also considered passing and implicit args vector to > >> > exec_context::bind to make the trick more visible. > >> > > >> > Turning workdim into a proper arg in _args does not work either, because > >> > it is not present in module args. > >> > > >> > any thoughts? > >> > > >> > >> I finally had a chance to take a closer look at your series. It looks > >> like you're right: In order to implement my proposal cleanly, implicit > >> arguments would have to be part of the _args array so the compiler would > >> have to include them in the module argument lists with memory layout > >> parameters (e.g. alignment, size) suitable for the hardware, so there's > >> probably little benefit compared to your original approach that includes > >> the number of dimensions as an additional launch_grid() parameter. > >> > >> So we don't have to change it again, can you add another array parameter > >> for the base grid offset? That's another thing we don't pass through > >> the pipe driver API currently and CL requires. The prototype of > >> launch_grid could look like: > >> > >> | void (*launch_grid)(struct pipe_context *context, uint dims, > >> | const uint *block_layout, const uint *grid_layout, > >> | const uint *grid_offset, uint32_t pc, > >> | const void *input); > >> > >> And don't forget to update the docs. :) > > > > Hi, > > > > I wanted to explore the original idea of appending implicit args, since > > launch_grid is driver specific and would need to reimplement the same > > functionality in every driver. > > > > I came up with a solution (see the attached patch). I don't like that > > the implicit arg needs to be set in api function. > > Right, if we do it this way it would probably be a better fit for the > clover::kernel code, the CL front-end doesn't really need to be aware of > implicit args. > > > I also don't like that this way there is no difference between > > explicit and implicit kernel arguments. On the other hand it's simple, > > and does not need additional per driver code. > > > Yeah... We definitely want to hide these from the user, as e.g. the > CL_KERNEL_NUM_ARGS param is required by the spec to return the number of > arguments provided by the user, and we don't want the user to set > implicit args, so it gets a bit messy. I think I like better your > original idea of passing them as launch_grid() arguments, even though > the grid offset and dimension parameters are somewhat artificial from a > the hardware's point of view.
sorry to bug you some more with this. I tried one more thing before going back to the launch_grid parameters. this time it implements a parallel infrastructure for implicit arguments by creating artificial module arguments for uint and size_t (I don't think we need more for implicit arguments). I only added the work dimension argument but adding more should be easy. If you think that the launch_grid way is better, I'll stop experimenting as I ran out of ideas I wanted to try. thanks, jan > > > thanks, > > jan > > > >> > >> Thank you. > >> > >> > thanks, > >> > jan > >> > > >> > > >> >> > >> >> > Passes get-work-dim piglit on turks without any regression, > >> >> > I have not tested SI as I don't have the hw. > >> >> > > >> >> > jan > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > Jan Vesely (3): > >> >> > gallium: Pass input data size to launch_grid > >> >> > clover: Add work dimension implicit param to input > >> >> > r600,radeonsi: Copy implicit args provided by clover > >> >> > > >> >> > src/gallium/drivers/ilo/ilo_gpgpu.c | 2 +- > >> >> > src/gallium/drivers/nouveau/nvc0/nvc0_compute.c | 2 +- > >> >> > src/gallium/drivers/nouveau/nvc0/nvc0_context.h | 4 +- > >> >> > src/gallium/drivers/nouveau/nvc0/nve4_compute.c | 2 +- > >> >> > src/gallium/drivers/r600/evergreen_compute.c | 14 +- > >> >> > src/gallium/drivers/r600/evergreen_compute.h | 1 - > >> >> > src/gallium/drivers/radeonsi/si_compute.c | 6 +- > >> >> > src/gallium/include/pipe/p_context.h | 2 +- > >> >> > src/gallium/state_trackers/clover/core/kernel.cpp | 162 > >> >> > ++++++++++++---------- > >> >> > src/gallium/tests/trivial/compute.c | 40 +++--- > >> >> > 10 files changed, 122 insertions(+), 113 deletions(-) > >> >> > > >> >> > -- > >> >> > 1.9.3 > >> > > >> > -- > >> > Jan Vesely <jan.ves...@rutgers.edu> > > > > -- > > Jan Vesely <jan.ves...@rutgers.edu> > > From 7ad338ebd3a67b19d4ba492fb5a4cbda418fcdad Mon Sep 17 00:00:00 2001 > > From: Jan Vesely <jan.ves...@rutgers.edu> > > Date: Mon, 1 Sep 2014 19:18:12 -0400 > > Subject: [PATCH RFC 1/1] clover: Append implicit work dim arg > > > > Signed-off-by: Jan Vesely <jan.ves...@rutgers.edu> > > --- > > src/gallium/state_trackers/clover/api/kernel.cpp | 6 ++++++ > > .../state_trackers/clover/llvm/invocation.cpp | 20 > > ++++++++++++++------ > > 2 files changed, 20 insertions(+), 6 deletions(-) > > > > diff --git a/src/gallium/state_trackers/clover/api/kernel.cpp > > b/src/gallium/state_trackers/clover/api/kernel.cpp > > index 05cc392..a3b9735 100644 > > --- a/src/gallium/state_trackers/clover/api/kernel.cpp > > +++ b/src/gallium/state_trackers/clover/api/kernel.cpp > > @@ -276,6 +276,9 @@ clEnqueueNDRangeKernel(cl_command_queue d_q, cl_kernel > > d_kern, > > auto block_size = validate_block_size(q, kern, dims, > > d_grid_size, d_block_size); > > > > + cl_uint work_dim_val = block_size.size(); > > + kern.args().back().set(sizeof(work_dim_val), &work_dim_val); > > + > > validate_common(q, kern, deps); > > > > auto hev = create<hard_event>( > > @@ -299,6 +302,9 @@ clEnqueueTask(cl_command_queue d_q, cl_kernel d_kern, > > auto &kern = obj(d_kern); > > auto deps = objs<wait_list_tag>(d_deps, num_deps); > > > > + cl_uint work_dim_val = 1; > > + kern.args().back().set(sizeof(work_dim_val), &work_dim_val); > > + > > validate_common(q, kern, deps); > > > > auto hev = create<hard_event>( > > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp > > b/src/gallium/state_trackers/clover/llvm/invocation.cpp > > index 7bca0d6..a934384 100644 > > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp > > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp > > @@ -315,17 +315,17 @@ namespace { > > kernel_func = kernels[i]; > > kernel_name = kernel_func->getName(); > > > > - for (llvm::Function::arg_iterator I = kernel_func->arg_begin(), > > - E = kernel_func->arg_end(); I != E; > > ++I) { > > - llvm::Argument &arg = *I; > > #if HAVE_LLVM < 0x0302 > > - llvm::TargetData TD(kernel_func->getParent()); > > + llvm::TargetData TD(kernel_func->getParent()); > > #elif HAVE_LLVM < 0x0305 > > - llvm::DataLayout TD(kernel_func->getParent()->getDataLayout()); > > + llvm::DataLayout TD(kernel_func->getParent()->getDataLayout()); > > #else > > - llvm::DataLayout TD(mod); > > + llvm::DataLayout TD(mod); > > #endif > > > > + for (llvm::Function::arg_iterator I = kernel_func->arg_begin(), > > + E = kernel_func->arg_end(); I != E; > > ++I) { > > + llvm::Argument &arg = *I; > > llvm::Type *arg_type = arg.getType(); > > const unsigned arg_store_size = TD.getTypeStoreSize(arg_type); > > > > @@ -384,6 +384,14 @@ namespace { > > } > > } > > > > + // Implicit arguments > > + // Work dimensions (cl_uint), uint is 32 bit > > + llvm::Type *target_type = > > llvm::Type::getInt32Ty(mod->getContext()); > > + args.push_back(module::argument(module::argument::scalar, > > + sizeof(cl_uint), TD.getTypeStoreSize(target_type), > > + TD.getABITypeAlignment(target_type), > > + module::argument::zero_ext)); > > + > > m.syms.push_back(module::symbol(kernel_name, 0, i, args )); > > } > > > > -- > > 1.9.3 -- Jan Vesely <jan.ves...@rutgers.edu>
From 33d55ecbeae680c9a8291ca16fff49c94035c65a Mon Sep 17 00:00:00 2001 From: Jan Vesely <jan.ves...@rutgers.edu> Date: Mon, 1 Sep 2014 19:18:12 -0400 Subject: [PATCH 1/2] clover: save module argument types pass symbol reference instead of just arg list to kernel constructor Signed-off-by: Jan Vesely <jan.ves...@rutgers.edu> --- src/gallium/state_trackers/clover/api/kernel.cpp | 4 +-- src/gallium/state_trackers/clover/core/kernel.cpp | 5 ++-- src/gallium/state_trackers/clover/core/kernel.hpp | 2 +- src/gallium/state_trackers/clover/core/module.hpp | 9 ++++-- .../state_trackers/clover/llvm/invocation.cpp | 33 +++++++++++++++++----- .../state_trackers/clover/tgsi/compiler.cpp | 3 +- 6 files changed, 41 insertions(+), 15 deletions(-) diff --git a/src/gallium/state_trackers/clover/api/kernel.cpp b/src/gallium/state_trackers/clover/api/kernel.cpp index 05cc392..762c5d8 100644 --- a/src/gallium/state_trackers/clover/api/kernel.cpp +++ b/src/gallium/state_trackers/clover/api/kernel.cpp @@ -36,7 +36,7 @@ clCreateKernel(cl_program d_prog, const char *name, cl_int *r_errcode) try { auto &sym = find(name_equals(name), prog.symbols()); ret_error(r_errcode, CL_SUCCESS); - return new kernel(prog, name, range(sym.args)); + return new kernel(prog, name, sym); } catch (std::out_of_range &e) { ret_error(r_errcode, CL_INVALID_KERNEL_NAME); @@ -61,7 +61,7 @@ clCreateKernelsInProgram(cl_program d_prog, cl_uint count, return desc(new kernel(prog, std::string(sym.name.begin(), sym.name.end()), - range(sym.args))); + sym)); }, syms), rd_kerns); diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp b/src/gallium/state_trackers/clover/core/kernel.cpp index e4b2152..769e161 100644 --- a/src/gallium/state_trackers/clover/core/kernel.cpp +++ b/src/gallium/state_trackers/clover/core/kernel.cpp @@ -29,10 +29,10 @@ using namespace clover; kernel::kernel(clover::program &prog, const std::string &name, - const std::vector<module::argument> &margs) : + const clover::module::symbol &symbol): program(prog), _name(name), exec(*this), program_ref(prog._kernel_ref_counter) { - for (auto &marg : margs) { + for (const auto &marg : symbol.args) { if (marg.type == module::argument::scalar) _args.emplace_back(new scalar_argument(marg.size)); else if (marg.type == module::argument::global) @@ -70,6 +70,7 @@ kernel::launch(command_queue &q, const auto m = program().binary(q.device()); const auto reduced_grid_size = map(divides(), grid_size, block_size); + void *st = exec.bind(&q); // The handles are created during exec_context::bind(), so we need make diff --git a/src/gallium/state_trackers/clover/core/kernel.hpp b/src/gallium/state_trackers/clover/core/kernel.hpp index f9e2765..dbd41b9 100644 --- a/src/gallium/state_trackers/clover/core/kernel.hpp +++ b/src/gallium/state_trackers/clover/core/kernel.hpp @@ -106,7 +106,7 @@ namespace clover { public: kernel(clover::program &prog, const std::string &name, - const std::vector<clover::module::argument> &margs); + const clover::module::symbol &symbol); kernel(const kernel &kern) = delete; kernel & diff --git a/src/gallium/state_trackers/clover/core/module.hpp b/src/gallium/state_trackers/clover/core/module.hpp index 18a5bfb..6ea805d 100644 --- a/src/gallium/state_trackers/clover/core/module.hpp +++ b/src/gallium/state_trackers/clover/core/module.hpp @@ -93,14 +93,19 @@ namespace clover { struct symbol { symbol(const compat::vector<char> &name, resource_id section, - size_t offset, const compat::vector<argument> &args) : - name(name), section(section), offset(offset), args(args) { } + size_t offset, const compat::vector<argument> &args, + argument uint_arg, argument size_arg) : + name(name), section(section), offset(offset), args(args), + uint_arg(uint_arg), size_arg(size_arg) { } symbol() : name(), section(0), offset(0), args() { } compat::vector<char> name; resource_id section; size_t offset; compat::vector<argument> args; + + argument uint_arg; + argument size_arg; }; void serialize(compat::ostream &os) const; diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp index 7bca0d6..9d43d57 100644 --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp @@ -315,17 +315,17 @@ namespace { kernel_func = kernels[i]; kernel_name = kernel_func->getName(); - for (llvm::Function::arg_iterator I = kernel_func->arg_begin(), - E = kernel_func->arg_end(); I != E; ++I) { - llvm::Argument &arg = *I; #if HAVE_LLVM < 0x0302 - llvm::TargetData TD(kernel_func->getParent()); + llvm::TargetData TD(kernel_func->getParent()); #elif HAVE_LLVM < 0x0305 - llvm::DataLayout TD(kernel_func->getParent()->getDataLayout()); + llvm::DataLayout TD(kernel_func->getParent()->getDataLayout()); #else - llvm::DataLayout TD(mod); + llvm::DataLayout TD(mod); #endif + for (llvm::Function::arg_iterator I = kernel_func->arg_begin(), + E = kernel_func->arg_end(); I != E; ++I) { + llvm::Argument &arg = *I; llvm::Type *arg_type = arg.getType(); const unsigned arg_store_size = TD.getTypeStoreSize(arg_type); @@ -384,7 +384,26 @@ namespace { } } - m.syms.push_back(module::symbol(kernel_name, 0, i, args )); + + // Implicit arguments + llvm::Type *int32_type = llvm::Type::getInt32Ty(mod->getContext()); + // int, uint 32 bit + module::argument uint_arg = + module::argument(module::argument::scalar, sizeof(cl_uint), + TD.getTypeStoreSize(int32_type), + TD.getABITypeAlignment(int32_type), + module::argument::zero_ext); + //size_t is based on address bits, + //for now 32 bit is reported for every device + //FIXME: update this when we support device with 64 address bits + module::argument size_arg = + module::argument(module::argument::scalar, sizeof(size_t), + TD.getTypeStoreSize(int32_type), + TD.getABITypeAlignment(int32_type), + module::argument::zero_ext); + + m.syms.push_back(module::symbol(kernel_name, 0, i, args, uint_arg, + size_arg)); } header.num_bytes = llvm_bitcode.size(); diff --git a/src/gallium/state_trackers/clover/tgsi/compiler.cpp b/src/gallium/state_trackers/clover/tgsi/compiler.cpp index 93dfeb5..c162026 100644 --- a/src/gallium/state_trackers/clover/tgsi/compiler.cpp +++ b/src/gallium/state_trackers/clover/tgsi/compiler.cpp @@ -71,7 +71,8 @@ namespace { throw build_error("invalid kernel argument"); } - m.syms.push_back({ name, 0, offset, args }); + m.syms.push_back({ name, 0, offset, args, module::argument(), + module::argument() }); } } -- 1.9.3
From 34b38da76abcaf39e13646a67522f1b835a5cb24 Mon Sep 17 00:00:00 2001 From: Jan Vesely <jan.ves...@rutgers.edu> Date: Tue, 2 Sep 2014 14:42:27 -0400 Subject: [PATCH 2/2] clover: Add implicit arguments to kernel Signed-off-by: Jan Vesely <jan.ves...@rutgers.edu> --- src/gallium/state_trackers/clover/core/kernel.cpp | 30 +++++++++++++++++++++-- src/gallium/state_trackers/clover/core/kernel.hpp | 5 ++++ 2 files changed, 33 insertions(+), 2 deletions(-) diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp b/src/gallium/state_trackers/clover/core/kernel.cpp index 769e161..c141b4b 100644 --- a/src/gallium/state_trackers/clover/core/kernel.cpp +++ b/src/gallium/state_trackers/clover/core/kernel.cpp @@ -52,6 +52,11 @@ kernel::kernel(clover::program &prog, const std::string &name, else throw error(CL_INVALID_KERNEL_DEFINITION); } + + //Implicit arguments + // work dim + _implicit_args.emplace_back(new scalar_argument(symbol.uint_arg.size)); + _implicit_margs.emplace_back(&symbol.uint_arg); } template<typename V> @@ -71,6 +76,11 @@ kernel::launch(command_queue &q, const auto reduced_grid_size = map(divides(), grid_size, block_size); + //implicit arguments + cl_uint work_dim = block_size.size(); + _implicit_args[0]->set(sizeof(work_dim), &work_dim); + + void *st = exec.bind(&q); // The handles are created during exec_context::bind(), so we need make @@ -151,6 +161,16 @@ kernel::args() const { return map(derefs(), _args); } +kernel::argument_range +kernel::implicit_args() { + return map(derefs(), _implicit_args); +} + +kernel::const_argument_range +kernel::implicit_args() const { + return map(derefs(), _implicit_args); +} + const module & kernel::module(const command_queue &q) const { return program().binary(q.device()); @@ -171,13 +191,19 @@ kernel::exec_context::bind(intrusive_ptr<command_queue> _q) { // Bind kernel arguments. auto &m = kern.program().binary(q->device()); - auto margs = find(name_equals(kern.name()), m.syms).args; - auto msec = find(type_equals(module::section::text), m.secs); + const auto &margs = find(name_equals(kern.name()), m.syms).args; + const auto &msec = find(type_equals(module::section::text), m.secs); + //Explicit arguments for_each([=](kernel::argument &karg, const module::argument &marg) { karg.bind(*this, marg); }, kern.args(), margs); + //Implicit arguments + for_each([=](kernel::argument &karg, const module::argument *marg) { + karg.bind(*this, *marg); + }, kern.implicit_args(), kern._implicit_margs); + // Create a new compute state if anything changed. if (!st || q != _q || cs.req_local_mem != mem_local || diff --git a/src/gallium/state_trackers/clover/core/kernel.hpp b/src/gallium/state_trackers/clover/core/kernel.hpp index dbd41b9..b34ec40 100644 --- a/src/gallium/state_trackers/clover/core/kernel.hpp +++ b/src/gallium/state_trackers/clover/core/kernel.hpp @@ -131,6 +131,9 @@ namespace clover { argument_range args(); const_argument_range args() const; + argument_range implicit_args(); + const_argument_range implicit_args() const; + const intrusive_ref<clover::program> program; private: @@ -223,6 +226,8 @@ namespace clover { }; std::vector<std::unique_ptr<argument>> _args; + std::vector<std::unique_ptr<argument>> _implicit_args; + std::vector<const module::argument*> _implicit_margs; std::string _name; exec_context exec; const ref_holder program_ref; -- 1.9.3
signature.asc
Description: This is a digitally signed message part
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev