On Tue, 2014-09-02 at 15:36 +0300, Francisco Jerez wrote:
> Jan Vesely <jan.ves...@rutgers.edu> writes:
> 
> > On Sat, 2014-08-16 at 13:13 +0300, Francisco Jerez wrote:
> >> Jan Vesely <jan.ves...@rutgers.edu> writes:
> >> 
> >> > On Thu, 2014-08-07 at 16:02 +0300, Francisco Jerez wrote:
> >> >> Jan Vesely <jan.ves...@rutgers.edu> writes:
> >> >> 
> >> >> > This respin includes Francisco's approach of providing implicit
> >> >> > in the arg vector passed from clover, and Tom's idea of appending
> >> >> > implicit args after the kernel args.
> >> >> >
> >> >> 
> >> >> Hmmm...  Maybe it would make sense to add some sort of versioning
> >> >> (e.g. as part of the target triple) to the binary interface between
> >> >> clover and the kernel instead, so we can handle this sort of
> >> >> non-backwards compatible changes and the compiler back-end and libclc
> >> >> have some way to find out whether some specific feature is available and
> >> >> e.g. some specific extension should be enabled.
> >> >> 
> >> >> > I assumed it's not safe to modify exec.input, so the input vector is 
> >> >> > copied
> >> >> > before appending work dim.
> >> >> >
> >> >> 
> >> >> Why wouldn't it be safe?  You just need to make sure they're appended
> >> >> before the compute state is created.
> >> >
> >> > I thought there might be a problem when called from multiple threads,
> >> > but it looks like most of the vars are local to the current call anyway.
> >> >
> >> > I looked at the code a bit better, and need a bit of help with what the
> >> > proffered approach would be.
> >> >
> >> > exec_context::bind() appends all kernel args to the input vector. If the
> >> > implicit args are added before bind() it shifts all other args, which is
> >> > not what we want.
> >> > if the implicit args are appended after, they are not accounted for in
> >> > shader->input_size (and not copied by the driver).
> >> >
> >> > my current code modifies exec_context::bind() to preserve the content of
> >> > input before binding kernel args, and append the old content after the
> >> > args are bound.
> >> > I have also considered passing and implicit args vector to
> >> > exec_context::bind to make the trick more visible.
> >> >
> >> > Turning workdim into a proper arg in _args does not work either, because
> >> > it is not present in module args.
> >> >
> >> > any thoughts?
> >> >
> >> 
> >> I finally had a chance to take a closer look at your series.  It looks
> >> like you're right: In order to implement my proposal cleanly, implicit
> >> arguments would have to be part of the _args array so the compiler would
> >> have to include them in the module argument lists with memory layout
> >> parameters (e.g. alignment, size) suitable for the hardware, so there's
> >> probably little benefit compared to your original approach that includes
> >> the number of dimensions as an additional launch_grid() parameter.
> >> 
> >> So we don't have to change it again, can you add another array parameter
> >> for the base grid offset?  That's another thing we don't pass through
> >> the pipe driver API currently and CL requires. The prototype of
> >> launch_grid could look like:
> >> 
> >> | void (*launch_grid)(struct pipe_context *context, uint dims,
> >> |                     const uint *block_layout, const uint *grid_layout,
> >> |                     const uint *grid_offset, uint32_t pc,
> >> |                     const void *input);
> >> 
> >> And don't forget to update the docs. :)
> >
> > Hi,
> >
> > I wanted to explore the original idea of appending implicit args, since
> > launch_grid is driver specific and would need to reimplement the same
> > functionality in every driver.
> >
> > I came up with a solution (see the attached patch). I don't like that
> > the implicit arg needs to be set in api function.
> 
> Right, if we do it this way it would probably be a better fit for the
> clover::kernel code, the CL front-end doesn't really need to be aware of
> implicit args.
> 
> > I also don't like that this way there is no difference between
> > explicit and implicit kernel arguments. On the other hand it's simple,
> > and does not need additional per driver code.
> >
> Yeah...  We definitely want to hide these from the user, as e.g. the
> CL_KERNEL_NUM_ARGS param is required by the spec to return the number of
> arguments provided by the user, and we don't want the user to set
> implicit args, so it gets a bit messy.  I think I like better your
> original idea of passing them as launch_grid() arguments, even though
> the grid offset and dimension parameters are somewhat artificial from a
> the hardware's point of view.

sorry to bug you some more with this. I tried one more thing before
going back to the launch_grid parameters. this time it implements a
parallel infrastructure for implicit arguments by creating artificial
module arguments for uint and size_t (I don't think we need more for
implicit arguments).

I only added the work dimension argument but adding more should be easy.
If you think that the launch_grid way is better, I'll stop experimenting
as I ran out of ideas I wanted to try.

thanks,
jan

> 
> > thanks,
> > jan
> >
> >> 
> >> Thank you.
> >> 
> >> > thanks,
> >> > jan
> >> >
> >> >
> >> >> 
> >> >> > Passes get-work-dim piglit on turks without any regression,
> >> >> > I have not tested SI as I don't have the hw.
> >> >> >
> >> >> > jan
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> > Jan Vesely (3):
> >> >> >   gallium: Pass input data size to launch_grid
> >> >> >   clover: Add work dimension implicit param to input
> >> >> >   r600,radeonsi: Copy implicit args provided by clover
> >> >> >
> >> >> >  src/gallium/drivers/ilo/ilo_gpgpu.c               |   2 +-
> >> >> >  src/gallium/drivers/nouveau/nvc0/nvc0_compute.c   |   2 +-
> >> >> >  src/gallium/drivers/nouveau/nvc0/nvc0_context.h   |   4 +-
> >> >> >  src/gallium/drivers/nouveau/nvc0/nve4_compute.c   |   2 +-
> >> >> >  src/gallium/drivers/r600/evergreen_compute.c      |  14 +-
> >> >> >  src/gallium/drivers/r600/evergreen_compute.h      |   1 -
> >> >> >  src/gallium/drivers/radeonsi/si_compute.c         |   6 +-
> >> >> >  src/gallium/include/pipe/p_context.h              |   2 +-
> >> >> >  src/gallium/state_trackers/clover/core/kernel.cpp | 162 
> >> >> > ++++++++++++----------
> >> >> >  src/gallium/tests/trivial/compute.c               |  40 +++---
> >> >> >  10 files changed, 122 insertions(+), 113 deletions(-)
> >> >> >
> >> >> > -- 
> >> >> > 1.9.3
> >> >
> >> > -- 
> >> > Jan Vesely <jan.ves...@rutgers.edu>
> >
> > -- 
> > Jan Vesely <jan.ves...@rutgers.edu>
> > From 7ad338ebd3a67b19d4ba492fb5a4cbda418fcdad Mon Sep 17 00:00:00 2001
> > From: Jan Vesely <jan.ves...@rutgers.edu>
> > Date: Mon, 1 Sep 2014 19:18:12 -0400
> > Subject: [PATCH RFC 1/1] clover: Append implicit work dim arg
> >
> > Signed-off-by: Jan Vesely <jan.ves...@rutgers.edu>
> > ---
> >  src/gallium/state_trackers/clover/api/kernel.cpp     |  6 ++++++
> >  .../state_trackers/clover/llvm/invocation.cpp        | 20 
> > ++++++++++++++------
> >  2 files changed, 20 insertions(+), 6 deletions(-)
> >
> > diff --git a/src/gallium/state_trackers/clover/api/kernel.cpp 
> > b/src/gallium/state_trackers/clover/api/kernel.cpp
> > index 05cc392..a3b9735 100644
> > --- a/src/gallium/state_trackers/clover/api/kernel.cpp
> > +++ b/src/gallium/state_trackers/clover/api/kernel.cpp
> > @@ -276,6 +276,9 @@ clEnqueueNDRangeKernel(cl_command_queue d_q, cl_kernel 
> > d_kern,
> >     auto block_size = validate_block_size(q, kern, dims,
> >                                           d_grid_size, d_block_size);
> >  
> > +   cl_uint work_dim_val = block_size.size();
> > +   kern.args().back().set(sizeof(work_dim_val), &work_dim_val);
> > +
> >     validate_common(q, kern, deps);
> >  
> >     auto hev = create<hard_event>(
> > @@ -299,6 +302,9 @@ clEnqueueTask(cl_command_queue d_q, cl_kernel d_kern,
> >     auto &kern = obj(d_kern);
> >     auto deps = objs<wait_list_tag>(d_deps, num_deps);
> >  
> > +   cl_uint work_dim_val = 1;
> > +   kern.args().back().set(sizeof(work_dim_val), &work_dim_val);
> > +
> >     validate_common(q, kern, deps);
> >  
> >     auto hev = create<hard_event>(
> > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
> > b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > index 7bca0d6..a934384 100644
> > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > @@ -315,17 +315,17 @@ namespace {
> >           kernel_func = kernels[i];
> >           kernel_name = kernel_func->getName();
> >  
> > -         for (llvm::Function::arg_iterator I = kernel_func->arg_begin(),
> > -                                      E = kernel_func->arg_end(); I != E; 
> > ++I) {
> > -            llvm::Argument &arg = *I;
> >  #if HAVE_LLVM < 0x0302
> > -            llvm::TargetData TD(kernel_func->getParent());
> > +         llvm::TargetData TD(kernel_func->getParent());
> >  #elif HAVE_LLVM < 0x0305
> > -            llvm::DataLayout TD(kernel_func->getParent()->getDataLayout());
> > +         llvm::DataLayout TD(kernel_func->getParent()->getDataLayout());
> >  #else
> > -            llvm::DataLayout TD(mod);
> > +         llvm::DataLayout TD(mod);
> >  #endif
> >  
> > +         for (llvm::Function::arg_iterator I = kernel_func->arg_begin(),
> > +                                      E = kernel_func->arg_end(); I != E; 
> > ++I) {
> > +            llvm::Argument &arg = *I;
> >              llvm::Type *arg_type = arg.getType();
> >              const unsigned arg_store_size = TD.getTypeStoreSize(arg_type);
> >  
> > @@ -384,6 +384,14 @@ namespace {
> >              }
> >           }
> >  
> > +         // Implicit arguments
> > +         // Work dimensions (cl_uint), uint is 32 bit
> > +         llvm::Type *target_type = 
> > llvm::Type::getInt32Ty(mod->getContext());
> > +         args.push_back(module::argument(module::argument::scalar,
> > +                        sizeof(cl_uint), TD.getTypeStoreSize(target_type),
> > +                        TD.getABITypeAlignment(target_type),
> > +                        module::argument::zero_ext));
> > +
> >           m.syms.push_back(module::symbol(kernel_name, 0, i, args ));
> >        }
> >  
> > -- 
> > 1.9.3

-- 
Jan Vesely <jan.ves...@rutgers.edu>
From 33d55ecbeae680c9a8291ca16fff49c94035c65a Mon Sep 17 00:00:00 2001
From: Jan Vesely <jan.ves...@rutgers.edu>
Date: Mon, 1 Sep 2014 19:18:12 -0400
Subject: [PATCH 1/2] clover: save module argument types

pass symbol reference instead of just arg list to kernel constructor

Signed-off-by: Jan Vesely <jan.ves...@rutgers.edu>
---
 src/gallium/state_trackers/clover/api/kernel.cpp   |  4 +--
 src/gallium/state_trackers/clover/core/kernel.cpp  |  5 ++--
 src/gallium/state_trackers/clover/core/kernel.hpp  |  2 +-
 src/gallium/state_trackers/clover/core/module.hpp  |  9 ++++--
 .../state_trackers/clover/llvm/invocation.cpp      | 33 +++++++++++++++++-----
 .../state_trackers/clover/tgsi/compiler.cpp        |  3 +-
 6 files changed, 41 insertions(+), 15 deletions(-)

diff --git a/src/gallium/state_trackers/clover/api/kernel.cpp b/src/gallium/state_trackers/clover/api/kernel.cpp
index 05cc392..762c5d8 100644
--- a/src/gallium/state_trackers/clover/api/kernel.cpp
+++ b/src/gallium/state_trackers/clover/api/kernel.cpp
@@ -36,7 +36,7 @@ clCreateKernel(cl_program d_prog, const char *name, cl_int *r_errcode) try {
    auto &sym = find(name_equals(name), prog.symbols());
 
    ret_error(r_errcode, CL_SUCCESS);
-   return new kernel(prog, name, range(sym.args));
+   return new kernel(prog, name, sym);
 
 } catch (std::out_of_range &e) {
    ret_error(r_errcode, CL_INVALID_KERNEL_NAME);
@@ -61,7 +61,7 @@ clCreateKernelsInProgram(cl_program d_prog, cl_uint count,
                return desc(new kernel(prog,
                                       std::string(sym.name.begin(),
                                                   sym.name.end()),
-                                      range(sym.args)));
+                                      sym));
             }, syms),
          rd_kerns);
 
diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp b/src/gallium/state_trackers/clover/core/kernel.cpp
index e4b2152..769e161 100644
--- a/src/gallium/state_trackers/clover/core/kernel.cpp
+++ b/src/gallium/state_trackers/clover/core/kernel.cpp
@@ -29,10 +29,10 @@
 using namespace clover;
 
 kernel::kernel(clover::program &prog, const std::string &name,
-               const std::vector<module::argument> &margs) :
+               const clover::module::symbol &symbol):
    program(prog), _name(name), exec(*this),
    program_ref(prog._kernel_ref_counter) {
-   for (auto &marg : margs) {
+   for (const auto &marg : symbol.args) {
       if (marg.type == module::argument::scalar)
          _args.emplace_back(new scalar_argument(marg.size));
       else if (marg.type == module::argument::global)
@@ -70,6 +70,7 @@ kernel::launch(command_queue &q,
    const auto m = program().binary(q.device());
    const auto reduced_grid_size =
       map(divides(), grid_size, block_size);
+
    void *st = exec.bind(&q);
 
    // The handles are created during exec_context::bind(), so we need make
diff --git a/src/gallium/state_trackers/clover/core/kernel.hpp b/src/gallium/state_trackers/clover/core/kernel.hpp
index f9e2765..dbd41b9 100644
--- a/src/gallium/state_trackers/clover/core/kernel.hpp
+++ b/src/gallium/state_trackers/clover/core/kernel.hpp
@@ -106,7 +106,7 @@ namespace clover {
 
    public:
       kernel(clover::program &prog, const std::string &name,
-             const std::vector<clover::module::argument> &margs);
+             const clover::module::symbol &symbol);
 
       kernel(const kernel &kern) = delete;
       kernel &
diff --git a/src/gallium/state_trackers/clover/core/module.hpp b/src/gallium/state_trackers/clover/core/module.hpp
index 18a5bfb..6ea805d 100644
--- a/src/gallium/state_trackers/clover/core/module.hpp
+++ b/src/gallium/state_trackers/clover/core/module.hpp
@@ -93,14 +93,19 @@ namespace clover {
 
       struct symbol {
          symbol(const compat::vector<char> &name, resource_id section,
-                size_t offset, const compat::vector<argument> &args) :
-                name(name), section(section), offset(offset), args(args) { }
+                size_t offset, const compat::vector<argument> &args,
+                argument uint_arg, argument size_arg) :
+                name(name), section(section), offset(offset), args(args),
+                uint_arg(uint_arg), size_arg(size_arg) { }
          symbol() : name(), section(0), offset(0), args() { }
 
          compat::vector<char> name;
          resource_id section;
          size_t offset;
          compat::vector<argument> args;
+
+         argument uint_arg;
+         argument size_arg;
       };
 
       void serialize(compat::ostream &os) const;
diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp
index 7bca0d6..9d43d57 100644
--- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
+++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
@@ -315,17 +315,17 @@ namespace {
          kernel_func = kernels[i];
          kernel_name = kernel_func->getName();
 
-         for (llvm::Function::arg_iterator I = kernel_func->arg_begin(),
-                                      E = kernel_func->arg_end(); I != E; ++I) {
-            llvm::Argument &arg = *I;
 #if HAVE_LLVM < 0x0302
-            llvm::TargetData TD(kernel_func->getParent());
+         llvm::TargetData TD(kernel_func->getParent());
 #elif HAVE_LLVM < 0x0305
-            llvm::DataLayout TD(kernel_func->getParent()->getDataLayout());
+         llvm::DataLayout TD(kernel_func->getParent()->getDataLayout());
 #else
-            llvm::DataLayout TD(mod);
+         llvm::DataLayout TD(mod);
 #endif
 
+         for (llvm::Function::arg_iterator I = kernel_func->arg_begin(),
+                                      E = kernel_func->arg_end(); I != E; ++I) {
+            llvm::Argument &arg = *I;
             llvm::Type *arg_type = arg.getType();
             const unsigned arg_store_size = TD.getTypeStoreSize(arg_type);
 
@@ -384,7 +384,26 @@ namespace {
             }
          }
 
-         m.syms.push_back(module::symbol(kernel_name, 0, i, args ));
+
+         // Implicit arguments
+         llvm::Type *int32_type = llvm::Type::getInt32Ty(mod->getContext());
+         // int, uint 32 bit
+         module::argument uint_arg =
+            module::argument(module::argument::scalar, sizeof(cl_uint),
+                             TD.getTypeStoreSize(int32_type),
+                             TD.getABITypeAlignment(int32_type),
+                             module::argument::zero_ext);
+         //size_t is based on address bits,
+         //for now 32 bit is reported for every device
+         //FIXME: update this when we support device with 64 address bits
+         module::argument size_arg =
+            module::argument(module::argument::scalar, sizeof(size_t),
+                             TD.getTypeStoreSize(int32_type),
+                             TD.getABITypeAlignment(int32_type),
+                             module::argument::zero_ext);
+
+         m.syms.push_back(module::symbol(kernel_name, 0, i, args, uint_arg,
+                                         size_arg));
       }
 
       header.num_bytes = llvm_bitcode.size();
diff --git a/src/gallium/state_trackers/clover/tgsi/compiler.cpp b/src/gallium/state_trackers/clover/tgsi/compiler.cpp
index 93dfeb5..c162026 100644
--- a/src/gallium/state_trackers/clover/tgsi/compiler.cpp
+++ b/src/gallium/state_trackers/clover/tgsi/compiler.cpp
@@ -71,7 +71,8 @@ namespace {
                throw build_error("invalid kernel argument");
          }
 
-         m.syms.push_back({ name, 0, offset, args });
+         m.syms.push_back({ name, 0, offset, args, module::argument(),
+                            module::argument() });
       }
    }
 
-- 
1.9.3

From 34b38da76abcaf39e13646a67522f1b835a5cb24 Mon Sep 17 00:00:00 2001
From: Jan Vesely <jan.ves...@rutgers.edu>
Date: Tue, 2 Sep 2014 14:42:27 -0400
Subject: [PATCH 2/2] clover: Add implicit arguments to kernel

Signed-off-by: Jan Vesely <jan.ves...@rutgers.edu>
---
 src/gallium/state_trackers/clover/core/kernel.cpp | 30 +++++++++++++++++++++--
 src/gallium/state_trackers/clover/core/kernel.hpp |  5 ++++
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp b/src/gallium/state_trackers/clover/core/kernel.cpp
index 769e161..c141b4b 100644
--- a/src/gallium/state_trackers/clover/core/kernel.cpp
+++ b/src/gallium/state_trackers/clover/core/kernel.cpp
@@ -52,6 +52,11 @@ kernel::kernel(clover::program &prog, const std::string &name,
       else
          throw error(CL_INVALID_KERNEL_DEFINITION);
    }
+
+   //Implicit arguments
+   // work dim
+   _implicit_args.emplace_back(new scalar_argument(symbol.uint_arg.size));
+   _implicit_margs.emplace_back(&symbol.uint_arg);
 }
 
 template<typename V>
@@ -71,6 +76,11 @@ kernel::launch(command_queue &q,
    const auto reduced_grid_size =
       map(divides(), grid_size, block_size);
 
+   //implicit arguments
+   cl_uint work_dim = block_size.size();
+   _implicit_args[0]->set(sizeof(work_dim), &work_dim);
+
+
    void *st = exec.bind(&q);
 
    // The handles are created during exec_context::bind(), so we need make
@@ -151,6 +161,16 @@ kernel::args() const {
    return map(derefs(), _args);
 }
 
+kernel::argument_range
+kernel::implicit_args() {
+   return map(derefs(), _implicit_args);
+}
+
+kernel::const_argument_range
+kernel::implicit_args() const {
+   return map(derefs(), _implicit_args);
+}
+
 const module &
 kernel::module(const command_queue &q) const {
    return program().binary(q.device());
@@ -171,13 +191,19 @@ kernel::exec_context::bind(intrusive_ptr<command_queue> _q) {
 
    // Bind kernel arguments.
    auto &m = kern.program().binary(q->device());
-   auto margs = find(name_equals(kern.name()), m.syms).args;
-   auto msec = find(type_equals(module::section::text), m.secs);
+   const auto &margs = find(name_equals(kern.name()), m.syms).args;
+   const auto &msec = find(type_equals(module::section::text), m.secs);
 
+   //Explicit arguments
    for_each([=](kernel::argument &karg, const module::argument &marg) {
                karg.bind(*this, marg);
             }, kern.args(), margs);
 
+   //Implicit arguments
+   for_each([=](kernel::argument &karg, const module::argument *marg) {
+               karg.bind(*this, *marg);
+            }, kern.implicit_args(), kern._implicit_margs);
+
    // Create a new compute state if anything changed.
    if (!st || q != _q ||
        cs.req_local_mem != mem_local ||
diff --git a/src/gallium/state_trackers/clover/core/kernel.hpp b/src/gallium/state_trackers/clover/core/kernel.hpp
index dbd41b9..b34ec40 100644
--- a/src/gallium/state_trackers/clover/core/kernel.hpp
+++ b/src/gallium/state_trackers/clover/core/kernel.hpp
@@ -131,6 +131,9 @@ namespace clover {
       argument_range args();
       const_argument_range args() const;
 
+      argument_range implicit_args();
+      const_argument_range implicit_args() const;
+
       const intrusive_ref<clover::program> program;
 
    private:
@@ -223,6 +226,8 @@ namespace clover {
       };
 
       std::vector<std::unique_ptr<argument>> _args;
+      std::vector<std::unique_ptr<argument>> _implicit_args;
+      std::vector<const module::argument*> _implicit_margs;
       std::string _name;
       exec_context exec;
       const ref_holder program_ref;
-- 
1.9.3

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to