Re: [Mesa-dev] [PATCH 0/3] cl workdim v2

Jan Vesely Mon, 01 Sep 2014 16:34:38 -0700

On Sat, 2014-08-16 at 13:13 +0300, Francisco Jerez wrote:
> Jan Vesely <jan.ves...@rutgers.edu> writes:
> 
> > On Thu, 2014-08-07 at 16:02 +0300, Francisco Jerez wrote:
> >> Jan Vesely <jan.ves...@rutgers.edu> writes:
> >> 
> >> > This respin includes Francisco's approach of providing implicit
> >> > in the arg vector passed from clover, and Tom's idea of appending
> >> > implicit args after the kernel args.
> >> >
> >> 
> >> Hmmm...  Maybe it would make sense to add some sort of versioning
> >> (e.g. as part of the target triple) to the binary interface between
> >> clover and the kernel instead, so we can handle this sort of
> >> non-backwards compatible changes and the compiler back-end and libclc
> >> have some way to find out whether some specific feature is available and
> >> e.g. some specific extension should be enabled.
> >> 
> >> > I assumed it's not safe to modify exec.input, so the input vector is 
> >> > copied
> >> > before appending work dim.
> >> >
> >> 
> >> Why wouldn't it be safe?  You just need to make sure they're appended
> >> before the compute state is created.
> >
> > I thought there might be a problem when called from multiple threads,
> > but it looks like most of the vars are local to the current call anyway.
> >
> > I looked at the code a bit better, and need a bit of help with what the
> > proffered approach would be.
> >
> > exec_context::bind() appends all kernel args to the input vector. If the
> > implicit args are added before bind() it shifts all other args, which is
> > not what we want.
> > if the implicit args are appended after, they are not accounted for in
> > shader->input_size (and not copied by the driver).
> >
> > my current code modifies exec_context::bind() to preserve the content of
> > input before binding kernel args, and append the old content after the
> > args are bound.
> > I have also considered passing and implicit args vector to
> > exec_context::bind to make the trick more visible.
> >
> > Turning workdim into a proper arg in _args does not work either, because
> > it is not present in module args.
> >
> > any thoughts?
> >
> 
> I finally had a chance to take a closer look at your series.  It looks
> like you're right: In order to implement my proposal cleanly, implicit
> arguments would have to be part of the _args array so the compiler would
> have to include them in the module argument lists with memory layout
> parameters (e.g. alignment, size) suitable for the hardware, so there's
> probably little benefit compared to your original approach that includes
> the number of dimensions as an additional launch_grid() parameter.
> 
> So we don't have to change it again, can you add another array parameter
> for the base grid offset?  That's another thing we don't pass through
> the pipe driver API currently and CL requires. The prototype of
> launch_grid could look like:
> 
> | void (*launch_grid)(struct pipe_context *context, uint dims,
> |                     const uint *block_layout, const uint *grid_layout,
> |                     const uint *grid_offset, uint32_t pc,
> |                     const void *input);
> 
> And don't forget to update the docs. :)


Hi,

I wanted to explore the original idea of appending implicit args, since
launch_grid is driver specific and would need to reimplement the same
functionality in every driver.

I came up with a solution (see the attached patch). I don't like that
the implicit arg needs to be set in api function. I also don't like that
this way there is no difference between explicit and implicit kernel
arguments. On the other hand it's simple, and does not need additional
per driver code.

thanks,
jan

> 
> Thank you.
> 
> > thanks,
> > jan
> >
> >
> >> 
> >> > Passes get-work-dim piglit on turks without any regression,
> >> > I have not tested SI as I don't have the hw.
> >> >
> >> > jan
> >> >
> >> >
> >> >
> >> >
> >> > Jan Vesely (3):
> >> >   gallium: Pass input data size to launch_grid
> >> >   clover: Add work dimension implicit param to input
> >> >   r600,radeonsi: Copy implicit args provided by clover
> >> >
> >> >  src/gallium/drivers/ilo/ilo_gpgpu.c               |   2 +-
> >> >  src/gallium/drivers/nouveau/nvc0/nvc0_compute.c   |   2 +-
> >> >  src/gallium/drivers/nouveau/nvc0/nvc0_context.h   |   4 +-
> >> >  src/gallium/drivers/nouveau/nvc0/nve4_compute.c   |   2 +-
> >> >  src/gallium/drivers/r600/evergreen_compute.c      |  14 +-
> >> >  src/gallium/drivers/r600/evergreen_compute.h      |   1 -
> >> >  src/gallium/drivers/radeonsi/si_compute.c         |   6 +-
> >> >  src/gallium/include/pipe/p_context.h              |   2 +-
> >> >  src/gallium/state_trackers/clover/core/kernel.cpp | 162 
> >> > ++++++++++++----------
> >> >  src/gallium/tests/trivial/compute.c               |  40 +++---
> >> >  10 files changed, 122 insertions(+), 113 deletions(-)
> >> >
> >> > -- 
> >> > 1.9.3
> >
> > -- 
> > Jan Vesely <jan.ves...@rutgers.edu>

-- 
Jan Vesely <jan.ves...@rutgers.edu>

From 7ad338ebd3a67b19d4ba492fb5a4cbda418fcdad Mon Sep 17 00:00:00 2001
From: Jan Vesely <jan.ves...@rutgers.edu>
Date: Mon, 1 Sep 2014 19:18:12 -0400
Subject: [PATCH RFC 1/1] clover: Append implicit work dim arg

Signed-off-by: Jan Vesely <jan.ves...@rutgers.edu>
---
 src/gallium/state_trackers/clover/api/kernel.cpp     |  6 ++++++
 .../state_trackers/clover/llvm/invocation.cpp        | 20 ++++++++++++++------
 2 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/src/gallium/state_trackers/clover/api/kernel.cpp b/src/gallium/state_trackers/clover/api/kernel.cpp
index 05cc392..a3b9735 100644
--- a/src/gallium/state_trackers/clover/api/kernel.cpp
+++ b/src/gallium/state_trackers/clover/api/kernel.cpp
@@ -276,6 +276,9 @@ clEnqueueNDRangeKernel(cl_command_queue d_q, cl_kernel d_kern,
    auto block_size = validate_block_size(q, kern, dims,
                                          d_grid_size, d_block_size);
 
+   cl_uint work_dim_val = block_size.size();
+   kern.args().back().set(sizeof(work_dim_val), &work_dim_val);
+
    validate_common(q, kern, deps);
 
    auto hev = create<hard_event>(
@@ -299,6 +302,9 @@ clEnqueueTask(cl_command_queue d_q, cl_kernel d_kern,
    auto &kern = obj(d_kern);
    auto deps = objs<wait_list_tag>(d_deps, num_deps);
 
+   cl_uint work_dim_val = 1;
+   kern.args().back().set(sizeof(work_dim_val), &work_dim_val);
+
    validate_common(q, kern, deps);
 
    auto hev = create<hard_event>(
diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp
index 7bca0d6..a934384 100644
--- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
+++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
@@ -315,17 +315,17 @@ namespace {
          kernel_func = kernels[i];
          kernel_name = kernel_func->getName();
 
-         for (llvm::Function::arg_iterator I = kernel_func->arg_begin(),
-                                      E = kernel_func->arg_end(); I != E; ++I) {
-            llvm::Argument &arg = *I;
 #if HAVE_LLVM < 0x0302
-            llvm::TargetData TD(kernel_func->getParent());
+         llvm::TargetData TD(kernel_func->getParent());
 #elif HAVE_LLVM < 0x0305
-            llvm::DataLayout TD(kernel_func->getParent()->getDataLayout());
+         llvm::DataLayout TD(kernel_func->getParent()->getDataLayout());
 #else
-            llvm::DataLayout TD(mod);
+         llvm::DataLayout TD(mod);
 #endif
 
+         for (llvm::Function::arg_iterator I = kernel_func->arg_begin(),
+                                      E = kernel_func->arg_end(); I != E; ++I) {
+            llvm::Argument &arg = *I;
             llvm::Type *arg_type = arg.getType();
             const unsigned arg_store_size = TD.getTypeStoreSize(arg_type);
 
@@ -384,6 +384,14 @@ namespace {
             }
          }
 
+         // Implicit arguments
+         // Work dimensions (cl_uint), uint is 32 bit
+         llvm::Type *target_type = llvm::Type::getInt32Ty(mod->getContext());
+         args.push_back(module::argument(module::argument::scalar,
+                        sizeof(cl_uint), TD.getTypeStoreSize(target_type),
+                        TD.getABITypeAlignment(target_type),
+                        module::argument::zero_ext));
+
          m.syms.push_back(module::symbol(kernel_name, 0, i, args ));
       }
 
-- 
1.9.3

signature.asc
Description: This is a digitally signed message part

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/3] cl workdim v2

Reply via email to