On Tue, 2014-09-02 at 15:36 +0300, Francisco Jerez wrote: > Jan Vesely <jan.ves...@rutgers.edu> writes: > > > On Sat, 2014-08-16 at 13:13 +0300, Francisco Jerez wrote: > >> Jan Vesely <jan.ves...@rutgers.edu> writes: > >> > >> > On Thu, 2014-08-07 at 16:02 +0300, Francisco Jerez wrote: > >> >> Jan Vesely <jan.ves...@rutgers.edu> writes: > >> >> > >> >> > This respin includes Francisco's approach of providing implicit > >> >> > in the arg vector passed from clover, and Tom's idea of appending > >> >> > implicit args after the kernel args. > >> >> > > >> >> > >> >> Hmmm... Maybe it would make sense to add some sort of versioning > >> >> (e.g. as part of the target triple) to the binary interface between > >> >> clover and the kernel instead, so we can handle this sort of > >> >> non-backwards compatible changes and the compiler back-end and libclc > >> >> have some way to find out whether some specific feature is available and > >> >> e.g. some specific extension should be enabled. > >> >> > >> >> > I assumed it's not safe to modify exec.input, so the input vector is > >> >> > copied > >> >> > before appending work dim. > >> >> > > >> >> > >> >> Why wouldn't it be safe? You just need to make sure they're appended > >> >> before the compute state is created. > >> > > >> > I thought there might be a problem when called from multiple threads, > >> > but it looks like most of the vars are local to the current call anyway. > >> > > >> > I looked at the code a bit better, and need a bit of help with what the > >> > proffered approach would be. > >> > > >> > exec_context::bind() appends all kernel args to the input vector. If the > >> > implicit args are added before bind() it shifts all other args, which is > >> > not what we want. > >> > if the implicit args are appended after, they are not accounted for in > >> > shader->input_size (and not copied by the driver). > >> > > >> > my current code modifies exec_context::bind() to preserve the content of > >> > input before binding kernel args, and append the old content after the > >> > args are bound. > >> > I have also considered passing and implicit args vector to > >> > exec_context::bind to make the trick more visible. > >> > > >> > Turning workdim into a proper arg in _args does not work either, because > >> > it is not present in module args. > >> > > >> > any thoughts? > >> > > >> > >> I finally had a chance to take a closer look at your series. It looks > >> like you're right: In order to implement my proposal cleanly, implicit > >> arguments would have to be part of the _args array so the compiler would > >> have to include them in the module argument lists with memory layout > >> parameters (e.g. alignment, size) suitable for the hardware, so there's > >> probably little benefit compared to your original approach that includes > >> the number of dimensions as an additional launch_grid() parameter. > >> > >> So we don't have to change it again, can you add another array parameter > >> for the base grid offset? That's another thing we don't pass through > >> the pipe driver API currently and CL requires. The prototype of > >> launch_grid could look like: > >> > >> | void (*launch_grid)(struct pipe_context *context, uint dims, > >> | const uint *block_layout, const uint *grid_layout, > >> | const uint *grid_offset, uint32_t pc, > >> | const void *input); > >> > >> And don't forget to update the docs. :) > > > > Hi, > > > > I wanted to explore the original idea of appending implicit args, since > > launch_grid is driver specific and would need to reimplement the same > > functionality in every driver. > > > > I came up with a solution (see the attached patch). I don't like that > > the implicit arg needs to be set in api function. > > Right, if we do it this way it would probably be a better fit for the > clover::kernel code, the CL front-end doesn't really need to be aware of > implicit args. > > > I also don't like that this way there is no difference between > > explicit and implicit kernel arguments. On the other hand it's simple, > > and does not need additional per driver code. > > > Yeah... We definitely want to hide these from the user, as e.g. the > CL_KERNEL_NUM_ARGS param is required by the spec to return the number of > arguments provided by the user, and we don't want the user to set > implicit args, so it gets a bit messy. I think I like better your > original idea of passing them as launch_grid() arguments, even though > the grid offset and dimension parameters are somewhat artificial from a > the hardware's point of view.
sorry to bug you some more with this. I tried one more thing before
going back to the launch_grid parameters. this time it implements a
parallel infrastructure for implicit arguments by creating artificial
module arguments for uint and size_t (I don't think we need more for
implicit arguments).
I only added the work dimension argument but adding more should be easy.
If you think that the launch_grid way is better, I'll stop experimenting
as I ran out of ideas I wanted to try.
thanks,
jan
>
> > thanks,
> > jan
> >
> >>
> >> Thank you.
> >>
> >> > thanks,
> >> > jan
> >> >
> >> >
> >> >>
> >> >> > Passes get-work-dim piglit on turks without any regression,
> >> >> > I have not tested SI as I don't have the hw.
> >> >> >
> >> >> > jan
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> > Jan Vesely (3):
> >> >> > gallium: Pass input data size to launch_grid
> >> >> > clover: Add work dimension implicit param to input
> >> >> > r600,radeonsi: Copy implicit args provided by clover
> >> >> >
> >> >> > src/gallium/drivers/ilo/ilo_gpgpu.c | 2 +-
> >> >> > src/gallium/drivers/nouveau/nvc0/nvc0_compute.c | 2 +-
> >> >> > src/gallium/drivers/nouveau/nvc0/nvc0_context.h | 4 +-
> >> >> > src/gallium/drivers/nouveau/nvc0/nve4_compute.c | 2 +-
> >> >> > src/gallium/drivers/r600/evergreen_compute.c | 14 +-
> >> >> > src/gallium/drivers/r600/evergreen_compute.h | 1 -
> >> >> > src/gallium/drivers/radeonsi/si_compute.c | 6 +-
> >> >> > src/gallium/include/pipe/p_context.h | 2 +-
> >> >> > src/gallium/state_trackers/clover/core/kernel.cpp | 162
> >> >> > ++++++++++++----------
> >> >> > src/gallium/tests/trivial/compute.c | 40 +++---
> >> >> > 10 files changed, 122 insertions(+), 113 deletions(-)
> >> >> >
> >> >> > --
> >> >> > 1.9.3
> >> >
> >> > --
> >> > Jan Vesely <jan.ves...@rutgers.edu>
> >
> > --
> > Jan Vesely <jan.ves...@rutgers.edu>
> > From 7ad338ebd3a67b19d4ba492fb5a4cbda418fcdad Mon Sep 17 00:00:00 2001
> > From: Jan Vesely <jan.ves...@rutgers.edu>
> > Date: Mon, 1 Sep 2014 19:18:12 -0400
> > Subject: [PATCH RFC 1/1] clover: Append implicit work dim arg
> >
> > Signed-off-by: Jan Vesely <jan.ves...@rutgers.edu>
> > ---
> > src/gallium/state_trackers/clover/api/kernel.cpp | 6 ++++++
> > .../state_trackers/clover/llvm/invocation.cpp | 20
> > ++++++++++++++------
> > 2 files changed, 20 insertions(+), 6 deletions(-)
> >
> > diff --git a/src/gallium/state_trackers/clover/api/kernel.cpp
> > b/src/gallium/state_trackers/clover/api/kernel.cpp
> > index 05cc392..a3b9735 100644
> > --- a/src/gallium/state_trackers/clover/api/kernel.cpp
> > +++ b/src/gallium/state_trackers/clover/api/kernel.cpp
> > @@ -276,6 +276,9 @@ clEnqueueNDRangeKernel(cl_command_queue d_q, cl_kernel
> > d_kern,
> > auto block_size = validate_block_size(q, kern, dims,
> > d_grid_size, d_block_size);
> >
> > + cl_uint work_dim_val = block_size.size();
> > + kern.args().back().set(sizeof(work_dim_val), &work_dim_val);
> > +
> > validate_common(q, kern, deps);
> >
> > auto hev = create<hard_event>(
> > @@ -299,6 +302,9 @@ clEnqueueTask(cl_command_queue d_q, cl_kernel d_kern,
> > auto &kern = obj(d_kern);
> > auto deps = objs<wait_list_tag>(d_deps, num_deps);
> >
> > + cl_uint work_dim_val = 1;
> > + kern.args().back().set(sizeof(work_dim_val), &work_dim_val);
> > +
> > validate_common(q, kern, deps);
> >
> > auto hev = create<hard_event>(
> > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > index 7bca0d6..a934384 100644
> > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > @@ -315,17 +315,17 @@ namespace {
> > kernel_func = kernels[i];
> > kernel_name = kernel_func->getName();
> >
> > - for (llvm::Function::arg_iterator I = kernel_func->arg_begin(),
> > - E = kernel_func->arg_end(); I != E;
> > ++I) {
> > - llvm::Argument &arg = *I;
> > #if HAVE_LLVM < 0x0302
> > - llvm::TargetData TD(kernel_func->getParent());
> > + llvm::TargetData TD(kernel_func->getParent());
> > #elif HAVE_LLVM < 0x0305
> > - llvm::DataLayout TD(kernel_func->getParent()->getDataLayout());
> > + llvm::DataLayout TD(kernel_func->getParent()->getDataLayout());
> > #else
> > - llvm::DataLayout TD(mod);
> > + llvm::DataLayout TD(mod);
> > #endif
> >
> > + for (llvm::Function::arg_iterator I = kernel_func->arg_begin(),
> > + E = kernel_func->arg_end(); I != E;
> > ++I) {
> > + llvm::Argument &arg = *I;
> > llvm::Type *arg_type = arg.getType();
> > const unsigned arg_store_size = TD.getTypeStoreSize(arg_type);
> >
> > @@ -384,6 +384,14 @@ namespace {
> > }
> > }
> >
> > + // Implicit arguments
> > + // Work dimensions (cl_uint), uint is 32 bit
> > + llvm::Type *target_type =
> > llvm::Type::getInt32Ty(mod->getContext());
> > + args.push_back(module::argument(module::argument::scalar,
> > + sizeof(cl_uint), TD.getTypeStoreSize(target_type),
> > + TD.getABITypeAlignment(target_type),
> > + module::argument::zero_ext));
> > +
> > m.syms.push_back(module::symbol(kernel_name, 0, i, args ));
> > }
> >
> > --
> > 1.9.3
--
Jan Vesely <jan.ves...@rutgers.edu>
From 33d55ecbeae680c9a8291ca16fff49c94035c65a Mon Sep 17 00:00:00 2001
From: Jan Vesely <jan.ves...@rutgers.edu>
Date: Mon, 1 Sep 2014 19:18:12 -0400
Subject: [PATCH 1/2] clover: save module argument types
pass symbol reference instead of just arg list to kernel constructor
Signed-off-by: Jan Vesely <jan.ves...@rutgers.edu>
---
src/gallium/state_trackers/clover/api/kernel.cpp | 4 +--
src/gallium/state_trackers/clover/core/kernel.cpp | 5 ++--
src/gallium/state_trackers/clover/core/kernel.hpp | 2 +-
src/gallium/state_trackers/clover/core/module.hpp | 9 ++++--
.../state_trackers/clover/llvm/invocation.cpp | 33 +++++++++++++++++-----
.../state_trackers/clover/tgsi/compiler.cpp | 3 +-
6 files changed, 41 insertions(+), 15 deletions(-)
diff --git a/src/gallium/state_trackers/clover/api/kernel.cpp b/src/gallium/state_trackers/clover/api/kernel.cpp
index 05cc392..762c5d8 100644
--- a/src/gallium/state_trackers/clover/api/kernel.cpp
+++ b/src/gallium/state_trackers/clover/api/kernel.cpp
@@ -36,7 +36,7 @@ clCreateKernel(cl_program d_prog, const char *name, cl_int *r_errcode) try {
auto &sym = find(name_equals(name), prog.symbols());
ret_error(r_errcode, CL_SUCCESS);
- return new kernel(prog, name, range(sym.args));
+ return new kernel(prog, name, sym);
} catch (std::out_of_range &e) {
ret_error(r_errcode, CL_INVALID_KERNEL_NAME);
@@ -61,7 +61,7 @@ clCreateKernelsInProgram(cl_program d_prog, cl_uint count,
return desc(new kernel(prog,
std::string(sym.name.begin(),
sym.name.end()),
- range(sym.args)));
+ sym));
}, syms),
rd_kerns);
diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp b/src/gallium/state_trackers/clover/core/kernel.cpp
index e4b2152..769e161 100644
--- a/src/gallium/state_trackers/clover/core/kernel.cpp
+++ b/src/gallium/state_trackers/clover/core/kernel.cpp
@@ -29,10 +29,10 @@
using namespace clover;
kernel::kernel(clover::program &prog, const std::string &name,
- const std::vector<module::argument> &margs) :
+ const clover::module::symbol &symbol):
program(prog), _name(name), exec(*this),
program_ref(prog._kernel_ref_counter) {
- for (auto &marg : margs) {
+ for (const auto &marg : symbol.args) {
if (marg.type == module::argument::scalar)
_args.emplace_back(new scalar_argument(marg.size));
else if (marg.type == module::argument::global)
@@ -70,6 +70,7 @@ kernel::launch(command_queue &q,
const auto m = program().binary(q.device());
const auto reduced_grid_size =
map(divides(), grid_size, block_size);
+
void *st = exec.bind(&q);
// The handles are created during exec_context::bind(), so we need make
diff --git a/src/gallium/state_trackers/clover/core/kernel.hpp b/src/gallium/state_trackers/clover/core/kernel.hpp
index f9e2765..dbd41b9 100644
--- a/src/gallium/state_trackers/clover/core/kernel.hpp
+++ b/src/gallium/state_trackers/clover/core/kernel.hpp
@@ -106,7 +106,7 @@ namespace clover {
public:
kernel(clover::program &prog, const std::string &name,
- const std::vector<clover::module::argument> &margs);
+ const clover::module::symbol &symbol);
kernel(const kernel &kern) = delete;
kernel &
diff --git a/src/gallium/state_trackers/clover/core/module.hpp b/src/gallium/state_trackers/clover/core/module.hpp
index 18a5bfb..6ea805d 100644
--- a/src/gallium/state_trackers/clover/core/module.hpp
+++ b/src/gallium/state_trackers/clover/core/module.hpp
@@ -93,14 +93,19 @@ namespace clover {
struct symbol {
symbol(const compat::vector<char> &name, resource_id section,
- size_t offset, const compat::vector<argument> &args) :
- name(name), section(section), offset(offset), args(args) { }
+ size_t offset, const compat::vector<argument> &args,
+ argument uint_arg, argument size_arg) :
+ name(name), section(section), offset(offset), args(args),
+ uint_arg(uint_arg), size_arg(size_arg) { }
symbol() : name(), section(0), offset(0), args() { }
compat::vector<char> name;
resource_id section;
size_t offset;
compat::vector<argument> args;
+
+ argument uint_arg;
+ argument size_arg;
};
void serialize(compat::ostream &os) const;
diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp
index 7bca0d6..9d43d57 100644
--- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
+++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
@@ -315,17 +315,17 @@ namespace {
kernel_func = kernels[i];
kernel_name = kernel_func->getName();
- for (llvm::Function::arg_iterator I = kernel_func->arg_begin(),
- E = kernel_func->arg_end(); I != E; ++I) {
- llvm::Argument &arg = *I;
#if HAVE_LLVM < 0x0302
- llvm::TargetData TD(kernel_func->getParent());
+ llvm::TargetData TD(kernel_func->getParent());
#elif HAVE_LLVM < 0x0305
- llvm::DataLayout TD(kernel_func->getParent()->getDataLayout());
+ llvm::DataLayout TD(kernel_func->getParent()->getDataLayout());
#else
- llvm::DataLayout TD(mod);
+ llvm::DataLayout TD(mod);
#endif
+ for (llvm::Function::arg_iterator I = kernel_func->arg_begin(),
+ E = kernel_func->arg_end(); I != E; ++I) {
+ llvm::Argument &arg = *I;
llvm::Type *arg_type = arg.getType();
const unsigned arg_store_size = TD.getTypeStoreSize(arg_type);
@@ -384,7 +384,26 @@ namespace {
}
}
- m.syms.push_back(module::symbol(kernel_name, 0, i, args ));
+
+ // Implicit arguments
+ llvm::Type *int32_type = llvm::Type::getInt32Ty(mod->getContext());
+ // int, uint 32 bit
+ module::argument uint_arg =
+ module::argument(module::argument::scalar, sizeof(cl_uint),
+ TD.getTypeStoreSize(int32_type),
+ TD.getABITypeAlignment(int32_type),
+ module::argument::zero_ext);
+ //size_t is based on address bits,
+ //for now 32 bit is reported for every device
+ //FIXME: update this when we support device with 64 address bits
+ module::argument size_arg =
+ module::argument(module::argument::scalar, sizeof(size_t),
+ TD.getTypeStoreSize(int32_type),
+ TD.getABITypeAlignment(int32_type),
+ module::argument::zero_ext);
+
+ m.syms.push_back(module::symbol(kernel_name, 0, i, args, uint_arg,
+ size_arg));
}
header.num_bytes = llvm_bitcode.size();
diff --git a/src/gallium/state_trackers/clover/tgsi/compiler.cpp b/src/gallium/state_trackers/clover/tgsi/compiler.cpp
index 93dfeb5..c162026 100644
--- a/src/gallium/state_trackers/clover/tgsi/compiler.cpp
+++ b/src/gallium/state_trackers/clover/tgsi/compiler.cpp
@@ -71,7 +71,8 @@ namespace {
throw build_error("invalid kernel argument");
}
- m.syms.push_back({ name, 0, offset, args });
+ m.syms.push_back({ name, 0, offset, args, module::argument(),
+ module::argument() });
}
}
--
1.9.3
From 34b38da76abcaf39e13646a67522f1b835a5cb24 Mon Sep 17 00:00:00 2001
From: Jan Vesely <jan.ves...@rutgers.edu>
Date: Tue, 2 Sep 2014 14:42:27 -0400
Subject: [PATCH 2/2] clover: Add implicit arguments to kernel
Signed-off-by: Jan Vesely <jan.ves...@rutgers.edu>
---
src/gallium/state_trackers/clover/core/kernel.cpp | 30 +++++++++++++++++++++--
src/gallium/state_trackers/clover/core/kernel.hpp | 5 ++++
2 files changed, 33 insertions(+), 2 deletions(-)
diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp b/src/gallium/state_trackers/clover/core/kernel.cpp
index 769e161..c141b4b 100644
--- a/src/gallium/state_trackers/clover/core/kernel.cpp
+++ b/src/gallium/state_trackers/clover/core/kernel.cpp
@@ -52,6 +52,11 @@ kernel::kernel(clover::program &prog, const std::string &name,
else
throw error(CL_INVALID_KERNEL_DEFINITION);
}
+
+ //Implicit arguments
+ // work dim
+ _implicit_args.emplace_back(new scalar_argument(symbol.uint_arg.size));
+ _implicit_margs.emplace_back(&symbol.uint_arg);
}
template<typename V>
@@ -71,6 +76,11 @@ kernel::launch(command_queue &q,
const auto reduced_grid_size =
map(divides(), grid_size, block_size);
+ //implicit arguments
+ cl_uint work_dim = block_size.size();
+ _implicit_args[0]->set(sizeof(work_dim), &work_dim);
+
+
void *st = exec.bind(&q);
// The handles are created during exec_context::bind(), so we need make
@@ -151,6 +161,16 @@ kernel::args() const {
return map(derefs(), _args);
}
+kernel::argument_range
+kernel::implicit_args() {
+ return map(derefs(), _implicit_args);
+}
+
+kernel::const_argument_range
+kernel::implicit_args() const {
+ return map(derefs(), _implicit_args);
+}
+
const module &
kernel::module(const command_queue &q) const {
return program().binary(q.device());
@@ -171,13 +191,19 @@ kernel::exec_context::bind(intrusive_ptr<command_queue> _q) {
// Bind kernel arguments.
auto &m = kern.program().binary(q->device());
- auto margs = find(name_equals(kern.name()), m.syms).args;
- auto msec = find(type_equals(module::section::text), m.secs);
+ const auto &margs = find(name_equals(kern.name()), m.syms).args;
+ const auto &msec = find(type_equals(module::section::text), m.secs);
+ //Explicit arguments
for_each([=](kernel::argument &karg, const module::argument &marg) {
karg.bind(*this, marg);
}, kern.args(), margs);
+ //Implicit arguments
+ for_each([=](kernel::argument &karg, const module::argument *marg) {
+ karg.bind(*this, *marg);
+ }, kern.implicit_args(), kern._implicit_margs);
+
// Create a new compute state if anything changed.
if (!st || q != _q ||
cs.req_local_mem != mem_local ||
diff --git a/src/gallium/state_trackers/clover/core/kernel.hpp b/src/gallium/state_trackers/clover/core/kernel.hpp
index dbd41b9..b34ec40 100644
--- a/src/gallium/state_trackers/clover/core/kernel.hpp
+++ b/src/gallium/state_trackers/clover/core/kernel.hpp
@@ -131,6 +131,9 @@ namespace clover {
argument_range args();
const_argument_range args() const;
+ argument_range implicit_args();
+ const_argument_range implicit_args() const;
+
const intrusive_ref<clover::program> program;
private:
@@ -223,6 +226,8 @@ namespace clover {
};
std::vector<std::unique_ptr<argument>> _args;
+ std::vector<std::unique_ptr<argument>> _implicit_args;
+ std::vector<const module::argument*> _implicit_margs;
std::string _name;
exec_context exec;
const ref_holder program_ref;
--
1.9.3
signature.asc
Description: This is a digitally signed message part
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
