Hi Grigori,

Grigori Goronzy <g...@chown.ath.cx> writes:

> Wrap MapBuffer and MapImage as hard_event actions, like other
> operations. This enables correct profiling. Also make sure to wait
> for events to finish when blocking is requested by the caller.
> ---
>  src/gallium/state_trackers/clover/api/transfer.cpp | 50 
> ++++++++++++++++++++--
>  1 file changed, 46 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/state_trackers/clover/api/transfer.cpp 
> b/src/gallium/state_trackers/clover/api/transfer.cpp
> index fdb9405..4986f53 100644
> --- a/src/gallium/state_trackers/clover/api/transfer.cpp
> +++ b/src/gallium/state_trackers/clover/api/transfer.cpp
> @@ -270,6 +270,18 @@ namespace {
>                                     src_obj->resource(q), src_orig);
>        };
>     }
> +
> +   ///
> +   /// Resource mapping
> +   ///
> +   template<typename T>
> +   std::function<void (event &)>
> +   map_resource_op(command_queue &q, T obj, cl_map_flags flags, bool 
> blocking,
> +                   const vector_t &origin, const vector_t &region, void 
> **map) {
> +      return [=, &q](event &) {
> +         *map = obj->resource(q).add_map(q, flags, blocking, origin, region);
> +      };
> +   }
>  }
>  
>  CLOVER_API cl_int
> @@ -363,6 +375,10 @@ clEnqueueReadBufferRect(cl_command_queue d_q, cl_mem 
> d_mem, cl_bool blocking,
>                     region));
>  
>     ret_object(rd_ev, hev);
> +
> +   if (blocking)
> +      hev().wait();
> +

hard_event::wait() may fail, so this should probably be done before the
ret_object() call to avoid leaks.  Is there any reason you didn't make
the same change in clEnqueueReadBuffer() and clEnqueueWriteBuffer()?

>     return CL_SUCCESS;
>  
>  } catch (error &e) {
> @@ -400,6 +416,10 @@ clEnqueueWriteBufferRect(cl_command_queue d_q, cl_mem 
> d_mem, cl_bool blocking,
>                     region));
>  
>     ret_object(rd_ev, hev);
> +
> +   if (blocking)
> +      hev().wait();
> +

Same comment as above.  Also note that this is being more strict than
the spec requires (which I believe is what Tom was referring to).  From
the CL 1.2 spec:

| If blocking_write is CL_TRUE, the OpenCL implementation copies the data
| referred to by ptr and enqueues the write operation in the
| command-queue. The memory pointed to by ptr can be reused by the
| application after the clEnqueueWriteBufferRect call returns.

The spec is giving you no guarantee that the write to the actual memory
object will be complete by the time the clEnqueueWriteBufferRect call
returns -- Only that your data will have been buffered somewhere and the
memory pointed to by the argument can be reused immediately by the
application.  The reason why I was reluctant to make this change last
time it came up was that it's likely to hurt performance unnecessarily
because the wait() call blocks until *all* previous commands in the same
queue have completed execution, even though in the most common case the
copy is performed synchronously using soft_copy_op(), so the wait() call
is redundant even for blocking copies.

The case with blocking reads is similar, the copy is handled
synchronously using soft_copy_op() when no user events are present in
the list of dependencies, so calling wait() on the event is unnecessary
to guarantee that the execution of the read has completed, and will
cause a pipe_context flush and wait until the most recent fence is
signalled.

Ideally we would have a weaker variant of event::wait()
(e.g. wait_signalled()) that doesn't flush and just waits for the
associated action call-back to have been executed without giving any
guarantees about the corresponding GPU command.  The event interface
doesn't expose such a functionality right now, I'm attaching two
(completely untested) patches implementing it, you should be able to use
them as starting point to fix blocking transfers.

>     return CL_SUCCESS;
>  
>  } catch (error &e) {
> @@ -505,6 +525,10 @@ clEnqueueReadImage(cl_command_queue d_q, cl_mem d_mem, 
> cl_bool blocking,
>                     region));
>  
>     ret_object(rd_ev, hev);
> +
> +   if (blocking)
> +      hev().wait();
> +
>     return CL_SUCCESS;
>  
>  } catch (error &e) {
> @@ -539,6 +563,10 @@ clEnqueueWriteImage(cl_command_queue d_q, cl_mem d_mem, 
> cl_bool blocking,
>                     region));
>  
>     ret_object(rd_ev, hev);
> +
> +   if (blocking)
> +      hev().wait();
> +

Same comment as above for image reads and writes.

>     return CL_SUCCESS;
>  
>  } catch (error &e) {
> @@ -665,10 +693,17 @@ clEnqueueMapBuffer(cl_command_queue d_q, cl_mem d_mem, 
> cl_bool blocking,
>     validate_object(q, mem, obj_origin, obj_pitch, region);
>     validate_map_flags(mem, flags);
>  
> -   void *map = mem.resource(q).add_map(q, flags, blocking, obj_origin, 
> region);
> +   void *map = nullptr;
> +   auto hev = create<hard_event>(
> +      q, CL_COMMAND_MAP_BUFFER, deps,
> +      map_resource_op(q, &mem, flags, blocking, obj_origin, region, &map));
>  
> -   ret_object(rd_ev, create<hard_event>(q, CL_COMMAND_MAP_BUFFER, deps));
> +   ret_object(rd_ev, hev);
>     ret_error(r_errcode, CL_SUCCESS);
> +
> +   if (blocking)
> +      hev().wait();
> +
This doesn't look right for non-blocking maps, if you wrap it in an
event you need to make sure that the action has been executed (e.g. by
using event::wait_signalled()) before you return, even if the map is
non-blocking, otherwise the "map" variable won't be initialized in time
and you'll just return nullptr.

>     return map;
>  
>  } catch (error &e) {
> @@ -693,10 +728,17 @@ clEnqueueMapImage(cl_command_queue d_q, cl_mem d_mem, 
> cl_bool blocking,
>     validate_object(q, img, origin, region);
>     validate_map_flags(img, flags);
>  
> -   void *map = img.resource(q).add_map(q, flags, blocking, origin, region);
> +   void *map = nullptr;
> +   auto hev = create<hard_event>(
> +      q, CL_COMMAND_MAP_IMAGE, deps,
> +      map_resource_op(q, &img, flags, blocking, origin, region, &map));
>  
> -   ret_object(rd_ev, create<hard_event>(q, CL_COMMAND_MAP_IMAGE, deps));
> +   ret_object(rd_ev, hev);
>     ret_error(r_errcode, CL_SUCCESS);
> +
> +   if (blocking)
> +      hev().wait();
> +

Same here.

>     return map;
>  
>  } catch (error &e) {
> -- 
> 1.9.1
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

From 162047763a83a57764c5784c95c3f5474da897b1 Mon Sep 17 00:00:00 2001
From: Francisco Jerez <curroje...@riseup.net>
Date: Tue, 9 Jun 2015 22:52:25 +0300
Subject: [PATCH 1/2] clover: Wrap event::wait_count in a method taking care of
 the required locking.

---
 src/gallium/state_trackers/clover/core/event.cpp | 19 ++++++++++++-------
 src/gallium/state_trackers/clover/core/event.hpp |  3 ++-
 2 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/src/gallium/state_trackers/clover/core/event.cpp b/src/gallium/state_trackers/clover/core/event.cpp
index e1f9de0..fdd9209 100644
--- a/src/gallium/state_trackers/clover/core/event.cpp
+++ b/src/gallium/state_trackers/clover/core/event.cpp
@@ -27,7 +27,7 @@ using namespace clover;
 
 event::event(clover::context &ctx, const ref_vector<event> &deps,
              action action_ok, action action_fail) :
-   context(ctx), wait_count(1), _status(0),
+   context(ctx), _wait_count(1), _status(0),
    action_ok(action_ok), action_fail(action_fail) {
    for (auto &ev : deps)
       ev.chain(*this);
@@ -41,7 +41,7 @@ event::trigger_self() {
    std::lock_guard<std::mutex> lock(mutex);
    std::vector<intrusive_ref<event>> evs;
 
-   if (!--wait_count)
+   if (!--_wait_count)
       std::swap(_chain, evs);
 
    return evs;
@@ -81,10 +81,15 @@ event::abort(cl_int status) {
       ev.abort(status);
 }
 
+unsigned
+event::wait_count() const {
+   std::lock_guard<std::mutex> lock(mutex);
+   return _wait_count;
+}
+
 bool
 event::signalled() const {
-   std::lock_guard<std::mutex> lock(mutex);
-   return !wait_count;
+   return !wait_count();
 }
 
 cl_int
@@ -99,8 +104,8 @@ event::chain(event &ev) {
    std::unique_lock<std::mutex> lock_ev(ev.mutex, std::defer_lock);
    std::lock(lock, lock_ev);
 
-   if (wait_count) {
-      ev.wait_count++;
+   if (_wait_count) {
+      ev._wait_count++;
       _chain.push_back(ev);
    }
    ev.deps.push_back(*this);
@@ -112,7 +117,7 @@ event::wait() const {
       ev.wait();
 
    std::unique_lock<std::mutex> lock(mutex);
-   cv.wait(lock, [=]{ return !wait_count; });
+   cv.wait(lock, [=]{ return !_wait_count; });
 }
 
 hard_event::hard_event(command_queue &q, cl_command_type command,
diff --git a/src/gallium/state_trackers/clover/core/event.hpp b/src/gallium/state_trackers/clover/core/event.hpp
index 6469e48..53dac68 100644
--- a/src/gallium/state_trackers/clover/core/event.hpp
+++ b/src/gallium/state_trackers/clover/core/event.hpp
@@ -85,8 +85,9 @@ namespace clover {
    private:
       std::vector<intrusive_ref<event>> trigger_self();
       std::vector<intrusive_ref<event>> abort_self(cl_int status);
+      unsigned wait_count() const;
 
-      unsigned wait_count;
+      unsigned _wait_count;
       cl_int _status;
       action action_ok;
       action action_fail;
-- 
2.3.5

From 9789fa01f013b3f451cc8f8bcfa315edad1865a0 Mon Sep 17 00:00:00 2001
From: Francisco Jerez <curroje...@riseup.net>
Date: Tue, 9 Jun 2015 22:59:43 +0300
Subject: [PATCH 2/2] clover: Run the associated action before an event is
 signalled.

And define a method for other threads to wait until the action
function associated with an event has been executed to completion.

For hard events, this will mean waiting until the corresponding
command has been submitted to the pipe driver, without necessarily
flushing the pipe_context and waiting for the actual command to be
processed by the GPU (which is what hard_event::wait() already does).

This weaker kind of event wait will allow implementing blocking memory
transfers efficiently.
---
 src/gallium/state_trackers/clover/core/event.cpp | 22 +++++++++++-----------
 src/gallium/state_trackers/clover/core/event.hpp |  1 +
 2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/src/gallium/state_trackers/clover/core/event.cpp b/src/gallium/state_trackers/clover/core/event.cpp
index fdd9209..1f0db1b 100644
--- a/src/gallium/state_trackers/clover/core/event.cpp
+++ b/src/gallium/state_trackers/clover/core/event.cpp
@@ -44,19 +44,16 @@ event::trigger_self() {
    if (!--_wait_count)
       std::swap(_chain, evs);
 
+   cv.notify_all();
    return evs;
 }
 
 void
 event::trigger() {
-   auto evs = trigger_self();
-
-   if (signalled()) {
+   if (wait_count() == 1)
       action_ok(*this);
-      cv.notify_all();
-   }
 
-   for (event &ev : evs)
+   for (event &ev : trigger_self())
       ev.trigger();
 }
 
@@ -73,11 +70,9 @@ event::abort_self(cl_int status) {
 
 void
 event::abort(cl_int status) {
-   auto evs = abort_self(status);
-
    action_fail(*this);
 
-   for (event &ev : evs)
+   for (event &ev : abort_self(status))
       ev.abort(status);
 }
 
@@ -112,12 +107,17 @@ event::chain(event &ev) {
 }
 
 void
+event::wait_signalled() const {
+   std::unique_lock<std::mutex> lock(mutex);
+   cv.wait(lock, [=]{ return !_wait_count; });
+}
+
+void
 event::wait() const {
    for (event &ev : deps)
       ev.wait();
 
-   std::unique_lock<std::mutex> lock(mutex);
-   cv.wait(lock, [=]{ return !_wait_count; });
+   wait_signalled();
 }
 
 hard_event::hard_event(command_queue &q, cl_command_type command,
diff --git a/src/gallium/state_trackers/clover/core/event.hpp b/src/gallium/state_trackers/clover/core/event.hpp
index 53dac68..03c97bc 100644
--- a/src/gallium/state_trackers/clover/core/event.hpp
+++ b/src/gallium/state_trackers/clover/core/event.hpp
@@ -69,6 +69,7 @@ namespace clover {
       virtual cl_int status() const;
       virtual command_queue *queue() const = 0;
       virtual cl_command_type command() const = 0;
+      void wait_signalled() const;
       virtual void wait() const;
 
       virtual struct pipe_fence_handle *fence() const {
-- 
2.3.5

Attachment: signature.asc
Description: PGP signature

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to