> -----Original Message----- > From: Francisco Jerez [mailto:curroje...@riseup.net] > "Dorrington, Albert" <albert.dorring...@lmco.com> writes: > > > > From reading the OpenCL spec (and perhaps I'm misinterpreting something > again), section 5.10 Flush and Finish says: > > > > Any blocking commands queued in a command-queue such as > > clEnqueueRead{Image|Buffer} with blocking_read set to CL_TRUE, > > clEnqueueWrite{Image|Buffer} with blocking_write set to CL_TRUE, > > clEnqueueMap{Buffer|Image} with blocking_map set to CL_TRUE or > > clWaitForEvents perform an implicit flush of the command-queue. > > > > From this statement, I would expect that the command-queue would be > flushed when the blocking flag is set. > > clEnqueueRead*, clEnqueueMap* and clWaitForEvents already flush the > command queue (the first two are flushing indirectly as we try to map a > buffer referenced by the GPU). clEnqueueWrite* doesn't flush, but it's not > clear to me that not doing it can be considered a violation of the spec. The > guarantees given by clFlush() are rather vague (to some extent an empty > function could be a valid implementation) and it seems to me that a > compliant implementation might, for instance, choose to batch up > commands across flushes if that's the most efficient thing to do, as long as > the user has no way to tell the difference. > > I'd like to see some real-world example where clover's behavior represents a > problem before we change it to flush more frequently, because I'm worried > that changing this will actually worsen performance rather than improving it.
I have been working with a modified version of Mesa code, which accepts kernels compiled with AMD's compiler. (Our project's goal is to host Mesa in an environment which does not currently support LLVM/Clang or C++11) While testing 2D image read capabilities, I have been encountering an issue where the command queue's 'queued_events' continues to be populated, with none of the events being removed until the clFinish call. At that point, I have 23,328 events in the queue and encounter a segmentation fault during the command_queue flush. After seeing the statement in the OpenCL spec about the implicit flush during the clEnqueue calls, I added the previously mentioned conditional hev().wait() calls to initiate a flush. This seems to have resolved the issue with the segFaults during the clFinish call; although I'll admit it likely isn't the most efficient method. While I have not benchmarked the runtimes precisely, the run-time did not seem to be significantly impacted. The test ran for ~20 minutes before crashing, and now runs for ~20 minutes before completing successfully. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev