On Sat, Feb 15, 2014 at 12:22:45PM +0100, Francisco Jerez wrote: > Tom Stellard <t...@stellard.net> writes: > > > From: Tom Stellard <thomas.stell...@amd.com> > > > > The offsets will be stored in the handles parameter. This makes > > it possible to use sub-buffers. > > --- > > src/gallium/drivers/r600/evergreen_compute.c | 2 +- > > src/gallium/drivers/radeonsi/si_compute.c | 1 + > > src/gallium/include/pipe/p_context.h | 9 ++++++--- > > src/gallium/state_trackers/clover/core/kernel.cpp | 11 +++++++++-- > > 4 files changed, 17 insertions(+), 6 deletions(-) > > > > diff --git a/src/gallium/drivers/r600/evergreen_compute.c > > b/src/gallium/drivers/r600/evergreen_compute.c > > index 70efe5c..ee00a24 100644 > > --- a/src/gallium/drivers/r600/evergreen_compute.c > > +++ b/src/gallium/drivers/r600/evergreen_compute.c > > @@ -665,7 +665,7 @@ static void evergreen_set_global_binding( > > assert(resources[i]->target == PIPE_BUFFER); > > assert(resources[i]->bind & PIPE_BIND_GLOBAL); > > > > - *(handles[i]) = buffers[i]->chunk->start_in_dw * 4; > > + *(handles[i]) += buffers[i]->chunk->start_in_dw * 4; > > } > > > > evergreen_set_rat(ctx->cs_shader_state.shader, 0, pool->bo, 0, > > pool->size_in_dw * 4); > > diff --git a/src/gallium/drivers/radeonsi/si_compute.c > > b/src/gallium/drivers/radeonsi/si_compute.c > > index a7f49e7..e0e6bb4 100644 > > --- a/src/gallium/drivers/radeonsi/si_compute.c > > +++ b/src/gallium/drivers/radeonsi/si_compute.c > > @@ -109,6 +109,7 @@ static void si_set_global_binding( > > uint64_t va; > > program->global_buffers[i] = resources[i]; > > va = r600_resource_va(ctx->screen, resources[i]); > > + va += *handles[i]; > > memcpy(handles[i], &va, sizeof(va)); > > } > > } > > diff --git a/src/gallium/include/pipe/p_context.h > > b/src/gallium/include/pipe/p_context.h > > index 8ef6e27..eef0a63 100644 > > --- a/src/gallium/include/pipe/p_context.h > > +++ b/src/gallium/include/pipe/p_context.h > > @@ -461,9 +461,12 @@ struct pipe_context { > > * resources will be bound. > > * \param handles array of pointers to the memory locations that > > * will be filled with the respective base > > - * addresses each buffer will be mapped to. It > > - * should contain at least \a count elements, > > - * unless \a resources is NULL in which case \a > > + * addresses each buffer will be mapped to. When > > + * this function is called, the value at these memory > > + * locations will be the offset in bytes from > > + * the start of the buffer that should be used to > > + * compute the handle. It should contain at least \a > > count > > + * elements, unless \a resources is NULL in which > > case \a > > * handles should be NULL as well. > > * > > Maybe something like this would be easier to understand? > > | * \param handles array of pointers to the memory locations that > | * will be updated with the address each buffer > | * will be mapped to. The base memory address of > | * each of the buffers will be added to the value > | * pointed to by its corresponding handle to form > | * the final address argument. It should contain > | * at least \a count elements, unless \a > | * resources is NULL in which case \a handles > | * should be NULL as well. > > > * Note that the driver isn't required to make any guarantees about > > diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp > > b/src/gallium/state_trackers/clover/core/kernel.cpp > > index 6d894cd..06eff85 100644 > > --- a/src/gallium/state_trackers/clover/core/kernel.cpp > > +++ b/src/gallium/state_trackers/clover/core/kernel.cpp > > @@ -337,8 +337,15 @@ kernel::global_argument::bind(exec_context &ctx, > > align(ctx.input, marg.target_align); > > > > if (buf) { > > - ctx.g_handles.push_back(allocate(ctx.input, marg.target_size)); > > - ctx.g_buffers.push_back(buf->resource(*ctx.q).pipe); > > + resource &r = buf->resource(*ctx.q); > > This could be 'const resource &'. > > > + ctx.g_handles.push_back(ctx.input.size()); > > + ctx.g_buffers.push_back(r.pipe); > > + > > + // XXX: How to handle multi-demensional offsets? > > They're not an issue, you can safely assume that the offset is > one-dimensional because CL doesn't support sub-images, only sub-buffers. > > > + assert(!r.offset[1] && !r.offset[2]); > > + std::vector<uint8_t> offset = bytes(r.offset[0]); > > 'auto v' for consistency? > > > + extend(offset, marg.ext_type, marg.target_size); > > We should call byteswap() here.
I don't think we should call byteswap here. This value needs to be in host-endian order, because the driver will use it to compute the handle. -Tom > > > + insert(ctx.input, offset); > > } else { > > // Null pointer. > > allocate(ctx.input, marg.target_size); > > Would you mind fixing constant sub-buffers too? My plan was to pass the > offset in the low bits of the input argument, like: > 'bytes(ctx.resources.size() << 24 | r.offset[0])'. They should be even > easier than global sub-buffers. > > Thanks. > > > -- > > 1.8.1.5 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev