The kernel I'm working with is rather simple:

const sampler_t s_nearest = CLK_FILTER_NEAREST | CLK_NORMALIZED_COORDS_FALSE | 
CLK_ADDRESS_NONE;

__kernel void
image_test (__read_only image2d_t im, __global float4 *out)
{
  out[ 0] = read_imagef (im, s_nearest, (int2) (0, 0));
}


Using the Catalyst compiler to produce the ISA (with a -O0 compile flag) I get 
the following, which includes my notes/comments on the right side:

; --------  Disassembly --------------------
00 ALU: ADDR(32) CNT(18) KCACHE0(CB1:0-15) KCACHE1(CB2:0-15)
      0  z: BFE_UINT    T0.z,  KC1[0].x,  0x00000005,  1        ; Linear Filter 
Flag
         w: AND_INT     T0.w,  KC1[0].x,  1                     ; Normalized 
Flag
         t: I_TO_F      ____,  KC0[0].x                 ; convert Int to Float 
(Width?)
      1  x: LSHR        R1.x,  KC0[2].x,  2                     ; KC0[2].x / 4 
-> R1.x
         z: CNDE_INT    ____,  PV0.w,  1.0f,  ImgWidth  ; if Not Normalized 
then 1.0f else ImgWidth
         t: I_TO_F      ____,  KC0[0].y                 ; convert int to float 
(Height?)
      2  x: MUL_e       T0.x,  PV1.z,  xCoord                   ; Scale xCoord 
-> T0.x 
         y: CNDE_INT    ____,  T0.w,  1.0f,  ImgHeight  ; if Not Normalized 
then 1.0f else ImgHeight
      3  y: FLOOR       ____,  PV2.x                            ; floor(xCoord)
         w: MUL_e       T0.w,  PV2.y,  yCoord           ; Scale yCoord -> T0.w
      4  x: FLOOR       ____,  PV3.w                            ; Floor(yCoord)
         w: CNDE_INT    ____,  T0.z,  PV3.y,  T0.x              ; If Not Linear 
then floor(xCoord) else Scaled(xCoord)
      5  x: MUL_e       R0.x,  KC0[1].x,  PV4.w         ; R0.x = KC0[1].x * 
xCoord (floored/scaled)
         z: CNDE_INT    ____,  T0.z,  PV4.x,  T0.w              ; If Not Linear 
then floor(xCoord) else Scaled(yCoord)
      6  y: MUL_e       R0.y,  KC0[1].y,  PV5.z         ; R0.y = KC0[1].y * 
yCoord (floored/scaled)
01 TEX: ADDR(64) CNT(1)
      7  SAMPLE R0, R0.xy0x, t0, s0
02 MEM_RAT_CACHELESS_STORE_RAW: RAT(11)[R1], R0, ARRAY_SIZE(4)  VPM
END_OF_PROGRAM

I am fairly certain that KC1[0] is the Sampler value.
And that KC0[0] and KC0[1] are image parameters while KC0[2] is the output 
pointer parameter

The fields I'm unsure of are KC0[1].x and KC0[1].y. I'm fairly certain that 
they are pitch or stride values, but since I'm not sure if the texture memory 
is linear or tiled, I'm not sure.

I was trying to use the sampler set as CLK_ADDRESS_NONE, in an attempt to look 
'outside' of the image dimensions, to see what else might be in the memory 
buffer - but I'm guessing that doesn't work as I suspected, because I keep 
seeing clamped values anyway.

I've started reviewing the changes you made. I'm happy to see that I made a lot 
of the same changes as you implemented (makes me think I actually understand 
some of this!)
Although I'm not sure I quite yet follow what you did in 
evergreen_set_compute_resources(), where you removed the vertex buffer setup.

-Al

> -----Original Message-----
> From: Tom Stellard [mailto:t...@stellard.net]
> Sent: Monday, March 24, 2014 4:16 PM
> To: Dorrington, Albert
> Cc: mesa-dev@lists.freedesktop.org
> Subject: EXTERNAL: Re: [Mesa-dev] OpenCL/clover buffers vs images
> 
> On Mon, Mar 24, 2014 at 02:35:04PM +0000, Dorrington, Albert wrote:
> > I have been experimenting with adding image support to the clover
> implementation, and have been trying to understand the differences
> between the existing buffer support and what would be required to support
> images.
> >
> > From what I'm gathering, buffers are laid out in VRAM in a linear format,
> while images would be laid out in some sort of tiled format.
> >
> > I have been trying to do some research on tiled memory layout, and have
> not yet been able to find anything which describes the tiled format that is in
> use on R600 and Evergreen GPUs.
> >
> > I have also tried going through the OpenGL code to understand how image
> textures are transferred to the R600/Evergreen GPUs, since I am making the
> assumption that OpenGL would be transferring the images to GPU RAM in
> the same tiled format that an OpenCL texture would use.
> >
> > I have been trying to do some comparisons with the Catalyst driver's
> implementation, but I have not determined a way to view the internals of
> the registers and CB areas within the catalyst environment.
> >
> > For example, looking at the IL and ISA generated using the Catalyst SDK, I
> can see that there are 8 32-bit fields being read from CB1 for an read_only
> image kernel parameter.
> > I have been able to determine that the first three are integer width, 
> > height,
> depth. The fourth is the image channel data type, the 8th is the image
> channel order.
> > The 5th and 6th are involved in offset calculations for sampler
> > coordinates (not sure if they are row and slice pitches of some sort)
> > while the 7th seems unused (I'm assuming it must have something to do
> > with 3D images)
> >
> 
> If you send me your example code, I can look at the kernel analyzer and try
> to figure out what is going on.
> 
> > I have been thinking that it should be possible to use Mesa's OpenGL
> > texture transfer routines within the Clover transfer routines (rather
> > than the current path through soft_copy_op, which uses direct memcpy
> > instructions)
> >
> > Unfortunately, so far I've only been able to look at a 4x4 image, anything
> beyond that causes the GPU CP to stall on me.
> >
> > If anyone can shed some light on these parameters that the Catalyst driver
> uses, or provide some information on how the Mesa OpenGL
> implementation transfers texture data to the radeon GPUs, I'd appreciate it.
> >
> > My online research hasn't been very productive, I think because I don't
> fully understand the terminology being used in this area.
> >
> 
> I think you should be able to re-use most of the texturing code in r600g for
> OpenCL.  However, I have very limited knowledge of this code, so I may be
> wrong.
> 
> I actually had basic image support working about 6 months ago.  I had to hard
> code a bunch of values into the compiler and also libclc, but I was able to 
> pass
> a simple test.  Below you can find some links to the code.
> You might get lucky and it will still work after you rebase it, but I doubt 
> it.
> However, it may help you get an idea of what to do by looking through the
> code:
> 
> http://cgit.freedesktop.org/~tstellar/mesa/log/?h=r600g-image-support
> http://cgit.freedesktop.org/~tstellar/libclc/log/?h=image
> http://cgit.freedesktop.org/~tstellar/llvm/log/?h=image-support
> 
> 
> > Thanks!
> >
> > Al Dorrington
> > Software Engineer Sr
> > Lockheed Martin, Mission Systems and Training
> >
> 
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to