I've implemented the OpenCL vload/vstore builtin functions in two parts.
1) Pure CL C implementation. No Assembly
2) Add assembly optimizations for 32-bit int/uint loads/stores of 4+ component
   vectors

Note: The vstore implementation assumes that the hardware back end supports
byte-addressable stores.  This may not always be optimal.

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to