So design is mostly the same then previously. Few changes, first i use only one thread to offload all cs submission wether gfx or dma. Reasons is that using on thread for gfx and one for dma lead to more complex synchronization with no gain ie when submitting gfx you would need to make sure previous dma submittion are done and vice et versa. So in the end it's just not a good idea. Moreover the dma submission is lot faster than the gfx one as the dma cs are smaller and simpler to parse for the kernel.
Second is that i don't use a stack in r600g to keep track of cs submission ordering. Instead anytime r600g switch cmd stream ie start writing dma command after writing gfx one, we first asynchronously flush the gfx command. This insure that any point in time the driver is only building command for either gfx or dma ring and everything is serialize from driver pov. It simplify implementation as there is no need to special case some corner case such as query/event or streamout buffer. The last patch is a small optimization that decrease the cpu overhead by not submitting gfx cmd that does not do anything. Everything been tested on r7xx and evergreen and i witnessed no regression. Evergreen can be improved by adding support for partial blit but i am not sure it's worth it. Cheers, Jerome _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev