On Sun, Jan 11, 2009 at 5:09 PM, Jonathan Adamczewski <jadam...@utas.edu.au> wrote: > Seeking comments on the following... > > Working through some ideas, I ended up at the point of command/data delivery > to the SPU to find that opcodes and structs arrive with (only) 8-byte > alignment. This causes problems in that the compiler must prefix every > access to the opcode/struct with a variable rotate based on the local store > address, which is undesirable and avoidable. > > I've taken the approach of treating the batch buffer as a series of 16 byte > slots, with opcode stored in a 16 byte qword and all subsequent data aligned > (and packed) to 16 byte boundaries. > > As such, I've replaced cell_batch_{append,alloc}*() with > cell_batch_alloc16(), which will allocate only a 16 byte aligned multiple of > 16 bytes. (cell_batch_append() appeared to be mostly duplication of > cell_batch_alloc_aligned(), and not essential for its one caller). > > Structures in common.h have been packed to 16 byte multiples, and I'm > STATIC_ASSERT()ing that their size is correct (failing at compile time is > much nicer). > > Where the struct is one not defined in common.h, I'm just using ROUNDUP16() > to ensure 'enough' space is reserved for the struct. > > The result of this change is smaller SPU code - ~100 bytes for spu_render.o > (-2%) and ~600 bytes for spu_command.o (-5%), as well as reducing the size of > the PPU code by ~1.3kb. > > The main benefit, as I see it, is that it provides some assurances about data > layout in LS memory and can avoid some subsequent rotates and copies in some > further changes I am considering. > > It does mean that the batch buffer is a little sparser - at most these > changes result in an extra 16 bytes per batched command. I haven't tried to > measure the added overhead, but I consider it unlikely that this results in a > measurable addition to DMA times. > > > There are a couple of surplus changes in the attached patch, but overall it > should give the gist of what I'm doing. Please let me know if this kind of > modification is of interest and I'll clean it up.
It'd be interesting to know how big the batch buffers are with and without this change but overall this sounds good to me. Are you interested in getting git-write access? -Brian ------------------------------------------------------------------------------ Check out the new SourceForge.net Marketplace. It is the best place to buy or sell services for just about anything Open Source. http://p.sf.net/sfu/Xq1LFB _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev