On 11/22/2013 10:30 AM, Eric Anholt wrote:
Kenneth Graunke <kenn...@whitecape.org> writes:
On 11/22/2013 12:21 AM, Eric Anholt wrote:
The canary is basically just to give a better debugging message when you
ralloc_free() something that wasn't rallocated. Reduces maximum memory
usage of apitrace replay of the dota2 demo by 60MB on my 64-bit system (so
half that on a real 32-bit dota2 environment).
Really, half? It's an unsigned...that's 4 bytes regardless of 64-bit
vs. 32-bit. I think this should be 60MB of savings, end of story.
Scalar types get aligned to their size, so since it's followed by a
pointer, there's 4 bytes of pad in between.
For anyone that hasn't seen this tool before, check out pahole from the
dwarves package. Run it on a .o file you think might be sucking up a
bunch of memory, and see your structs like:
class fs_inst : public backend_instruction {
public:
/* class backend_instruction <ancestor>; */ /* 0 32 */
/* XXX last struct has 7 bytes of padding */
class fs_reg dst; /* 32 48 */
/* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
class fs_reg src[3]; /* 80 144 */
/* --- cacheline 3 boundary (192 bytes) was 32 bytes ago --- */
bool saturate; /* 224 1 */
/* XXX 3 bytes hole, try to pack */
int conditional_mod; /* 228 4 */
uint8_t flag_subreg; /* 232 1 */
/* XXX 3 bytes hole, try to pack */
int mlen; /* 236 4 */
int regs_written; /* 240 4 */
int base_mrf; /* 244 4 */
uint32_t texture_offset; /* 248 4 */
int sampler; /* 252 4 */
/* --- cacheline 4 boundary (256 bytes) --- */
int target; /* 256 4 */
bool eot; /* 260 1 */
bool header_present; /* 261 1 */
bool shadow_compare; /* 262 1 */
bool force_uncompressed; /* 263 1 */
bool force_sechalf; /* 264 1 */
bool force_writemask_all; /* 265 1 */
...
/* size: 288, cachelines: 5, members: 21 */
/* sum members: 280, holes: 3, sum holes: 8 */
/* paddings: 1, sum paddings: 7 */
/* last cacheline: 32 bytes */
};
Getting a bit OT, but I'm sure some mesa structs could be compacted
quite a bit. In gl_texture_image, for example, a number of the fields
could be reduced to GLubyte (like Face, Level, Border, NumSamples, etc)
and rearranged to reduce the memory used for such objects.
We could potentially reduce gl_texture_image from 80 bytes to 44 bytes
which would save 324 bytes for a 256x256 mipmapped texture. It would
start to add up with a thousand textures or so.
There might be some debate about how worthwhile that is. I'm not too
concerned right now.
However, pahole says gl_debug_state is fairly huge: 292712 bytes!
sizeof(gl_context) = 384208 so that's a big piece. At the very least,
maybe gl_debug_state could be pulled out and allocated on first use...
-Brian
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev