Hi, Ok, I suppose this is not the most pressing issue in mesa, but I was toying with an idea of how to reduce get.c size and integrate get_es1.c and get_es2.c and I had to try it out. Of course it ended up being a bigger project and took a couple of days, but in the end I think it turned out to be a worthwhile effort. The result is the two patches on the get-optimagix branch in my personal mesa repo:
http://cgit.freedesktop.org/~krh/mesa/log/?h=get-optimagix The basic idea is that most getters just look up an int somewhere in GLcontext and then convert it to a bool or float according to which of glGetIntegerv() glGetBooleanv() etc is being called. Instead of generating code to do this, we can just record the enum value and the offset into GLcontext in an array of structs. Then in glGet*(), we lookup the struct for the enum in question, and use the offset to get the int we need. Of course, sometimes we need to look up a float, a boolean, a bit in a bitfield, a matrix or other types, so we need to track the type of the value in GLcontext. And sometimes the value isn't in GLcontext but in the drawbuffer, the array object, current texture unit, or maybe it's a computed value. So we need to also track where or how to find the value. Finally, we sometimes need to check that one of a number of extensions are enabled, the gl version or flush or call _mesa_update_state(). This is done by attaching optional extra information to the value description struct, it's sort of like an array of opcodes that describe extra checks or actions. Putting all this together we end up with struct value_desc in the patch, and with a couple of macros to help, the table of struct value_desc is about as concise as the specification in the python code. All we need now is a way to look up the value struct from the enum. The code generated by gcc for the current generated big switch statement is a big, balanced, open coded if/else tree (I'm giving gcc the benefit of the doubt here, I didn't validate that the tree was balanced). It would be natural to sort the new enum table and use bsearch(), but I decided to use a read-only hash table instead. bsearch() has a nice guaranteed worst case performance, but we're also guaranteed to hit that worst case (log2(n) iterations) for about half the enums. Instead, using a simple, direct hashing hash table, we can find the enum on the first try for 80% of the enums, 1 collision for 10% and never more than 5 collisions for any enum (typical numbers). And the code is very simple, even though it feels a little magic. Benefits: 1) Smaller. Much smaller. Generated code is much bigger than the corresponding data tables. Looking at an i965 DRI driver with GLES1 and GLES2 APIs enabled we get: [...@hinata mesa]$ size lib/i965_dri*.so text data bss dec hex filename 2658275 29132 61664 2749071 29f28f lib/i965_dri_old.so 2505275 36980 63712 2605967 27c38f lib/i965_dri.so That is, a 140kb difference, or a 5% size reduction. And since the reduction is in libmesa.a, it applies to all DRI drivers, which adds up to a nice space savings if you're to squeeze 14 DRI drivers onto a live CD (looking at Fedoras mesa-dri-drivers RPM). 2) Faster; the hash table will find the enum in zero to one iterations most of the time and never more that five. Of course, this is all academic, since glGet*() aren't typically in any kind of hotpath, but it's nice to just verify that we're not replacing get.c with something slower. 2) No code-generation, the C file *is* the spec and is about as concise as the python script was. 3) A non-hacky glGetDoublev(). The current implementation calls glGetFloatv() with a local variable array, which it fills with the magic value -1234.5 to be able to determine how many values was returned from glGetFloatv(). So if your matrix has an entry with the value -1234.5 you're out of luck. 4) A clean way to integrate get.c, get-es1.c and get-es2.c. We can initialize the hash table with the values that are valid for the API we're initializing and use the same _mesa_Get*() entry points to implement the glGet* functions for the different APIs. Drawbacks: 1) Uhm, regressions? I went back and double checked the new get.c against the enum list in get_gen.py after finishing the patch. While I didn't find any inconsistencies, it's a long list and I may have overlooked something. I'm running piglit on it now, but I suspect I'll have to add a few testcases to hit the different code paths in the new glGet*() implementation. 2) More complex code (though if you consider the get-gen.py script, it's probably about the same total complexity as the current solution). Let me know what you think about this - I'd like to merge it once I've tested it a bit. Kristian _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev