On 05/24/2014 09:22 PM, Jeremy Huddleston Sequoia wrote: > On May 24, 2014, at 19:55, Emil Velikov <emil.l.veli...@gmail.com> wrote: > >> Hi Jeremy, >> >> IIRC there was another location where the above typedef gave us the finger. >> Not entirety sure what the conclusion on the topic was and I believe that >> some >> of the patches did not get accepted as they would break our current libGL <> >> DRI ABI. The discussion (starting with a few patches) is available in the ML >> archives [1]. >> >> -Emil >> >> [1] http://lists.freedesktop.org/archives/mesa-dev/2014-March/055617.html > > Thanks for the pointer. +Brian and Ian from the March thread. > > As I understand it, the only platforms where fixing this could break > the DRI ABI are the ones where GLhandleARB and GLuint do not have the > same underlying type. The only platform where that is the case is > darwin, which doesn't use that code (hence why I mentioned above that I > wasn't concerned about fixing this breaking binary compatibility on > darwin). Can someone explain how chaning some GLuint types to > GLhandleARB (or visa versa) could break ABI on other systems? I just > don't see why that would be the case. > > Ian said: >> The problem is that drivers are built expecting that glCompileShader and >> glCompileShaerARB are the same function. As a result, the driver only >> asks libGL the offset of one of those functions in the dispatch table, >> and it only sets one pointer in the dispatch table. Then an application >> tries to call the "other" function, gets a NULL dispatch pointer, and >> explodes. > > That doesn't seem right to me. Why would the driver only set one > entry? As it knows (or at least assumes) that both are the same, it > seems understandable that it would just ask libGL for one of the > functions, but it should set both entries in its dispatch table to > that value. Having a NULL entry for one of those functions seems > like an obvious bug at the driver level. Is the application layer > really responsible for knowing about what aliasing is being done at > the driver level? That's a rather big violation of the abstraction > that I'd expect to be present.
Re-re-re-re-hashing the old discussion... Every libGL that has ever shipped on Linux from Mesa has only one dispatch table entry for both the GLuint and GLhandle version of the functions. There's only one place for the driver to store a pointer, and shipping drivers only know that they need to store one pointer. Changing either libGL or the driver will catastrophically break ABI with the other. There is a temptation to say that we should never have had any functions alias each other in the dispatch table. I can see some strong agruments for that especially in light of this issue and a previous bug with ARB_framebuffer_object vs EXT_framebuffer_object functions incorrectly aliasing. This was an intentional design choice that was made for GLX in the Xserver. The original design was for functions that had the same GLX opcode to share the same dispatch. In the server, it is impossible to tell the difference between glTexImage3D and glTexImage3DEXT. They both just come in as opcode 4114. Having multiple entrypoints only made more work for everyone. This was also at a time when it was common to have as many as four different names (vendor, EXT, ARB, and "core") for the same function. None of the code was generated by scripts, and api_exec.c wasn't generated until about a year ago. Each time a new spelling of the name was added, someone had to remember to manually update some code. Had GLX protocol been defined for the ARB functions in 2003, it would have: a. Had 64-bits for the handles, and Mesa would have had multiple dispatch entries. b. Had 32-bits for the handles (as the GLX protocol added in 2009 does!), and Mesa would have still only had aliased entries. > Also, in the earlier thread, Ian said, "I can't understand why we'd > break our own ABI because of something silly that Apple did. This > feels like madness." ... if I recall, the issue wasn't that Apple did > "something silly," the issue was that GLhandleARB was underspecified > and different vendors implemented it differently. Apple is no more > "at fault" for making it sized to a pointer (which is actually much > more "safe" given ambiguity) than Mesa is "at fault" for fixing it at > a 32bit unsigned integer. The real issue here is that mesa is mixing > GLhandleARB and GLuint when it shouldn't be and has made other design > decisions which make fixing bugs like this difficult. The reason I believe Apple did something silly is that OpenGL 2.0, which uses GLuint, shipped in October 2004... and in April 2005 Apple shipped something that used void* sized GLhandleARB. At least one person from Apple was on the conference call when the discsion was decision was made to change GLhandleARB to GLuint in the API, so it should not have been a surprise that GLhandleARB was a dead end. The Mesa implementation came even after that, and the implementer decieded to not have two separate entrypoints for no clear benefit. That wasn't me, and I don't recall who or when it was. *BUT* I don't think any of that matters. I think this can all be resolved without having to break any ABI. It will take a bit of work of fairly unpleasant work. but I think it's doable. 1. Rename the existing Mesa functions with the OpenGL 2.0 names and function signatures. This will require changes to the XML so that the api_exec.c generator script does the right thing. 2. Add new functions with the old names and the ARB function signatures. These functions ought to be completely trivial: they'll just do some pointer casting and call the "other" functions. They should probably be wrapped in #ifdef APPLE blocks. 3. Introduce new markup to the XML. Maybe 'offset="assign_apple"' or similar. Modify the scripts that matter so that they will treat functions marked "assign_apple" as "assign" when building on Apple platforms. On non-Apple platforms it is ignored. This means the XML processing code will have to correctly handle function entires that have an alias="..." and an offset="assign_apple". 4. Pizza party. There is an addition step that could perhaps split the dispatch entries even on Linux, but it seems a little sketchy to me. I haven't fully thought it through, so it may not even work at all. I think we may want to have a "flag day" for libGL / driver ABI in the not too distant future, so I think I'd rather put this on the list of things to change at that time. > --Jeremy _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev