On Sat, Apr 20, 2013 at 02:20:23PM +0200, Christian König wrote: > Am 20.04.2013 09:27, schrieb Mathias Fröhlich: > > Hi Tom, > > > > May be I need to tell where the problem really appears in real life. > > OpenSceneGraph has some nifty features regarding multi channel rendering. > > Assume a setup of multiple full screen views running on different graphics > > boards into a single mashine composing a view into a single scene. > > Now the recommended way to do this with osg is to set up a X screen per > > graphics board. Even if this spans multiple monitors/projectors. Set up a GL > > graphics context per graphics board and set up a viewport per projector in > > the > > graphics contexts. Rendering happens now in parallel for each graphics > > context. I do drive such a thing here with two radeons and three monitors > > for > > testing and here the problem appears. > > > > When I start the favourite flight simulation application of my choice with > > this > > setup, then it crashes almost immediately without llvm_start_multithreaded > > being called. Wheres it works stable if we ensure llvm being multithreaded. > > > > So, I tried to distill a piglit testcase out of this somehow huger setup > > with > > flightgear, OpenSceneGraph, multiple gpu's and what not. > > > > On Friday, April 19, 2013 20:08:54 Tom Stellard wrote: > >> On Wed, Apr 17, 2013 at 07:54:32AM +0200, Mathias Fröhlich wrote: > >>> Tom, > >>> > >>>> -class LLVMEnsureMultithreaded { > >>>> -public: > >>>> - LLVMEnsureMultithreaded() > >>>> - { > >>>> - llvm_start_multithreaded(); > >>>> - } > >>>> -}; > >>>> - > >>>> -static LLVMEnsureMultithreaded lLVMEnsureMultithreaded; > >>> Removing this leads to crashes in llvm with applications that concurrently > >>> work on different gl contexts. > >> The test you wrote still passes with this patch. Do you see that > >> we are now calling the C API version of llvm_start_multithreaded(), > >> LLVMStartMutithreaded() from inside radeon_llvm_compile() protected by a > >> static variable? > > Oh, no I did not see this. I did not realize that the > > llvm_start_multithreaded > > call is not just plain C. So I thought grepping for the call I used is > > sufficient. > > > > But negative. If I really apply your patch and try to run this with the > > above > > setup I get the crashes. The same with the piglit test here. > > > > Too bad, that reproducing races is racy for itself. > > With the piglit test I get about 2/3 of the runs either glibc memory > > corruption aborts. Or one of the below asserts from llvm: > > > > bool llvm::llvm_start_multithreaded(): Assertion `!multithreaded_mode && > > "Already multithreaded!"' failed. > > > > void > > llvm::PassRegistry::removeRegistrationListener(llvm::PassRegistrationListener*): > > Assertion `I != Impl->Listeners.end() && "PassRegistrationListener not > > registered!"' failed. > > > > bool llvm::sys::SmartMutex<mt_only>::release() [with bool mt_only = true]: > > Assertion `((recursive && acquired) || (acquired == 1)) && "Lock not > > acquired > > before release!"' failed. > > > > So the biggest problem IIRC was that use of llvm::sys::SmartMutex<mt_only> > > which is spread around here and there in llvm. The pass registry was (is?) > > one > > of the users for that. If you did not tell llvm to run multithreaded these > > locks get noops and you concurrently access containers and that ... > > > > Looking at the first assert, the llvm guys have made this problem even worse > > IMO since I looked at this before. We need to check for multithreading being > > enabled before trying to set this. Both of which being racy for itself in > > this > > way and all of them not being save against already happening llvm access > > from > > an other thread and an other foreign use. > > > >> Sorry about that. I didn't have piglit commit access at the time, and > >> I forgot about the patch. I fixed a few things and sent v3 to the list. > > The same here. Thanks for this. > > > >>> Regarding the point where this funciton is called I had choosen static > >>> initialization time since llvm requires this function to be called single > >>> threaded which we cannot guarantee in any case. Keep in mind that you need > >>> to ensure this function called non concurrently even against applications > >>> that itself already use the llvm libs in some way while the driver is > >>> loaded. But the best bet is to do this in the dynamic loder which is > >>> itself serialized, so I could avoid calling this function concurrently by > >>> initialization of different contexts. That should at least shield against > >>> applications that itself do the same trick by calling this funtion in the > >>> dlopen phase in some static initializer ... > >>> We may get around part of this problem with dlopening the driver with > >>> better isolation but up to now the problem can get that far. > >> This is a tricky problem, and I'm not sure that radeon_llvm_compile() is > >> the best place to call llvm_start_multithreaded(). Maybe it would be > >> better to move this into gallivm, because this problem could affect any > >> driver that uses the gallivm code, which includes: llvmpipe, r300g, r600g, > >> radeonsi, and i915g. What do you think? > > Yep, an other place would be better. > > > > I do not know the llvm tools well enough, but If I move the current c++ code > > into src/gallium/auxiliary/gallivm/lp_bld_misc.cpp it works for me (TM). > > Seriously, I know of one guy who wants to use llvmpipe with windows and he > > would benefit from the c++ solution since the gcc extension solution is > > clearly > > not available with the cl windows build then. And with the lack of a windows > > test system I do also not start implementing DllMain. > > That could be a short term solution ... > > > > > > Appart from that, I mentioned better symbol isolation somehow. > > > > How would this improove the situation with the initialization? > > If we would know that this current driver module is the only user of the > > this > > particular llvm instance, then we could really place a mutex somewhere and > > serialize the call to lvm::llvm_start_multithreaded() by ourselves as we > > then > > know that we are the only user. This would then happen before the first use > > of > > llvm by the driver module in question. No dlopen time static initializers > > that > > usually lead to ordering problems earlier or later ... > > > > So, I have a patch set floating around here for about half a year that > > changes > > the RTLD_GLOBAL arguments to dlopen to RTLD_LOCAL|RTLD_DEEPBIND which should > > (? I still have to definitely investigate this) provide us the private copy > > of > > the used libraries/modules. Also up to now I did not take the time to really > > analyze if we rely on some globals being really shared between driver > > modules > > or different types of drivers that we then also have multiple times in > > memory. > > Note that this does *not* affect the shared glapi topics as they are all > > above > > the loaded driver modules at the level of libGL.so.1. > > So for example I can think of some need for a single libdrm_radeon to make > > cl- > > gl buffer sharing work for example? Do we? > > > > Alternatively, you might be able to pack every use of llvm at least in > > radeon > > statically into libllvmradeon and again hide all non driver api used stuff > > from > > the outside. Then you would also have your private llvm instance that you > > can > > initialize with your own lock to serialize. That is then more a radeon alone > > solution as the other drivers need to do something similar themselves then. > > And I realize that your recent code move work could point into this > > direction > > at least for the radeon type drivers. > > > > Both symbol isolation methods will also make the driver used llvm version > > independent of any possibly application used version that can lead to > > incompatible symbol clashes. So you gain even more with that. > > > > Opinions? > > Hi Tom & Mathias, > > completely agree with Mathias here. I also suggested on IRC a couple of > weeks ago that libllvmradeon should definitely be static and hide all > internal symbols and only export those needed by the drivers. That > should isolate us mostly from the mess that comes with having multiple > LLVM instances (and probably also different versions) around at the same > time. >
How would you recommend doing this? Version scripts? > That would probably also remove the need to call > llvm_start_multithreaded from any library load routine, since then we > can control the order of initialization. > > I would take a look into it myself, but I'm a bit busy with UVD right now. > I should be able to spend some time on this over the next few days. -Tom > Christian. > > > Mathias > > _______________________________________________ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev