On Wed, Nov 9, 2016, at 06:26, Lucas Nussbaum wrote: > On 08/11/16 at 16:01 -0200, Henrique de Moraes Holschuh wrote: > > I fear it might be bad, but > > I would love to be pleasantly surprised that people did get libpthreads > > locking right most of the time... > > I wonder if it has been considered to "fix" glibc so that the misuses > that are tolerated without TSX are also tolerated with TSX? Or is that > impossible?
AFAIK, the hardware cannot be programed to tolerate this kind of programming error. And I don't think that's a bad thing. Locking bugs are already subtle enough when the whole deal is fully visible to software and depends only on trivial atomic operations on machine word sizes (32-bit on ia32/amd64). Hidden by hardware transactional memory, they would go from subtle and difficult to debug straight into utterly nasty hellbug land if the hardware was too permissive about misuse. One can handle the SIGSEGV and attempt to recover, I suppose -- which is painful enough to get right, and that assumes such a thing is possible at all in the first place: we are talking about a threaded application here -- but that is so very slow, that it is simply not worth it as far as I am concerned. Not that I think it would be desirable to do so in the first place: locking bugs are best fixed, not papered over. This is an area where KISS is absolutely required, too. Handling that SIGSEGV to trigger a safe whole-application exit while saving user data is one thing, attempting to resume execution from a signal raised while inside an transactional state that has been aborted(!) is quite another. This is NOT the kind of thing I would ever trust current and future processors to always get right. It reeks of an errata minefield one should never enter willing. The deal with *current* Debian stable is that, if the breakage is too widespread, we simply might not be able to do the right thing (fix the real bugs). IMHO, this is not a valid excuse to paper over the breakage for unstable (or even the next stable, as far as I am concerned. I'd rather delay the release, although it is _not_ clear at this time that such a thing would be needed). It is not really about Intel TSX, it is about broken locking that was *already* causing hard-to-debug issues in many cases (I believe Ian said ghostscript was already showing hard to debug hangs in this thread), and Intel TSX happened to expose. -- Henrique de Moraes Holschuh <h...@debian.org>