On Sat, 4 Nov 2017 01:50 am, Chris Angelico wrote: > On Fri, Nov 3, 2017 at 10:26 PM, Rhodri James <rho...@kynesim.co.uk> wrote:
>> I'm with Steven. To be fair, the danger with threads is that most people >> don't understand thread-safety, and in particular don't understand either >> that they have a responsibility to ensure that shared data access is done >> properly or what the cost of that is. I've seen far too much thread-based >> code over the years that would have been markedly less buggy and not much >> slower if it had been written sequentially. > > Yes, but what you're seeing is that *concurrent* code is more > complicated than *sequential* code. Would the code in question have > been less buggy if it had used multiprocessing instead of > multithreading? Maybe. There's no way to be sure unless you actually compare a threading implementation with a processing implementation -- and they have to be "equally good, for the style" implementations. No fair comparing the multiprocessing equivalent of "Stooge Sort" with the threading equivalent of "Quick Sort", and concluding that threading is better. However, we can predict the likelihood of which will be less buggy by reasoning in general principles. And the general principle is that shared data tends, all else being equal, to lead to more bugs than no shared data. The more data is shared, the more bugs, more or less. I don't know if there are any hard scientific studies on this, but experience and anecdote strongly suggests it is true. Programming is not yet fully evidence-based. For example, most of us accept "global variables considered harmful". With few exceptions, the use of application-wide global variables to communicate between functions is harmful and leads to problems. This isn't because of any sort of mystical or magical malignity from global variables. It is because the use of global variables adds coupling between otherwise distant parts of the code, and that adds complexity, and the more complex code is, the more likely we mere humans are to screw it up. So, all else being equal, which is likely to have more bugs? 1. Multiprocessing code with very little coupling between processes; or 2. Threaded code with shared data and hence higher coupling between threads? Obviously the *best* threaded code will have fewer bugs than the *worst* multiprocessing code, but my heuristic is that, in general, the average application using threading is likely to be more highly coupled, hence more complicated, than the equivalent using multiprocessing. (Async is too new, and to me, too confusing, for me to have an opinion on yet, but I lean slightly towards the position that deterministic task-switching is probably better than non-deterministic.) -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list