On Wed, Feb 12, 2014 at 5:55 PM, Julian Taylor < jtaylor.deb...@googlemail.com> wrote:
> sorry for duplicate, webmail fail. > > On Wed, Feb 12, 2014 at 5:41 PM, Sébastien Villemot > <sebast...@debian.org>wrote: > >> Le mardi 04 février 2014 à 23:17 +0100, Julian Taylor a écrit : >> > There are well known issues with the gnu openmp variant of openblas in >> > respect to forks of applications. >> >> Please find attached an example of such a breakage, using >> parallelisation in R. The example is run simply with: >> >> R --vanilla < foo.R >> >> On sid, with openblas 0.2.8-3 (OpenMP), the program runs and reports 8 >> condition numbers. >> >> On sid but with openblas 0.2.8-2 (pthreads), the program hangs. >> >> > > thanks, it hangs because R forks: > strace -e clone ... > > > mclapply(z, f, mc.cores = num.cores) > clone(child_stack=0, > flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, > child_tidptr=0x7f6df0cd9a50) = 28886 > clone(child_stack=0, > flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, > child_tidptr=0x7f6df0cd9a50) = 28887 > clone(child_stack=0, > flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, > child_tidptr=0x7f6df0cd9a50) = 28888 > clone(child_stack=0, > flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, > child_tidptr=0x7f6df0cd9a50) = 28889 > > > this is the same issue that plagues python and that can be fixed with > pthreads but not gnu openmp. > > As I'm not familiar with R, can you construct a testcase where openblas is > used in both the parent and the child of the fork? > It should also hang with openmp. > > I added the function body above the mclapply and also got openmp openblas to hang as expected. The difference between the two hangs is that openmp initializes lazily while openblas pthread initializes on application start thus always hangs while openmp only hangs when its used on both sides of the fork. This can be fixed in the pthread variant of openblas by adding a pthread_atfork handler which unlocks and reinitializes the locks after the fork in the parent and the child. The handler for that is available in the github bug. While not perfect (it can't be perfect under posix semantics) it prevents these types of hangs while being perfectly safe, everything that is working now keeps working as the fork handler will never be called. Fixing openmp on the otherhand while probably possible (a patch has recently been posted) it will take a lot longer, likely not before gcc 4.10 due 2015.
-- debian-science-maintainers mailing list debian-science-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/debian-science-maintainers