On Sat, 9 Oct 2021, Ivan Krylov wrote:
В Thu, 7 Oct 2021 21:58:08 -0400 (EDT)
Vladimir Dergachev <volo...@mindspring.com> пишет:
* My understanding from reading documentation and source code is
that there is no dedicated support in R yet, but there are packages
that use multithreading. Are there any plans for multithreading
support in future R versions ?
Shared memory multithreading is hard to get right in a memory-safe
language (e.g. R), but there's the parallel package, which is a part of
base R, which offers process-based parallelism and may run your code on
multiple machines at the same time. There's no communication _between_
these machines, though. (But I think there's an MPI package on CRAN.)
Well, the way I planned to use multitheading is to speedup processing of
very large vectors, so one does not have to wait seconds for the command
to return. Same could be done for many built-in R primitives.
* pthread or openmp ? I am particularly concerned about
interaction with other packages. I have seen that using pthread and
openmp libraries simultaneously can result in incorrectly pinned
threads.
pthreads-based code could be harder to run on Windows (which is a
first-class platform for R, expected to be supported by most packages).
Gábor Csárdi pointed out that R is compiled with mingw on Windows and
has pthread support - something I did not know either.
OpenMP should be cross-platform, but Apple compilers are sometimes
lacking; the latest Apple likely has been solved since I've heard about
it. If your problem can be made embarrassingly parallel, you're welcome
to use the parallel package.
I used parallel before, it is very nice, but R-level only. I am looking
for something to speedup response of individual package functions so they
themselves can be used of part of more complicated code.
* control of maximum number of threads. One can default to openmp
environment variable, but these might vary between openmp
implementations.
Moreover, CRAN-facing tests aren't allowed to consume more than 200%
CPU, so it's a good idea to leave the number of workers in control of
the user. According to a reference guide I got from openmp.org, OpenMP
implementations are expected to understand omp_set_num_threads() and
the OMP_NUM_THREADS environment variable.
Oh, this would never be run through CRAN tests, it is meant for data that
is too big for CRAN.
I seem to remember that the Intel compiler used a different environmental
variable, but it could be this was fixed since the last time I used it.
best
Vladimir Dergachev
--
Best regards,
Ivan
______________________________________________
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel