Maybe you have run out of file handles. William
On Mon, 29 Mar 2021, 17:36 Patrick Goetz, <pgo...@math.utexas.edu> wrote: > Could this be a function of the R script you're trying to run, or are > you saying you get this error running the same script which works at > other times? > > On 3/29/21 7:47 AM, Simon Andrews wrote: > > I've got a weird problem on our slurm cluster. If I submit lots of R > > jobs to the queue then as soon as I've got more than about 7 of them > > running at the same time I start to get failures, saying: > > > > /bi/apps/R/4.0.4/lib64/R/bin/exec/R: error while loading shared > > libraries: libpcre2-8.so.0: cannot open shared object file: No such file > > or directory > > > > ..which makes no sense because that library is definitely there, and > > other jobs on the same nodes worked both before and after the failed > > jobs. I recently ran 500 identical jobs and 152 of them failed in this > way. > > > > There are no errors in the log files on the compute nodes where this > > failed and it happens across multiple nodes so it's not a single one > > being strange. The R binary is on an isilon network share, but the > > libpcre2 library is on the local disk for the node. > > > > Anyone come across anything like this before? Any suggestions for fixes? > > > > Thanks > > > > Simon. > > > > > > This message is from an external sender. Learn more about why this > > matters. <https://ut.service-now.com/sp?id=kb_article&number=KB0011401> > > > > > >