Dear Artur, You've got a well-written package. There are some parts I wasn't able to understand (e.g. changing the class of a variable by reference using SET_CLASS and later changing it back), but there are no obvious places where a mistake could be hiding.
On Fri, 1 Dec 2023 23:43:32 +0000 Artur Araujo <artur.s...@gmail.com> wrote: > Examples with CPU time > 2.5 times elapsed time > user system elapsed ratio > contour.TPCmsm 2.605 0.132 0.108 25.343 I managed to reproduce this, but not before compiling R-devel r85646 on a Debian Testing virtual machine using clang-17 and flang-17 with libomp-17-dev installed. In particular, this doesn't seem to happen with latest R-devel built using (much older versions of) the GNU toolchain. There was a false start where I saw pthread_create breakpoints hit inside libpango while contour() was running, but that couldn't be your problem because examples run with the pdf() device active. Instead, when source()ing the TPmsm-Ex.R file, the breakpoint fired elsewhere: ### Name: as.data.frame.survTP ### Title: as.data.frame method for a survTP object ### Aliases: as.data.frame.survTP ### Keywords: manip metho .... [TRUNCATED] Breakpoint 7, __pthread_create_2_1 (newthread=0x7fffffff5a70, attr=0x7fffffff5a78, start_routine=0x7ffff75882a0 <__kmp_launch_worker()>, arg=0x5555572d9e80) at ./nptl/pthread_create.c:623 (gdb) bt #0 __pthread_create_2_1 (newthread=0x7fffffff5a70, attr=0x7fffffff5a78, start_routine=0x7ffff75882a0 <__kmp_launch_worker()>, arg=0x5555572d9e80) at ./nptl/pthread_create.c:623 #1 0x00007ffff75879b4 in __kmp_create_worker () at build-llvm/tools/clang/stage2-bins/openmp/runtime/src/z_Linux_util.cpp:795 #2 0x00007ffff7518899 in __kmp_allocate_thread () at build-llvm/tools/clang/stage2-bins/openmp/runtime/src/kmp_runtime.cpp:4677 #3 0x00007ffff751078a in __kmp_allocate_team () at build-llvm/tools/clang/stage2-bins/openmp/runtime/src/kmp_runtime.cpp:5384 #4 0x00007ffff7511813 in __kmp_fork_call () at build-llvm/tools/clang/stage2-bins/openmp/runtime/src/kmp_runtime.cpp:2122 #5 0x00007ffff74ffc81 in __kmpc_fork_call () at build-llvm/tools/clang/stage2-bins/openmp/runtime/src/kmp_csupport.cpp:300 #6 0x00007ffff465100e in cens2 (pT1=<optimized out>, pE1=<optimized out>, pS=<optimized out>, pE=<optimized out>, pn=<optimized out>, tfunc=<optimized out>, pcorr=<optimized out>, pdistpar=<optimized out>, cfunc=<optimized out>, pcenspar=<optimized out>, pstate2prob=<optimized out>) at dgpTP.c:83 #7 dgpTP (n=<optimized out>, corr=<optimized out>, dist=<optimized out>, distpar=<optimized out>, modelcens=<optimized out>, censpar=<optimized out>, state2prob=<optimized out>) at dgpTP.c:178 #8 0x0000555555645cba in R_doDotCall (fun=0x0, nargs=nargs@entry=7, cargs=cargs@entry=0x7fffffff63b0, call=call@entry=0x5555573b62b8) at ../../../R-svn/src/main/dotcode.c:1498 (gdb) frame 6 #6 0x00007ffff465100e in cens2 (pT1=<optimized out>, pE1=<optimized out>, pS=<optimized out>, pE=<optimized out>, pn=<optimized out>, tfunc=<optimized out>, pcorr=<optimized out>, pdistpar=<optimized out>, cfunc=<optimized out>, pcenspar=<optimized out>, pstate2prob=<optimized out>) at dgpTP.c:83 83 #pragma omp parallel num_threads(global_num_threads) (gdb) p global_num_threads $4 = 8 Not sure why the CPU time gets counted for the contour() example instead of as.data.frame.survTP - do libomp threads not get counted until they terminate or something? - but since the contour.TPCmsm example directly follows as.data.frame.survTP, the call to dgpTP here must be the culprit. I would like to suggest a new diagnostic for the CPU time / elapsed time tests. If any examples have been found to exceed the time ratio limit and R is running on Linux with a special environment variable set, the examples should be re-run under a debugger. The debugger should set a breakpoint on the clone _system call_ (ideally, all system calls that can create threads or child processes) and print a C-level backtrace when it fires. Ideally, it should call something like R_ConciseTraceback to obtain an R-level traceback too, but it's not an API (at least yet) and I'm not sure it's generally safe to do this in the middle of an R call. If there is interest, I can try to see if gdb can be scripted to make this happen. I think it must be a ptrace debugger and not an LD_PRELOAD wrapper because there are too many ways to start threads and processes. For example, with libomp, pthread_create() breakpoints fire while clone() breakpoints do not, despite something later does make the clone() system call. This won't help if the threads get started separately and only later get utilised in a different example, but I wasn't able to put the output of `perf report` for good use when debugging similar problems in a different package. Other ideas for tracing the CPU usage culprits are also welcome. -- Best regards, Ivan ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel