Wow. On Mon, Dec 4, 2023, 5:43 PM Ivan Krylov <krylov.r...@gmail.com> wrote:
> Dear Artur, > > You've got a well-written package. There are some parts I wasn't able > to understand (e.g. changing the class of a variable by reference using > SET_CLASS and later changing it back), but there are no obvious places > where a mistake could be hiding. > > On Fri, 1 Dec 2023 23:43:32 +0000 > Artur Araujo <artur.s...@gmail.com> wrote: > > > Examples with CPU time > 2.5 times elapsed time > > user system elapsed ratio > > contour.TPCmsm 2.605 0.132 0.108 25.343 > > I managed to reproduce this, but not before compiling R-devel r85646 > on a Debian Testing virtual machine using clang-17 and flang-17 with > libomp-17-dev installed. In particular, this doesn't seem to happen > with latest R-devel built using (much older versions of) the GNU > toolchain. > > There was a false start where I saw pthread_create breakpoints hit > inside libpango while contour() was running, but that couldn't be your > problem because examples run with the pdf() device active. Instead, > when source()ing the TPmsm-Ex.R file, the breakpoint fired elsewhere: > > ### Name: as.data.frame.survTP > ### Title: as.data.frame method for a survTP object > ### Aliases: as.data.frame.survTP > ### Keywords: manip metho .... [TRUNCATED] > > Breakpoint 7, __pthread_create_2_1 (newthread=0x7fffffff5a70, > attr=0x7fffffff5a78, start_routine=0x7ffff75882a0 <__kmp_launch_worker()>, > arg=0x5555572d9e80) > at ./nptl/pthread_create.c:623 > (gdb) bt > #0 __pthread_create_2_1 (newthread=0x7fffffff5a70, attr=0x7fffffff5a78, > start_routine=0x7ffff75882a0 <__kmp_launch_worker()>, arg=0x5555572d9e80) > at ./nptl/pthread_create.c:623 > #1 0x00007ffff75879b4 in __kmp_create_worker () at > build-llvm/tools/clang/stage2-bins/openmp/runtime/src/z_Linux_util.cpp:795 > #2 0x00007ffff7518899 in __kmp_allocate_thread () at > build-llvm/tools/clang/stage2-bins/openmp/runtime/src/kmp_runtime.cpp:4677 > #3 0x00007ffff751078a in __kmp_allocate_team () at > build-llvm/tools/clang/stage2-bins/openmp/runtime/src/kmp_runtime.cpp:5384 > #4 0x00007ffff7511813 in __kmp_fork_call () at > build-llvm/tools/clang/stage2-bins/openmp/runtime/src/kmp_runtime.cpp:2122 > #5 0x00007ffff74ffc81 in __kmpc_fork_call () at > build-llvm/tools/clang/stage2-bins/openmp/runtime/src/kmp_csupport.cpp:300 > #6 0x00007ffff465100e in cens2 (pT1=<optimized out>, pE1=<optimized out>, > pS=<optimized out>, pE=<optimized out>, pn=<optimized out>, > tfunc=<optimized out>, > pcorr=<optimized out>, pdistpar=<optimized out>, cfunc=<optimized > out>, pcenspar=<optimized out>, pstate2prob=<optimized out>) at dgpTP.c:83 > #7 dgpTP (n=<optimized out>, corr=<optimized out>, dist=<optimized out>, > distpar=<optimized out>, modelcens=<optimized out>, censpar=<optimized > out>, > state2prob=<optimized out>) at dgpTP.c:178 > #8 0x0000555555645cba in R_doDotCall (fun=0x0, nargs=nargs@entry=7, > cargs=cargs@entry=0x7fffffff63b0, call=call@entry=0x5555573b62b8) > at ../../../R-svn/src/main/dotcode.c:1498 > (gdb) frame 6 > #6 0x00007ffff465100e in cens2 (pT1=<optimized out>, pE1=<optimized out>, > pS=<optimized out>, pE=<optimized out>, pn=<optimized out>, > tfunc=<optimized out>, > pcorr=<optimized out>, pdistpar=<optimized out>, cfunc=<optimized > out>, pcenspar=<optimized out>, pstate2prob=<optimized out>) at dgpTP.c:83 > 83 #pragma omp parallel num_threads(global_num_threads) > (gdb) p global_num_threads > $4 = 8 > > Not sure why the CPU time gets counted for the contour() example > instead of as.data.frame.survTP - do libomp threads not get counted > until they terminate or something? - but since the contour.TPCmsm > example directly follows as.data.frame.survTP, the call to dgpTP here > must be the culprit. > > I would like to suggest a new diagnostic for the CPU time / elapsed > time tests. If any examples have been found to exceed the time ratio > limit and R is running on Linux with a special environment variable > set, the examples should be re-run under a debugger. The debugger > should set a breakpoint on the clone _system call_ (ideally, all system > calls that can create threads or child processes) and print a C-level > backtrace when it fires. Ideally, it should call something like > R_ConciseTraceback to obtain an R-level traceback too, but it's not an > API (at least yet) and I'm not sure it's generally safe to do this in > the middle of an R call. If there is interest, I can try to see if gdb > can be scripted to make this happen. I think it must be a ptrace > debugger and not an LD_PRELOAD wrapper because there are too many ways > to start threads and processes. For example, with libomp, > pthread_create() breakpoints fire while clone() breakpoints do not, > despite something later does make the clone() system call. > > This won't help if the threads get started separately and only later > get utilised in a different example, but I wasn't able to put the > output of `perf report` for good use when debugging similar problems in > a different package. Other ideas for tracing the CPU usage culprits are > also welcome. > > -- > Best regards, > Ivan > > ______________________________________________ > R-package-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-package-devel > [[alternative HTML version deleted]] ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel