> On Nov 9, 2022, at 10:03 AM, Tomas Kalibera <tomas.kalib...@gmail.com> wrote:
>
>
> On 11/7/22 01:58, luke-tier...@uiowa.edu wrote:
>> On Sun, 6 Nov 2022, Simon Urbanek wrote:
>>
>>> Carl,
>>>
>>> first, setting such low interval won't work anyway - the overhead is bigger
>>> than the sampled time, so we should really not allow it to begin with (on
>>> my machine the timer signals arrive before anything can be done so you have
>>> to kill R and you get no output).
>>>
>>> That said, it crashes in doprof() which is called on all threads - the main
>>> R one is ok, but one of the other threads crashes in pthread_self(). At
>>> that time R is trying to propagate the signal from all threads to the main
>>> thread which seems odd to me (since the main thread already got the
>>> signal), I'm CCing Luke in the hope that he has any ideas. This may fall in
>>> the category of "don't do this" and the fix may be to set a lower bound on
>>> the interval.
>>
>> I can't reproduce this on Linux or macOS.
>>
>> On Linux only one thread receives a signal sent to a process, but the
>> kernel picks which one if multiple threads have the signal unblocked,
>> so we make sure the signal gets relayed to the main thread. If macOS
>> behaves differently then someone who knows how signals and threads
>> interact there would have to adjust this code.
>
> From my reading this is the same on macOS. The profiling signal is
> asynchronous, sent to the process, it will be served by one thread which is
> picked by the OS. POSIX doesn't say which thread is preferred.
Yes, I saw the same with extra detail that thread signal blocking doesn't seem
to necessarily work on macOS.
> While some OSes prefer the main thread (I read macOS and Linux do, but from
> non-authoritative sources), R may also be embedded and not run on the main
> thread.
>
> We have to do something to ensure the R thread is not running while we sample
> its R stack, anyway. On Windows we suspend the R thread for that. On Unix we
> do the relaying. We could in principle suspend the R thread on macOS as
> well, but would have to use Mach calls directly.
>
>> Disallowing such a low interval is reasonable, but if there is a real
>> issue on macOS then it would only mask the problem.
>
> Yes. The key question is why pthread_self() crashed.
Yes, that is the main mystery. Looking at the xnu kernel sources it is
equivalent to pthread_getspecific(0) [since it's just the first slot in TSD]
plus a check of a magic content in there. I suspect it's that check which
segfaults for whatever reason. I wanted to see if just comparing the pointer
from pthread_getspecific(0) instead of pthread_self() would work since we don't
care if the pthread_t is valid as we only compare it to the main thread value
(not that I would propose that as a fix since it's very
implementation-specific, just curious), but I didn't get that far (I cannot
really reproduce it - the closest I get is a mach exception under lldb).
> Otherwise, from the stack trace, the behavior looks ok. The main thread (also
> R thread) is serving the signal, hence the signal is blocked, but it is
> received again, so another thread is picked to serve it, and it is relaying
> it to the main thread. One more thread is picked to serve it, and it crashes
> while calling pthread_self(). There is also one more thread not involved in
> the signal handling.
>
> POSIX statest that pthread_self() is async-signal-safe. macOS 12.6 manuals
> (sigaction) however doesn't include any pthread function in the list of
> async-signal-functions.
>
> We could do some work-around (hiding the problem a bit more) like exit from
> the handler if the signal is being served by another thread. We could also
> report such situation to indicate that the interval is unreasonable. But it
> would be good first to know for sure what caused the problem.
>
How can you check anything if pthread functions fail? If a simple pthead_self()
crashes then I don't see how you can do anything since we don't even know what
thread we are, cannot call mutexes etc.
Cheers,
Simon
_______________________________________________
R-SIG-Mac mailing list
R-SIG-Mac@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-mac