We're using the CpuProfiler in an Isolate that can have multiple threads running and and taking turns in the Isolate via Unlocker. Frequently, we want to profile what's happening on a thread so we do a StartProfiling for a thread, let it run for a while, and then collect the results via StopProfiling. Works great.
However, I've noticed a difference between Windows and Linux in the profiles, specifically, Windows collects samples for a thread when it's inside an Unlocker bracket whereas Linux does not. So if we have a native function sleep(), that unlocks the Isolate, in Windows I see a high hit count for stacks that have functionName "sleep" and sourceType callback in the bottom stack entry. In Linux, there are no hits where I'm inside the sleep native function. For our purposes, the Windows behavior is much nicer as we can easily see how much of the thread's time is being spent in sleep. In Linux, there is absolutely no way of knowing because the information is not captured. Unfortunately, Windows nicer (IMO) behavior comes at a cost. Specifically, if indeed other threads do run in the Isolate while the profiled thread is sampled when unlocked, there is a tendency for core dumps while getting the stack trace. This is unsurprising because the stack trace needs to look in the Isolate's heap to get things like function names and that's not really a good idea when a thread doesn't have the Isolate lock. The following code samples are all from src/libsampler/sampler.cc. Under Linux the profiler protects itself in SamplerManager::DoSample (called by the SIGPROF handler) with: for (Sampler* sampler : samplers) { ... if (v8::Locker::IsActive() && !Locker::IsLocked(isolate)) continue; sampler->SampleStack(state); } That Locker::isLocked is a bit of a misnomer and really means Locker::isLockedByCurrentThread and, because our SIGPROF handler runs on the profiled thread, if that tread is in an Unlocker bracket in a native function, isLocked returns false and we don't collect a sample. In Windows, the sampler does not run on the profiled thread and, instead, does a SuspendThread of the profiled thread from another thread and then uses GetThreadContext: void Sampler::DoSample() { HANDLE profiled_thread = platform_data()->profiled_thread(); if (profiled_thread == nullptr) return; const DWORD kSuspendFailed = static_cast<DWORD>(-1); if (SuspendThread(profiled_thread) == kSuspendFailed) return; // Context used for sampling the register state of the profiled thread. CONTEXT context; memset(&context, 0, sizeof(context)); context.ContextFlags = CONTEXT_FULL; if (GetThreadContext(profiled_thread, &context) != 0) { ... SampleStack(state); } There is no check if the Isolate is actually locked by the profiled thread so we can get a sample for it but also, unfortunately, seg faults and the like if another thread is running. So it seems that there really should be a check in Sampler::DoSample for Windows to check if the Isolate is locked by the profiled thread. This would require maybe adding another method to Locker like isLockedByThread(ThreadId id) or something like that and then checking that in Sampler::DoSample. When PlatformData is instantiated for Windows it could stash the ThreadId for the current thread so it could be retrieved in Sampler::DoSample. Doesn't seem all that daunting and am willing to put together a PR for this if this seems reasonable and unless someone more conversant with V8 internals doesn't pick it up. However, while this fix would eliminate Windows seg faults when doing profiling in a multi-threaded environment, it would also make me sad because now we would have no way of getting time spent unlocked in my native function. To provide this functionality, it seems like it would be nice to be able to create a provisional TickSample before the Isolate is unlocked in a native function and use that in DoSample if the Isolate is not locked by the profiled thread. Obviously, this would/should only be done if the thread is being profiled. There are a lot of ways this could work but a straw man is that there could be CpuProfiler methods called SaveProvisionalThreadSample and ClearProvisionalThreadSample. The former, at least, would have to be called while one holds the Isolate lock so one would presumably do such a call right before Unlocker instantiation and then do a ClearProvisionalThreadSample call right after the end of the Unlocker context. Then, if DoSample decides that Isolate is not locked by the sampled thread, it could look for a saved provisional sample for the thread and, if available, use that for its sample. Note that a provisional TickSample could be used many times as a thread might be unlocked over many samples. Again, assuming no one else picks this up, happy to take a swing at a PR, myself. While more complicated than the Windows fix, it doesn't seem that daunting. Opinions? Sorry if this would have been more appropriately posted on v8-dev. Thanks -- -- v8-users mailing list v8-users@googlegroups.com http://groups.google.com/group/v8-users --- You received this message because you are subscribed to the Google Groups "v8-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to v8-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/v8-users/8abd14c1-1cc8-4aaf-a7c9-e78b03665f67o%40googlegroups.com.