Am 2015-12-15 08:33, schrieb Martin Gräßlin:
Am 2015-12-15 03:20, schrieb Michael Pyne:
On Mon, December 14, 2015 16:07:38 Martin Graesslin wrote:
On Friday, November 27, 2015 1:05:26 PM CET Michael Pyne wrote:
> On Thu, November 26, 2015 13:16:04 Martin Graesslin wrote:
> > we are facing a problem during the startup of Plasma on Wayland. If OOM
> > protection is enabled for kdeinit and we already have a running X
> > server,
> > kdeinit freezes dead.
> >
> > I'm sorry for having ignored the issue for too long and had just
> > disabled
> > OOM protection on my system, so I never hit it. Now I enabled it again
> > to
> > get the problem. On my system I have now two frozen kdeinit processes:
> >
> > martin 1960 1956 0 77832 26448 1 13:05 ? 00:00:00
> > /opt/kf5/bin/ kdeinit5 --oom-pipe 4 --kded +kcminit_startup
> > martin 1961 1960 0 77832 2816 3 13:05 ? 00:00:00
> > /opt/kf5/bin/ kdeinit5 --oom-pipe 4 --kded +kcminit_startup
> >
> > One has the following stacktrace:
> > It's frozen in this line of code:
> > sigsuspend(&oldsigs); // wait for the signal to come
> >
> > The other one has the following stacktrace:
> > which is:
> > d.n = read(d.fd[0], &d.result, 1);
> >
> > Given that it looks to me like these two processes dead-lock. I do not
> > understand why, why it only happens on Wayland, why the fact that an X
> > server must already be running is relevant and what the OOM protection
> > has
> > to do with it.
>
> I don't have the answer but I can help explain the deadlock better I
> think.
thanks for your input. It helped me understanding quite a bit.
Some more testing results:
Weston+Xwayland: doesn't show the problem
Weston without Xwayland (and DISPLAY=$WAYLAND_DISPLAY): doesn't show
the
problem.
What I absolutely do not understand how KWin could influence it. From
all
the backtraces I see it always freezes before interacting with the
windowing system.
Any more ideas to test and investigate, highly appreciated. I got a
rather
high number of complaints due to that problem and it's a showstopper
and I'm
lost with it.
Did you add an error check around the set_protection call in
start_kdeinit.c
and see if that call is failing? (i.e. does "kill(pid, SIGUSR1)" ever
execute?).
yep I added it, but I'm not sure whether it changed anything. When I
gdb'ed into the process it was hanging in the read in the for loop. So
it might or might not have proceeded to the set_protection call.
If the kill() call *is* reached then perhaps SIGUSR1 is
unintentionally masked
in the 'grandchild' process (the child of kdeinit about to be
exec()'d).
Perhaps something in the wayland/kwin/weston/x11 library interaction
blocks
SIGUSR1 from being received in that case?
good news: I found the reason. It was KWin blocking SIGUSR through
pthread_sigmask and passing it to the child processes created through
QProcess. By reimplementing setupChildProcess I was able to fix the
problem.
Thanks a lot for pointing me in the right direction!
And yes, I'll still look into changing to the wait variant.
Cheers
Martin
_______________________________________________
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel