On Thu, Mar 24, 2011 at 7:06 PM, Ulf Magnusson <ulfali...@gmail.com> wrote:
> Hi,
>
> My DirectFB application kept hanging on shutdown whenever I used the
> Linux Input driver, so I did some investigation to figure out why.
> Here's what happens:
>
> 1. On shutdown, the threads processing /dev/event/X are all 
> pthread_cancel()'ed.
> 2. For some reason[1] one or more of the threads reach the D_PERROR
> ("linux_input thread died\n") at the end of linux_input_EventThread().
> 3. One of the threads acquires the log->lock mutex in
> direct_log_printf(), writes to log->fd, and then dies before releasing
> the lock as write() is a cancellation point.
> 4. The next thread that tries to write to the log gets stuck on log->lock.
>
> Here's my proposed fix, which temporarily changes the cancellation
> state of the thread inside direct_log_printf() to prevent it from
> being canceled while holding the lock (I'm by means a Pthreads guru,
> so there might very well be a better solution):
>
> --- a/lib/direct/log.c  2010-11-15 22:12:08.000000000 +0100
> +++ b/lib/direct/log.c  2011-03-24 17:58:38.259808355 +0100
> @@ -167,14 +167,19 @@
>      else {
>           int  len;
>           char buf[512];
> +          int old_cancellation_state;
>
>           len = vsnprintf( buf, sizeof(buf), format, args );
>
> -          pthread_mutex_lock( &log->lock );
> +          /* Ensure the thread does not get canceled at the write(), which
> +           * would prevent the log lock from being released. */
> +          pthread_setcancelstate( PTHREAD_CANCEL_DISABLE,
> &old_cancellation_state );
>
> +          pthread_mutex_lock( &log->lock );
>           write( log->fd, buf, len );
> -
>           pthread_mutex_unlock( &log->lock );
> +
> +          pthread_setcancelstate( old_cancellation_state, NULL );
>      }
>
>      va_end( args );
>
> With the above patch my application no longer hangs on shutdown.
>
> You probably also ought to make sure a thread can never die inside a
> direct_log_lock()/direct_log_unlock() pair.
>
> [1] This seems to be due to a uClibc bug that causes the select() on
> linux_input.c:902 (DirectFB 1.4.11) to return -1 while errno remains
> 0. Seems to depend on subtle details in how the application was
> compiled.

To clarify: In this case the error was due to uClibc, but the same
thing would happen any time a thread is pthread_cancel()'ed and then
writes to the log, which seems like a bug.

/Ulf
_______________________________________________
directfb-dev mailing list
directfb-dev@directfb.org
http://mail.directfb.org/cgi-bin/mailman/listinfo/directfb-dev

Reply via email to