mån 2002-10-21 klockan 21.44 skrev Peter Pentchev:
> On Mon, Oct 21, 2002 at 06:33:46PM +0200, Linus Kendall wrote:
> > Answer inline below.
> > 
> > m?n 2002-10-21 klockan 15.50 skrev Peter Pentchev:
> > > On Mon, Oct 21, 2002 at 04:48:34PM +0300, Peter Pentchev wrote:
> > > > On Mon, Oct 21, 2002 at 03:24:08PM +0200, Linus Kendall wrote:
> > > > > m?n 2002-10-21 klockan 14.45 skrev Peter Pentchev:
> > > > > > On Mon, Oct 21, 2002 at 01:35:59PM +0200, Linus Kendall wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > I'm trying to port a heavily threaded application from Linux (Debian
> > > > > > > 3.0, 2.4.19) to
> > > > > > > FreeBSD (4.6-RELEASE). The program compiles successfully using gcc with
> > > > > > > -pthreads. But, when I try to run the application I get the following
> > > > > > > error after a while (after spawning 11 threads):
> > > > > > > 
> > > > > > > Fatal error 'siglongjmp()ing between thread contexts is undefined by
> > > > > > > POSIX 1003.1' at line ? in file
> > > > > > > /usr/src/lib/libc_r/uthread/uthread_jmp.c (errno = ?)
> > > > > > > Abort trap - core dumped
> > > > > > > 
> [snip]
> > > > This is interesting; can you produce a simple testcase?  If not, I will
> > > > be able to take a look at it some time later today or tomorrow, but not
> > > > right now :(
> > 
> > I'm not sure if I've really got time to produce a testcase. As I've
> > understood the main cause of the crash was that in *BSD the signals
> > are sent to each thread but in Linux they're sent to the process.
> 
> Okay, I can see what the problem is; however, I have absolutely no idea
> how it is to be solved :(
> 
> The DNS resolution routines of libcurl use alarm() as a timeout
> mechanism for the system DNS resolving functions.  To enforce the
> timeout even when the resolver functions are automatically restarted
> after the SIGALRM signal, libcurl attempts to set a jump buffer in the
> thread doing the DNS lookup, and to siglongjmp() to it from the SIGALRM
> handler.
> 
> This works just fine on Linux, where each thread executes as a separate
> process; the signal is correctly delivered to the thread which invoked
> alarm(), and, consequently, exactly the one that set the jump buffer in
> the first place.
> 
> On FreeBSD, however, the signal is delivered merely to the currently
> executing thread; if the resolver routines are currently in the process
> of sending or receiving data on a network socket, the currently
> executing thread may very well not be the one that has requested the
> resolving, and so siglongjmp() may be called from a thread which is NOT
> the one the jump buffer has been set in.  As the abort error message
> states, this is behavior not covered by any standards, and, I dare say,
> not very easy to implement at all, so it is currently unimplemented in
> FreeBSD.  For a standards reference, the SUSv2 siglongjmp() manpage at
> http://www.opengroup.org/onlinepubs/007908799/xsh/siglongjmp.html
> explicitly states at the end of the DESCRIPTION section:
> 
>   The effect of a call to siglongjmp() where initialisation of the jmp_buf
>   structure was not performed in the calling thread is undefined.
> 
> > Blocking all signals resulted in an application which executed but
> > still I got problems with slow responses from libcurl
> 
> As I understand it, the only reason for SIGALRM to make a difference
> would be a situation where a DNS query times out, at least by libcurl's
> standards.  Is your application trying to do such lookups?
> 
> If anybody is interested, I am attaching a short proof-of-concept
> program which starts up two threads, then waits for a signal handler to
> hit.  If the longjmp() call is commented out, it displays the thread ID
> of the thread which received the signal - almost always the main thread,
> the one listed as 'me' in the list output at the program start, and most
> definitely not the last thread to call setjmp(), as that would be 't2'.
> If the longjmp() call is uncommented, the signal handler executing in
> the 'me' thread will longjmp() to a buffer initialized in the 't2'
> thread, and the program will abort with your error message with a 100%
> failure (or would that be success in proving the concept?) rate.
> 
> People knowledgeable about threads: would there be a way to fix that
> problem?  I don't know.. something like examining the jump buffer, then
> activating the thread that is stored there, and resuming the currently
> executing thread at the point where it was interrupted by the signal?
> Without looking at the code, I can guess that most probably the answer
> would be a short burst of hysterical laughter :)  Still.. one may hope..
> :)

That was very thorough, thanks! Now I at least have a notion of what 
is going on. Since this is slightly urgent I guess a hack into the
libcurl source code to try to remove the sigalarms would do the trick
(in my case). In the general case it seems like there's a rather big
problem here as libcurl's behavior cannot really work together with the
FreeBSD implementation of threads.

/Linus.

> G'luck,
> Peter
> 
> -- 
> Peter Pentchev        [EMAIL PROTECTED]        [EMAIL PROTECTED]
> PGP key:      http://people.FreeBSD.org/~roam/roam.key.asc
> Key fingerprint       FDBA FD79 C26F 3C51 C95E  DF9E ED18 B68D 1619 4553
> Hey, out there - is it *you* reading me, or is it someone else?
> 
> #include <sys/types.h>
> 
> #include <pthread.h>
> #include <setjmp.h>
> #include <signal.h>
> #include <stdio.h>
> #include <unistd.h>
> 
> pthread_mutex_t        mtxQ;
> int            q[16];
> pthread_t      tq[16];
> size_t                 qcnt;
> sigjmp_buf     jmpbuf;
> 
> static void
> sigalarm(int f)
> {
> 
>       pthread_mutex_lock(&mtxQ);
>       q[qcnt] = f;
>       tq[qcnt] = pthread_self();
>       qcnt++;
>       pthread_mutex_unlock(&mtxQ);
> //    siglongjmp(jmpbuf, 5);
> }
> 
> static void *
> thr(void *arg)
> {
> 
>       sigsetjmp(jmpbuf, 0);
>       sleep((int)arg);
>       return (NULL);
> }
> 
> int
> main(void)
> {
>       pthread_t t1, t2;
>       size_t i;
>       struct sigaction sa;
> 
>       sigsetjmp(jmpbuf, 0);
>       pthread_mutex_init(&mtxQ, NULL);
>       printf("me = %ld\n", (long)pthread_self());
>       pthread_create(&t1, NULL, thr, (void *)4);
>       printf("t1 = %ld\n", (long)t1);
>       pthread_create(&t2, NULL, thr, (void *)5);
>       printf("t2 = %ld\n", (long)t2);
>       memset(&sa, 0, sizeof(sa));
>       sa.sa_handler = sigalarm;
>       sigemptyset(&sa.sa_mask);
>       sigaddset(&sa.sa_mask, SIGALRM);
>       sigaction(SIGALRM, &sa, NULL);
>       alarm(1);
>       printf("qcnt = %u\n", qcnt);
>       sleep(3);
>       printf("qcnt = %u\n", qcnt);
>       sleep(3);
>       printf("qcnt = %u\n", qcnt);
>       sleep(3);
>       printf("qcnt = %u\n", qcnt);
>       for (i = 0; i < qcnt; i++)
>               printf("%2d\t%d\t%ld\n", i, q[i], (long)tq[i]);
>       return (0);
> }


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Reply via email to