mån 2002-10-21 klockan 21.44 skrev Peter Pentchev: > On Mon, Oct 21, 2002 at 06:33:46PM +0200, Linus Kendall wrote: > > Answer inline below. > > > > m?n 2002-10-21 klockan 15.50 skrev Peter Pentchev: > > > On Mon, Oct 21, 2002 at 04:48:34PM +0300, Peter Pentchev wrote: > > > > On Mon, Oct 21, 2002 at 03:24:08PM +0200, Linus Kendall wrote: > > > > > m?n 2002-10-21 klockan 14.45 skrev Peter Pentchev: > > > > > > On Mon, Oct 21, 2002 at 01:35:59PM +0200, Linus Kendall wrote: > > > > > > > Hi, > > > > > > > > > > > > > > I'm trying to port a heavily threaded application from Linux (Debian > > > > > > > 3.0, 2.4.19) to > > > > > > > FreeBSD (4.6-RELEASE). The program compiles successfully using gcc with > > > > > > > -pthreads. But, when I try to run the application I get the following > > > > > > > error after a while (after spawning 11 threads): > > > > > > > > > > > > > > Fatal error 'siglongjmp()ing between thread contexts is undefined by > > > > > > > POSIX 1003.1' at line ? in file > > > > > > > /usr/src/lib/libc_r/uthread/uthread_jmp.c (errno = ?) > > > > > > > Abort trap - core dumped > > > > > > > > [snip] > > > > This is interesting; can you produce a simple testcase? If not, I will > > > > be able to take a look at it some time later today or tomorrow, but not > > > > right now :( > > > > I'm not sure if I've really got time to produce a testcase. As I've > > understood the main cause of the crash was that in *BSD the signals > > are sent to each thread but in Linux they're sent to the process. > > Okay, I can see what the problem is; however, I have absolutely no idea > how it is to be solved :( > > The DNS resolution routines of libcurl use alarm() as a timeout > mechanism for the system DNS resolving functions. To enforce the > timeout even when the resolver functions are automatically restarted > after the SIGALRM signal, libcurl attempts to set a jump buffer in the > thread doing the DNS lookup, and to siglongjmp() to it from the SIGALRM > handler. > > This works just fine on Linux, where each thread executes as a separate > process; the signal is correctly delivered to the thread which invoked > alarm(), and, consequently, exactly the one that set the jump buffer in > the first place. > > On FreeBSD, however, the signal is delivered merely to the currently > executing thread; if the resolver routines are currently in the process > of sending or receiving data on a network socket, the currently > executing thread may very well not be the one that has requested the > resolving, and so siglongjmp() may be called from a thread which is NOT > the one the jump buffer has been set in. As the abort error message > states, this is behavior not covered by any standards, and, I dare say, > not very easy to implement at all, so it is currently unimplemented in > FreeBSD. For a standards reference, the SUSv2 siglongjmp() manpage at > http://www.opengroup.org/onlinepubs/007908799/xsh/siglongjmp.html > explicitly states at the end of the DESCRIPTION section: > > The effect of a call to siglongjmp() where initialisation of the jmp_buf > structure was not performed in the calling thread is undefined. > > > Blocking all signals resulted in an application which executed but > > still I got problems with slow responses from libcurl > > As I understand it, the only reason for SIGALRM to make a difference > would be a situation where a DNS query times out, at least by libcurl's > standards. Is your application trying to do such lookups? > > If anybody is interested, I am attaching a short proof-of-concept > program which starts up two threads, then waits for a signal handler to > hit. If the longjmp() call is commented out, it displays the thread ID > of the thread which received the signal - almost always the main thread, > the one listed as 'me' in the list output at the program start, and most > definitely not the last thread to call setjmp(), as that would be 't2'. > If the longjmp() call is uncommented, the signal handler executing in > the 'me' thread will longjmp() to a buffer initialized in the 't2' > thread, and the program will abort with your error message with a 100% > failure (or would that be success in proving the concept?) rate. > > People knowledgeable about threads: would there be a way to fix that > problem? I don't know.. something like examining the jump buffer, then > activating the thread that is stored there, and resuming the currently > executing thread at the point where it was interrupted by the signal? > Without looking at the code, I can guess that most probably the answer > would be a short burst of hysterical laughter :) Still.. one may hope.. > :)
That was very thorough, thanks! Now I at least have a notion of what is going on. Since this is slightly urgent I guess a hack into the libcurl source code to try to remove the sigalarms would do the trick (in my case). In the general case it seems like there's a rather big problem here as libcurl's behavior cannot really work together with the FreeBSD implementation of threads. /Linus. > G'luck, > Peter > > -- > Peter Pentchev [EMAIL PROTECTED] [EMAIL PROTECTED] > PGP key: http://people.FreeBSD.org/~roam/roam.key.asc > Key fingerprint FDBA FD79 C26F 3C51 C95E DF9E ED18 B68D 1619 4553 > Hey, out there - is it *you* reading me, or is it someone else? > > #include <sys/types.h> > > #include <pthread.h> > #include <setjmp.h> > #include <signal.h> > #include <stdio.h> > #include <unistd.h> > > pthread_mutex_t mtxQ; > int q[16]; > pthread_t tq[16]; > size_t qcnt; > sigjmp_buf jmpbuf; > > static void > sigalarm(int f) > { > > pthread_mutex_lock(&mtxQ); > q[qcnt] = f; > tq[qcnt] = pthread_self(); > qcnt++; > pthread_mutex_unlock(&mtxQ); > // siglongjmp(jmpbuf, 5); > } > > static void * > thr(void *arg) > { > > sigsetjmp(jmpbuf, 0); > sleep((int)arg); > return (NULL); > } > > int > main(void) > { > pthread_t t1, t2; > size_t i; > struct sigaction sa; > > sigsetjmp(jmpbuf, 0); > pthread_mutex_init(&mtxQ, NULL); > printf("me = %ld\n", (long)pthread_self()); > pthread_create(&t1, NULL, thr, (void *)4); > printf("t1 = %ld\n", (long)t1); > pthread_create(&t2, NULL, thr, (void *)5); > printf("t2 = %ld\n", (long)t2); > memset(&sa, 0, sizeof(sa)); > sa.sa_handler = sigalarm; > sigemptyset(&sa.sa_mask); > sigaddset(&sa.sa_mask, SIGALRM); > sigaction(SIGALRM, &sa, NULL); > alarm(1); > printf("qcnt = %u\n", qcnt); > sleep(3); > printf("qcnt = %u\n", qcnt); > sleep(3); > printf("qcnt = %u\n", qcnt); > sleep(3); > printf("qcnt = %u\n", qcnt); > for (i = 0; i < qcnt; i++) > printf("%2d\t%d\t%ld\n", i, q[i], (long)tq[i]); > return (0); > } To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message