Thanks so much Jonathan! The stack trace does show the main thread was
waiting on a condition to be fired.

-----------------  lwp# 1 / thread# 1  --------------------
 ffffffff7e8d270c lwp_park (0, 0, 0)
 ffffffff7e8cbee8 cond_wait_queue (ffffffff7ae95518, ffffffff7ae95500,
0, 0, 0, ffffffff7e9f9040) + 28
 ffffffff7e8cc498 cond_wait (ffffffff7ae95518, ffffffff7ae95500, 0,
ffffffff7e9f8f84, 0, 0) + 10
 ffffffff7e8cc4d4 pthread_cond_wait (ffffffff7ae95518, ffffffff7ae95500,
0, 1000, ffffffff7e900200, 0) + 8
 ffffffff7a82c9a0 ttc_recMutexLock (ffffffff7ae95500, 7, 1, 1,
1002b5bf0, 0) + 38
 ffffffff7ad0fb78 SQLDisconnect (1002b5bf0, 12400, 173598,
ffffffff7ae830f0, 1, ffffffff7ad6cef8) + 40
 ffffffff7c5e8050 x10lofLogoffInternal (ffffffff7e9f7340, 1002909e8,
10028f9f8, 8, 1002b5bf0, 0) + 1f0
 ffffffff7c5f1488 x10allReExecute (1002909e8, 2f79, 1005618b0,
1002a6820, 1002ada28, 10028f9f8) + 5c8
 ffffffff7c5e5e34 x10odr (1002909e8, 4, 100292ff0, 0, 100293168,
ffffffff7e9f7340) + 494
 ffffffff7c31e0ec upirtrc (1002909e8, 4, 1001e6740, 6, 0, 1002564e8) +
90c
 ffffffff7c515910 kpurcsc (1002901e0, 4, ffffffff7fffd696, 100294988, 0,
0) + 70
 ffffffff7c481d50 kpuexec (3000, 100292ff0, 1002af580, 100294988,
100290978, 10028faa0) + 2750
 ffffffff7c329ad0 OCIStmtExecute (1002901e0, 1002af500, 1002902b8, 1, 0,
0) + 30
 0000000100069ebc _ZN13COCIStatement13executeCommitEv (1002b9870,
1000b4138, 0, 36, 0, 1001e5d3a) + 78
 000000010003df0c _ZN17DBDynFilterInsert4initEv (1001e5ca0, 2400,
ffffffff7e9f72e8, ffffffff7e9f72e4, ffffffff7e9ec000, 26
) + d0
 000000010003e780 _ZN25DBConnPoolDynFilterInsert7executeEv (1001e2fd0,
ffffffff7fffed00, 10003aa1c, 9, 0, 0) + 78
 000000010003af30 main (3, ffffffff7ffff0e8, ffffffff7ffff108,
1001e2c60, 100000000, ffffffff7df00180) + 4d4
 000000010003a6d4 _start (0, 0, 0, 0, 0, 0) + 7c 

-----Original Message-----
From: Jonathan Adams [mailto:jonathan.ad...@oracle.com] 
Sent: Tuesday, December 14, 2010 7:30 PM
To: Cathy Guo
Cc: opensolaris-code@opensolaris.org
Subject: Re: [osol-code] lwp_park and lwp_unpark

On Tue, Dec 14, 2010 at 02:16:34PM -0500, Cathy Guo wrote:
> Our application has 9 threads, 1 main thread and 8 monitoring threads.
> The main thread does network transactions and the monitoring threads 
> re-create failed network connections periodically. We ran a test with 
> broken network connections to see monitoring threads try to re-create 
> network connections. In this test, we experienced delay in the main 
> thread. Truss shows the main thread called lwp_park and slept for 
> quite sometime before a monitoring thread called lwp_unpark to wake it

> up. The delay in main thread sometime was more than 1 minute. There's 
> no mutex contention between these threads.

What is the stack trace for the main thread while it is parked?

> The machine I'm using has 4 cpus with 4 cores each. The application 
> only used less 0.1% CPU.
> 
> I seems to me the machine has enough resources for all 9 threads to
run.
> I'm puzzled by why the main thread gets parked for so long.

threads call lwp_park() when they are waiting for a mutex to be dropped
or a condition variable to be fired.  The stack trace while it is
waiting is going to be the best clue as to what is going on.

Cheers,
- jonathan

_______________________________________________
opensolaris-code mailing list
opensolaris-code@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/opensolaris-code

Reply via email to