resolver.c:4858: fatal error

2013-06-04 Thread Stas Pirogov
Hello,

since upgrading our binds to 9.9.3 (from 9.9.2-P2) I've got
following crash couple of times in last 3 days:

04-Jun-2013 08:33:09.531 general: critical: resolver.c:4858: fatal error:
04-Jun-2013 08:33:09.531 general: critical: RUNTIME_CHECK(tresult == 0) 
failed
04-Jun-2013 08:33:09.531 general: critical: exiting (due to fatal error in 
library)

I have core generated and stack is as follows:

#0  0x00363f830215 in raise () from /lib64/libc.so.6
#1  0x00363f831cc0 in abort () from /lib64/libc.so.6
#2  0x0041398a in library_fatal_error (file=0x62ccdc "resolver.c", 
line=4858, format=0x63fa0a "RUNTIME_CHECK(%s) %s", 
args=0x44d8e0b0) at ./main.c:259
#3  0x005b0db2 in isc_error_fatal (file=0x29e9 , line=10736, 
format=0x6 ) at error.c:74
#4  0x005b0e0f in isc_error_runtimecheck (file=0x62ccdc 
"resolver.c", line=4858, expression=0x61e94a "tresult == 0")
at error.c:81
#5  0x005338c2 in resquery_response (task=0x21b7d6f0, event=) at resolver.c:4858
#6  0x005cdf26 in run (uap=0x2b44bf0a3010) at task.c:1116
#7  0x003640406367 in start_thread () from /lib64/libpthread.so.0
#8  0x00363f8d2f7d in clone () from /lib64/libc.so.6

We're running various versions CentOS. This happened on both 5.3 and 5.5

Please advise

Stas Pirogov


___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


memory management troubles/rndc flush hangs bind

2010-06-08 Thread Stas Pirogov
  [ anon ]
2aab2800   65372   65372   65372 rw---[ anon ]
2aab2bfd7000 164   0   0 -[ anon ]
2aab2c00   61408   61152   61152 rw---[ anon ]
2aab3000   63468   63468   63468 rw---[ anon ]
2aab33dfb0002068   0   0 -[ anon ]
2aab3400   47816   47540   47540 rw---[ anon ]
2aab3800   33204   14516   14516 rw---[ anon ]
2aab3a06d000   32332   0   0 -[ anon ]
2ae3ff02d000   4   4   4 rw---[ anon ]
2ae3ff03e000 276 276 276 rw---[ anon ]
7fff8fa4a000  84  20  20 rw---[ stack ]
ff608192   0   0 -[ anon ]
  --  --  --
total kB 3161504 3029568 3027536

strace -fp:

Process 2248 attached with 5 threads - interrupt to quit
[pid  2252] epoll_wait(7,  
[pid  2251] clock_gettime(CLOCK_REALTIME,  
[pid  2250] futex(0x2aaab104c088, FUTEX_WAIT_PRIVATE, 2, NULL 
[pid  2249] futex(0x2ae3ff047084, FUTEX_WAIT_PRIVATE, 4239917443, NULL 

[pid  2248] rt_sigsuspend([] 
[pid  2251] <... clock_gettime resumed> {1275976570, 97551000}) = 0
[pid  2251] futex(0x2ae3ff048074, FUTEX_WAIT_PRIVATE, 93569205, {0, 301251000}) 
= -1 ETIMEDOUT (Connection timed out)
[pid  2251] futex(0x2ae3ff048020, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  2251] clock_gettime(CLOCK_REALTIME, {1275976570, 400051000}) = 0
[pid  2251] futex(0x2ae3ff048074, FUTEX_WAIT_PRIVATE, 93569207, {0, 252521000}) 
= -1 ETIMEDOUT (Connection timed out)
[pid  2251] futex(0x2ae3ff048020, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  2251] clock_gettime(CLOCK_REALTIME, {1275976570, 654023000}) = 0
[pid  2251] futex(0x2ae3ff048074, FUTEX_WAIT_PRIVATE, 93569209, {0, 75751000}) 
= -1 ETIMEDOUT (Connection timed out)
[pid  2251] futex(0x2ae3ff048020, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  2251] clock_gettime(CLOCK_REALTIME, {1275976570, 731031000}) = 0
[pid  2251] futex(0x2ae3ff048074, FUTEX_WAIT_PRIVATE, 93569211, {0, 155742000}) 
= -1 ETIMEDOUT (Connection timed out)
[pid  2251] futex(0x2ae3ff048020, FUTEX_WAKE_PRIVATE, 1) = 0

>From what I can understand the threads are hanging waiting for lock and 
nothing happens afterwards.

Without running 'rndc flush' the bind will eventually reach 4G and crash 
with some other error which I currently don't have.

Up to now we tried different max-cache settings and threaded/non-threaded
compilations without much difference.

In all situations the named is 64-bit executable.

The problem never happens with bind 9.4.3-P5 that we run (nor with older 
version of 9.4), so it seems that from 9.6 (maybe even 9.5) the memory 
management changed. I also tried tests with 9.7.0-P1/P2 with same outcome.

Any help on the issue will be greatly appreciated. I'm open to any suggestions.

Thanks in advance.

Stas Pirogov
013 Netvision
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users