Hi all, I know this is a an old thread, but I wish to resurrect this in hopes to find answers..
9.5 + threads on FreeBSD 7 is better performance wise, but there is this problem. 9.4 + threads on FreeBSD 7 is almost 50% of the performance, but there is no issues like this. 9.5 without threads doesnt have this issue but same in performance. more data below... its basically the same as Vinny's but im stressing out that 9.5 with threads has a good performance. hoping there's some shed of light as to where to get a patch for this issue. Thanks! - Ivan system: FreeBSD 7.0 RELEASE AMD64 Server is a Dell SC1435 with 4 CPU's, no Hyperthreading, 2GB of RAM and a 150GB RAID1 Dnsperf run from a different server on the same network segment over Gig-E 1. FreeBSD 7-RELEASE+BIND 9.4.2-P2 = 34,000 QPS, 94MB mem 2. FreeBSD 7-RELEASE+BIND 9.5.0-P2 threaded = 82,000 QPS, 1.5GIG mem! (and it wont stop until the test script ends, and does not go back to its original state) 3. FreeBSD 7-RELEASE+BIND 9.5.0-P2 non-threaded = 34,000 QPS, 95MB mem FIRST TEST # pkg_info | grep bind bind94-base-9.4.2.2 The BIND DNS suite with updated DNSSEC and threads # named -v BIND 9.4.2-P2 # ldd /usr/sbin/named /usr/sbin/named: libcrypto.so.5 => /lib/libcrypto.so.5 (0x8007a9000) libthr.so.3 => /lib/libthr.so.3 (0x800a3b000) libc.so.7 => /lib/libc.so.7 (0x800b51000) PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 13677 bind 7 100 0 93704K 77912K select 1 6:13 194.43% named Notes: 1. regardless how many times the script was used, memory consumption remained the same.. 2. a few seconds after the script was terminated... the CPU normalize.. PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 13677 bind 7 98 0 93704K 77912K select 3 7:57 0.00% named SECOND TEST # pkg_info | grep bind bind95-base-9.5.0.2 The BIND DNS suite with updated DNSSEC and threads # named -v BIND 9.5.0-P2 # ldd /usr/sbin/named /usr/sbin/named: libcrypto.so.5 => /lib/libcrypto.so.5 (0x8007bf000) libxml2.so.5 => /usr/local/lib/libxml2.so.5 (0x800a51000) libz.so.4 => /lib/libz.so.4 (0x800c95000) libiconv.so.3 => /usr/local/lib/libiconv.so.3 (0x800da9000) libm.so.5 => /lib/libm.so.5 (0x800fa2000) libthr.so.3 => /lib/libthr.so.3 (0x8010bc000) libc.so.7 => /lib/libc.so.7 (0x8011d2000) PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 67304 bind 7 99 0 1524M 1509M select 1 2:10 200.54% named Notes: 1. memory consumption of 1.5G after only running the script 26 times. thats 1.3 million authoritative queries. 2. the script was terminated and the memory consumption was still the same. 3RD TEST (very similar to 1st test) Hardware Details CPU: Quad-Core AMD Opteron(tm) Processor 2350 (1995.01-MHz K8-class CPU) FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs cpu0: <ACPI CPU> on acpi0 cpu1: <ACPI CPU> on acpi0 cpu2: <ACPI CPU> on acpi0 cpu3: <ACPI CPU> on acpi0 SMP: AP CPU #2 Launched! SMP: AP CPU #3 Launched! SMP: AP CPU #1 Launched! usable memory = 2133491712 (2034 MB) avail memory = 2058821632 (1963 MB) # uname -a FreeBSD jaljeb.infoweapons.com 7.0-RELEASE FreeBSD 7.0-RELEASE #0: Sun Feb 24 10:35:36 UTC 2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC FreeBSD 7.0 RELEASE AMD64 Server is a Dell SC1435 with 4 CPU's, Hyperthreading disabled, 2GB of RAM and a 150GB RAID1 Dnsperf run from a different server on the same network segment over Gig-E --- On Fri, 8/8/08, Vinny Abello <[EMAIL PROTECTED]> wrote: > From: Vinny Abello <[EMAIL PROTECTED]> > Subject: RE: dnsperf and BIND memory consumption > To: "JINMEI Tatuya / 神明達哉" <[EMAIL PROTECTED]> > Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > Date: Friday, August 8, 2008, 2:33 AM > > -----Original Message----- > > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On > > Behalf Of JINMEI Tatuya / ???? > > Sent: Thursday, August 07, 2008 3:56 AM > > To: Vinny Abello > > Cc: [EMAIL PROTECTED] > > Subject: Re: dnsperf and BIND memory consumption > > > > At Thu, 7 Aug 2008 00:58:23 -0400, > > Vinny Abello <[EMAIL PROTECTED]> wrote: > > > > > OK. I've recompiled BIND 9.5.0-P2 (from > ports) without threads > > > enabled. I no longer see the memory leak at all. > I'm running dnsperf > > > and I see a constant of 18MB which is much more > reasonable for what > > > I am doing. For me it's easy to reproduce. > Some more information > > > that may help reproduce it: > > > > > FreeBSD 7.0 STABLE AMD64 (cvsup'ed within the > past week) > > > BIND 9.5.0-P2 installed via ports with threads > enabled > > > Server is a Dell PowerEdge 2850 with 2 CPU's, > Hyperthreading > > disabled, 4GB of RAM and a 36GB RAID1 array on a Perc4 > controller (LSI > > MegaRAID chipset) > > > Dnsperf run from a different server on the same > network segment over > > Gig-E > > > > This looks quite similar to the one we heard before. > I suspect this > > is due to some bad interaction between BIND9 and the > FreeBSD's thread > > library or its kernel, rather than application memory > leak (in which > > case you can confirm it by stopping named while its > memory is growing > > and seeing it crash). Here is what I suggested at > that time to > > identify the memory eater (but unfortunately we > couldn't get any > > feedback on it at that time), could you try it? > > Sure, I can give it a shot. > > > > ======================================================================= > > - create a symbolic link from > "/etc/malloc.conf" to "X": > > # ln -s X /etc/malloc.conf > > What exactly is this trying to accomplish here? JFYI, I > don't have a file /etc/malloc.conf on my server. Did you > mean /etc/make.conf? Where is X being referenced? > > > - start named with a moderate limitation of virtual > memory size, e.g. > > # /usr/bin/limits -v 384m $path_to_named/named > <command line options> > > > > Then the named process will eventually abort itself > with a core dump > > due to malloc failure. Please show us the stack trace > at that point. > > Hopefully it will reveal the malloc call that keeps > consuming memory. > > How would I show the trace that you require once this > happens? > > > > > Notes: > > - of course, this is a very radical way of diagnosing; > you need to > > keep watching the process because it's > "guaranteed" to be aborted. > > - the VM size must be carefully chosen so that malloc > failure won't > > happen due to normal named processing. I think 384MB > is reasonable > > enough according to the statistics you provided so > far, but I'm not > > 100% sure about that. > > - it's better to keep my latest patch to adb.c and > to run named with > > '-n 1' so that the mutex_init in adb.c > won't trigger the malloc > > failure. > > - the global symbolic link from /etc/make.conf affects > other > > processes. So, if you're running a different > process than named > > that can consume a lot of memory or can cause malloc > failure, we > > should find an alternative approach (there are some, > but they are > > more complicated so let's discuss those only when > they are really > > necessary). > > Shouldn't be a problem here. Again, it's just being > tested and this is the only thing the server is doing. > > > > ======================================================================= > > > > BTW, you should be able to find the previous > discussion on this matter > > by searching the [EMAIL PROTECTED] list with the > subject of > > "max-cache-size doesn't work with > 9.5.0b1". > > I may have to go back and find this thread. > > > > > --- > > JINMEI, Tatuya > > Internet Systems Consortium, Inc. > > > > p.s. I'm pretty sure it's different from the > 'memory leak' issue of > > BIND9/Windows. Let's forget it in this context. > > Fair enough. I'll trust you on that. _______________________________________________ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users