Hello everyone. I have a VPS running a fresh install of wheezy, installed by me from scratch (including kernel). Everything seems to be running fine, except for bind9 and openswan which literally crash the vps as explained below.
I'll start with bind9, since I have more info there. It's setup as a name server authoritative for two zones. Querying both zones works fine from localhost and the internet over ipv4, and ipv6. The problem comes up when I try to use bind9 to resolve other domains from localhost. When resolving certain domains, the vps literally crashes. I have to send it a boot request, and it boots up again starting with grub, to the login prompt. It doesn't matter if I use dig to query localhost by hand, or if I have nameserver::1, or nameserver 127.0.0.1 in resolv.conf. It doesn't matter if I query A records, or AAAA records (if those exist). The results are the same, bind9 resolves some domains, and crashes on others. There are no errors in logs. If I use dig by hand, type in: dig @localhost www.debian.org. and press enter, the crash happens right there and then, I have to send the vps a boot request at that point. Here's a list of domains that work fine, and those which crash the machine. crashes: www.ietf.org. www.linux-speakup.org. ftp.us.debian.org. www.debian.org. works fine: www.yahoo.com. www.google.com. www.fsf.org. There are probably many more from both categories. In the case of a query that works, I can get a cname record, and query that until I get answers for a and aaaa records without problems. It doesn't matter if I do, or don't use forwarders. If I put my vps provider's name servers in resolv.conf, I can query everything just fine. When using the stock wheezy kernel, the machine would sometimes crash during boot right after printing "starting bind9," before the ok that comes after. This was true especially if starting named without the -4 flag to disable ipv6. There were also random crashes every couple of days or so when I wasn't logged into the machine watching for them. All this seems to have gone away after I upgraded to linux 3.9 from wheezy-backports, and just the query crashes remain. I know someone who is with the same VPS provider and runs fedora 16 in his VPS. I have a shell account on his system, and have been able to verify for myself by using dig that it's possible to query all the domains I listed above using his local bind9 on his machine with no crashes. As far as I can tell (lspci, /proc/cpuinfo), his vps is configured exactly like mine as far as hardware, except for RAM and HD capacity. That's all the info I have on the bind9 problem. As far as openswan, it's setup with one connection, configured as responder using the native netkey stack. When openswan starts, I get this in /var/log/syslog: Aug 9 23:07:16 vserver kernel: [ 504.009595] NET: Registered protocol family 15 Aug 9 23:07:16 vserver ipsec_setup: Starting Openswan IPsec U2.6.37-g955aaafb-dirty/K3.9-0.bpo.1-amd64... Aug 9 23:07:16 vserver ipsec_setup: Using NETKEY(XFRM) stack Aug 9 23:07:16 vserver kernel: [ 504.132588] Initializing XFRM netlink socket Aug 9 23:07:16 vserver kernel: [ 504.194202] AVX instructions are not detected. Aug 9 23:07:16 vserver kernel: [ 504.202914] AVX instructions are not detected. Aug 9 23:07:16 vserver ipsec_setup: ...Openswan IPsec started Aug 9 23:07:16 vserver ipsec__plutorun: adjusting ipsec.d to /etc/ipsec.d Aug 9 23:07:16 vserver pluto: adjusting ipsec.d to /etc/ipsec.d Aug 9 23:07:16 vserver ipsec__plutorun: 002 loading certificate from /etc/ipsec.d/certs/servercert.pem Aug 9 23:07:16 vserver ipsec__plutorun: 002 loaded host cert file '/etc/ipsec.d/certs/servercert.pem' (1505 bytes) Aug 9 23:07:16 vserver ipsec__plutorun: 002 no subjectAltName matches ID '%fromcert', replaced by subject DN Aug 9 23:07:16 vserver ipsec__plutorun: 002 added connection description "l2tp" The machine crashes when I try to initiate a connection from a win7 client. Nothing gets written to the logs here, so the output below is the last screen full I get when logged into the vps via the serial console using out of band access, with the vps running in run level 1, and invoke-rc.d ipsec start done by hand: pluto[2266]: packet from 10.0.0.1:500: received Vendor ID payload [draft-ietf-ipsec-nat-t-ike-02_n] meth=106, but already using method 109 pluto[2266]: packet from 10.0.0.1:500: ignoring Vendor ID payload [FRAGMENTATION] pluto[2266]: packet from 10.0.0.1:500: ignoring Vendor ID payload [MS-Negotiation Discovery Capable] pluto[2266]: packet from 10.0.0.1:500: ignoring Vendor ID payload [Vid-Initial-Contact] pluto[2266]: packet from 10.0.0.1:500: ignoring Vendor ID payload [IKE CGA version 1] pluto[2266]: "l2tp"[1] 10.0.0.1 #1: responding to Main Mode from unknown peer 10.0.0.1 pluto[2266]: "l2tp"[1] 10.0.0.1 #1: OAKLEY_GROUP 20 not supported. Attribute OAKLEY_GROUP_DESCRIPTION pluto[2266]: "l2tp"[1] 10.0.0.1 #1: OAKLEY_GROUP 19 not supported. Attribute OAKLEY_GROUP_DESCRIPTION pluto[2266]: "l2tp"[1] 10.0.0.1 #1: transition from state STATE_MAIN_R0 to state STATE_MAIN_R1 pluto[2266]: "l2tp"[1] 10.0.0.1 #1: STATE_MAIN_R1: sent MR1, expecting MI2pluto[2266]: "l2tp"[1] 10.0.0.1 #1: NAT-Traversal: Result using RFC 3947 (NAT-Traversal): peer is NATed pluto[2266]: "l2tp"[1] 10.0.0.1 #1: transition from state STATE_MAIN_R1 to state STATE_MAIN_R2 pluto[2266]: "l2tp"[1] 10.0.0.1 #1: STATE_MAIN_R2: sent MR2, expecting MI3 That's all the info I have on the openswan issue. This vps is of course running lots more than just bind9 and openswan. Apache, proftpd, postfix, spamassassin, clamav, opendkim, just to name a few. All of those appear to be running without problems. As far as the vps itself, it is based on KVM/QEMU with one cpu, and one gig of RAM. The network card uses the virtio_net module, and the HD shows up as /dev/vda (I assume using the virtio_blk module, which is also automatically loaded). Based on the login banner I get when using out of band access, the host seems to be running openbsd. I'm not sure if the machine providing the out of band account and the host my vps is running on are actually one and the same though. According to /proc/cpu, the KVM/QEMU version seems to be 0.9.1. Any help in at least figuring out what is causing this, if not actually having a fully functional bind9 and openswan is much appreciated. If more info is necessary, I'll see what I can do. Greg -- web site: http://www.gregn.net gpg public key: http://www.gregn.net/pubkey.asc skype: gregn1 (authorization required, add me to your contacts list first) -- Free domains: http://www.eu.org/ or mail dns-mana...@eu.org -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20130811053602.ga5...@gregn.net