On Tue, Jul 8, 2008 at 5:16 PM, David S. Ahern <[EMAIL PROTECTED]> wrote:
> There's a bug opened for the network lockups -- see
> http://sourceforge.net/tracker/index.php?func=detail&aid=1802082&group_id=180599&atid=893831
>
> Based on my testing I've found that the e1000 has the lowest overhead
> (e.g., lowest irq and softirq times in the guest). I have not seen any
> lockups with the network using the e1000 nic, and a couple of months ago
> I was able to run a reasonably intensive network load continuously for
> several days.
>
> However, the duration tests I've run were with a modified BIOS. Months
> ago when I was digging into the network lockups I was comparing
> interrupt allocations to a DL320G3 running a RHEL3/4 load natively. I
> noticed no interrupts were shared on bare hardware, while in my RHEL3/4
> based kvm guests I was seeing interrupt sharing. So, I patched the bios
> (see attached) to get a different usage.
>
> I have not had time to do the due diligence to see if the stability was
> due to kvm updates or my bios change. If you have the time I'd be
> interested in knowing how the bios change works for you -- if you still
> see lockups.

This bug report is similar to the issue I'm seeing.  In our case, I'm
booting off a 32-bit Knoppix 5.3 DVD ISO, mounting the virtual
partitions, and running rsync from another server on the network.
Everything is connected via gigabit NICs and switch ports.

Host has a kvmbr0 using bond0 as the physical interface.  bond0
combines the 4 ports on an Intel PRO/1000MT PCIe NIC, using
mode=balance-tlb.

Host is running 64-bit Debian Lenny, with kvm-70 packages and 2.6.24
kernel, using the kvm/kvm-amd modules that ship with the kernel.

Hardware:
  Tyan h2000M motherboard
  2x dual-core Opteron 2220 CPUs at 2.8 GHz
  8 GB ECC DDR2-667 SD-RAM (4 GB per socket)
  12x 500 GB SATA-II HDs in RAID6
  3Ware 9650-ML16 PCIe RAID controller

The guests are using -net tap.

Using rtl8139, I can run rsync until the cows come home (it runs
through cron twice a day, but I've done manual runs 6 times
back-to-back, to sync 400 GB of data).

Using e1000, the guest networking will die within minutes of starting
rsync, everytime.  Won't last more than 15 minutes.  ifdown/ifup eth0
will bring the link back to life, but the rsync process has to be
restarted.

Using virtio-net (booting the guest OS using kernel 2.6.24, not
Knoppix), the guest networking dies within minutes as well, but it
lasts a little longer than e1000, and is considerably faster.

Guests are started with:
/usr/bin/kvm -name webmail -smp 1 -m 3072 -vnc :05 -daemonize
-localtime -usb -usbdevice tablet -net
nic,macaddr=00:16:3e:00:00:05,model=rtl8139 -net tap,ifname=tap05
-pidfile /var/run/kvm/webmail.pid -boot d -no-reboot -drive
index=0,media=disk,if=ide,file=/dev/mapper/vol0-webmail--boot -drive
index=1,media=disk,if=ide,file=/dev/mapper/vol0-webmail--storage
-drive 
index=2,media=cdrom,if=ide,file=/home/iso/KNOPPIX_V5.3.1DVD-2008-03-26-EN.iso

Number of guests running doesn't make a difference, happens with just
one or all 6 running.  But only the network for 1 guest dies at a
time.
-- 
Freddie Cash
[EMAIL PROTECTED]
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to