Re: [ATA] and re(4) stability issues
Victor Balada Diaz wrote: Digging at linux source code i've found that they do some special things for this chipset that i've been unable to find on our code. This is linux code for my chipset: 371 AHCI_HFLAGS (AHCI_HFLAG_IGN_SERR_INTERNAL | 372 AHCI_HFLAG_32BIT_ONLY | AHCI_HFLAG_NO_MSI | 373 AHCI_HFLAG_SECT255), File and the rest of the code in here[3]. As i saw AHCI_HFLAG_NO_MSI i tried doing the easiest thing i could think of, switching MSI and MSI-x off for the whole system, so i added to /boot/loader.conf this tunables: FreeBSD's ata(4) driver doesn't support MSI. This flag in linux's libata used in if ((hpriv->flags & AHCI_HFLAG_NO_MSI) || pci_enable_msi(pdev)) pci_intx(pdev, 1); In FreeBSD's code we have the same: /* enable PCI interrupt */ pci_write_config(dev, PCIR_COMMAND, pci_read_config(dev, PCIR_COMMAND, 2) & ~0x0400, 2); AHCI_HFLAG_IGN_SERR_INTERNAL flag targeted to ignore SERR_INTERNAL errors. FreeBSD's ata(4) driver ignores they too. AHCI_HFLAG_32BIT_ONLY flag limits to use 32-bit DMA only. If AHCI CAP register reports that controller supports 64-bit DMA driver will use 64-bit. So i think there can be added one quirk for you, but i'm not sure that problem is here.. AHCI_HFLAG_SECT255 flag limits I/O operation to 255 sectors, FreeBSD uses 128-limit by default. -- WBR, Andrey V. Elsukov ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [ATA] and re(4) stability issues
On Wed, Dec 10, 2008 at 03:12:26PM +0900, Pyun YongHyeon wrote: > On Tue, Dec 09, 2008 at 07:52:37PM +0100, Victor Balada Diaz wrote: > > Hello, > > > > I got various machines[1] at hetzner.de and I've been having problems > > with interrupts on FreeBSD 7.0 and now FreeBSD 7.1 -BETA2 in amd64. I've > > been trying to narrow the problem so someone more knowledgeable than me > > is able to fix it. This mail is an other attempt to ask a question > > with regards ATA code to see if this time i got something. > > > > For the ones that don't actually know what happened: > > > > With FreeBSD 7.0 -RELEASE for amd64 and default kernel > > the system shared re0 interrupt with OHCI and this caused > > re(4) to corrupt packets and create interrupt storms. Tried > > re(4) in 7.0-RELEASE had bus_dma(9) bug which could be easily > triggered on systems with > 4GB memory. But I dont' know whether > this is related with interrupt storms. > > > updating to 7.1 -BETA2 and still had some problems with it. > > > > I've opened the PR kern/128287[2] and Remko quickly answered > > with a workaround: that workaround was removing USB support from > > my kernel. I did it and re(4) wasn't sharing interrupts anylonger, > > and the interrupt storms were gone. Now sometime later the interface > > goes up and down from time to time, but less often. Also sometimes > > the machine losts the network interface but continues to work. > > > > It seems that your controller supports MSI so you can set a tunable > hw.re.msi_disable to 0 to enable MSI. With MSI you can remove > interrupt sharing(e.g. add hw.re.msi_disable="0" to > /boot/loader.conf file.) However there were several issues on re(4) > w.r.t MSI so it was off by default. This is undocumented and with sysctl -a i can't find the tunable. Is this a HEAD feature or it's also in 7.1 -BETA2? Should i add hw.re_msi_disable="0" to /boot/loader.conf? This was sharing interrupt with USB, does USB need any special MSI handling or with re using MSI is enough to not share the interrupt? > > > I know it continues to work because some days later i can see that > > it tried to deliver the status reports but was unable to resolve the > > aliases hostnames. I can't ping the machine and i know the network > > is OK. If i reboot the machine everything is working again. > > > > Recently I've made small changes to re(4) which may help to detect > link state change event. Would you try re(4) in HEAD? Can i just drop HEAD's /stable/7/sys/dev/re/ in -STABLE and test that or do i need to test the whole HEAD kernel? Regards. -- La prueba más fehaciente de que existe vida inteligente en otros planetas, es que no han intentado contactar con nosotros. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [ATA] and re(4) stability issues
On Wed, Dec 10, 2008 at 11:58:12AM +0300, Andrey V. Elsukov wrote: > Victor Balada Diaz wrote: > >Digging at linux source code i've found that they do some special things > >for this chipset that i've been unable to find on our code. This is > >linux code for my chipset: > > > >371 AHCI_HFLAGS (AHCI_HFLAG_IGN_SERR_INTERNAL | > >372 AHCI_HFLAG_32BIT_ONLY | > >AHCI_HFLAG_NO_MSI | > >373 AHCI_HFLAG_SECT255), > > > >File and the rest of the code in here[3]. > > > >As i saw AHCI_HFLAG_NO_MSI i tried doing the easiest thing i could > >think of, switching MSI and MSI-x off for the whole system, so > >i added to /boot/loader.conf this tunables: > > FreeBSD's ata(4) driver doesn't support MSI. This flag in linux's libata > used in > > if ((hpriv->flags & AHCI_HFLAG_NO_MSI) || pci_enable_msi(pdev)) > pci_intx(pdev, 1); > > In FreeBSD's code we have the same: > > /* enable PCI interrupt */ > pci_write_config(dev, PCIR_COMMAND, > pci_read_config(dev, PCIR_COMMAND, 2) & ~0x0400, 2); > > AHCI_HFLAG_IGN_SERR_INTERNAL flag targeted to ignore SERR_INTERNAL errors. > FreeBSD's ata(4) driver ignores they too. > > AHCI_HFLAG_32BIT_ONLY flag limits to use 32-bit DMA only. > If AHCI CAP register reports that controller supports 64-bit DMA driver > will use 64-bit. > So i think there can be added one quirk for you, but i'm not sure that > problem is here.. > > AHCI_HFLAG_SECT255 flag limits I/O operation to 255 sectors, FreeBSD uses > 128-limit > by default. Thanks for explaining me what the flags do. I'm not skilled enough to create the DMA quirks but if you could give me some patches i'll test them. Also if you have any other idea on what could i test or how can i debug this it would be more than welcome. Thanks. Regards. -- La prueba más fehaciente de que existe vida inteligente en otros planetas, es que no han intentado contactar con nosotros. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [ATA] and re(4) stability issues
On 10Dec, 2008, at 10:11 , Victor Balada Diaz wrote: Thanks for explaining me what the flags do. I'm not skilled enough to create the DMA quirks but if you could give me some patches i'll test them. Also if you have any other idea on what could i test or how can i debug this it would be more than welcome. Comment out the following two lines in ata_ahci_dmainit(): if (ATA_INL(ctlr->r_res2, ATA_AHCI_CAP) & ATA_AHCI_CAP_64BIT) ch->dma->max_address = BUS_SPACE_MAXADDR; And you will not use 64bit DMA even if the chipset supports it. However I have not seen any chipsets supporting this fail, YMMV as usual :) -Søren ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [ATA] and re(4) stability issues
On Wed, Dec 10, 2008 at 07:28:00PM +0900, Pyun YongHyeon wrote: > On Wed, Dec 10, 2008 at 09:59:35AM +0100, Victor Balada Diaz wrote: > > On Wed, Dec 10, 2008 at 03:12:26PM +0900, Pyun YongHyeon wrote: > > > On Tue, Dec 09, 2008 at 07:52:37PM +0100, Victor Balada Diaz wrote: > > > > Hello, > > > > > > > > I got various machines[1] at hetzner.de and I've been having problems > > > > with interrupts on FreeBSD 7.0 and now FreeBSD 7.1 -BETA2 in amd64. > I've > > > > been trying to narrow the problem so someone more knowledgeable than > me > > > > is able to fix it. This mail is an other attempt to ask a question > > > > with regards ATA code to see if this time i got something. > > > > > > > > For the ones that don't actually know what happened: > > > > > > > > With FreeBSD 7.0 -RELEASE for amd64 and default kernel > > > > the system shared re0 interrupt with OHCI and this caused > > > > re(4) to corrupt packets and create interrupt storms. Tried > > > > > > re(4) in 7.0-RELEASE had bus_dma(9) bug which could be easily > > > triggered on systems with > 4GB memory. But I dont' know whether > > > this is related with interrupt storms. > > > > > > > updating to 7.1 -BETA2 and still had some problems with it. > > > > > > > > I've opened the PR kern/128287[2] and Remko quickly answered > > > > with a workaround: that workaround was removing USB support from > > > > my kernel. I did it and re(4) wasn't sharing interrupts anylonger, > > > > and the interrupt storms were gone. Now sometime later the interface > > > > goes up and down from time to time, but less often. Also sometimes > > > > the machine losts the network interface but continues to work. > > > > > > > > > > It seems that your controller supports MSI so you can set a tunable > > > hw.re.msi_disable to 0 to enable MSI. With MSI you can remove > > > interrupt sharing(e.g. add hw.re.msi_disable="0" to > > > /boot/loader.conf file.) However there were several issues on re(4) > > > w.r.t MSI so it was off by default. > > > > This is undocumented and with sysctl -a i can't find the tunable. Is this > > a HEAD feature or it's also in 7.1 -BETA2? Should i add > > Yeah it's an undocmented feature. But most drivers written by me > have similar kobs. Both HEAD and stable/7 including 7.1 BETA2 have > the tunable. I think it could be great if you could document it or at least show it by default when you do sysctl -ad with a small description. > > > hw.re_msi_disable="0" to /boot/loader.conf? >^ >Shoule be hw.re.msi_disable="0" > > > > Yes, just add it to /boot/loader.conf. Note, you should not disable > system-wide MSI control(e.g. hw.pci.enable_msi == 1). > > > This was sharing interrupt with USB, does USB need any special MSI handling > > or with re using MSI is enough to not share the interrupt? > > If re(4) can use MSI, you don't need to worry about interrupt > sharing with USB. Check the output of "vmstat -i". You normally get > an irq256 or higher for MSI enabled driver. > > > > > > > > > > > > I know it continues to work because some days later i can see that > > > > it tried to deliver the status reports but was unable to resolve the > > > > aliases hostnames. I can't ping the machine and i know the network > > > > is OK. If i reboot the machine everything is working again. > > > > > > > > > > Recently I've made small changes to re(4) which may help to detect > > > link state change event. Would you try re(4) in HEAD? > > > > Can i just drop HEAD's /stable/7/sys/dev/re/ in -STABLE and test that > > Yes, you can. It should build without problems. Just replace re(4) on > stable/7 with HEAD version. > > > or do i need to test the whole HEAD kernel? > > > > No you don't have to that. Backporting the changes i've found that it didn't compile so in the end i got from HEAD the following files: base/head/sys/dev/re/if_re.c base/head/sys/pci/if_rl.c base/head/sys/pci/if_rlreg.h After that i've recompiled 7.1 -BETA2 GENERIC kernel and enabled the knob you suggested in /boot/loader.conf. With the new kernel and MSI the interrupts are like this: # vmstat -i interrupt total rate irq9: acpi01 0 irq16: ohci0 1 0 irq17: ohci1 ohci3 1 0 irq18: ohci2 ohci4 1 0 irq22: atapci0 19215 15 cpu0: timer 2502718 1998 irq256: re0 4967726 3967 cpu1: timer 2502525 1998 Total9992188 7980 The high interrupt numbers are because i've been running iperf to check everything it's fine, not because of interrupt storms. So far i didn't find any interrupt storms related to USB or re(4) driver but while doing the tests i've found this error: re0: watchdo
Re: [ATA] and re(4) stability issues
On Wed, Dec 10, 2008 at 12:32:25PM +0100, Victor Balada Diaz wrote: > On Wed, Dec 10, 2008 at 07:28:00PM +0900, Pyun YongHyeon wrote: > > On Wed, Dec 10, 2008 at 09:59:35AM +0100, Victor Balada Diaz wrote: > > > On Wed, Dec 10, 2008 at 03:12:26PM +0900, Pyun YongHyeon wrote: > > > > On Tue, Dec 09, 2008 at 07:52:37PM +0100, Victor Balada Diaz wrote: > > > > > Hello, > > > > > > > > > > I got various machines[1] at hetzner.de and I've been having > > problems > > > > > with interrupts on FreeBSD 7.0 and now FreeBSD 7.1 -BETA2 in > > amd64. I've > > > > > been trying to narrow the problem so someone more knowledgeable > > than me > > > > > is able to fix it. This mail is an other attempt to ask a question > > > > > with regards ATA code to see if this time i got something. > > > > > > > > > > For the ones that don't actually know what happened: > > > > > > > > > > With FreeBSD 7.0 -RELEASE for amd64 and default kernel > > > > > the system shared re0 interrupt with OHCI and this caused > > > > > re(4) to corrupt packets and create interrupt storms. Tried > > > > > > > > re(4) in 7.0-RELEASE had bus_dma(9) bug which could be easily > > > > triggered on systems with > 4GB memory. But I dont' know whether > > > > this is related with interrupt storms. > > > > > > > > > updating to 7.1 -BETA2 and still had some problems with it. > > > > > > > > > > I've opened the PR kern/128287[2] and Remko quickly answered > > > > > with a workaround: that workaround was removing USB support from > > > > > my kernel. I did it and re(4) wasn't sharing interrupts anylonger, > > > > > and the interrupt storms were gone. Now sometime later the > > interface > > > > > goes up and down from time to time, but less often. Also sometimes > > > > > the machine losts the network interface but continues to work. > > > > > > > > > > > > > It seems that your controller supports MSI so you can set a tunable > > > > hw.re.msi_disable to 0 to enable MSI. With MSI you can remove > > > > interrupt sharing(e.g. add hw.re.msi_disable="0" to > > > > /boot/loader.conf file.) However there were several issues on re(4) > > > > w.r.t MSI so it was off by default. > > > > > > This is undocumented and with sysctl -a i can't find the tunable. Is > > this > > > a HEAD feature or it's also in 7.1 -BETA2? Should i add > > > > Yeah it's an undocmented feature. But most drivers written by me > > have similar kobs. Both HEAD and stable/7 including 7.1 BETA2 have > > the tunable. > > I think it could be great if you could document it or at least > show it by default when you do sysctl -ad with a small description. > If MSI worked as expected I would have documented it as I did in msk(4)/nfe(4)/ale(4)/age(4)/jme(4) etc. Using MSI on RealTek does not seem to stable. I tried hard to fix that but some users still reported watchdog timeouts. Working without documentation and hardware also made it hard to complete the work. This was the main reason why MSI was disabled on re(4). > > > > > hw.re_msi_disable="0" to /boot/loader.conf? > >^ > >Shoule be hw.re.msi_disable="0" > > > > > > > Yes, just add it to /boot/loader.conf. Note, you should not disable > > system-wide MSI control(e.g. hw.pci.enable_msi == 1). > > > > > This was sharing interrupt with USB, does USB need any special MSI > > handling > > > or with re using MSI is enough to not share the interrupt? > > > > If re(4) can use MSI, you don't need to worry about interrupt > > sharing with USB. Check the output of "vmstat -i". You normally get > > an irq256 or higher for MSI enabled driver. > > > > > > > > > > > > > > > > > I know it continues to work because some days later i can see that > > > > > it tried to deliver the status reports but was unable to resolve > > the > > > > > aliases hostnames. I can't ping the machine and i know the network > > > > > is OK. If i reboot the machine everything is working again. > > > > > > > > > > > > > Recently I've made small changes to re(4) which may help to detect > > > > link state change event. Would you try re(4) in HEAD? > > > > > > Can i just drop HEAD's /stable/7/sys/dev/re/ in -STABLE and test that > > > > Yes, you can. It should build without problems. Just replace re(4) on > > stable/7 with HEAD version. > > > > > or do i need to test the whole HEAD kernel? > > > > > > > No you don't have to that. > > Backporting the changes i've found that it didn't compile so in > the end i got from HEAD the following files: > > base/head/sys/dev/re/if_re.c > base/head/sys/pci/if_rl.c > base/head/sys/pci/if_rlreg.h > Ah,, sorry about that. Recently there was some changes. I forgot that. > After that i've recompiled 7.1 -BETA2 GENERIC kernel and enabled > the knob you suggested in /boot/loader.conf. > > With the new kernel
Re: visibility of release process
On Tue, 2008-12-09 at 15:19 -0600, Kevin Day wrote: [ Other points were not ignored, just nothing to really say about them other than "Yes" and/or "Will try", etc. ] > * More notice to hubs@ before the release notes are generated. The > releases always come with a "At the time of this writing, these > mirrors have the full distribution" list. If it was announced to us > mirror operators before that list is made, we could make sure we were > synced in time to be included. Maybe even a semi-shaming of "These > mirrors do not appear to have the required bits:". The difference in > bandwidth we see on our public mirror (ftp3.us) is pretty extreme if > we're listed there or not, which seems to be a 50/50 coin-toss on the > last few releases. I'm honestly not sure why, since we can easily pull > >50mbps from ftp-master. I have absolutely no clue how to fairly handle what needs to be done with this so, as you noted, its something of a coin toss at the moment. The issue is that the Release Announcement needs to include a list of FTP sites, but the list can't be "too long" (as in can't be every mirror site we've got). The Release Announcement should be relatively short and to the point. An exhaustive list of every mirror is a bit too much in that regard. Ideally we'd just say "Its available on a mirror site, get it from there.". But people want easy so we need to include something to click on. With the 7.0 release I tried giving just the URL of the primary site (ftp.freebsd.org) but that proved people don't just want easy - they're lazy. For the most part they just clicked on that and didn't look around for a mirror. Hence your observation about the difference in bandwidth when you're listed versus when you're not listed. Since we don't have any sort of "click here and automagically land on a nice fast mirror real close to you" I basically make a quick survey of some FTP sites shooting for having several of the primary mirror sites (ftpX.freebsd.org) and a sampling of geographically diverse country mirrors (ftpX.au.freebsd.org, ftpX.ru.freebsd.org, etc.). If you're one of the ones I check and if you've got the right sparc64 checksum file (I'm looking for sites that carry everything, and since sparc64 is usually the last to get loaded on ftp-master ...) you make the list. Sorry, I know it sucks. Until we've got something automagic I'm not quite sure how to fairly handle having a list that's not "too long" for a release announcement but still providing a reasonable starting point for people who want something to click on in the release announcement. -- Ken Smith - From there to here, from here to | [EMAIL PROTECTED] there, funny things are everywhere. | - Theodore Geisel | signature.asc Description: This is a digitally signed message part
Re: [ATA] and re(4) stability issues
Victor Balada Diaz a écrit : Hello, I got various machines[1] at hetzner.de and I've been having problems with interrupts on FreeBSD 7.0 and now FreeBSD 7.1 -BETA2 in amd64. I've been trying to narrow the problem so someone more knowledgeable than me is able to fix it. This mail is an other attempt to ask a question with regards ATA code to see if this time i got something. For the ones that don't actually know what happened: With FreeBSD 7.0 -RELEASE for amd64 and default kernel the system shared re0 interrupt with OHCI and this caused re(4) to corrupt packets and create interrupt storms. Tried updating to 7.1 -BETA2 and still had some problems with it. I've opened the PR kern/128287[2] and Remko quickly answered with a workaround: that workaround was removing USB support from my kernel. I did it and re(4) wasn't sharing interrupts anylonger, and the interrupt storms were gone. Now sometime later the interface goes up and down from time to time, but less often. Also sometimes the machine losts the network interface but continues to work. I know it continues to work because some days later i can see that it tried to deliver the status reports but was unable to resolve the aliases hostnames. I can't ping the machine and i know the network is OK. If i reboot the machine everything is working again. When switched from 7.0 to 7.1 BETA2 i also found that under load after some hours the machine created interrupt storms on ATA disks. Digging at linux source code i've found that they do some special things for this chipset that i've been unable to find on our code. This is linux code for my chipset: 371 AHCI_HFLAGS (AHCI_HFLAG_IGN_SERR_INTERNAL | 372 AHCI_HFLAG_32BIT_ONLY | AHCI_HFLAG_NO_MSI | 373 AHCI_HFLAG_SECT255), File and the rest of the code in here[3]. As i saw AHCI_HFLAG_NO_MSI i tried doing the easiest thing i could think of, switching MSI and MSI-x off for the whole system, so i added to /boot/loader.conf this tunables: hw.pci.enable_msix="0" hw.pci.enable_msi="0" And then rebooted the machine. After various hours of doing almost nothing i've found that the machine answered ping but was unable to answer any request (eg, ssh, nagios nrpe, etc). The machine recovered itself after some minutes and when i was able to ssh into i saw the following in dmesg: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=1463123158 and a lot more errors like that. I didn't get this errors with MSI enabled. I see WRITE_DMA48 and in linux code i saw AHCI_HFLAG_32BIT_ONLY which is later used for DMA related things. Could someone who is more knowledgeable check if we're doing the right thing? I've attached verbose dmesg of a machine that's like this one with 7.1 -BETA2, MSI enabled and GENERIC kernel minus USB and firewrire. Also, please, could someone give me a hand on how could i continue debugging this interrupt issues? I'm a bit lost and digging code and posting each time i think i've found something is not going to go anywhere. I would also like to say that i've seen reports of this kind of problems on amd64 machines in the lists since various years ago, so i don't think this is just a problem with this BIOS/motherboard (MSI K9AG Neo2 Digital) on the lists Thanks in advance for any help. Regards. [1]: http://www.hetzner.de/hosting/produkte_rootserver/ds7000/ [2]: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/128287 [3]: http://fxr.watson.org/fxr/source/drivers/ata/ahci.c?v=linux-2.6#L369 Sorry I didn't take the time to read all the thread, but I got similar problem with the same IXP600 chipset. Only it was'nt with a Realtek NIC (re) but with a Ralink wireless one. The simptoms where similar : interrupt 22 was shared between the sata controler and the wireless card. And I got Interrupt Storms at random times when using the wireless network. No problem since I removed the ral(4) NIC (got a real access point now). You might not want to point the finger at the re(4) driver too fast. Arnaud Houdelette ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [ATA] and re(4) stability issues
On Wed, Dec 10, 2008 at 09:59:35AM +0100, Victor Balada Diaz wrote: > On Wed, Dec 10, 2008 at 03:12:26PM +0900, Pyun YongHyeon wrote: > > On Tue, Dec 09, 2008 at 07:52:37PM +0100, Victor Balada Diaz wrote: > > > Hello, > > > > > > I got various machines[1] at hetzner.de and I've been having problems > > > with interrupts on FreeBSD 7.0 and now FreeBSD 7.1 -BETA2 in amd64. I've > > > been trying to narrow the problem so someone more knowledgeable than me > > > is able to fix it. This mail is an other attempt to ask a question > > > with regards ATA code to see if this time i got something. > > > > > > For the ones that don't actually know what happened: > > > > > > With FreeBSD 7.0 -RELEASE for amd64 and default kernel > > > the system shared re0 interrupt with OHCI and this caused > > > re(4) to corrupt packets and create interrupt storms. Tried > > > > re(4) in 7.0-RELEASE had bus_dma(9) bug which could be easily > > triggered on systems with > 4GB memory. But I dont' know whether > > this is related with interrupt storms. > > > > > updating to 7.1 -BETA2 and still had some problems with it. > > > > > > I've opened the PR kern/128287[2] and Remko quickly answered > > > with a workaround: that workaround was removing USB support from > > > my kernel. I did it and re(4) wasn't sharing interrupts anylonger, > > > and the interrupt storms were gone. Now sometime later the interface > > > goes up and down from time to time, but less often. Also sometimes > > > the machine losts the network interface but continues to work. > > > > > > > It seems that your controller supports MSI so you can set a tunable > > hw.re.msi_disable to 0 to enable MSI. With MSI you can remove > > interrupt sharing(e.g. add hw.re.msi_disable="0" to > > /boot/loader.conf file.) However there were several issues on re(4) > > w.r.t MSI so it was off by default. > > This is undocumented and with sysctl -a i can't find the tunable. Is this > a HEAD feature or it's also in 7.1 -BETA2? Should i add Yeah it's an undocmented feature. But most drivers written by me have similar kobs. Both HEAD and stable/7 including 7.1 BETA2 have the tunable. > hw.re_msi_disable="0" to /boot/loader.conf? ^ Shoule be hw.re.msi_disable="0" > Yes, just add it to /boot/loader.conf. Note, you should not disable system-wide MSI control(e.g. hw.pci.enable_msi == 1). > This was sharing interrupt with USB, does USB need any special MSI handling > or with re using MSI is enough to not share the interrupt? If re(4) can use MSI, you don't need to worry about interrupt sharing with USB. Check the output of "vmstat -i". You normally get an irq256 or higher for MSI enabled driver. > > > > > > > I know it continues to work because some days later i can see that > > > it tried to deliver the status reports but was unable to resolve the > > > aliases hostnames. I can't ping the machine and i know the network > > > is OK. If i reboot the machine everything is working again. > > > > > > > Recently I've made small changes to re(4) which may help to detect > > link state change event. Would you try re(4) in HEAD? > > Can i just drop HEAD's /stable/7/sys/dev/re/ in -STABLE and test that Yes, you can. It should build without problems. Just replace re(4) on stable/7 with HEAD version. > or do i need to test the whole HEAD kernel? > No you don't have to that. -- Regards, Pyun YongHyeon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [ATA] and re(4) stability issues
On Tue, Dec 09, 2008 at 07:52:37PM +0100, Victor Balada Diaz wrote: > Hello, > > I got various machines[1] at hetzner.de and I've been having problems > with interrupts on FreeBSD 7.0 and now FreeBSD 7.1 -BETA2 in amd64. I've > been trying to narrow the problem so someone more knowledgeable than me > is able to fix it. This mail is an other attempt to ask a question > with regards ATA code to see if this time i got something. Just want to add a quick note and say that I'm having the same problem with my 7.0-RELEASE-p6/amd64 hetzner machine: http://lists.freebsd.org/pipermail/freebsd-acpi/2008-September/005095.html I would be happy to test patches as well. Thanks. -- Oliver PETER, email: [EMAIL PROTECTED], ICQ# 113969174 "If it feels good, you're doing something wrong." -- Coach McTavish ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: visibility of release process
Ken Smith wrote: > With the 7.0 release I tried giving just the URL > of the primary site (ftp.freebsd.org) but that proved people don't just > want easy - they're lazy. For the most part they just clicked on that > and didn't look around for a mirror. Hence your observation about the > difference in bandwidth when you're listed versus when you're not > listed. Any idea if most of those ISO downloaders are really installing a fresh system or are just updating from a previous release by reinstalling? It seems to me many more people could be using freebsd-update(8) so the announcement really could focus on upgrades rather than fresh installs. I obviously like FreeBSD myself, but how many new users who need to download ISOs really come on board with each new release? The freebsd-update(8) portion of "Updating existing systems" could be the main focus of the announcement, and the "Availability" section and "updating existing systems from source" sections could just contain a link pointing to the web site since (I believe) the number of users needing those should be limited. No FTP listing in the announcement at all. I guess freebsd-update(8) currently has some limitations that make it not so cut-and-dry. But I'm a little confused anyway at this point as to what the long-term plans are. There's a CVS repo, SVN repo which appears to be the way things will be, a "projects" svn repo, a "projects" p4 repo, cvs(1) in base, csup(1) in base which is still being worked on even though there appears to be a slow migration to svn, svn(1) is in ports, there's no SVN repo for the ports tree but there is for src, freebsd-update(8) exists for binary upgrades which seems to be the way of the future for a huge majority of end-users, and yet the official mirrors are missing both the SVN src repo and binary update files. It seems to me the mirrors and release announcement are behind the times by pointing to source upgrades and ISO downloads, or maybe I'm just a little too early. I hope core has a plan for all of this. :) -- Skip ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [ATA] and re(4) stability issues
On Wed, Dec 10, 2008 at 12:08:40PM +, Oliver Peter wrote: > On Tue, Dec 09, 2008 at 07:52:37PM +0100, Victor Balada Diaz wrote: > > Hello, > > > > I got various machines[1] at hetzner.de and I've been having problems > > with interrupts on FreeBSD 7.0 and now FreeBSD 7.1 -BETA2 in amd64. I've > > been trying to narrow the problem so someone more knowledgeable than me > > is able to fix it. This mail is an other attempt to ask a question > > with regards ATA code to see if this time i got something. > > Just want to add a quick note and say that I'm having the same problem > with my 7.0-RELEASE-p6/amd64 hetzner machine: > > > http://lists.freebsd.org/pipermail/freebsd-acpi/2008-September/005095.html > > I would be happy to test patches as well. Thanks. Hello Oliver, What i did so far and improved a lot the experience was: 1) Upgrade at least the if_re code to RELENG_7. This fixes issues of packet corruption on ssh sessions. 2) Delete from your kernel config USB and firewire. This prevents the realtek interrupt to be shared. After this, with 7.1 -BETA2 the systems are more or less stable, but after a while the ATA controller starts to create interrupt storms. I wasn't able to find why. With the help that i've received in this thread from Pyun YongHyeon (Thanks!!) i'm also trying this suggestions: 3) Backport this 3 files from current to 7.1 -BETA2: base/head/sys/dev/re/if_re.c base/head/sys/pci/if_rl.c base/head/sys/pci/if_rlreg.h You can fetch them from http://svn.freebsd.org/. With them and adding to /boot/loader.conf this tunable: hw.re.msi_disable="0" I can use GENERIC kernel again (ie, USB enabled) and so far i didn't find any problem yet. No more interface up/down problems and no more interrupt storms. I must say that i haven't tested this enough, because the interrupt storms in ATA code start to happen after a few days of uptime load, but at least the problems with the realtek seem to be gone. If you upgrade to 7.1 -BETA2 you'll also get SATA support for the IXP card. With 7.0 it will work as ATA 33 in compatibility mode. Maybe someone with write access to the wiki could add it somewhere so that other hetzner users that are having the same problems could use the same workarounds :) I hope this helps you. Regards. -- La prueba más fehaciente de que existe vida inteligente en otros planetas, es que no han intentado contactar con nosotros. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [ATA] and re(4) stability issues
On Wed, 10 Dec 2008 21:07:19 +0900 Pyun YongHyeon <[EMAIL PROTECTED]> wrote: > On Wed, Dec 10, 2008 at 12:32:25PM +0100, Victor Balada Diaz wrote: > > As these seems to improve the current situation, is there any > > chance of merging -current driver in 7.1 before release? > > > > I think re(4) in HEAD needs more testing. As you might know RealTek > produced too many chipsets. :-( > FYI I've now turned MSI on in HEAD and will see what happens. Before my re0 was sharing interrupts with 3 USB controllers. Now it's all by itself on irq256. I'm running amd64 with re0: port 0xde00-0xdeff mem 0xfdaff000-0xfdaf, 0xfdae-0xfdae irq 18 at device 0.0 on pci2 --- Gary Jennejohn ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Yahoo! Groups: Welcome to SKATINGEXTREME. Visit today!
Hello, Thank you for being a subscriber to this list. I hope that you get a chance to check out my web sites. I have one at http://www.howard.net/hilary and one also at http://www.webpost.net/sk/SkatersChoice . I will try to be in the chat room on Thursday evenings. Go to Hilarys Skating Center then if you want to try to chat. I would love to get a lot of skaters out there in a chat. Pass the word and lets find some skaters eh. Thanks, Hilary Complete your Yahoo! Groups account: -- Your email address has been added to the email list of a Yahoo! Group. To gain access to all of your group's web features (previous messages, photos, files, calendar, etc.) and easier control of your message delivery options, we highly recommend that you complete your account by connecting your email address to Yahoo account. It is easy and free. Please visit: http://groups.yahoo.com/convacct?email=stable%40FreeBSD.org&list=SKATINGEXTREME Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [ATA] and re(4) stability issues
On Wed, Dec 10, 2008 at 09:07:19PM +0900, Pyun YongHyeon wrote: > On Wed, Dec 10, 2008 at 12:32:25PM +0100, Victor Balada Diaz wrote: > > On Wed, Dec 10, 2008 at 07:28:00PM +0900, Pyun YongHyeon wrote: > > > On Wed, Dec 10, 2008 at 09:59:35AM +0100, Victor Balada Diaz wrote: > > > > On Wed, Dec 10, 2008 at 03:12:26PM +0900, Pyun YongHyeon wrote: > > > > > On Tue, Dec 09, 2008 at 07:52:37PM +0100, Victor Balada Diaz wrote: > > > > > > Hello, > > > > > > > > > > > > I got various machines[1] at hetzner.de and I've been having > problems > > > > > > with interrupts on FreeBSD 7.0 and now FreeBSD 7.1 -BETA2 in > amd64. I've > > > > > > been trying to narrow the problem so someone more knowledgeable > than me > > > > > > is able to fix it. This mail is an other attempt to ask a > question > > > > > > with regards ATA code to see if this time i got something. > > > > > > > > > > > > For the ones that don't actually know what happened: > > > > > > > > > > > > With FreeBSD 7.0 -RELEASE for amd64 and default kernel > > > > > > the system shared re0 interrupt with OHCI and this caused > > > > > > re(4) to corrupt packets and create interrupt storms. Tried > > > > > > > > > > re(4) in 7.0-RELEASE had bus_dma(9) bug which could be easily > > > > > triggered on systems with > 4GB memory. But I dont' know whether > > > > > this is related with interrupt storms. > > > > > > > > > > > updating to 7.1 -BETA2 and still had some problems with it. > > > > > > > > > > > > I've opened the PR kern/128287[2] and Remko quickly answered > > > > > > with a workaround: that workaround was removing USB support from > > > > > > my kernel. I did it and re(4) wasn't sharing interrupts > anylonger, > > > > > > and the interrupt storms were gone. Now sometime later the > interface > > > > > > goes up and down from time to time, but less often. Also > sometimes > > > > > > the machine losts the network interface but continues to work. > > > > > > > > > > > > > > > > It seems that your controller supports MSI so you can set a tunable > > > > > hw.re.msi_disable to 0 to enable MSI. With MSI you can remove > > > > > interrupt sharing(e.g. add hw.re.msi_disable="0" to > > > > > /boot/loader.conf file.) However there were several issues on re(4) > > > > > w.r.t MSI so it was off by default. > > > > > > > > This is undocumented and with sysctl -a i can't find the tunable. Is > this > > > > a HEAD feature or it's also in 7.1 -BETA2? Should i add > > > > > > Yeah it's an undocmented feature. But most drivers written by me > > > have similar kobs. Both HEAD and stable/7 including 7.1 BETA2 have > > > the tunable. > > > > I think it could be great if you could document it or at least > > show it by default when you do sysctl -ad with a small description. > > > > If MSI worked as expected I would have documented it as I did > in msk(4)/nfe(4)/ale(4)/age(4)/jme(4) etc. > Using MSI on RealTek does not seem to stable. I tried hard to fix > that but some users still reported watchdog timeouts. Working > without documentation and hardware also made it hard to complete > the work. This was the main reason why MSI was disabled on re(4). What do you think about adding a note in the man page telling that it's experimental and in some cases it could improve the situation but in others it will give errors? > > > > > > > > hw.re_msi_disable="0" to /boot/loader.conf? > > >^ > > >Shoule be hw.re.msi_disable="0" > > > > > > > > > > Yes, just add it to /boot/loader.conf. Note, you should not disable > > > system-wide MSI control(e.g. hw.pci.enable_msi == 1). > > > > > > > This was sharing interrupt with USB, does USB need any special MSI > handling > > > > or with re using MSI is enough to not share the interrupt? > > > > > > If re(4) can use MSI, you don't need to worry about interrupt > > > sharing with USB. Check the output of "vmstat -i". You normally get > > > an irq256 or higher for MSI enabled driver. > > > > > > > > > > > > > > > > > > > > > > I know it continues to work because some days later i can see > that > > > > > > it tried to deliver the status reports but was unable to resolve > the > > > > > > aliases hostnames. I can't ping the machine and i know the > network > > > > > > is OK. If i reboot the machine everything is working again. > > > > > > > > > > > > > > > > Recently I've made small changes to re(4) which may help to detect > > > > > link state change event. Would you try re(4) in HEAD? > > > > > > > > Can i just drop HEAD's /stable/7/sys/dev/re/ in -STABLE and test that > > > > > > Yes, you can. It should build without problems. Just replace re(4) on > > > stable/7 with HEAD version. > > > > > > > or do i need to test the whole HEAD kernel? > > > > > > > > > > No you don't have to that. > > > > Backpor
iir(4) support under 7.1
Hi list, i am planning to upgrade our main server from 6.1 to 7.1. The machine has a ICP Vortex GDT8524RZ raid controller which is handled buy the iir(4) driver. Since i have seen various discussions in the past about Adaptec no longer supporting these controllers, driver reaching EOE, data corruption in 64bit configurtions with > 4G RAM and so on, i just wanted to ask what the current state of the driver is. Thanks for your help, Heinrich Rebehn University of Bremen Physics / Electrical and Electronics Engineering - Department of Telecommunications - Phone : +49/421/218-4664 Fax :-3341 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [ATA] and re(4) stability issues
On Wed, Dec 10, 2008 at 03:01:30PM +0100, Victor Balada Diaz wrote: > On Wed, Dec 10, 2008 at 12:08:40PM +, Oliver Peter wrote: > > On Tue, Dec 09, 2008 at 07:52:37PM +0100, Victor Balada Diaz wrote: ... > I can use GENERIC kernel again (ie, USB enabled) and so far > i didn't find any problem yet. No more interface up/down problems > and no more interrupt storms. I must say that i haven't tested > this enough, because the interrupt storms in ATA code start to > happen after a few days of uptime load, but at least the problems > with the realtek seem to be gone. I found out that I'm able to 'force' the interrupt storm by provoking higher disk I/O. Just let dd write to a file in a loop for some hours and watch vmstat: while true; do dd if=/dev/zero of=BLA bs=1M count=1000; done First you'll see that the throughput will decrease, and a few hours later you'll have /var/log/messages / dmesg full of interrupt storm messages. > If you upgrade to 7.1 -BETA2 you'll also get SATA support for > the IXP card. With 7.0 it will work as ATA 33 in compatibility mode. Wow! That's good to hear as well. I'll definitely switch to -STABLE or 7.1-PRERELASE sooner or later. I'll just give it a try on my other machines at first. > I hope this helps you. Absolutely, cheers mate. I owe you one! ~ollie -- Oliver PETER, email: [EMAIL PROTECTED], ICQ# 113969174 "If it feels good, you're doing something wrong." -- Coach McTavish ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [ATA] and re(4) stability issues
On Wed, Dec 10, 2008 at 01:18:00PM +0100, Arnaud Houdelette wrote: > Victor Balada Diaz a écrit : > >Hello, > > > >I got various machines[1] at hetzner.de and I've been having problems > >with interrupts on FreeBSD 7.0 and now FreeBSD 7.1 -BETA2 in amd64. I've > >been trying to narrow the problem so someone more knowledgeable than me > >is able to fix it. This mail is an other attempt to ask a question > >with regards ATA code to see if this time i got something. > > > >[...] > > Sorry I didn't take the time to read all the thread, but I got similar > problem with the same IXP600 chipset. > Only it was'nt with a Realtek NIC (re) but with a Ralink wireless one. > The simptoms where similar : interrupt 22 was shared between the sata > controler and the wireless card. And I got Interrupt Storms at random > times when using the wireless network. > > No problem since I removed the ral(4) NIC (got a real access point now). > You might not want to point the finger at the re(4) driver too fast. > > Arnaud Houdelette Hello Arnaud, I didn't say the problem was just because of re(4). Actually i think the there were two problems, one with re(4) and other with ata(4). The reason why i talked about both of them in the same mail is because i thought that as two drivers were affected, maybe the problem was in other part of the operating system and that could help the developers to debug the problem. My re(4) card isn't sharing the interrupt with IXP600, it's sharing the interrupt with USB controller. In this case i think the problem is fixed with the advices from Pyun YongHyeon (backporting the driver from HEAD and using MSI for interrupts). I think the problems with ata(4) code will appear again after a few days of load, as they always do, so i'll keep trying to debug them. Regards. -- La prueba más fehaciente de que existe vida inteligente en otros planetas, es que no han intentado contactar con nosotros. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
bce(4) and rx errors
Hello. Sorry for crossposting, but I wasn't sure which mailing list was the most appropriate for this email. I have an application pulling about 220Kpps from a bce(4) card (details below). At what seems to be random times, errors start showing up on that interface (I'm watching it with netstat -w1 -I), so about 10% of the initial 220Kpps is reported as errors. Bringing the interface down and then back up clears the errors, but they do reappear at a later time. Before they reappear, the systems manages to pull the full 220Kpps as before. This is a temporary setup, we'll very soon use an Intel fiber card, but I thought this issue was worth mentioning, as I don't think it's a hardware problem (the switch also reports no errors). The system is running a fresh (yesterday's) RELENG_7. The card is onboard, on a HP DL380 G5. Here's the pciconf output: -- cut here -- [EMAIL PROTECTED]:2:0:0: class=0x060400 card=0x chip=0x01031166 rev=0xc3 hdr=0x01 vendor = 'ServerWorks (Was: Reliance Computer Corp)' device = 'BCM5715 Broadcom dual gigabit, pci bridge' class = bridge subclass = PCI-PCI [EMAIL PROTECTED]:3:0:0:class=0x02 card=0x7038103c chip=0x164c14e4 rev=0x12 hdr=0x00 vendor = 'Broadcom Corporation' device = '5708C Broadcom NetXtreme II Gigabit Ethernet Adapter' class = network subclass = ethernet -- and here -- Regards, Vlad -- ~/.signature: no such file or directory ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
7.0 unusual performance issue - vmdaemon hang?
Just had one of hour webservers flag as down here and on investigation the machine seems to be struggling due to a hung vmdaemon process. top is reporting vmdaemon as using a constant 55.57% CPU yet CPU time is not increasing:- last pid: 36492; load averages: 0.04, 0.05, .11 up 89+19:53:21 14:36:08 223 processes: 9 running, 201 sleeping, 13 waiting CPU states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Mem: 644M Active, 2780M Inact, 480M Wired, 249M Cache, 214M Buf, 3759M Free Swap: 4096M Total, 537M Used, 3559M Free, 13% Inuse PID USERNAME THR PRI NICE SIZERES STATE C TIME WCPU COMMAND 11 root 1 171 ki31 0K16K CPU7 7 2116.4 100.00% idle: cpu7 12 root 1 171 ki31 0K16K CPU6 6 2059.5 100.00% idle: cpu6 13 root 1 171 ki31 0K16K CPU5 5 2029.3 100.00% idle: cpu5 14 root 1 171 ki31 0K16K CPU4 4 1977.8 100.00% idle: cpu4 15 root 1 171 ki31 0K16K CPU3 3 1912.0 100.00% idle: cpu3 16 root 1 171 ki31 0K16K CPU2 2 1835.2 100.00% idle: cpu2 17 root 1 171 ki31 0K16K CPU1 1 1763.1 100.00% idle: cpu1 18 root 1 171 ki31 0K16K RUN0 1727.6 100.00% idle: cpu0 37 root 1 20- 0K16K psleep 5 0:56 55.57% vmdaemon 60198 www1 4098M 13516K sbwait 2 35:21 1.46% httpd 60264 www1 40 133M 9248K sbwait 0 21:21 0.39% httpd 30 root 1 -68- 0K16K - 7 18.3H 0.00% em1 taskq 29 root 1 -68- 0K16K - 6 330:21 0.00% em0 taskq 41 root 1 20- 0K16K syncer 1 212:42 0.00% syncer 21 root 1 -44- 0K16K WAIT 0 201:02 0.00% swi1: net 19 root 1 -32- 0K16K WAIT 0 120:15 0.00% swi4: clock 22 root 1 44- 0K16K - 5 73:00 0.00% yarrow I've tried to ktrace the process and it produced nothing, also tried gdb and it failed to attach. Is there anything else I can try before we reboot the machine to help determine what the problem is? Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
zfs panics
hi, from a solaris or linux client, doing a ls(1) of a nfs exported zfs file, for example: ls /net/zfs-server/h/.zfs/snapshot, panics the server. The server is running latest 7.1-prerelease. when client is freebsd, it mostly works, but in a few cases the server just goes into comma. btw, the server is running vanilla zfs, no tunning, and the server is 64bit with 8gb of memory and quad core (dell-pe2950) Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x168 fault code = supervisor write data, page not present instruction pointer = 0x8:0x804a9175 stack pointer = 0x10:0xb71fc550 frame pointer = 0x10:0xb71fc560 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 802 (nfsd) [thread pid 802 tid 100185 ] Stopped at _mtx_lock_flags+0x15: lock cmpxchgq %rsi,0x50(%rdi) db> tr Tracing pid 802 tid 100185 td 0xff0004d576e0 _mtx_lock_flags() at _mtx_lock_flags+0x15 vput() at vput+0x45 nfsrv_readdirplus() at nfsrv_readdirplus+0x83e nfssvc() at nfssvc+0x400 syscall() at syscall+0x1bb Xfast_syscall() at Xfast_syscall+0xab --- syscall (155, FreeBSD ELF64, nfssvc), rip = 0x8006885cc, rsp = 0x7fffea2 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: zfs panics
Hi, On 2008-12-10, Danny Braniss wrote: > from a solaris or linux client, doing a ls(1) of a nfs exported zfs > file, > for example: ls /net/zfs-server/h/.zfs/snapshot, > panics the server. The server is running latest 7.1-prerelease. This has been reported as PR kern/125149. I have described the problem in this message: http://lists.freebsd.org/pipermail/freebsd-fs/2008-October/005217.html See the PR for RELENG_7 patches. (http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/125149) -- Jaakko ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: bce(4) and rx errors
On Wed, Dec 10, 2008 at 04:59:26PM +0200, Vlad GALU wrote: > I have an application pulling about 220Kpps from a bce(4) card > (details below). At what seems to be random times, errors start > showing up on that interface (I'm watching it with netstat -w1 -I), so > about 10% of the initial 220Kpps is reported as errors. I'm also seeing a pretty steady stream of errors on both bce interfaces in a Dell PowerEdge 2950 III. In my case, the source is RELENG_7_1 from ~14:00 UTC yesterday (9 Dec). Throughput does not seem to be affected. "sysctl -a | egrep -i 'bce.*err'" yields all zeroes, for whatever that's worth. [EMAIL PROTECTED]:0:0:0:class=0x06 card=0x80868086 chip=0x25c08086 rev=0x12 hdr=0x00 vendor = 'Intel Corporation' device = '5000X Chipset Memory Controller Hub' class = bridge subclass = HOST-PCI [EMAIL PROTECTED]:0:2:0:class=0x060400 card=0x chip=0x25e28086 rev=0x12 hdr=0x01 vendor = 'Intel Corporation' device = '5000 Series Chipset PCIe x4 Port 2' class = bridge subclass = PCI-PCI [EMAIL PROTECTED]:0:3:0:class=0x060400 card=0x chip=0x25e38086 rev=0x12 hdr=0x01 vendor = 'Intel Corporation' device = '5000 Series Chipset PCIe x4 Port 3' class = bridge subclass = PCI-PCI [EMAIL PROTECTED]:0:4:0:class=0x060400 card=0x chip=0x25f88086 rev=0x12 hdr=0x01 vendor = 'Intel Corporation' device = '5000 Series Chipset PCIe x8 Port 4-5' class = bridge subclass = PCI-PCI [EMAIL PROTECTED]:0:5:0:class=0x060400 card=0x chip=0x25e58086 rev=0x12 hdr=0x01 vendor = 'Intel Corporation' device = '5000 Series Chipset PCIe x4 Port 5' class = bridge subclass = PCI-PCI [EMAIL PROTECTED]:0:6:0:class=0x060400 card=0x chip=0x25f98086 rev=0x12 hdr=0x01 vendor = 'Intel Corporation' device = '5000 Series Chipset PCIe x8 Port 6-7' class = bridge subclass = PCI-PCI [EMAIL PROTECTED]:0:7:0:class=0x060400 card=0x chip=0x25e78086 rev=0x12 hdr=0x01 vendor = 'Intel Corporation' device = '5000 Series Chipset PCIe x4 Port 7' class = bridge subclass = PCI-PCI [EMAIL PROTECTED]:0:16:0: class=0x06 card=0x01b21028 chip=0x25f08086 rev=0x12 hdr=0x00 vendor = 'Intel Corporation' device = '5000 Series Chipset Error Reporting Registers' class = bridge subclass = HOST-PCI [EMAIL PROTECTED]:0:16:1: class=0x06 card=0x01b21028 chip=0x25f08086 rev=0x12 hdr=0x00 vendor = 'Intel Corporation' device = '5000 Series Chipset Error Reporting Registers' class = bridge subclass = HOST-PCI [EMAIL PROTECTED]:0:16:2: class=0x06 card=0x01b21028 chip=0x25f08086 rev=0x12 hdr=0x00 vendor = 'Intel Corporation' device = '5000 Series Chipset Error Reporting Registers' class = bridge subclass = HOST-PCI [EMAIL PROTECTED]:0:17:0: class=0x06 card=0x80868086 chip=0x25f18086 rev=0x12 hdr=0x00 vendor = 'Intel Corporation' device = '5000 Series Chipset Reserved Registers' class = bridge subclass = HOST-PCI [EMAIL PROTECTED]:0:19:0: class=0x06 card=0x80868086 chip=0x25f38086 rev=0x12 hdr=0x00 vendor = 'Intel Corporation' device = '5000 Series Chipset Reserved Registers' class = bridge subclass = HOST-PCI [EMAIL PROTECTED]:0:21:0: class=0x06 card=0x80868086 chip=0x25f58086 rev=0x12 hdr=0x00 vendor = 'Intel Corporation' device = '5000 Series Chipset FBD Registers' class = bridge subclass = HOST-PCI [EMAIL PROTECTED]:0:22:0: class=0x06 card=0x80868086 chip=0x25f68086 rev=0x12 hdr=0x00 vendor = 'Intel Corporation' device = '5000 Series Chipset FBD Registers' class = bridge subclass = HOST-PCI [EMAIL PROTECTED]:0:28:0: class=0x060400 card=0x01b21028 chip=0x26908086 rev=0x09 hdr=0x01 vendor = 'Intel Corporation' device = '631xESB/632xESB/3100 PCIe Root Port 1' class = bridge subclass = PCI-PCI [EMAIL PROTECTED]:0:29:0: class=0x0c0300 card=0x01b21028 chip=0x26888086 rev=0x09 hdr=0x00 vendor = 'Intel Corporation' device = '631xESB/632xESB/3100 Chipset USB Universal Host Controller' class = serial bus subclass = USB [EMAIL PROTECTED]:0:29:1: class=0x0c0300 card=0x01b21028 chip=0x26898086 rev=0x09 hdr=0x00 vendor = 'Intel Corporation' device = '631xESB/632xESB/3100 Chipset USB Universal Host Controller' class = serial bus subclass = USB [EMAIL PROTECTED]:0:29:2: class=0x0c0300 card=0x01b21028 chip=0x268a8086 rev=0x09 hdr=0x00 vendor = 'Intel Corporation' device = '631xESB/632xESB/3100 Chipset USB Universal Host Controller' class =
Re: RELENG_7_1: bce driver change generating too much interrupts ?
On Mon, December 8, 2008 5:22 pm, Mike Jakubik wrote: > On Mon, December 8, 2008 5:12 pm, Xin LI wrote: > >> Which version are you currently using? My previous commit only fixes >> the excessive interrupt issue, I think this could be a different >> problem, I'm taking a look at the code to see if I can have something >> for you. > > I was running on the version just prior to the latest interrupt commit. I > have now updated to the one with the interrupt fix. Will let you know if > things change. > > Thank You. The interrupt rate has decreased significantly, however i am still having having problem with applications that hold stateful connections. The rx errors are also still showing, i suspect this is related to the problem. How can i roll back this driver to the last known good version? Thanks. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: bce(4) and rx errors
On Wed, December 10, 2008 11:03 am, Jeff Blank wrote: > On Wed, Dec 10, 2008 at 04:59:26PM +0200, Vlad GALU wrote: >> I have an application pulling about 220Kpps from a bce(4) card >> (details below). At what seems to be random times, errors start >> showing up on that interface (I'm watching it with netstat -w1 -I), so >> about 10% of the initial 220Kpps is reported as errors. > > I'm also seeing a pretty steady stream of errors on both bce > interfaces in a Dell PowerEdge 2950 III. In my case, the source > is RELENG_7_1 from ~14:00 UTC yesterday (9 Dec). Throughput does not > seem to be affected. "sysctl -a | egrep -i 'bce.*err'" yields all > zeroes, for whatever that's worth. See the "RELENG_7_1: bce driver change generating too much interrupts ?" thread. This problem as surfaced since the recent bce driver changes. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: RELENG_7_1: bce driver change generating too much interrupts ?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Mike Jakubik wrote: > On Mon, December 8, 2008 5:22 pm, Mike Jakubik wrote: >> On Mon, December 8, 2008 5:12 pm, Xin LI wrote: >> >>> Which version are you currently using? My previous commit only fixes >>> the excessive interrupt issue, I think this could be a different >>> problem, I'm taking a look at the code to see if I can have something >>> for you. >> I was running on the version just prior to the latest interrupt commit. I >> have now updated to the one with the interrupt fix. Will let you know if >> things change. >> >> Thank You. > > The interrupt rate has decreased significantly, however i am still having > having problem with applications that hold stateful connections. The rx > errors are also still showing, i suspect this is related to the problem. > How can i roll back this driver to the last known good version? If you are using CVS to track the -stable tree: cd /usr/src/sys/dev/bce cvs -q up -rRELENG_7 -D2008/11/01 If not, then the process would be a bit complicated. You need to checkout from anoncvs, e.g.: cvs -q -d [EMAIL PROTECTED]:/home/ncvs login cvs -q -d [EMAIL PROTECTED]:/home/ncvs co -rRELENG_7 - -D2008/11/01 sys/dev/bce cd sys/dev/bce cp * /sys/dev/bce Cheers, - -- Xin LI <[EMAIL PROTECTED]> http://www.delphij.net/ FreeBSD - The Power to Serve! -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (FreeBSD) iEYEARECAAYFAklAAKUACgkQi+vbBBjt66AhrwCfXI5aPX3q/E26KcW7HovtPSct LnoAn0QNK/l65eYMiUvGBDUfHDyeXJ9Z =r8So -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [ATA] and re(4) stability issues
On 2008-Dec-10 10:55:35 +0100, Søren Schmidt <[EMAIL PROTECTED]> wrote: >And you will not use 64bit DMA even if the chipset supports it. >However I have not seen any chipsets supporting this fail, YMMV as >usual :) There's a reference in wikipedia pointing to http://www.mail-archive.com/[EMAIL PROTECTED]/msg06694.html that claims the AMD/ATI SB600 lies about supporting 64-bit DMA in AHCI mode. I have a SB600 but it doesn't have >4GB to test on. -- Peter Jeremy Please excuse any delays as the result of my ISP's inability to implement an MTA that is either RFC2821-compliant or matches their claimed behaviour. pgp1ifE19lUGB.pgp Description: PGP signature
Re: bce(4) and rx errors
On 12/10/08, Mike Jakubik <[EMAIL PROTECTED]> wrote: > On Wed, December 10, 2008 11:03 am, Jeff Blank wrote: > > On Wed, Dec 10, 2008 at 04:59:26PM +0200, Vlad GALU wrote: > >> I have an application pulling about 220Kpps from a bce(4) card > >> (details below). At what seems to be random times, errors start > >> showing up on that interface (I'm watching it with netstat -w1 -I), so > >> about 10% of the initial 220Kpps is reported as errors. > > > > I'm also seeing a pretty steady stream of errors on both bce > > interfaces in a Dell PowerEdge 2950 III. In my case, the source > > is RELENG_7_1 from ~14:00 UTC yesterday (9 Dec). Throughput does not > > seem to be affected. "sysctl -a | egrep -i 'bce.*err'" yields all > > zeroes, for whatever that's worth. > > > See the "RELENG_7_1: bce driver change generating too much interrupts ?" > thread. This problem as surfaced since the recent bce driver changes. Thanks Mike, I'll give it a shot. -- ~/.signature: no such file or directory ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: bce(4) and rx errors
On 12/10/08, Vlad GALU <[EMAIL PROTECTED]> wrote: > On 12/10/08, Mike Jakubik <[EMAIL PROTECTED]> wrote: > > On Wed, December 10, 2008 11:03 am, Jeff Blank wrote: > > > On Wed, Dec 10, 2008 at 04:59:26PM +0200, Vlad GALU wrote: > > >> I have an application pulling about 220Kpps from a bce(4) card > > >> (details below). At what seems to be random times, errors start > > >> showing up on that interface (I'm watching it with netstat -w1 -I), so > > >> about 10% of the initial 220Kpps is reported as errors. > > > > > > I'm also seeing a pretty steady stream of errors on both bce > > > interfaces in a Dell PowerEdge 2950 III. In my case, the source > > > is RELENG_7_1 from ~14:00 UTC yesterday (9 Dec). Throughput does not > > > seem to be affected. "sysctl -a | egrep -i 'bce.*err'" yields all > > > zeroes, for whatever that's worth. > > > > > > See the "RELENG_7_1: bce driver change generating too much interrupts ?" > > thread. This problem as surfaced since the recent bce driver changes. > > > Thanks Mike, I'll give it a shot. Indeed, the errors seem to have gone away after rolling back the driver, as Xin Li suggested in another thread. Sorry for the noise! -- ~/.signature: no such file or directory ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
why can't multiple programs listen to cuaXYZ anymore? (7-stable)
Before my last cvsup, I could have cutecom & a custom configuration app (i.e gpsd) running at the same time on the same serial port. Any incoming data, both would echo it, and as long as only one was outputting data, that worked fine too. Now it's 'broke'. I hear noise about TTY changes, I assume that changed the underlying devices, as well? This used to be a major perk over windows for embedded systems guys like memakes debugging serial devices a snap. Steve ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [ATA] and re(4) stability issues
On Wed, Dec 10, 2008 at 03:08:24PM +0100, Victor Balada Diaz wrote: > On Wed, Dec 10, 2008 at 09:07:19PM +0900, Pyun YongHyeon wrote: [...] > > > > > > It seems that your controller supports MSI so you can set a > > tunable > > > > > > hw.re.msi_disable to 0 to enable MSI. With MSI you can remove > > > > > > interrupt sharing(e.g. add hw.re.msi_disable="0" to > > > > > > /boot/loader.conf file.) However there were several issues on > > re(4) > > > > > > w.r.t MSI so it was off by default. > > > > > > > > > > This is undocumented and with sysctl -a i can't find the tunable. > > Is this > > > > > a HEAD feature or it's also in 7.1 -BETA2? Should i add > > > > > > > > Yeah it's an undocmented feature. But most drivers written by me > > > > have similar kobs. Both HEAD and stable/7 including 7.1 BETA2 have > > > > the tunable. > > > > > > I think it could be great if you could document it or at least > > > show it by default when you do sysctl -ad with a small description. > > > > > > > If MSI worked as expected I would have documented it as I did > > in msk(4)/nfe(4)/ale(4)/age(4)/jme(4) etc. > > Using MSI on RealTek does not seem to stable. I tried hard to fix > > that but some users still reported watchdog timeouts. Working > > without documentation and hardware also made it hard to complete > > the work. This was the main reason why MSI was disabled on re(4). > > What do you think about adding a note in the man page telling that > it's experimental and in some cases it could improve the situation > but in others it will give errors? Based on the your testing I have idea how to mitigate the missing Tx completion interrupt. If all goes well re(4) could reliably take advantage of MSI on RealTek controllers. If that miserably fail I would do as you suggested. > > > > I think re(4) in HEAD needs more testing. As you might know RealTek > > produced too many chipsets. :-( > > Ok, i'll use the backported driver as it works better for me :-) > > If i can help you testing any patches i'm more than welcome to do it. > > Thanks a lot for your help Pyun YongHyeon. > You're welcome. -- Regards, Pyun YongHyeon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: visibility of release process
* Patrick Lamaizière ([EMAIL PROTECTED]) wrote: > Ruben van Staveren <[EMAIL PROTECTED]> a écrit : > > Though experimental, I'm greatly enjoying > > http://www.secnetix.de/olli/FreeBSD/svnews/?p=/stable/7 > > Nice. There is also http://freshbsd.org/ (really cool IMHO). Thanks; I write/run that. I'm currently developing version 2, which should bring much better performance and more powerful filtering; e.g. by filename, multiple committers, branches, etc, as well as history all the way back to r1. Since it's working off a local SVN mirror rather than commit emails, it's also feasable to support things like local copies of diffs. Of course, now I need to generalize it back to the other BSD's -- Thomas 'Freaky' Hurst http://hur.st/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [ATA] and re(4) stability issues
On Wed, Dec 10, 2008 at 09:07:19PM +0900, Pyun YongHyeon wrote: > On Wed, Dec 10, 2008 at 12:32:25PM +0100, Victor Balada Diaz wrote: > > Also i didn't see any problem with interfaces going up and down, > > but that usually happen after some hours of uptime, so i'll let > > you know if the error happens again. > > After writing to the HD with dd for a few hours and using stress -i 10 -d 10 the machine lost connectivity. I waited until today to be sure if the machine hung, paniced or just lost network connectivity. I don't have local access or serial access, so this is the only way i could do it. I've seen in the logs during the night various messages of: Dec 10 00:33:49 yac kernel: re0: watchdog timeout Dec 10 00:33:49 yac kernel: re0: link state changed to DOWN Dec 10 00:33:52 yac kernel: re0: link state changed to UP The interface never recovered and i wasn't able to ping the machine until i rebooted. Nagios was checking all the time and no recovery happened. The netstat -i in daily scripts shows just one Oerrs. I'm used to have a lot of them, but seems this time the card didn't recover from the only one. I also want to say that this is not a regression, as it happened before with 7.1 -BETA 2 code. Is there anything more i can try? Regards. -- La prueba más fehaciente de que existe vida inteligente en otros planetas, es que no han intentado contactar con nosotros. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"