Re: Curious failure of ZFS snapshots
On Sat, 29 Nov 2008 11:46:40 +0100 Pawel Jakub Dawidek <[EMAIL PROTECTED]> wrote about Re: Curious failure of ZFS snapshots: PJD> > > GK> mclane# ll /tank/home/pt/.zfs/ PJD> > > GK> ls: snapshot: Bad file descriptor PJD> > > GK> total 0 PJD> Is there a way for me to reproduce that? None that I could tell you right now. This was on a machine which uses zfs send/receive to backup its zfs filesystem to a backup server. Only one out of 6 or 7 zfs filesystems showed this problem. After rebooting it went away and did not appear again since then. cu Gerrit ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Curious failure of ZFS snapshots
On Sun, 30 Nov 2008 01:05:48 + Pete French <[EMAIL PROTECTED]> wrote about Re: Curious failure of ZFS snapshots: PF> Here is what I am doing - this script is run with an argument '7am' or PF> '7pm' once per day. the mysql database is a slave replication from a PF> master, so there is a continuous trickle of data into it. The symbolic PF> links are there so you can connect to the mysql server and access PF> 'xxx-7am' or 'xxx-7pm' to get a previous version of database 'xxx'. PF> In case its not obvious, the filesystem 'tank/zfs' is mounted on the PF> director '/var/db/mysql'. If you run this for a few cycles it should PF> preseumably break for you too. If you think it will be useful I can also post my scripts. However, as I did not see the problem again so far, it might be the case that I messed something up manually while developing the scripts one or two weeks ago. As mentioned, even the unaccessible zfs snapshots did send/receive fine, so internally zfs seems to be happy (only unmounting them was a bad idea :-). cu Gerrit ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
dhclient doing DISCOVER with bad IP checksum - bge
Sorry for the cross-post, but this could be either lists problem. I have 2 boxes running 7-STABLE as of 20081130, both i386 SMP. One is running ISC DHCPD 3.0.x from recent ports, and the other dhclient from make world. The server is refusing to answer the DISCOVER request, as it thinks the IP checksum is wrong, which tcpdump also confirms. Other DHCP clients are working fine on this network, so I do not believe it to be the network, server or dhcpd. Server is running a 2 Port Intel card - em driver. Client is a Dell PE1750 with 2 onboard NIC's - bge driver. I have tried turning off both RXCSUM and TXCSUM on both the client and server machines with no luck. I also tried the second NIC on the server with the same result. This setup was working just a couple of weeks ago, and the only thing that has changed is updating the src for a make world. PXE booting this server does result in an IP being issued, so it is pointing towards something new/changed in 7-STABLE. I have attached a 3 packet dump of the DISCOVER requests. Can anybody shed some light on this for me? Thanks, -Jon -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. dhclient_badcsum.cap Description: Binary data ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
make distribution halts during install (7.1Prerelease today)
The "make distribution" phase of a full build of 7.1 Prerelease, sourced today (Mon Dec 1 10:34:45 UTC 2008), unfortunately failed. make distribution DESTDIR=/differentplace failed (see below), however the following worked ok: buildkernel, installkernerl, buildworld, installworld. Two systems (a Uni and Dual processor) were used and both failed at the same point. Both building systems were successful in building/installing kernel and world for themselves and for a different DESTDIR (per the Handbook). Only the make distribution failed (a clue?) The repeated attempts to make distribution DESTDIR=X failed at the same location (see below). The error message suggestions incorrect parameters to "install". Advise/guidance welcome. #cd /usr/src && make DESTDIR=/usr/k_brfw-d distribution cd /usr/src/etc; MAKEOBJDIRPREFIX=/usr/obj MACHINE_ARCH=i386 MACHINE=i386 CPUTYPE= GROFF_BIN_PATH=/usr/obj/usr/src/tmp/legacy/usr/bin GROFF_FONT_PATH=/usr/obj/usr/src/tmp/legacy/usr/share/groff_font GROFF_TMAC_PATH=/usr/obj/usr/src/tmp/legacy/usr/share/tmac PATH=/usr/obj/usr/src/tmp/legacy/usr/sbin:/usr/obj/usr/src/tmp/legacy/usr/bin:/usr/obj/usr/src/tmp/legacy/usr/games:/usr/obj/usr/src/tmp/usr/sbin:/usr/obj/usr/src/tmp/usr/bin:/usr/obj/usr/src/tmp/usr/games:/sbin:/bin:/usr/sbin:/usr/bin make distribution cd /usr/src/etc; install -o root -g wheel -m 644 amd.map apmd.conf auth.conf crontab csh.cshrc csh.login csh.logout devd.conf devfs.conf ddb.conf dhclient.conf disktab fbtab freebsd-update.conf ftpusers gettytab group hosts hosts.allow hosts.equiv hosts.lpd inetd.conf libalias.conf login.access login.conf mac.conf motd netconfig network.subr networks newsyslog.conf nsswitch.conf portsnap.conf pf.os phones profile protocols rc rc.bsdextended rc.firewall rc.firewall6 rc.initdiskless rc.sendmail rc.shutdown rc.subr remote rpc services shells snmpd.config sysctl.conf syslog.conf etc.i386/ttys /usr/src/etc/../gnu/usr.bin/man/manpath/manpath.config /usr/src/etc/../usr.bin/mail/misc/mail.rc /usr/src/etc/../usr.bin/locate/locate/locate.rc nscd.conf /usr/k_brfw-d/etc; cap_mkdb -l /usr/k_brfw-d/etc/login.conf; install -o root -g wheel -m 755 netstart pccard_ether rc.suspend rc.resume /usr/k_brfw-d/etc; install -o root -g wheel -m 600 master.passwd nsmb.conf opieaccess /usr/k_brfw-d/etc; pwd_mkdb -L -i -p -d /usr/k_brfw-d/etc /usr/k_brfw-d/etc/master.passwd install: wrong number or types of arguments usage: install [-bCcpSsv] [-B suffix] [-f flags] [-g group] [-m mode] [-o owner] file1 file2 install [-bCcpSsv] [-B suffix] [-f flags] [-g group] [-m mode] [-o owner] file1 ... fileN directory install -d [-v] [-g group] [-m mode] [-o owner] directory ... *** Error code 64 Stop in /usr/src/etc. *** Error code 1 Stop in /usr/src. *** Error code 1 Stop in /usr/src. I also tried various CPUTYPES to ensure that all parameters to make, were populated. Having spent most of the day building kernels/worlds and gstripping, gjournalling and building a lot of ports, the package is looking pretty good. I hope that my explanation is concise it's been a long day and I'm stuck. Regards, Dewayne. Start your day with Yahoo!7 and win a Sony Bravia TV. Enter now http://au.docs.yahoo.com/homepageset/?p1=other&p2=au&p3=tagline ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
7.1-PRERELEASE: arcmsr write performance problem
Hi, I am seeing extremely poor performance (~100kB/s) when untaring large tar files into fresh ufs filesystems. I see the problem with softupdates and without softupdates but with an async mount. This is a Supermicro X7DB8 board, 4GB, 2 x Xeon 5140. Sample gstat output: dT: 1.033s w: 1.000s L(q) ops/sr/s kBps ms/rw/s kBps ms/w %busy Name 585 61 0 00.0 61170 13812.0 100.1| da2 I see ms/w start at about 200ms with a ~3MB/s throughput, and then I see ms/w rise and kBps drop. ms/w goes as high as 16-20s, and then suddenly drops back down to about 200ms. Using iostat, while the performance is high(er), kb/t is 64kB, as the problem starts it drops towards 2kB. Copying a single large file doesn't exhibit this problem, although throughput isn't great (~3-5MB/s). However, that's better that 100kB/s. arcmsr0: mem 0xd890-0xd8900fff,0xd800-0xd83f irq 16 at device 14.0 on pci10 ARECA RAID ADAPTER0: Driver Version 1.20.00.15 2007-10-07 ARECA RAID ADAPTER0: FIRMWARE VERSION V1.46 2008-08-06 arcmsr0: [ITHREAD] There are eight disks connected in a RAID-6 configuration. The controller's cache is write-through and the disks' write caches are disabled. NCQ is enabled on the drives. The same hardware when it ran 6.3-p1 didn't have this problem. However, the system BIOS was updated at the same time as the operating system (in an attempt to solve a recent em problem), so it is possible that it is a BIOS related problem. The same build on an entirely different machine with an aac controller and SAS disks also doesn't show this problem. Running 'devinfo -r' doesn't list arcmsr as having an interrupt at all. (see below). That strikes me as odd; checking another machine that is still running 6.2 with an arcmsr controller, I can see the interrupt just fine. So: - Does anyone have any suggestions? - Is it normal for arcmsr to not show an interrupt in the output from devinfo in 7.1? Full dmesg, devinfo below. Thanks, Jan Mikkelsen Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.1-PRERELEASE #0: Mon Dec 1 14:53:12 EST 2008 [EMAIL PROTECTED]:/home/janm/p4/freebsd-image-std-2008.2/work/base-freebsd/home/janm/p4/freebsd-image-std-2008.2/FreeBSD/src/sys/TW-SMP Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(R) CPU5140 @ 2.33GHz (2333.35-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x6f6 Stepping = 6 Features=0xbfebfbff Features2=0x4e3bd AMD Features=0x20100800 AMD Features2=0x1 Cores per package: 2 usable memory = 4280651776 (4082 MB) avail memory = 4117843968 (3927 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 6 cpu3 (AP): APIC ID: 7 ioapic0 irqs 0-23 on motherboard ioapic1 irqs 24-47 on motherboard kbd1 at kbdmux0 ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pcib1: at device 2.0 on pci0 pci1: on pcib1 pcib2: irq 16 at device 0.0 on pci1 pci2: on pcib2 pcib3: irq 16 at device 0.0 on pci2 pci3: on pcib3 pcib4: at device 0.0 on pci3 pci4: on pcib4 ahd0: port 0x2400-0x24ff,0x2000-0x20ff mem 0xd850-0xd8501fff irq 16 at device 2.0 on pci4 ahd0: [ITHREAD] aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs ahd1: port 0x2c00-0x2cff,0x2800-0x28ff mem 0xd8502000-0xd8503fff irq 17 at device 2.1 on pci4 ahd1: [ITHREAD] aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs pcib5: at device 0.2 on pci3 pci5: on pcib5 bge0: mem 0xd860-0xd860 irq 16 at device 1.0 on pci5 miibus0: on bge0 brgphy0: PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto bge0: Ethernet address: 00:40:f4:66:b1:56 bge0: [ITHREAD] pcib6: irq 18 at device 2.0 on pci2 pci6: on pcib6 em0: port 0x3000-0x301f mem 0xd840-0xd841 irq 18 at device 0.0 on pci6 em0: Using MSI interrupt em0: [FILTER] em0: Ethernet address: 00:30:48:31:67:86 em1: port 0x3020-0x303f mem 0xd842-0xd843 irq 19 at device 0.1 on pci6 em1: Using MSI interrupt em1: [FILTER] em1: Ethernet address: 00:30:48:31:67:87 pcib7: at device 0.3 on pci1 pci7: on pcib7 pcib8: at device 4.0 on pci0 pci8: on pcib8 pcib9: at device 6.0 on pci0 pci9: on pcib9 pcib10: at device 0.0 on pci9 pci10: on pcib10 arcmsr0: mem 0xd890-0xd8900fff,0xd800-0xd83f irq 16 at device 14.0 on pci10 ARECA RAID ADAPTER0: Driver Version 1.20.00.1
Re: Can I get a committer to mark this bug as blocking 6.4-RELEASE ?
On Nov 26, 2008, at 1:12 PM, Ken Smith wrote: Unfortunately no. As John indicated in the earlier thread BIOS issues tend to be extremely hard to diagnose and so far it seems like its specific to this one motherboard. Given this problem does cause issues with installs I'd be willing to provide ISOs built at the point we've done the Errata Notice that fixes the problem. But its too nebulous an issue to hold up the release itself for. It does *not* cause an issue with installs. Installs work fine. It prevents booting an installed operating system. This appears to affect *ALL* of the Intel multi-cpu motherboards, including 3 generations of Rackable systems. The only reason it is nebulous is because absolutely nobody bothered to investigate the issue. I've been asking for what information would help. I've offered to setup serial consoles, or even ship systems, to anyone who would work on this problem. This is very big problem that will affect thousands of freebsd servers. Ken, the complete lack of action taken by FreeBSD to even CONSIDER investigating a significant bug reported during the testing process is shocking. And it truly puts a lie to those who continue to claim that we should be more active in the testing process. Every time I have done this, I'd found significant issues that affect a significant portion of the user base and COMPLETELY prevent deployment of a given release, and absolutely nothing has been done to even investigate the reports, nevermind address them. Congradulations. Good Job. If you aren't going to accept bug reports, why exactly do you release testing candidates at all? -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: usb keyboard dying at loader prompt
Just FYI we are seeing the exact same problem with PS/2 keyboards and the 6.4 loader, so this may not be a USB-only issue. The complete lack of response to serious bug reports about 6.4-REL is fairly shocking. On Nov 28, 2008, at 5:24 AM, Andriy Gapon wrote: I did more testing and it seems that our loader does have something to do with the problem. If I boot to memtest86 the keyboard keeps working. If I pause boot menu, wait for many minutes, the keyboard still works. If I escape to loader prompt, this when the keyboard stops working after a few seconds. Not sure how to explain this. I think I've seen some changes to reduce memory usage of loader, I will try them to see if that would make any difference for my situation. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED] " -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Can I get a committer to mark this bug as blocking 6.4-RELEASE ?
On Mon, 2008-12-01 at 10:20 -0800, Jo Rhett wrote: > On Nov 26, 2008, at 1:12 PM, Ken Smith wrote: > > Unfortunately no. As John indicated in the earlier thread BIOS > > issues tend to be extremely hard to diagnose and so far it seems > > like its specific to this one motherboard. > > > > Given this problem does cause issues with installs I'd be willing > > to provide ISOs built at the point we've done the Errata Notice that > > fixes the problem. But its too nebulous an issue to hold up the > > release itself for. > > It does *not* cause an issue with installs. Installs work fine. It > prevents booting an installed operating system. This appears to > affect *ALL* of the Intel multi-cpu motherboards, including 3 > generations of Rackable systems. Understood, I guess I wasn't quite specific enough. The machine not being able to boot what got installed on its disk I consider an install problem. To date this is the first mention I've seen of it affecting more than one specific machine type. I might have missed it but I can't recall you mentioning this affected more than one particular machine. And it does not seem to affect *ALL* of the Intel multi-cpu motherboards. > The only reason it is nebulous is because absolutely nobody bothered > to investigate the issue. I've been asking for what information would > help. I've offered to setup serial consoles, or even ship systems, to > anyone who would work on this problem. Both John and Xin Li have chimed in on the two threads I've seen that are related to this specific topic. John diagnosed it as a issue with the BIOS. That's what makes it a nebulous problem. When working on those sorts of things most people liken it to "Whack-a-mole". > This is very big problem that will affect thousands of freebsd servers. Its still not clear it will affect thousands of servers. The same set of changes got made to stable/7 as were done to stable/6, and the test builds for the 7.1 release have been seeing much more testing than the test builds for the 6.4 release. If the problem was as wide-spread as you're suggesting we'd likely have seen a lot more reports and that factored into the decision about whether to go ahead or not. This all left me with a decision. My choices were to back out the BTX changes that were known to fix boot issues with certain motherboards and enabled booting from USB devices or leave things as they are. The motherboards that didn't boot with the older code had no work-around. The motherboards that did boot with the older code but not the newer code do have a work-around (use the old loader). Decisions like that suck, no matter which choice I make it's wrong. Holding the release until all bios issues get resolved isn't a viable option because of the "Whack-a-mole" thing mentioned above. Fix it for one and two break. It takes a lot of time/work to settle into what seems to work for the widest set of machines. > Ken, the complete lack of action taken by FreeBSD to even CONSIDER > investigating a significant bug reported during the testing process is > shocking. And it truly puts a lie to those who continue to claim that > we should be more active in the testing process. Every time I have > done this, I'd found significant issues that affect a significant > portion of the user base and COMPLETELY prevent deployment of a given > release, and absolutely nothing has been done to even investigate the > reports, nevermind address them. > > Congradulations. Good Job. If you aren't going to accept bug > reports, why exactly do you release testing candidates at all? So you're saying John and Xin Li's responses (Xin Li's questions still un-answered) to you show a complete lack to even consider investigating it? I know from past email threads your preference is for 6.X right now but as a test point if you aren't totally fried over this whole thing it would still be useful to know for sure if the issue exists with 7.1 test builds. If yes it eliminates a variety of possibilities and helps focus on the exact problem. -- Ken Smith - From there to here, from here to | [EMAIL PROTECTED] there, funny things are everywhere. | - Theodore Geisel | signature.asc Description: This is a digitally signed message part
Re: Can I get a committer to mark this bug as blocking 6.4-RELEASE ?
On Dec 1, 2008, at 11:30 AM, Ken Smith wrote: Both John and Xin Li have chimed in on the two threads I've seen that are related to this specific topic. John diagnosed it as a issue with the BIOS. That's what makes it a nebulous problem. When working on those sorts of things most people liken it to "Whack-a-mole". Diagnosed without testing. John never asked for any more information than the page fault description from me. When I asked what else to test and offered to supply systems for testing he stopped responding. Xin Li proposed a work-around that would have castrated the systems. It might work, but it wasn't a useful workaround so I deferred testing and focused on trying to get someone to address the real problem. This is very big problem that will affect thousands of freebsd servers. Its still not clear it will affect thousands of servers. Um... Rackable. Rackable ships cabinets full of systems to people that run FreeBSD. They don't sell to home or small corporate users, period. Any problem that affects a standard Rackable build will by definition affect thousands of systems. (much like any standard Dell or HP server build) This all left me with a decision. My choices were to back out the BTX changes that were known to fix boot issues with certain motherboards and enabled booting from USB devices or leave things as they are. Or do some more testing and determine the problem and fix it. I had a stack of systems demonstrating the problem. I could have shipped one to each freebsd developer you wanted to work on it. If you were willing to identify the affect source code and relevant gdb traps I would have happily worked on the source directly if that is what it took. I would test. I would supply console access and build systems. I would ship them to anyone who wanted one in their hot little hands. I would investigate the source code myself with a mere hour of "here's the relevant bits you need to consider" training. You could have done *anything* that suited your needs for testing. Instead you did nothing. The motherboards that didn't boot with the older code had no work-around. The motherboards that did boot with the older code but not the newer code do have a work-around (use the old loader). Not true. I tested this, installing the old loader and it did not change the problem. As reported. Decisions like that suck, no matter which choice I make it's wrong. Holding the release until all bios issues get resolved isn't a viable option because of the "Whack-a-mole" thing mentioned above. Fix it for one and two break. It takes a lot of time/work to settle into what seems to work for the widest set of machines. Break the boot loader for a very wide variety of systems rather than spend EVEN A SINGLE HOUR trying to diagnose the boot problem? Ken, your diagnosis here would make sense if ANY diagnosis had been attempted. This could be a trivial problem. It could be solved with 5 minutes of actually looking at it. What happened here is that you proceeded WITHOUT EVEN TRYING. So you're saying John and Xin Li's responses (Xin Li's questions still un-answered) to you show a complete lack to even consider investigating it? No actual diagnosis was done. I'm sorry, but if I pull my car up to my mechanic's garage and he makes a diagnosis of "no idea what's wrong" without even popping the hood, yeah that counts as "didn't even consider investigating" Worse yet, I would happily have done all of the grunt work for the investigation. But I'm not going to start by reading the source tree and making guesses where to look. If someone had given me some useful tests to do, I would have done them. I know from past email threads your preference is for 6.X right now Not my preference, my ability to justify the evaluation and testing costs based on the support available for a given release. 7.0 doesn't work on this hardware at all. No, I haven't tested 7.1 because 6.4 was the easier testing target and I had thought that the security team was working on fixing the support model. So now we have the brilliance strategy of a long-term support -REL that we will never be able to use. The same stupid stunt that gave us 6.1 which was unusable and 6.2 which worked great but expired at the same time as 6.1. Etc and such forth. 6.5 will likely be short term support again, but the first release we can consider for deployment. but as a test point if you aren't totally fried over this whole thing it would still be useful to know for sure if the issue exists with 7.1 test builds. If yes it eliminates a variety of possibilities and helps focus on the exact problem. I'm not burnt, but testing 7.1 has no meaningful relevance to my day job until we have a reasonable and working support mechanism. And given that I really pulled out the stops to make sure we had hardware f
Re: Can I get a committer to mark this bug as blocking 6.4-RELEASE ?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Jo Rhett wrote: > On Dec 1, 2008, at 11:30 AM, Ken Smith wrote: >> Both John and Xin Li have chimed in on the two threads I've seen that >> are related to this specific topic. John diagnosed it as a issue with >> the BIOS. That's what makes it a nebulous problem. When working on >> those sorts of things most people liken it to "Whack-a-mole". > > Diagnosed without testing. John never asked for any more information > than the page fault description from me. When I asked what else to test > and offered to supply systems for testing he stopped responding. Xin Li > proposed a work-around that would have castrated the systems. It might > work, but it wasn't a useful workaround so I deferred testing and > focused on trying to get someone to address the real problem. What I proposed is, to *narrow down* the problem so we can diagnose further, since nobody has idea at the moment about how the problem was, we do need to have further information, or, to get the whole 6.3->6.4 diff reviewed, which is (in my opinion) not an optimal use of developers' time. Cheers, - -- Xin LI <[EMAIL PROTECTED]> http://www.delphij.net/ FreeBSD - The Power to Serve! -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (FreeBSD) iEYEARECAAYFAkk0SEwACgkQi+vbBBjt66AbmACeLJgUrf3fp9yNyUXV/T/YvCxT WDkAoL745HKpJw0CogTcZDdvbkMck3uG =0Fg4 -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
confirming bugs is bad behavior, etc.
On Dec 1, 2008, at 11:59 AM, George V. Neville-Neil wrote: I have mostly stayed away from these threads because they've often devolved into unproductive finger pointing. Please leave the hyperbole out of your posts, or at least attempt to cut it back. People on these lists are working quite hard to solve problems for the whole of the FreeBSD community and your posts, such as this one, are not helping us to move forward. My posts have always been directed at solving very real, operational problems with using FreeBSD on server platforms, which is exactly the stated goal for freebsd. I have always offered not only problems, but resources to help test or evaluate the issues, and serious considerations for ways to improve the process. Yes, you're right. Threads I start about real problems always devolve into unproductive finger pointing. That would be the freebsd developers attacking the reporter for identifying a real, operational problem. Take a look at the posts of the FreeBSD developers, and view for yourself the unprofessional attacks and personal insults hurled by them at people who are simply trying to get real problems resolved. And yet, instead of asking your developers to stop violating the posted rules of the mailing list, you are asking a bug reporter who simply informed another bug reporter that their problem was both widespread and not limited to USB devices to stop posting to the list. Because god knows that "yes we saw it too and it's widely reported" is bad behavior. Much worse that personal attacks which are strictly against the list rules. Yes, I'm sure that the personal attacks really do help drive freebsd development forward. Much more so than me bringing resources and actually testing things does. Now that Core has clearly spoken their mind on this issue, by refusing to ask freebsd developers to avoid violating the list charter and then publicly calling out someone for just saying "yeah, it's a widely reported problem" ... leaves any doubt that positive change is going to happen here. Your request is accepted. I'm unsubscribing now. -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Can I get a committer to mark this bug as blocking 6.4-RELEASE ?
On Dec 1, 2008, at 12:25 PM, Xin LI wrote: What I proposed is, to *narrow down* the problem so we can diagnose further, since nobody has idea at the moment about how the problem was, we do need to have further information, or, to get the whole 6.3->6.4 diff reviewed, which is (in my opinion) not an optimal use of developers' time. I got your request at the beginning of a vacation period where I was out of town. I had explicitly requested that 6.4 be blocked for this issue. I didn't think that "just my problem" would be enough to hold it up, but I apparently never even considered that -REL would happen without even responding to my request. Since nobody had responded to my request, and several posts had gone out about more testing for 7.1 (which had the same loader and the same problems) I assumed that 6.4 was similarly delayed. Had anyone said you needed this information pronto I would have canceled my Thanksgiving plans and spent the day in the lab testing this for you. For that matter, I had already pulled a diff of 6.3 to 6.4 and was working my way through it trying to find the relevant parts. If you would have identified the relevant portions, I would have happily tried backing out some of the changes on a per-component basis to figure it out. In short, tell me what you wanted/needed, and I would have done it ASAP. It's apparently irrelevant now. -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: usb keyboard dying at loader prompt
At Mon, 1 Dec 2008 10:22:31 -0800, Jo Rhett wrote: > > Just FYI we are seeing the exact same problem with PS/2 keyboards and > the 6.4 loader, so this may not be a USB-only issue. > > The complete lack of response to serious bug reports about 6.4-REL is > fairly shocking. > Jo, I have mostly stayed away from these threads because they've often devolved into unproductive finger pointing. Please leave the hyperbole out of your posts, or at least attempt to cut it back. People on these lists are working quite hard to solve problems for the whole of the FreeBSD community and your posts, such as this one, are not helping us to move forward. Thanks, George Neville-Neil ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: confirming bugs is bad behavior, etc.
At Mon, 1 Dec 2008 12:27:57 -0800, Jo Rhett wrote: > > Now that Core has clearly spoken their mind on this issue, by refusing > to ask freebsd developers to avoid violating the list charter and then > publicly calling out someone for just saying "yeah, it's a widely > reported problem" ... leaves any doubt that positive change is going > to happen here. > Note that my mail was not marked in any way "From core" but was merely as a list participant. I've always been all for people finding and helping to work through bugs. What I object to is hyperbole and passive aggressiveness. For more on this see here: http://video.google.com/videoplay?docid=-4216011961522818645 If we can identify the issue let's fix it, but let's do it without lots of emotional stuff. Best, George ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
RE: usb keyboard dying at loader prompt
Hi, > Just FYI we are seeing the exact same problem with PS/2 > keyboards and > the 6.4 loader, so this may not be a USB-only issue. > > [ ... ] > > On Nov 28, 2008, at 5:24 AM, Andriy Gapon wrote: > > I did more testing and it seems that our loader does have > something to > > do with the problem. > > > > If I boot to memtest86 the keyboard keeps working. > > If I pause boot menu, wait for many minutes, the keyboard > still works. > > If I escape to loader prompt, this when the keyboard stops working > > after > > a few seconds. > > > > Not sure how to explain this. > > I think I've seen some changes to reduce memory usage of loader, I > > will > > try them to see if that would make any difference for my situation. I have seen a similar problem on a Sun X4240 with 7.1-PRE. Using the ILOM remote keyboard works at the loader prompt but fails at the root filesystem prompt. I could work around the problem by attaching a different keyboard to the front USB port. Have you tried different keyboards? Regards, Jan. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: no priority on the console?
Hi, > As per my previous message, I've spent about 3 months trying to debug > a problem that was causing all disk I/O to go very slowly. A first glance this sounds similar to the problem I am having with very slow I/O on the Areca controller. (see: "7.1-PRERELEASE: arcmsr write performance problem") What controller are you using? Is the write cache enabled? > One of the things which made this nearly impossible to diagnose was > the absolute lack of priority given to the console. Logging in on the > console would take 12-15 minutes. Hitting enter on the console would > usually take between 3 and 5 minutes. Yes, I see this when I get the slow I/O problem. I think this has been a problem for some time; I have also seen "console freezes" (ssh, console, etc.) on 6.0 and 6.1 systems under SATA load. That was a while ago now (2006?). I also recall others reporting have seen the same problem intermittently. > This doesn't seem right to me. Can someone explain why the console > isn't given a very high priority? Why not? What other mechanism does > the sysadmin have for debugging, at a time when SSH logins either > fail, or take up to an hour to complete? In my case I could log into the system and start things like iostat and gstat and they kept running while the problem occurred so that I could see some of what was going on. I could also have what seemed like a reasonable ssh session with a jail on the same machine. This indicates to me that it is not the console that is the issue, but rather that the process of logging into the main machine touches some file that causes it to get caught up in the slow I/O quagmire. If the problem I am seeing now is the same as the one I saw a few years ago then I think the nature might have changed. My recollection is that utilities like iostat would also freeze back then, but I can't be sure. I'd like to resolve this problem too. Regards, Jan. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: make distribution halts during install (7.1Prerelease today)
My earlier post falls into the embarrassing, wish that I hadn't category. To prevent anyone wasting effort, I'm replying. make distribution DESTDIR=/newplace requires a make world DESTDIR=/newplace as a prerequisite. The earlier post, caused me to believe that there was an error in /usr/bin/install, when using: make distribution DESTDIR=/differentplace The granularity of my testing was inappropriate. Apologies for the distraction. Dewayne Start your day with Yahoo!7 and win a Sony Bravia TV. Enter now http://au.docs.yahoo.com/homepageset/?p1=other&p2=au&p3=tagline ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: dhclient doing DISCOVER with bad IP checksum - bge (7.1 show stopper??)
Can someone please confirm or rule out my issue with dhclient sending bad IP checksum packets. It would really suck if 7.1 was released with a broken DHCP client. Jonathan Feally wrote: Sorry for the cross-post, but this could be either lists problem. I have 2 boxes running 7-STABLE as of 20081130, both i386 SMP. One is running ISC DHCPD 3.0.x from recent ports, and the other dhclient from make world. The server is refusing to answer the DISCOVER request, as it thinks the IP checksum is wrong, which tcpdump also confirms. Other DHCP clients are working fine on this network, so I do not believe it to be the network, server or dhcpd. Server is running a 2 Port Intel card - em driver. Client is a Dell PE1750 with 2 onboard NIC's - bge driver. I have tried turning off both RXCSUM and TXCSUM on both the client and server machines with no luck. I also tried the second NIC on the server with the same result. This setup was working just a couple of weeks ago, and the only thing that has changed is updating the src for a make world. PXE booting this server does result in an IP being issued, so it is pointing towards something new/changed in 7-STABLE. I have attached a 3 packet dump of the DISCOVER requests. Can anybody shed some light on this for me? Thanks, -Jon ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]" -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 7.1-PRERELEASE: arcmsr write performance problem
Replying to my own post ... I have done a test on the same machine comparing 6.3-p1 to 7.1-PRE. The performance is the expected ~6MB/s (because of the lack of cache) on 6.3-p1, so the BIOS change doesn't seem to be at fault. This seems to be a regression somewhere between 6.3 to 7.1. The Areca driver is the same in 6.3 and 7.1, so the problem seems to be elsewhere. I think this is more than just a "performance" problem. The observations with gstat showing extremely high ms/w values (I have seen them as high as 22000) makes it look like IO completion interrupts are being lost. Any suggestions on where to look next? Are there obvious candidates? Jan Mikkelsen wrote: Hi, I am seeing extremely poor performance (~100kB/s) when untaring large tar files into fresh ufs filesystems. I see the problem with softupdates and without softupdates but with an async mount. This is a Supermicro X7DB8 board, 4GB, 2 x Xeon 5140. Sample gstat output: dT: 1.033s w: 1.000s L(q) ops/sr/s kBps ms/rw/s kBps ms/w %busy Name 585 61 0 00.0 61170 13812.0 100.1| da2 I see ms/w start at about 200ms with a ~3MB/s throughput, and then I see ms/w rise and kBps drop. ms/w goes as high as 16-20s, and then suddenly drops back down to about 200ms. Using iostat, while the performance is high(er), kb/t is 64kB, as the problem starts it drops towards 2kB. Copying a single large file doesn't exhibit this problem, although throughput isn't great (~3-5MB/s). However, that's better that 100kB/s. arcmsr0: mem 0xd890-0xd8900fff,0xd800-0xd83f irq 16 at device 14.0 on pci10 ARECA RAID ADAPTER0: Driver Version 1.20.00.15 2007-10-07 ARECA RAID ADAPTER0: FIRMWARE VERSION V1.46 2008-08-06 arcmsr0: [ITHREAD] There are eight disks connected in a RAID-6 configuration. The controller's cache is write-through and the disks' write caches are disabled. NCQ is enabled on the drives. The same hardware when it ran 6.3-p1 didn't have this problem. However, the system BIOS was updated at the same time as the operating system (in an attempt to solve a recent em problem), so it is possible that it is a BIOS related problem. The same build on an entirely different machine with an aac controller and SAS disks also doesn't show this problem. Running 'devinfo -r' doesn't list arcmsr as having an interrupt at all. (see below). That strikes me as odd; checking another machine that is still running 6.2 with an arcmsr controller, I can see the interrupt just fine. So: - Does anyone have any suggestions? - Is it normal for arcmsr to not show an interrupt in the output from devinfo in 7.1? Full dmesg, devinfo below. Thanks, Jan Mikkelsen Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.1-PRERELEASE #0: Mon Dec 1 14:53:12 EST 2008 [EMAIL PROTECTED]:/home/janm/p4/freebsd-image-std-2008.2/work/base-freebsd/home/janm/p4/freebsd-image-std-2008.2/FreeBSD/src/sys/TW-SMP Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(R) CPU5140 @ 2.33GHz (2333.35-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x6f6 Stepping = 6 Features=0xbfebfbff Features2=0x4e3bd AMD Features=0x20100800 AMD Features2=0x1 Cores per package: 2 usable memory = 4280651776 (4082 MB) avail memory = 4117843968 (3927 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 6 cpu3 (AP): APIC ID: 7 ioapic0 irqs 0-23 on motherboard ioapic1 irqs 24-47 on motherboard kbd1 at kbdmux0 ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pcib1: at device 2.0 on pci0 pci1: on pcib1 pcib2: irq 16 at device 0.0 on pci1 pci2: on pcib2 pcib3: irq 16 at device 0.0 on pci2 pci3: on pcib3 pcib4: at device 0.0 on pci3 pci4: on pcib4 ahd0: port 0x2400-0x24ff,0x2000-0x20ff mem 0xd850-0xd8501fff irq 16 at device 2.0 on pci4 ahd0: [ITHREAD] aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs ahd1: port 0x2c00-0x2cff,0x2800-0x28ff mem 0xd8502000-0xd8503fff irq 17 at device 2.1 on pci4 ahd1: [ITHREAD] aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs pcib5: at device 0.2 on pci3 pci5: on pcib5 bge0: mem 0xd860-0xd860 irq 16 at device 1.0 on pci5 miibus0: on bge0 brgphy0: PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto bge0: Ethernet address: 00:40:f4:66:b1:56 bge0: [ITHREAD] p
exim 4.69 freebsd 7.0 locking issues
Hi all, Running Exim 4.69 on 7.0-RELEASE-p6 FreeBSD. The box has been recently upgraded from 6.3 (like 24 hours ago). Currently Exim is sending the following lines to the log files. 2008-12-01 19:02:35 Failed to get write lock for /var/spool/exim/db/callout.lockfile: Invalid argument 2008-12-01 19:02:35 Failed to get write lock for /var/spool/exim/db/callout.lockfile: Invalid argument 2008-12-01 19:02:35 1L74Cp-000GRN-3R Cannot lock /var/spool/exim/input//1L74Cp-000GRN-3R-D (22): Invalid argument The permissions are all correct for the spool directories and for Exim itself. It is creating stacks and stacks of 0 byte files in the message spool directory. I have recompiled all the ports but to no avail. I've upgraded 2 other machines with 99.0% the same setup with no issues. The only difference is hostnames/ips and that this machine is running mysql on it. Everything else on the machine (spam-assassin, clamav, mysql) is working fine. Has anybody got any ideas, other than downgrade back to fbsd 6.3? Cheers cya Andrew ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: dhclient doing DISCOVER with bad IP checksum - bge (7.1 show stopper??)
On Tuesday 02 December 2008 15:57:15 Jonathan Feally wrote: > Can someone please confirm or rule out my issue with dhclient sending > bad IP checksum packets. It would really suck if 7.1 was released with a > broken DHCP client. I had 7.1-PRE (early Octover) send out DHCP requests without issue, although I don't have that system available now. It was using em card. I have a 7.0-STABLE system with an sk card from July that does DHCP requests just fine too.. I don't have any bge systems running 7 to test with though sorry.. Does it always give dud packets or just DHCP? Can you try another card in the client? -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au "The nice thing about standards is that there are so many of them to choose from." -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C signature.asc Description: This is a digitally signed message part.