Re: Using iscsi with multiple targets
On Mon, 2008-07-14 at 11:29 +0300, Danny Braniss wrote: > > FreeBSD 7.0 > > > > I have 2 machines with identical configurations/hardware, let's call them A > > (master) > > and B (slave). I have installed iscsi-target from ports and have set up 3 > > targets > > representing the 3 drives I wish to be connected to from A. > > > > The Targets file: > > # extents filestart length > > extent0 /dev/da10 465GB > > extent1 /dev/da20 465GB > > extent2 /dev/da30 465GB > > > > # targetflags storage netmask > > target0 rw extent0 192.168.0.1/24 > > target1 rw extent1 192.168.0.1/24 > > target2 rw extent2 192.168.0.1/24 > > > > I then start up iscsi_target and all is good. > > > > Now on A I have set up my /etc/iscsi.conf file as follows: > > > > # cat /etc/iscsi.conf > > data1 { > > targetaddress=192.168.0.252 > > targetname=iqn.1994-04.org.netbsd.iscsi-target:target0 > > initiatorname=iqn.2005-01.il.ac.huji.cs::BSD-2-1.sven.local > > } > > data2 { > > targetaddress=192.168.0.252 > > targetname=iqn.1994-04.org.netbsd.iscsi-target:target1 > > initiatorname=iqn.2005-01.il.ac.huji.cs::BSD-2-1.sven.local > > } > > data3 { > > targetaddress=192.168.0.252 > > targetname=iqn.1994-04.org.netbsd.iscsi-target:target2 > > initiatorname=iqn.2005-01.il.ac.huji.cs::BSD-2-1.sven.local > > } > > > > So far so good, now come the issues. First of all, it would appear that > > with > > iscontrol one can only start one "named" session at a time; for example > > /sbin/iscontrol -n data1 > > /sbin/iscontrol -n data2 > > /sbin/isconrtol -n data3 > > > > I guess that is ok, except that each invocation of iscontrol resets the > > other > > sessions. Here is the camcontrol and dmesg output from running the above 3 > > commands. > > > > # camcontrol devlist > > at scbus0 target 0 lun 0 (pass0,da0) > > at scbus0 target 1 lun 0 (pass1,da1) > > at scbus0 target 2 lun 0 (pass2,da2) > > at scbus0 target 3 lun 0 (pass3,da3) > > at scbus1 target 0 lun 0 (da5,pass5) > > at scbus1 target 1 lun 0 (da6,pass6) > > at scbus1 target 2 lun 0 (da4,pass4) > > > > > > [ /sbin/iscontrol -n data1 ] > > da4 at iscsi0 bus 0 target 0 lun 0 > > da4: Fixed Direct Access SCSI-3 device > > > > [ /sbin/iscontrol -n data2 ] > > (da4:iscsi0:0:0:0): lost device > > (da4:iscsi0:0:0:0): removing device entry > > da4 at iscsi0 bus 0 target 0 lun 0 > > da4: Fixed Direct Access SCSI-3 device > > da5 at iscsi0 bus 0 target 1 lun 0 > > da5: Fixed Direct Access SCSI-3 device > > > > [ /sbin/iscontrol -n data3 ] > > (da4:iscsi0:0:0:0): lost device > > (da4:iscsi0:0:0:0): removing device entry > > (da5:iscsi0:0:1:0): lost device > > (da5:iscsi0:0:1:0): removing device entry > > da4 at iscsi0 bus 0 target 2 lun 0 > > da4: Fixed Direct Access SCSI-3 device > > da5 at iscsi0 bus 0 target 0 lun 0 > > da5: Fixed Direct Access SCSI-3 device > > da6 at iscsi0 bus 0 target 1 lun 0 > > da6: Fixed Direct Access SCSI-3 device > > > > > > It would appear that rather than appending the new device to the end of the > > "da" > > devices, it starts to do some type of naming queue after the second device. > > If I am > > to use these devices in any type of automated setup, how can make sure that > > after > > these commands, "da6" will always be target 1 (i.e. /dev/da2 on the slave > > machine). > > > > Next, there is no "startup" script for iscontrol - would that simply have > > to be > > added the system or is there a way with sysctl that it could be done. The > > plan here > > is use gmirror such that /dev/da1 on A is mirrored with the /dev/da1 on B > > using iscsi. > > Hi Sven, > I just tried it here, and it seems that at the end all is ok :-) > I think the lost/removing/found has something to do to iscontrol calling > camcontrol rescan - I will check this later, but the end result is that > you should have all /dev/da's. > I don't see any reasonable safe way to tie a scsi# (/dev/dan), > except to label (see glabel) the disk. > The startup script is, at the moment, not trivial, but I'm attaching > it, so someone can suggest improvements :-) > #!/bin/sh > > # PROVIDE: iscsi > # REQUIRE: NETWORKING > # BEFORE: DAEMON > # KEYWORD: nojail shutdown > > # > # Add the following lines to /etc/rc.conf to enable iscsi: > # > # iscsi_enable="YES" > # iscsi_fstab="/etc/fstab.iscsi" > > . /etc/rc.subr > . /cs/share/etc/rc.subr > > name=iscsi > rcvar=`set_rcvar` > > command=/sbin/iscontrol > > iscsi_enable=${iscsi_enable:-"NO"} > iscsi_fstab=${iscsi_fstab:-"/etc/fstab.iscsi"} > iscsi_exports=${iscsi_exports:-"/etc/exports.iscsi"} > iscsi_debug=${iscsi_debug:-0} > start_cmd="iscsi_start" > faststop_cmp
Multi-machine mirroring choices
With the introduction of zfs to FreeBSD 7.0, a door has opened for more mirroring options so I would like to get some opinions on what direction I should take for the following scenario. Basically I have 2 machines that are "clones" of each other (master and slave) wherein one will be serving up samba shares. Each server has one disk to hold the OS (not mirrored) and then 3 disks, each of which will be its own mountpoint and samba share. The idea is to create a mirror of each of these disks on the slave machine so that in the event the master goes down, the slave can pick up serving the samba shares (I am using CARP as the samba server IP address). My initial thought was to have the slave set up as an iscsi target and then have the master connect to each drive, then create a gmirror or zpool mirror using local_data1:iscsi_data1, local_data2:iscsi_data2, and local_data3:iscsi_data3. After some feedback (P.French for example) it would appear as though iscsi may not be the way to go for this as it locks up when the target goes down and even though I may be able to remove the target from the mirror, that process may fail as the "disk" remains in "D" state. So that leaves me with the following options: 1) ggated/ggatec + gmirror 2) ggated/ggatec + zfs (zpool mirror) 3) zfs send/recv incremental snapshots (ssh) 1) I have been using ggated/ggatec on a set of 6.2-REL boxes and find that ggated tends to fail after some time leaving me rebuilding the mirror periodically (and gmirror resilvering takes quite some time). Has ggated/ggatec performance and stability improved in 7.0? This combination does work, but it is high maintenance and automating it is a bit painful (in terms of re-establishing the gmirror and rebuilding and making sure the master machine is the one being read from). 2) Noting the issues with ggated/ggatec in (1), would a zpool be better at rebuilding the mirror? I understand that it can better determine which drive of the mirror is out of sync than can gmirror so a lot of the "insert" "rebuild" manipulations used with gmirror would not be needed here. 3) The send/recv feature of zfs was something I had not even considered until very recently. My understanding is that this would work by a) taking a snapshot of master_data1 b) zfs sending that snapshot to slave_data1 c) via ssh on pipe, receiving that snapshot on slave_data1 and then d) doing incremental snapshots, sending, receiving as in (a)(b)(c). How time/cpu intensive is the snapshot generation and just how granular could this be done? I would imagine for systems with litle traffic/changes this could be practical but what about systems that may see a lot of files added, modified, deleted to the filesystem(s)? I would be interested to hear anyone's experience with any (or all) of these methods and caveats of each. I am leaning towards ggate(dc) + zpool at the moment assuming that zfs can "smartly" rebuild the mirror after the slave's ggated processes bug out. Sven signature.asc Description: This is a digitally signed message part
Re: Multi-machine mirroring choices
On Tue, 2008-07-15 at 07:54 -0700, Jeremy Chadwick wrote: > On Tue, Jul 15, 2008 at 10:07:14AM -0400, Sven Willenberger wrote: > > 3) The send/recv feature of zfs was something I had not even considered > > until very recently. My understanding is that this would work by a) > > taking a snapshot of master_data1 b) zfs sending that snapshot to > > slave_data1 c) via ssh on pipe, receiving that snapshot on slave_data1 > > and then d) doing incremental snapshots, sending, receiving as in > > (a)(b)(c). How time/cpu intensive is the snapshot generation and just > > how granular could this be done? I would imagine for systems with litle > > traffic/changes this could be practical but what about systems that may > > see a lot of files added, modified, deleted to the filesystem(s)? > > I can speak a bit on ZFS snapshots, because I've used them in the past > with good results. > > Compared to UFS2 snapshots (e.g. dump -L or mksnap_ffs), ZFS snapshots > are fantastic. The two main positives for me were: > > 1) ZFS snapshots take significantly less time to create; I'm talking > seconds or minutes vs. 30-45 minutes. I also remember receiving mail > from someone (on -hackers? I can't remember -- let me know and I can > dig through my mail archives for the specific mail/details) stating > something along the lines of "over time, yes, UFS2 snapshots take > longer and longer, it's a known design problem". > > 2) ZFS snapshots, when created, do not cause the system to more or less > deadlock until the snapshot is generated; you can continue to use the > system during the time the snapshot is being generated. While with > UFS2, dump -L and mksnap_ffs will surely disappoint you. > > We moved all of our production systems off of using dump/restore solely > because of these aspects. We didn't move to ZFS though; we went with > rsync, which is great, except for the fact that it modifies file atimes > (hope you use Maildir and not classic mbox/mail spools...). > > ZFS's send/recv capability (over a network) is something I didn't have > time to experiment with, but it looked *very* promising. The method is > documented in the manpage as "Example 12", and is very simple -- as it > should be. You don't have to use SSH either, by the way[1]. The examples do list ssh as the way of initiating the receiving end; I am curious as to what the alterative would be (short of installing openssh-portable and using cipher=no). > One of the "annoyances" to ZFS snapshots, however, was that I had to > write my own script to do snapshot rotations (think incremental dump(8) > but using ZFS snapshots). That is what I was afraid of. Using snapshots would seem to involve a bit of housekeeping. Furthermore, it sounds more suited to a system that needs periodic rather than constant backing up (syncing). > > I would be interested to hear anyone's experience with any (or all) of > > these methods and caveats of each. I am leaning towards ggate(dc) + > > zpool at the moment assuming that zfs can "smartly" rebuild the mirror > > after the slave's ggated processes bug out. > > I don't have any experience with GEOM gate, so I can't comment on it. > But I would highly recommend you discuss the shortcomings with pjd@, > because he definitely listens. > > However, I must ask you this: why are you doing things the way you are? > Why are you using the equivalent of RAID 1 but for entire computers? Is > there some reason you aren't using a filer (e.g. NetApp) for your data, > thus keeping it centralised? There has been recent discussion of using > FreeBSD with ZFS as such, over on freebsd-fs. If you want a link to the > thread, I can point you to it. Basically I am trying to eliminate the "single point of failure". The project prior to this had such a failure that even a raid5 setup could not get out of. It was determined at that point that a single-machine storage solution would no longer suffice. What I am trying to achieve is having a slave machine that could take over as the file server in the event the master machine goes down. This could be something as simple as the master's network connection going down (CARP to the rescue on the slave) to a complete failure of the master. While zfs send/recv sounds like a good option for periodic backups, I don't think it will fit my purpose. Zpool or gmirror will be a better fit. I see in posts following my initial post that there is reference to improvements in ggate[cd] and/or tcp since 6.2 (and I have moved to 7.0 now) so that bodes well. The question then becomes a matter of which system would be easier to manage in terms of a) the master rebuilding the mirr
Re: Multi-machine mirroring choices
On Tue, 2008-07-15 at 16:20 +0100, Pete French wrote: > > However, I must ask you this: why are you doing things the way you are? > > Why are you using the equivalent of RAID 1 but for entire computers? Is > > there some reason you aren't using a filer (e.g. NetApp) for your data, > > thus keeping it centralised? > > I am not the roiginal poster, but I am doing something very similar and > can answer that question for you. Some people get paranoid about the > whole "single point of failure" thing. I originally suggestted that we buy > a filer and have identical servers so if one breaks we connect the other > to the filer, but the response I got was "what if the filer breaks?". So > in the end I had to show we have duplicate independent machines, with the > data kept symetrical on them at all times. > > It does actually work quite nicely actually - I have an "'active" database > machine, and a "passive". The opassive is only used if the active fails, > and the drives are run as a gmirror pair with the remote one being mounted > using ggated. It also means I can flip from active to passive when I want > to do an OS upgrade on the active machine. Switching takes a few seconds, > and this is fine for our setup. > > So the answer is that the descisiuon was taken out of my hands - but this > is not uncommon, and as a roll-your-own cluster it works very nicely. > > -pete. > ___ I have for now gone with using ggate[cd] along with zpool and so far it's not bad. I can fail the master, stop ggated on the slave at which point geom reads the glabeled disks. From there I can zpool import to an alternate root. When the master comes back up I can zpool export and then, on the master, zpool import at which point zfs handles the resilvering. The *big* issue I have right now is dealing with the slave machine going down. Once the master no longer has a connection to the ggated devices, all processes trying to use the device hang in D status. I have tried pkill'ing ggatec to no avail and ggatec destroy returns a message of gctl being busy. Trying to ggatec destroy -f panics the machine. Does anyone know how to successfully time out a failed ggatec connection so that I can zpool detach or somehow have zfs removed the unavailable drive? Sven signature.asc Description: This is a digitally signed message part
CARP state changes and devd.conf
I see mention of CARP as a device-type in the devd.conf documentation but for the life of me cannot manage to get devd to recognize *any* changes in the CARP interface. I have set sysctl net.inet.carp.log=2 and I see message in /var/log/messages when the interface goes INIT->BACKUP and BACKUP -> MASTER, but for the life of me cannot get devd to "see" these changes. I have tried something even as simple as: notify 100 { action "logger -p kern.notice '$device-name interface has changed'"; }; and then bringing the CARP interfaces up and down on either boxes to change INIT and BACKUP/MASTER states, but *nothing* is noted. Does CARP simply not work that way with devd (i.e. only the creation of the CARP device, not any subsequent states, work )? Sven signature.asc Description: This is a digitally signed message part
Re: filesystem full error with inumber
Feargal Reilly presumably uttered the following on 07/24/06 11:48: > On Mon, 24 Jul 2006 17:14:27 +0200 (CEST) > Oliver Fromme <[EMAIL PROTECTED]> wrote: > >> Nobody else has answered so far, so I try to give it a shot ... >> >> The "filesystem full" error can happen in three cases: >> 1. The file system is running out of data space. >> 2. The file system is running out of inodes. >> 3. The file system is running out of non-fragmented blocks. >> >> The third case can only happen on extremely fragmented >> file systems which happens very rarely, but maybe it's >> a possible cause of your problem. > > I rebooted that server, and df then reported that disk at 108%, > so it appears that df was reporting incorrect figures prior to > the reboot. Having cleaned up, it appears by my best > calculations to be showing correct figures now. > >> > kern.maxfiles: 2 >> > kern.openfiles: 3582 >> >> Those have nothing to do with "filesystem full". >> > > Yeah, that's what I figured. > >> > Looking again at dumpfs, it appears to say that this is >> > formatted with a block size of 8K, and a fragment size of >> > 2K, but tuning(7) says: [...] >> > Reading this makes me think that when this server was >> > installed, the block size was dropped from the 16K default >> > to 8K for performance reasons, but the fragment size was >> > not modified accordingly. >> > >> > Would this be the root of my problem? >> >> I think a bsize/fsize ratio of 4/1 _should_ work, but it's >> not widely used, so there might be bugs hidden somewhere. >> > > Such as df not reporting the actual data usage, which is now my > best working theory. I don't know what df bases it's figures on, > perhaps it either slowly got out of sync, or more likely, got > things wrong once the disk filled up. > > I'll monitor it to see if this happens again, but hopefully > won't keep that configuration around for too much longer anyway. > > Thanks, > -fr. > One of my machines that I recently upgraded to 6.1 (6.1-RELEASE-p3) is also exhibiting df reporting wrong data usage numbers. Notice the negative "Used" numbers below: > df -h Filesystem SizeUsed Avail Capacity Mounted on /dev/da0s1a496M 63M393M14%/ devfs 1.0K1.0K 0B 100%/dev /dev/da0s1e989M -132M1.0G -14%/tmp /dev/da0s1f 15G478M 14G 3%/usr /dev/da0s1d 15G -1.0G 14G-8%/var /dev/md0 496M228K456M 0%/var/spool/MIMEDefang devfs 1.0K1.0K 0B 100%/var/named/dev Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: filesystem full error with inumber
Peter Jeremy presumably uttered the following on 07/26/06 15:00: > On Wed, 2006-Jul-26 13:07:19 -0400, Sven Willenberger wrote: >> One of my machines that I recently upgraded to 6.1 (6.1-RELEASE-p3) is also >> exhibiting df reporting wrong data usage numbers. > > What did you upgrade from? > Is this UFS1 or UFS2? > Does a full fsck fix the problem? > This was an upgrade from a 5.x system (UFS2); a full fsck did in fact fix the problem (for now). Thanks, Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Megacli fails to find SAS adapter
FreeBSD 6.2-PRERELEASE #3: Tue Oct 10 13:58:29 EDT 2006 LSi 8480e SAS Raid card mount: linprocfs on /compat/linux/proc (linprocfs, local) linsysfs on /compat/linux/sys (linsysfs, local) /dev/mfid0s1d on /usr/local/pgsql (ufs, local, noatime) dmesg: mfi0: 2025 - PCI 0x041000 0x04411 0x041000 0x041002: Firmware initialization started (PCI ID 0411/1000/1002/1000) mfi0: 2026 - Type 18: Firmware version 1.00.00-0074 mfi0: 2027 - Battery temperature is normal mfi0: 2028 - Battery Present mfi0: 2029 - PD 39(e1/s255) event: Enclosure (SES) discovered on PD 27(e1/s255) mfi0: 2030 - PD 56(e2/s255) event: Enclosure (SES) discovered on PD 38(e2/s255) mfi0: 2031 - PD 39(e1/s255) event: Inserted: PD 27(e1/s255) mfi0: 2032 - Type 29: Inserted: PD 27(e1/s255) Info: enclPd=27, scsiType=d, portMap=10, sasAddr=50015b2180001839, mfi0: 2033 - PD 56(e2/s255) event: Inserted: PD 38(e2/s255) pkg_info: linux_base-fc-4_9 I have downloaded the Megacli and, using rpm2cpio extracted MegaCli-1.01.09-0.i386.rpm into my home directory. ~/usr/sbin/MegaCli brandelf -t Linux usr/sbin/MegaCli cd usr/sbin # ./MegaCli -EncInfo -aALL ERROR:Could not detect controller. # ./MegaCli -CfgDsply -aALL ERROR:Could not detect controller. Do I actually need to set up the links in /compat/linux/sys for the SAS raid card? or should this rpm be installed into the /compat/linux directory? I need to upgrade the firmware on this card as for some reason the webbios will not let me configure a Raid10 array and the only way I can see to upgrade the fw is to use the megacli utility. Thanks, Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Megacli fails to find SAS adapter
On Tue, 2006-10-10 at 22:11 -0700, Doug Ambrisko wrote: > Sven Willenberger writes: > | FreeBSD 6.2-PRERELEASE #3: Tue Oct 10 13:58:29 EDT 2006 > | LSi 8480e SAS Raid card > | > | mount: > | linprocfs on /compat/linux/proc (linprocfs, local) > | linsysfs on /compat/linux/sys (linsysfs, local) > | /dev/mfid0s1d on /usr/local/pgsql (ufs, local, noatime) > | > | dmesg: > | mfi0: 2025 - PCI 0x041000 0x04411 0x041000 0x041002: Firmware > initialization started (PCI ID 0411/1000/1002/1000) > | mfi0: 2026 - Type 18: Firmware version 1.00.00-0074 > | mfi0: 2027 - Battery temperature is normal > | mfi0: 2028 - Battery Present > | mfi0: 2029 - PD 39(e1/s255) event: Enclosure (SES) discovered on PD > 27(e1/s255) > | mfi0: 2030 - PD 56(e2/s255) event: Enclosure (SES) discovered on PD > 38(e2/s255) > | mfi0: 2031 - PD 39(e1/s255) event: Inserted: PD 27(e1/s255) > | mfi0: 2032 - Type 29: Inserted: PD 27(e1/s255) Info: enclPd=27, scsiType=d, > portMap=10, sasAddr=50015b2180001839, > | mfi0: 2033 - PD 56(e2/s255) event: Inserted: PD 38(e2/s255) > | > | pkg_info: > | linux_base-fc-4_9 > | > | I have downloaded the Megacli and, using rpm2cpio extracted > | MegaCli-1.01.09-0.i386.rpm into my home directory. > | > | ~/usr/sbin/MegaCli > | brandelf -t Linux usr/sbin/MegaCli > | > | cd usr/sbin > | > | # ./MegaCli -EncInfo -aALL > | > | ERROR:Could not detect controller. > | # ./MegaCli -CfgDsply -aALL > | > | ERROR:Could not detect controller. > | > | Do I actually need to set up the links in /compat/linux/sys for the SAS > | raid card? or should this rpm be installed into the /compat/linux > | directory? I need to upgrade the firmware on this card as for some > | reason the webbios will not let me configure a Raid10 array and the only > | way I can see to upgrade the fw is to use the megacli utility. > > Make sure you have the Linux ioctl module loaded before linsysfs so it > can register the hooks. kldstat/kernel config will help. One sanity > check is to do: > dhcp194:ambrisko 11] cat /compat/linux/sys/class/scsi_host/host*/proc_name > megaraid_sas > (null) > dhcp194:ambrisko 12] > > If you don't see megaraid_sas then it isn't going to work and is > missing the linux mfi module. Also > you need to set: > sysctl compat.linux.osrelease=2.6.12 > or things won't work well. This will probably break your fc-4_9 Linux > install until the updates to Linux emulation is merged (maybe it > has but I don't think so). Since it is a static binary we don't have > linux base installed. > > Doug A. > ___ Adding mfi_linux_enable="YES" to /boot/loader.conf did do the trick of having the device added to the system: # cat /compat/linux/sys/class/scsi_host/host*/proc_name (null) megaraid_sas (null) # sysctl compat.linux compat.linux.oss_version: 198144 compat.linux.osrelease: 2.6.12 compat.linux.osname: Linux Although the MegaCli utility no longer complains about not finding a controller, it sadly does nothing else either (except dump core on certain commands): # ./MegaCli -AdpAllinfo -a0 # ./MegaCli -AdpGetProp SpinupDriveCount -a0 Segmentation fault (core dumped) # ./MegaCli -LDGetNum -a0 Failed to get VD count on adapter -9993. # ./MegaCli -CfgFreeSpaceinfo -a0 Failed to initialize RM and so on ... I am guessing this is an issue with the MegaCli software now; needless to say I certainly doubt that this will allow me to flash the card bios (or even it if *could*, I would be leery of the process). ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [HEADS UP] perl symlinks in /usr/bin will be gone
Anton Berezin wrote: In order to keep pkg-install simple, no old symlink chasing and removal will be done, although the detailed instructions will be posted in ports/UPDATING and in pkg-message for the ports. How about leaving it up to the installer? Much like the minicom port prompts the user if they would like to symlink a /dev/modem device, why not ask (post-install) "Would you like to make a symlink in /usr/bin to your new installation?" or as someone else has suggested add a make flag (make ADD_SYMLINK=yes). Those who wish to have an unpolluted /usr/bin can not opt for a symlink, those that want compatibility with a majority of the scripts already written can have the link created. Just a thought, Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: need ISO-image for a new machine install
On Thu, 2005-03-17 at 20:13 -0500, Mikhail Teterin wrote: > Hello! > > Is there a place, from where I can download a reasonably fresh 5.4-PRERELEASE > install (or mini-install) .iso image for amd64? > > Thanks! > > -mi > ___ > [EMAIL PROTECTED] mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-amd64 > To unsubscribe, send any mail to "[EMAIL PROTECTED]" I saw a post here a little while ago that pointed to : ftp://ftp.freebsd.org/pub/FreeBSD/snapshots/Feb_2005/5.3-STABLE-SNAP001-amd64-miniinst.iso I used this on a dual opteron system with 8GB of RAM with no problem (i.e. the >4G RAM issue was resolved on this snapshot). Upgrading to 5.4 PRE is straightforward from there. Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Creating a striped set of mirrors using gvinum
I am hoping someone has found a way to create this type of raid set using [g]vinum. I see that it is a trivial matter to create a mirror of 2 striped sets but I have not seen a way to create a stripe set out of multiple mirrored sets (e.g. stripe across 3 sets of mirrors). Has anyone managed to implement this and, if so, what does your configuration file look like? If not, could this be added as a feature request for gvinum? Sven Willenberger ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: kern/79035: gvinum unable to create a striped set of mirrored sets/plexes
On Sun, 2005-03-20 at 15:51 +1030, Greg 'groggy' Lehey wrote: > On Saturday, 19 March 2005 at 23:43:00 -0500, Sven Willenberger wrote: > > Greg 'groggy' Lehey presumably uttered the following on 03/19/05 22:11: > >> On Sunday, 20 March 2005 at 2:04:34 +, Sven Willenberger wrote: > >> > >>> Under the current implementation of gvinum it is possible to create > >>> a mirrored set of striped plexes but not a striped set of mirrored > >>> plexes. For purposes of resiliency the latter configuration is > >>> preferred as illustrated by the following example: > >>> > >>> Use 6 disks to create one of 2 different scenarios. > >>> > >>> 1) Using the current abilities of gvinum create 2 striped sets using > >>> 3 disks each: A1 A2 A3 and B1 B2 B3 then create a mirror of those 2 > >>> sets such that A(123) mirrors B(123). In this situation if any drive > >>> in Set A fails, one still has a working set with Set B. If any drive > >>> now fails in Set B, the system is shot. > >> > >> No, this is not correct. The plex ("set") only fails when all drives > >> in it fail. > > > > I hope the following diagrams better illustrate what I was trying to > > point out. Data striped across all the A's and that is mirrored to the B > > Stripes: > > > > ... > > > > If A1 fails, then the A Stripe set cannot function (much like in Raid 0, > > one disk fails the set) meaning that B now is the array: > > No, this is not correct. > > >>> Thus the striping of mirrors (rather than a mirror of striped sets) > >>> is a more resilient and fault-tolerant setup of a multi-disk array. > >> > >> No, you're misunderstanding the current implementation. > > > > Perhaps I am ... but unless gvinum somehow reconstructs a 3 disk stripe > > into a 2 disk stripe in the event one disk fails, I am now sure how. > > Well, you have the source code. It's not quite the way you look at > it. It doesn't have stripes: it has plexes. And they can be > incomplete. If a read to a plex hits a "hole", it automatically > retries via (possibly all) the other plexes. Only when all plexes > have a hole in the same place does the transfer fail. > > You might like to (re)read http://www.vinumvm.org/vinum/intro.html. > I was really hoping that the "holes in the plex" functioning was going to work but my tests have shown otherwise. I created a gvinum array consisting of (A striped B) mirror (C striped D) which is the only such mirror/stripe combination allowed by gvinum for four drives. We have: _ | A B |__ |___| | |Mirror _ | | C D |--| |___| Based on what the "plex hole" theory states, Drive A and Drive D could both fail and the system would read through the holes and pick up data from B and C (or the converse if B and C failed), functionally equivalent to a stripe of mirrors. To fail a drive I rebooted single-user, dd dev/zero to the beginning of the disk and then fdisk. drive d device /dev/da4s1h drive c device /dev/da3s1h drive b device /dev/da2s1h drive a device /dev/da1s1h volume home plex name home.p1 org striped 960s vol home plex name home.p0 org striped 960s vol home sd name home.p1.s1 drive d len 71681280s driveoffset 265s plex home.p1 plexoffset 960s sd name home.p1.s0 drive c len 71681280s driveoffset 265s plex home.p1 plexoffset 0s sd name home.p0.s1 drive b len 71681280s driveoffset 265s plex home.p0 plexoffset 960s sd name home.p0.s0 drive a len 71681280s driveoffset 265s plex home.p0 plexoffset 0s In my case:Fail B Fail B and C A = /dev/da1s1h up up B = /dev/da2s1h downdown C = /dev/da3s1h up down D = /dev/da4s1h up up 1 Volume V home2 up down (!) 2 Plexes P home.p0 (A and B) downdown P home.p1 (C and D) up down 4 Subdisks S home.p0.s0 (A) up up S home.p0.s1 (B) downdown S home.p1.s0 (C) up down S home p1.s1 (D) up up Based on this failing the one drive did in fact fail the plex (home.p0). Although at that point I realized that failing either drive on the other plex would also fail that plex and also the volume, I went ahead and failed drive C also. The result was a failed volume. With the failed B drive, once I bsdlabeled the disk to include the vinum slice, then I got the message that the the plex was now stale (instead of down). A simple gvinum start home2 changed the state to degraded the the system rebuilt the array. When both drives failed I had to work a bit of a kludge in. I gvinum setstate -f up home.p1.s0, the
Re: 5.3-S (Mar 6) softdep stack backtrace from getdirtybuf()... problem?
Brandon S. Allbery KF8NH presumably uttered the following on 04/10/05 15:16: I have twice so far had the kernel syslog a stack backtrace with no other information. Inspection of the kernel source, to the best of my limited understanding, suggests that getdirtybuf() was handed a buffer without an associated vnode. Kernel config file and make.conf attached. Should I be concerned? Note that this system is an older 600MHz Athlon with only 256MB RAM, and both times this triggered it was thrashing quite a bit (that's more or less its usual state...). KDB: stack backtrace: kdb_backtrace(c06fbf78,2,c63ca26c,0,22) at kdb_backtrace+0x2e getdirtybuf(d3196bac,0,1,c63ca26c,1) at getdirtybuf+0x2b flush_deplist(c1a8544c,1,d3196bd4,d3196bd8,0) at flush_deplist+0x49 flush_inodedep_deps(c11eb800,5858f,c1ea723c,d3196c34,c052952f) at flush_inodedep_deps+0x9e softdep_sync_metadata(d3196ca4,c1ea7210,50,c06c9a19,0) at softdep_sync_metadata+0x9d ffs_fsync(d3196ca4,0,0,0,0) at ffs_fsync+0x487 fsync(c1b367d0,d3196d14,4,c10f9700,0) at fsync+0x196 syscall(2f,2f,2f,8327600,5e) at syscall+0x300 Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (95, FreeBSD ELF32, fsync), eip = 0x29152d6f, esp = 0xbf5a8d5c, ebp = 0xbf5a8d78 --- FreeBSD rushlight.kf8nh.com 5.3-STABLE FreeBSD 5.3-STABLE #0: Sun Mar 6 02:56:16 EST 2005 [EMAIL PROTECTED]:/usr/src/sys/i386/compile/RUSHLIGHT i386 I used to see this on a regular basis on several machines I had running early 5 through 5.2 releases and it seemed to have gone away (for me) with the 5.3 release(s). I never did hear of a definitive resolution for this issue; your backtrace is alarmingly similar to the one that I had seen. http://lists.freebsd.org/pipermail/freebsd-current/2004-July/031576.html Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: SuperMicro X5DP8-G2MB/(2)XEON 2.4/1GB RAM 5.4-S Freeze
Aaron Summers presumably uttered the following on 04/11/05 22:12: Greetings, We have a SuperMicro X5DP8-G2 Motherboard, 2xXEON 2.4, 1GB RAM server running 5.4-STABLE that keeps freezing up. We have replaced RAM, HD, SCSI controller, etc. To no avail. We are running SMP GENERIC Kernel. I cannot get the system to panic, leave a core dump, etc. It just always freezes. The server functions as a web server in a HSphere Cluster. I am about out of options besides loading 4.11 (since our 4 series servers never die). Any help, feedback, clues, similar experiences, etc would be greatly appreciated. On SCSI: The onboard Adaptec 7902 gives a dump on bootup but appears to work. I read the archived post about this issue. The system still locked up with an Adaptec 7982B that did not give this message. DMESG: da2 at ahd0 bus 0 target 4 lun 0 da2: Fixed Direct Access SCSI-3 device da2: 320.000MB/s transfers (160.000MHz, offset 63, 16bit), Tagged Queueing Enabled da2: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C) da0 at ahd0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-3 device da0: 320.000MB/s transfers (160.000MHz, offset 63, 16bit), Tagged Queueing Enabled da0: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C) da1 at ahd0 bus 0 target 2 lun 0 da1: Fixed Direct Access SCSI-3 device da1: 320.000MB/s transfers (160.000MHz, offset 63, 16bit), Tagged Queueing Enabled da1: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C) Mounting root from ufs:/dev/da0s1a We had many issues with Seagate drives and and S/M boards with the onboard Adaptec scsi controllers. Seagate offered no help other than to suggest putting in a network card (in lieu of the onboard) and/or disabling SMP; neither solution was acceptable so we switched to IBM/Hitachi drives and the problems disappeared. By the way, the problems manifested themselves in those servers where we had more than just one hard drive installed. This was even after updating to the latest firmware etc; Seagate insists no problem with their drives although other drives work perfectly well. YMMV I do see you say you tried other harddrives .. which ones did you use? Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: scsi card recommendation
On Wed, 2005-04-13 at 09:58 +0700, Dikshie wrote: > dear all, > I would like to buy SCSI card which must: > - support Ultra 320 > - support RAID 0,1,5, and 1/0 > any recommendation for FreeBSD-5.x ? > > > > > thanks ! > > > -dikshie- > ___ We find the LSI MegaRaid 320-2x series works great (using it on a dual opteron system), especially with the battery-backed cache ... can be picked up for just under $1k Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: panic in nfsd on 6.2-RC1
On Tue, 2006-12-05 at 12:38 +0900, Hiroki Sato wrote: > Kostik Belousov <[EMAIL PROTECTED]> wrote > in <[EMAIL PROTECTED]>: > > ko> What version of sys/nfsserver/nfs_serv.c do you use ? If it is older than > ko> 1.156.2.7, please, update the system. > > Thanks, I updated it just now and see how it works. > > -- > | Hiroki SATO I was/am having the same issue. Updating world (6.2-stable) to include the above update sadly did not fix the problem for me. This is an amd64 box with only one client connecting to it via nfs. Reading further it may seem to be an issue with rpc.statd and/or rpc.lockd. As I only have one client connecting and it is being used as mail storage (i.e. the client pops/imaps the storage) would be safe to not using fcntl forwards over the wire? Is this same issue present in 6.1-RELENG? I am really at my wits end at this point and for the first time am actually considering moving to another OS (solaris more than likely) as I cannot have these types of issues interrupting services every couple days. What other information (spefically) can I provide to help the devs figure out what is going on? What can I do in the meantime to have some semblence of stability? I assume downgrading to 5.5-RELENG is out of the question but perhaps disabling SMP? Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: panic in nfsd on 6.2-RC1
On Fri, 2006-12-15 at 13:15 -0500, Kris Kennaway wrote: > On Fri, Dec 15, 2006 at 10:01:19AM -0500, Sven Willenberger wrote: > > On Tue, 2006-12-05 at 12:38 +0900, Hiroki Sato wrote: > > > Kostik Belousov <[EMAIL PROTECTED]> wrote > > > in <[EMAIL PROTECTED]>: > > > > > > ko> What version of sys/nfsserver/nfs_serv.c do you use ? If it is older > > > than > > > ko> 1.156.2.7, please, update the system. > > > > > > Thanks, I updated it just now and see how it works. > > > > > > -- > > > | Hiroki SATO > > > > I was/am having the same issue. Updating world (6.2-stable) to include > > the above update sadly did not fix the problem for me. This is an amd64 > > box with only one client connecting to it via nfs. Reading further it > > may seem to be an issue with rpc.statd and/or rpc.lockd. As I only have > > one client connecting and it is being used as mail storage (i.e. the > > client pops/imaps the storage) would be safe to not using fcntl forwards > > over the wire? Is this same issue present in 6.1-RELENG? I am really at > > my wits end at this point and for the first time am actually considering > > moving to another OS (solaris more than likely) as I cannot have these > > types of issues interrupting services every couple days. > > > > What other information (spefically) can I provide to help the devs > > figure out what is going on? What can I do in the meantime to have some > > semblence of stability? I assume downgrading to 5.5-RELENG is out of the > > question but perhaps disabling SMP? > > Just to confirm, can you please post the panic backtrace you are > seeing? And can you explain what you mean by "may seem to be an issue > with rpc.statd and/or rpc.lockd"? > > Sometimes people think they're seeing the same problem as someone else > when really it's a completely different problem in the same subsystem, > so I'd like to rule that out here. > > Kris Well I have now added kdb and invariants/witness support to the kernel so I should be able to get some backtrace the next time it happens. Currently, the system just locks and no error is displayed on the console or /var/log/messages; sorry I cannot be of immediate help there. Regarding the rpc issue, I just ran across mention of those in sshfs/nfs threads appearing here and in particular to a link referenced within one of them (http://docs.freebsd.org/cgi/getmsg.cgi?fetch=1362611+0 +archive/2006/freebsd-stable/20060702.freebsd-stable ) - it is more than likely not at all related but I am grasping at straws here trying to solve this. FWIW, I do see the following appearing in the /var/log/messages: ufs_rename: fvp == tvp (can't happen) about once or twice a day, but cannot correlate those to lockup. Now that I have enabled the options mentioned above in the kernel, I am seeing some LOR issues: kernel: lock order reversal: kernel: 1st 0xff00c3bab200 kqueue (kqueue) @ /usr/src/sys/kern/kern_event.c:1547 kernel: 2nd 0xff0005bb6078 struct mount mtx (struct mount mtx) @ /usr/src/sys/ufs/ufs/ufs_vnops.c:138 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Not panic in nfsd (Re: panic in nfsd on 6.2-RC1)
On Fri, 2006-12-15 at 23:20 +0200, Kostik Belousov wrote: > On Fri, Dec 15, 2006 at 02:29:58PM -0500, Kris Kennaway wrote: <> > > > > > FWIW, I do see the following appearing in the /var/log/messages: > > > ufs_rename: fvp == tvp (can't happen) > > > about once or twice a day, but cannot correlate those to lockup. Now > > > that I have enabled the options mentioned above in the kernel, I am > > > seeing some LOR issues: > > > > > > kernel: lock order reversal: > > > kernel: 1st 0xff00c3bab200 kqueue (kqueue) @ > > > /usr/src/sys/kern/kern_event.c:1547 > > > kernel: 2nd 0xff0005bb6078 struct mount mtx (struct mount mtx) @ > > > /usr/src/sys/ufs/ufs/ufs_vnops.c:138 > > > > OK, this is interesting, so let's proceed from here. > > > > Kris > > Try this. > > Index: ufs/ufs/ufs_vnops.c > === > RCS file: /usr/local/arch/ncvs/src/sys/ufs/ufs/ufs_vnops.c,v > retrieving revision 1.283 > diff -u -r1.283 ufs_vnops.c > --- ufs/ufs/ufs_vnops.c 6 Nov 2006 13:42:09 - 1.283 > +++ ufs/ufs/ufs_vnops.c 15 Dec 2006 21:19:51 - > @@ -133,19 +133,15 @@ > { > struct inode *ip; > struct timespec ts; > - int mnt_locked; > > ip = VTOI(vp); > - mnt_locked = 0; > - if ((vp->v_mount->mnt_flag & MNT_RDONLY) != 0) { > - VI_LOCK(vp); > + VI_LOCK(vp); > + if ((vp->v_mount->mnt_flag & MNT_RDONLY) != 0) > goto out; > + if ((ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_UPDATE)) == 0) { > + VI_UNLOCK(vp); > + return; > } > - MNT_ILOCK(vp->v_mount); /* For reading of mnt_kern_flags. */ > - mnt_locked = 1; > - VI_LOCK(vp); > - if ((ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_UPDATE)) == 0) > - goto out_unl; > > if ((vp->v_type == VBLK || vp->v_type == VCHR) && !DOINGSOFTDEP(vp)) > ip->i_flag |= IN_LAZYMOD; > @@ -172,10 +168,7 @@ > > out: > ip->i_flag &= ~(IN_ACCESS | IN_CHANGE | IN_UPDATE); > - out_unl: > VI_UNLOCK(vp); > - if (mnt_locked) > - MNT_IUNLOCK(vp->v_mount); > } > > /* Patch applied cleanly (offset 6 lines), make buildworld, make kernel, reboot, make installworld, etc. kernel: lock order reversal: kernel: 1st 0xff00b9181800 kqueue (kqueue) @ /usr/src/sys/kern/kern_event.c:1547 kernel: 2nd 0xff00c16030d0 vnode interlock (vnode interlock) @ /usr/src/sys/ufs/ufs/ufs_vnops.c:132 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Migrating vinum to gvinum
I have a 5.2.1 box that I want to upgrade to 5.5-RELENG and in doing so need to upgrade/migrate the current vinum setup to gvinum. It is a simple vinum mirror (just 2 drives with one vinum slice each). Having done some googling on the matter I really haven't found a definitive "best approach" to doing this. The choices would be: 1) making buildworld and making kernel. Remove the vinum-specific entries in rc.conf and adding geom_vinum_load="YES" to /boot/loader.conf. Rebooting and [optionally] running gvinum saveconfig. or 2) clear the current vinum config (which would leave the data intact on each part of the mirror?). Making buildworld, making kernel, adding the loader.conf line. Then rebooting, installing world and rebuild the gvinum device by creating a mirror with one disk and then adding the second disk. Anyone have experience in this migration process? Alternatively has anyone converted a (g)vinum mirror into a gmirror setup? Thanks Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Not panic in nfsd (Re: panic in nfsd on 6.2-RC1)
Sven Willenberger presumably uttered the following on 12/18/06 12:33: > On Fri, 2006-12-15 at 23:20 +0200, Kostik Belousov wrote: >> On Fri, Dec 15, 2006 at 02:29:58PM -0500, Kris Kennaway wrote: > > <> > >>> >>>> FWIW, I do see the following appearing in the /var/log/messages: >>>> ufs_rename: fvp == tvp (can't happen) >>>> about once or twice a day, but cannot correlate those to lockup. Now >>>> that I have enabled the options mentioned above in the kernel, I am >>>> seeing some LOR issues: >>>> >>>> kernel: lock order reversal: >>>> kernel: 1st 0xff00c3bab200 kqueue (kqueue) @ >>>> /usr/src/sys/kern/kern_event.c:1547 >>>> kernel: 2nd 0xff0005bb6078 struct mount mtx (struct mount mtx) @ >>>> /usr/src/sys/ufs/ufs/ufs_vnops.c:138 >>> OK, this is interesting, so let's proceed from here. >>> >>> Kris >> Try this. >> >> Index: ufs/ufs/ufs_vnops.c >> === >> RCS file: /usr/local/arch/ncvs/src/sys/ufs/ufs/ufs_vnops.c,v >> retrieving revision 1.283 >> diff -u -r1.283 ufs_vnops.c >> --- ufs/ufs/ufs_vnops.c 6 Nov 2006 13:42:09 - 1.283 >> +++ ufs/ufs/ufs_vnops.c 15 Dec 2006 21:19:51 - >> @@ -133,19 +133,15 @@ >> { >> struct inode *ip; >> struct timespec ts; >> -int mnt_locked; >> >> ip = VTOI(vp); >> -mnt_locked = 0; >> -if ((vp->v_mount->mnt_flag & MNT_RDONLY) != 0) { >> -VI_LOCK(vp); >> +VI_LOCK(vp); >> +if ((vp->v_mount->mnt_flag & MNT_RDONLY) != 0) >> goto out; >> +if ((ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_UPDATE)) == 0) { >> +VI_UNLOCK(vp); >> +return; >> } >> -MNT_ILOCK(vp->v_mount); /* For reading of mnt_kern_flags. */ >> -mnt_locked = 1; >> -VI_LOCK(vp); >> -if ((ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_UPDATE)) == 0) >> -goto out_unl; >> >> if ((vp->v_type == VBLK || vp->v_type == VCHR) && !DOINGSOFTDEP(vp)) >> ip->i_flag |= IN_LAZYMOD; >> @@ -172,10 +168,7 @@ >> >> out: >> ip->i_flag &= ~(IN_ACCESS | IN_CHANGE | IN_UPDATE); >> - out_unl: >> VI_UNLOCK(vp); >> -if (mnt_locked) >> -MNT_IUNLOCK(vp->v_mount); >> } >> >> /* > > > Patch applied cleanly (offset 6 lines), make buildworld, make kernel, > reboot, make installworld, etc. > > kernel: lock order reversal: > kernel: 1st 0xff00b9181800 kqueue (kqueue) @ > /usr/src/sys/kern/kern_event.c:1547 > kernel: 2nd 0xff00c16030d0 vnode interlock (vnode interlock) @ > /usr/src/sys/ufs/ufs/ufs_vnops.c:132 > > > > ___ Having enabled witness and ddb, etc I cannot get this LOR to trigger anymore, but the machine is still locking up. I finally managed to get a piece of what was appearing on the console which is the following (copied by hand by an onsite tech so there may be a typo here and there): cut-- bge_intr() at loge_intr+0x84a ithread_loop() at ithread_loop+0x14c fork_exit() at fork_exit+0xbb fork_trampoline() at fork_trampoline+0xee --- trap 0, rip-0, rsp-0xb371ad00, rbp-0 --- Fatal trap 12: page fault while in Kernel Mode cupid=1, apic id=01 fault virtual address - 0x28 fault code - supervisor write, page not present instruction pointer - 0x8:0x801dae1a stack pointer - 0x10:0xb371ab70 frame pointer - 0x10:0xb371abd0 code segment - base 0x0, limit 0xf, type 0x1b - DPL 0, pres 1, long 1, def32 0, gram 1 processor eflags=interrupt enabled, resume, IOPL=0 current process=28 (irq 24:bge0) trap number=12 panic: page fault cupid=1 Uptime - 4d10h52m36s Dumping 4031MB (2 chunks) chunk0: 1MB (156 pages)... ok chunk1: 4031MB (1031920) --cut- For some reason, by the time it reboots, there is no dump file available (even though it is enabled in rc.conf and there is more than enough room in /var/crash to hold it). Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Verifying serial setup for KDB_UNATTENDED
We have a RELENG_6 amd64 box that has been experiencing lockups/panics every 4 days or so. The box is at a remote location and trying to get a full trace has been nigh impossible with the staffing constraints. As such I would like to set up a serial console (using another FreeBSD box and minicom or cu). I would like to verify that the following will work to allow a) the other FreeBSD box to have a terminal session via COM1 b) have it work regardless of whether a keyboard and/or monitor is plugged into the target box and c) still allow terminal redirection to internal or serial console if a keyboard is attached : /boot/loader.conf hint.sio.0.flags="0x30" console="comconsole,vidconsole" boot_multicons="YES" boot_console="YES" comsconsole_speed="19200" /etc/ttys ttyd0 "/usr/libexec/getty std.19200" vt100 on secure As this is basically a 6.2 system, I assume I don't need to do anything re: boot blocks or /etc/make.conf or the like? Kernel has already been built with options DDB, options KDB_UNATTENDED, options KDB, options KDB_TRACE. Would the modifications to the 2 files listed above be sufficient to meet my wishes above and allow me to see the panic to terminal when the system does panic (and allow me to even trace, etc via the kdb debugger) ? Thanks, Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Panic in 6.2-PRERELEASE with bge on amd64
I am starting a new thread on this as what I had assumed was a panic in nfsd turns out to be an issue with the bge driver. This is an amd64 box, dual processor (SMP kernel) that happens to be running nfsd. About every 3-5 days the kernel panics and I have finally managed to get a core dump. The system: FreeBSD 6.2-PRERELEASE #8: Tue Jan 2 10:57:39 EST 2007 The short and dirty of the dump: # kgdb /usr/obj/usr/src/sys/MSPOOL/kernel.debug /var/crash/vmcore.0 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd". Unread portion of the kernel message buffer: lock order reversal: (sleepable after non-sleepable) 1st 0x8836b010 bge0 (network driver) @ /usr/src/sys/dev/bge/if_bge.c:2675 2nd 0x805f26b0 user map (user map) @ /usr/src/sys/vm/vm_map.c:3074 KDB: stack backtrace: witness_checkorder() at witness_checkorder+0x4da _sx_xlock() at _sx_xlock+0x51 vm_map_lookup() at vm_map_lookup+0x44 vm_fault() at vm_fault+0xba trap_pfault() at trap_pfault+0x13c trap() at trap+0x1f9 calltrap() at calltrap+0x5 --- trap 0xc, rip = 0x801d5f17, rsp = 0xb371ab50, rbp = 0xb371aba0 --- bge_rxeof() at bge_rxeof+0x3b7 bge_intr() at bge_intr+0x1c8 ithread_loop() at ithread_loop+0x14c fork_exit() at fork_exit+0xbb fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xb371ad00, rbp = 0 --- Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0x28 fault code = supervisor write, page not present instruction pointer = 0x8:0x801d5f17 stack pointer = 0x10:0xb371ab50 frame pointer = 0x10:0xb371aba0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 28 (irq24: bge0) trap number = 12 panic: page fault cpuid = 1 Uptime: 3d4h18m42s #0 doadump () at pcpu.h:172 172 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump () at pcpu.h:172 #1 0x802771b9 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #2 0x80276c4b in panic (fmt=0x8044160c "%s") at /usr/src/sys/kern/kern_shutdown.c:565 #3 0x803ebba6 in trap_fatal (frame=0xc, eva=18446742978291675136) at /usr/src/sys/amd64/amd64/trap.c:660 #4 0x803ebee3 in trap_pfault (frame=0xb371aaa0, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:573 #5 0x803ec0f9 in trap (frame= {tf_rdi = 0, tf_rsi = 0, tf_rdx = 1, tf_rcx = 499, tf_r8 = 2521427970, tf_r9 = -1099500152320, tf_rax = 0, tf_rbx = -1263948192, tf_rbp = -1284396128, tf_r10 = 0, tf_r11 = 0, tf_r12 = -2009681920, tf_r13 = 0, tf_r14 = 0, tf_r15 = -1099499984896, tf_trapno = 12, tf_addr = 40, tf_flags = -1263948192, tf_err = 2, tf_rip = -2145558761, tf_cs = 8, tf_rflags = 66071, tf_rsp = -1284396192, tf_ss = 16}) at /usr/src/sys/amd64/amd64/trap.c:352 #6 0x803d779b in calltrap () at /usr/src/sys/amd64/amd64/exception.S:168 #7 0x801d5f17 in bge_rxeof (sc=0x8836b000) at /usr/src/sys/dev/bge/if_bge.c:2528 #8 0x801db818 in bge_intr (xsc=0x0) at /usr/src/sys/dev/bge/if_bge.c:2707 #9 0x8025f2bc in ithread_loop (arg=0xffb1b320) at /usr/src/sys/kern/kern_intr.c:682 #10 0x8025e00b in fork_exit (callout=0x8025f170 , arg=0xffb1b320, frame=0xb371ac50) at /usr/src/sys/kern/kern_fork.c:821 #11 0x803d7afe in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:394 If more information is needed (disassemble, etc) please let me know. In the interim I may switch to either using the base100 ethernet port (fxp) or turn off SMP. Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Panic in 6.2-PRERELEASE with bge on amd64
On Mon, 2007-01-08 at 16:06 +1100, Bruce Evans wrote: > On Sun, 7 Jan 2007, Sven Willenberger wrote: > > > I am starting a new thread on this as what I had assumed was a panic in > > nfsd turns out to be an issue with the bge driver. This is an amd64 box, > > dual processor (SMP kernel) that happens to be running nfsd. About every > > 3-5 days the kernel panics and I have finally managed to get a core > > dump. > > The system: FreeBSD 6.2-PRERELEASE #8: Tue Jan 2 10:57:39 EST 2007 > > Like most NIC drivers, bge unlocks and re-locks around its call to > ether_input() in its interrupt handler. This isn't very safe, and it > certainly causes panics for bge. I often see it panic when bringing > the interface down and up while input is arriving, on a non-SMP non-amd64 > (actually i386) non-6.x (actually -current) system. Bringing the > interface down is probably the worst case. It creates a null pointer > for bge_intr() to follow. > > > The short and dirty of the dump: > > ... > > --- trap 0xc, rip = 0x801d5f17, rsp = 0xb371ab50, rbp = > > 0xb371aba0 --- > > bge_rxeof() at bge_rxeof+0x3b7 > > What is the instruction here? I will do my best to ferret out the information you need. For the bge_rxeof() at bge_rxeof+0x3b7 line, the instruction is: 0x801d5f17 : mov%r15,0x28(%r14) bge_intr() at bge_intr+0x1c8 line, the instruction is: 0x801db818 : mov%rbx,%rdi > > > bge_intr() at bge_intr+0x1c8 > > ithread_loop() at ithread_loop+0x14c > > fork_exit() at fork_exit+0xbb > > fork_trampoline() at fork_trampoline+0xe > > --- trap 0, rip = 0, rsp = 0xb371ad00, rbp = 0 --- > > > Fatal trap 12: page fault while in kernel mode > > cpuid = 1; apic id = 01 > > fault virtual address = 0x28 > > Looks like a null pointer panic anyway. I guess the instruction is > movl to/from 0x28(%reg) where %reg is a null pointer. > from the above lines, apparently %r14 is null then. > > ... > > #8 0x801db818 in bge_intr (xsc=0x0) at > > /usr/src/sys/dev/bge/if_bge.c:2707 > > What is the statement here? It presumably follow a null pointer and only > the exprssion for the pointer is interesting. xsc is already null but > that is probably a bug in gdb, or the result of excessive optimization. > Compiling kernels with -O2 has little effect except to break debugging. > the block of code from if_bge.c: 2705 if (ifp->if_drv_flags & IFF_DRV_RUNNING) { 2706 /* Check RX return ring producer/consumer. */ 2707 bge_rxeof(sc); 2708 2709 /* Check TX ring producer/consumer. */ 2710 bge_txeof(sc); 2711 } By default -O2 is passed to CC (I don't use any custom make flags other than and only define CPUTYPE in my /etc/make.conf). > I rarely use gdb on kernels and haven't looked closely enough using ddb > to see where the null pointer for the panic on down/up came from. > > BTW, the sbdrop panic in -current isn't bge-only or SMP-only. I saw > it once for sk on a non-SMP system. It rarely happens for non-SMP > (much more rarely than the panic in bge_intr()). Under -current, on > an SMP amd64 system with bge, It happens almost every time on close > of the socket for a ttcp server if input is arriving at the time of > the close. I haven't seen it for 6.x. > > Bruce The short of it is that this interface sees pretty much non-stop traffic as this is a mailserver (final destination) and is constantly being delivered to (direct disk access) and mail being retrieved (remote machine(s) with nfs mounted mail spools. If a momentary down of the interface is enough to completely panic the driver and then the kernel, this hardly seems "robust" if, in fact, this is what is happening. So the question arises as to what would be causing the down/up of the interface; I could start looking at the cable, the switch it's connected to and ... any other ideas? (I don't have watchdog enabled or anything like that, for example). Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Panic in 6.2-PRERELEASE with bge on amd64
On Tue, 2007-01-09 at 12:50 +1100, Bruce Evans wrote: > On Mon, 8 Jan 2007, Sven Willenberger wrote: > > > On Mon, 2007-01-08 at 16:06 +1100, Bruce Evans wrote: > >> On Sun, 7 Jan 2007, Sven Willenberger wrote: > > >>> The short and dirty of the dump: > >>> ... > >>> --- trap 0xc, rip = 0x801d5f17, rsp = 0xb371ab50, rbp = > >>> 0xb371aba0 --- > >>> bge_rxeof() at bge_rxeof+0x3b7 > >> > >> What is the instruction here? > > > > I will do my best to ferret out the information you need. For the > > bge_rxeof() at bge_rxeof+0x3b7 line, the instruction is: > > > > 0x801d5f17 : mov%r15,0x28(%r14) > > ... > >> Looks like a null pointer panic anyway. I guess the instruction is > >> movl to/from 0x28(%reg) where %reg is a null pointer. > >> > > > > from the above lines, apparently %r14 is null then. > > Yes. It's a bit suprising that the access is a write. > > >>> ... > >>> #8 0x801db818 in bge_intr (xsc=0x0) at > >>> /usr/src/sys/dev/bge/if_bge.c:2707 > >> > >> What is the statement here? It presumably follow a null pointer and only > >> the exprssion for the pointer is interesting. xsc is already null but > >> that is probably a bug in gdb, or the result of excessive optimization. > >> Compiling kernels with -O2 has little effect except to break debugging. > > > > the block of code from if_bge.c: > > > > 2705 if (ifp->if_drv_flags & IFF_DRV_RUNNING) { > > 2706 /* Check RX return ring producer/consumer. */ > > 2707 bge_rxeof(sc); > > 2708 > > 2709 /* Check TX ring producer/consumer. */ > > 2710 bge_txeof(sc); > > 2711 } > > Oops. I should have asked for the statment in bge_rxeof(). #7 0x801d5f17 in bge_rxeof (sc=0x8836b000) at /usr/src/sys/dev/bge/if_bge.c:2528 2528m->m_pkthdr.len = m->m_len = cur_rx->bge_len - ETHER_CRC_LEN; (where m is defined as: 2449 struct mbuf *m = NULL; ) > > > By default -O2 is passed to CC (I don't use any custom make flags other > > than and only define CPUTYPE in my /etc/make.conf). > > -O2 is unfortunately the default for COPTFLAGS for most arches in > sys/conf/kern.pre.mk. All of my machines and most FreeBSD cluster > machines override this default in /etc/make.conf. > > With the override overridden for RELENG_6 amd64, gcc inlines bge_rxeof(), > so your environment must be a little different to get even the above > ifo. I think gdb can show the correct line numbers but not the call > frames (since there is no call). ddb and the kernel stack trace can > only show the call frames for actual calls. > > With -O1, I couldn't find any instruction similar to the mov to the > null pointer + 28. 28 is a popular offset in mbufs If you have a suggestion for an /etc/make.conf line, I can recompile the kernel accordingly assuming it still panics or locks up after the change of interface noted below. > > > The short of it is that this interface sees pretty much non-stop traffic > > as this is a mailserver (final destination) and is constantly being > > delivered to (direct disk access) and mail being retrieved (remote > > machine(s) with nfs mounted mail spools. If a momentary down of the > > interface is enough to completely panic the driver and then the kernel, > > this hardly seems "robust" if, in fact, this is what is happening. So > > the question arises as to what would be causing the down/up of the > > interface; I could start looking at the cable, the switch it's connected > > to and ... any other ideas? (I don't have watchdog enabled or anything > > like that, for example). > > I don't think down/up can occur in normal operation, since it takes ioctls > or a watchdog timeout to do it. Maybe some ioctls other than a full > down/up can cause problems... bge_init() is called for the following > ioctls: > - mtu changes > - some near down/up (possibly only these) > Suspend/resume and of course detach/attach do much the same things as > down/up. > > BTW, I added some sysctls and found it annoying to have to do down/up > to make the sysctls take effect. Sysctls in several other NIC drivers > require the same, since doing a full reinitialization is easiest. > Since I am tuning using sysctls, I got used to doing down/up too much. > > Similarly for the mtu ioctl. I think a full reinitialization is used > for mtu
Re: Panic in 6.2-PRERELEASE with bge on amd64
On Tue, 2007-01-09 at 11:50 -0500, John Baldwin wrote: > On Tuesday 09 January 2007 09:37, Sven Willenberger wrote: > > On Tue, 2007-01-09 at 12:50 +1100, Bruce Evans wrote: > > > On Mon, 8 Jan 2007, Sven Willenberger wrote: > > > > > > > On Mon, 2007-01-08 at 16:06 +1100, Bruce Evans wrote: > > > >> On Sun, 7 Jan 2007, Sven Willenberger wrote: > > > > > > >>> The short and dirty of the dump: > > > >>> ... > > > >>> --- trap 0xc, rip = 0x801d5f17, rsp = 0xb371ab50, rbp > = 0xb371aba0 --- > > > >>> bge_rxeof() at bge_rxeof+0x3b7 > > > >> > > > >> What is the instruction here? > > > > > > > > I will do my best to ferret out the information you need. For the > > > > bge_rxeof() at bge_rxeof+0x3b7 line, the instruction is: > > > > > > > > 0x801d5f17 : mov%r15,0x28(%r14) > > > > ... > > > >> Looks like a null pointer panic anyway. I guess the instruction is > > > >> movl to/from 0x28(%reg) where %reg is a null pointer. > > > >> > > > > > > > > from the above lines, apparently %r14 is null then. > > > > > > Yes. It's a bit suprising that the access is a write. > > > > > > >>> ... > > > >>> #8 0x801db818 in bge_intr (xsc=0x0) > at /usr/src/sys/dev/bge/if_bge.c:2707 > > > >> > > > >> What is the statement here? It presumably follow a null pointer and > only > > > >> the exprssion for the pointer is interesting. xsc is already null but > > > >> that is probably a bug in gdb, or the result of excessive optimization. > > > >> Compiling kernels with -O2 has little effect except to break debugging. > > > > > > > > the block of code from if_bge.c: > > > > > > > > 2705 if (ifp->if_drv_flags & IFF_DRV_RUNNING) { > > > > 2706 /* Check RX return ring producer/consumer. */ > > > > 2707 bge_rxeof(sc); > > > > 2708 > > > > 2709 /* Check TX ring producer/consumer. */ > > > > 2710 bge_txeof(sc); > > > > 2711 } > > > > > > Oops. I should have asked for the statment in bge_rxeof(). > > > > #7 0x801d5f17 in bge_rxeof (sc=0x8836b000) > at /usr/src/sys/dev/bge/if_bge.c:2528 > > 2528m->m_pkthdr.len = m->m_len = cur_rx->bge_len - > ETHER_CRC_LEN; > > > > (where m is defined as: > > 2449 struct mbuf *m = NULL; > > ) > > It's assigned earlier in between those two places. Can you 'p rxidx' as well > as 'p sc->bge_cdata.bge_rx_std_chain[rxidx]' and 'p > sc->bge_cdata.bge_rx_jumbo_chain[rxidx]'? Also, are you using jumbo frames > at all? > (kgdb) p rxidx $1 = 499 (kgdb) p sc->bge_cdata.bge_rx_std_chain[rxidx] $2 = (struct mbuf *) 0xff0097a27900 (kgdb) p sc->bge_cdata.bge_rx_jumbo_chain[rxidx] $3 = (struct mbuf *) 0x0 And no, I am not using jumbo frames: bge0: flags=8843 mtu 1500 options=1b Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Panic in 6.2-PRERELEASE with bge on amd64
On Tue, 2007-01-09 at 14:09 -0500, John Baldwin wrote: > On Tuesday 09 January 2007 12:53, Sven Willenberger wrote: > > On Tue, 2007-01-09 at 11:50 -0500, John Baldwin wrote: > > > On Tuesday 09 January 2007 09:37, Sven Willenberger wrote: > > > > On Tue, 2007-01-09 at 12:50 +1100, Bruce Evans wrote: > > > > > On Mon, 8 Jan 2007, Sven Willenberger wrote: > > > > > > > > > > > On Mon, 2007-01-08 at 16:06 +1100, Bruce Evans wrote: > > > > > >> On Sun, 7 Jan 2007, Sven Willenberger wrote: > > > > > > > > > > >>> The short and dirty of the dump: > > > > > >>> ... > > > > > >>> --- trap 0xc, rip = 0x801d5f17, rsp = 0xb371ab50, > rbp > > > = 0xb371aba0 --- > > > > > >>> bge_rxeof() at bge_rxeof+0x3b7 > > > > > >> > > > > > >> What is the instruction here? > > > > > > > > > > > > I will do my best to ferret out the information you need. For the > > > > > > bge_rxeof() at bge_rxeof+0x3b7 line, the instruction is: > > > > > > > > > > > > 0x801d5f17 : mov%r15,0x28(%r14) > > > > > > ... > > > > > >> Looks like a null pointer panic anyway. I guess the instruction is > > > > > >> movl to/from 0x28(%reg) where %reg is a null pointer. > > > > > >> > > > > > > > > > > > > from the above lines, apparently %r14 is null then. > > > > > > > > > > Yes. It's a bit suprising that the access is a write. > > > > > > > > > > >>> ... > > > > > >>> #8 0x801db818 in bge_intr (xsc=0x0) > > > at /usr/src/sys/dev/bge/if_bge.c:2707 > > > > > >> > > > > > >> What is the statement here? It presumably follow a null pointer > and > > > only > > > > > >> the exprssion for the pointer is interesting. xsc is already null > but > > > > > >> that is probably a bug in gdb, or the result of excessive > optimization. > > > > > >> Compiling kernels with -O2 has little effect except to break > debugging. > > > > > > > > > > > > the block of code from if_bge.c: > > > > > > > > > > > > 2705 if (ifp->if_drv_flags & IFF_DRV_RUNNING) { > > > > > > 2706 /* Check RX return ring producer/consumer. */ > > > > > > 2707 bge_rxeof(sc); > > > > > > 2708 > > > > > > 2709 /* Check TX ring producer/consumer. */ > > > > > > 2710 bge_txeof(sc); > > > > > > 2711 } > > > > > > > > > > Oops. I should have asked for the statment in bge_rxeof(). > > > > > > > > #7 0x801d5f17 in bge_rxeof (sc=0x8836b000) > > > at /usr/src/sys/dev/bge/if_bge.c:2528 > > > > 2528m->m_pkthdr.len = m->m_len = cur_rx->bge_len - > > > ETHER_CRC_LEN; > > > > > > > > (where m is defined as: > > > > 2449 struct mbuf *m = NULL; > > > > ) > > > > > > It's assigned earlier in between those two places. Can you 'p rxidx' as > well > > > as 'p sc->bge_cdata.bge_rx_std_chain[rxidx]' and 'p > > > sc->bge_cdata.bge_rx_jumbo_chain[rxidx]'? Also, are you using jumbo > frames > > > at all? > > > > > > > (kgdb) p rxidx > > $1 = 499 > > (kgdb) p sc->bge_cdata.bge_rx_std_chain[rxidx] > > $2 = (struct mbuf *) 0xff0097a27900 > > (kgdb) p sc->bge_cdata.bge_rx_jumbo_chain[rxidx] > > $3 = (struct mbuf *) 0x0 > > > > And no, I am not using jumbo frames: > > bge0: flags=8843 mtu 1500 > > options=1b > > Did you do a 'p m' to verify that m is NULL? If you can reproduce this, I'd > add some KASSERT's where it fetches the mbuf out of the descriptor data to > see if m is NULL. > at this spot, m is null: (kgdb) p m $3 = (struct mbuf *) 0x0 As far as adding some KASSERT's ... you have gone beyond my rudimentary knowledge here as far as application goes. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Panic in 6.2-PRERELEASE with bge on amd64
Bruce Evans presumably uttered the following on 01/09/07 21:42: > On Tue, 9 Jan 2007, John Baldwin wrote: > >> On Tuesday 09 January 2007 09:37, Sven Willenberger wrote: >>> On Tue, 2007-01-09 at 12:50 +1100, Bruce Evans wrote: >>>> Oops. I should have asked for the statment in bge_rxeof(). >>> >>> #7 0x801d5f17 in bge_rxeof (sc=0x8836b000) >> at /usr/src/sys/dev/bge/if_bge.c:2528 >>> 2528m->m_pkthdr.len = m->m_len = cur_rx->bge_len - >> ETHER_CRC_LEN; >>> >>> (where m is defined as: >>> 2449 struct mbuf *m = NULL; >>> ) >> >> It's assigned earlier in between those two places. > > Its initialization here is just a style bug. > >> Can you 'p rxidx' as well >> as 'p sc->bge_cdata.bge_rx_std_chain[rxidx]' and 'p >> sc->bge_cdata.bge_rx_jumbo_chain[rxidx]'? Also, are you using jumbo >> frames >> at all? > > Also look at nearby chain entries (especially at (rxidx - 1) mod 512)). > I think the previous 255 entries and the rxidx one should be > non-NULL since we should have refilled them as we used them (so the > one at rxidx is least interesting since we certainly just refilled > it), and the next 256 entries should be NULL since we bogusly only use > half of the entries. If the problem is uninitialization, then I expect > all 512 entries except the one just refilled at rxidx to be NULL. > > Bruce > ___ (kgdb) p sc->bge_cdata.bge_rx_std_chain[rxidx] $1 = (struct mbuf *) 0xff0097a27900 (kgdb) p rxidx $2 = 499 since rxidx = 499, I assume you are most interested in 498: (kgdb) p sc->bge_cdata.bge_rx_std_chain[498] $3 = (struct mbuf *) 0xff00cf1b3100 for the sake of argument, 500 is null: (kgdb) p sc->bge_cdata.bge_rx_std_chain[500] $13 = (struct mbuf *) 0x0 the indexes with values basically are 243 through 499: (kgdb) p sc->bge_cdata.bge_rx_std_chain[241] $30 = (struct mbuf *) 0x0 (kgdb) p sc->bge_cdata.bge_rx_std_chain[242] $31 = (struct mbuf *) 0x0 (kgdb) p sc->bge_cdata.bge_rx_std_chain[243] $32 = (struct mbuf *) 0xff005d4ab700 (kgdb) p sc->bge_cdata.bge_rx_std_chain[244] $33 = (struct mbuf *) 0xff004f644b00 so it does not seem to be a problem with "uninitialization". ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Not panic in nfsd (Re: panic in nfsd on 6.2-RC1)
On Sat, 2007-01-13 at 15:11 -0500, Kris Kennaway wrote: > On Sat, Dec 30, 2006 at 06:04:13PM -0500, Sven Willenberger wrote: > > > > > > Sven Willenberger presumably uttered the following on 12/18/06 12:33: > > > On Fri, 2006-12-15 at 23:20 +0200, Kostik Belousov wrote: > > >> On Fri, Dec 15, 2006 at 02:29:58PM -0500, Kris Kennaway wrote: > > > > > > <> > > > > > >>> > > >>>> FWIW, I do see the following appearing in the /var/log/messages: > > >>>> ufs_rename: fvp == tvp (can't happen) > > >>>> about once or twice a day, but cannot correlate those to lockup. Now > > >>>> that I have enabled the options mentioned above in the kernel, I am > > >>>> seeing some LOR issues: > > >>>> > > >>>> kernel: lock order reversal: > > >>>> kernel: 1st 0xff00c3bab200 kqueue (kqueue) @ > > >>>> /usr/src/sys/kern/kern_event.c:1547 > > >>>> kernel: 2nd 0xff0005bb6078 struct mount mtx (struct mount mtx) @ > > >>>> /usr/src/sys/ufs/ufs/ufs_vnops.c:138 > > >>> OK, this is interesting, so let's proceed from here. > > >>> > > >>> Kris > > >> Try this. > > >> > > >> Index: ufs/ufs/ufs_vnops.c > > >> === > > >> RCS file: /usr/local/arch/ncvs/src/sys/ufs/ufs/ufs_vnops.c,v > > >> retrieving revision 1.283 > > >> diff -u -r1.283 ufs_vnops.c > > >> --- ufs/ufs/ufs_vnops.c 6 Nov 2006 13:42:09 - 1.283 > > >> +++ ufs/ufs/ufs_vnops.c 15 Dec 2006 21:19:51 - > > >> @@ -133,19 +133,15 @@ > > >> { > > >> struct inode *ip; > > >> struct timespec ts; > > >> -int mnt_locked; > > >> > > >> ip = VTOI(vp); > > >> -mnt_locked = 0; > > >> -if ((vp->v_mount->mnt_flag & MNT_RDONLY) != 0) { > > >> -VI_LOCK(vp); > > >> +VI_LOCK(vp); > > >> +if ((vp->v_mount->mnt_flag & MNT_RDONLY) != 0) > > >> goto out; > > >> +if ((ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_UPDATE)) == 0) { > > >> +VI_UNLOCK(vp); > > >> +return; > > >> } > > >> -MNT_ILOCK(vp->v_mount); /* For reading of > > >> mnt_kern_flags. */ > > >> -mnt_locked = 1; > > >> -VI_LOCK(vp); > > >> -if ((ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_UPDATE)) == 0) > > >> -goto out_unl; > > >> > > >> if ((vp->v_type == VBLK || vp->v_type == VCHR) && > > >> !DOINGSOFTDEP(vp)) > > >> ip->i_flag |= IN_LAZYMOD; > > >> @@ -172,10 +168,7 @@ > > >> > > >> out: > > >> ip->i_flag &= ~(IN_ACCESS | IN_CHANGE | IN_UPDATE); > > >> - out_unl: > > >> VI_UNLOCK(vp); > > >> -if (mnt_locked) > > >> -MNT_IUNLOCK(vp->v_mount); > > >> } > > >> > > >> /* > > > > > > > > > Patch applied cleanly (offset 6 lines), make buildworld, make kernel, > > > reboot, make installworld, etc. > > > > > > kernel: lock order reversal: > > > kernel: 1st 0xff00b9181800 kqueue (kqueue) @ > > > /usr/src/sys/kern/kern_event.c:1547 > > > kernel: 2nd 0xff00c16030d0 vnode interlock (vnode interlock) @ > > > /usr/src/sys/ufs/ufs/ufs_vnops.c:132 > > > > > > > > > > > > ___ > > > > Having enabled witness and ddb, etc I cannot get this LOR to trigger > > anymore, but > > the machine is still locking up. I finally managed to get a piece of what > > was > > appearing on the console which is the following (copied by hand by an > > onsite tech so > > there may be a typo here and there): > > > > cut-- > > > > bge_intr() at loge_intr+0x84a > > ithread_loop() at ithread_loop+0x14c > > fork_exit() at fork_exit+0xbb > > fork_trampoline() at fork_trampoline+0xee > > --- trap 0, rip-0, rsp-0xb371ad00, rbp-0 --- > > > >
Re: bge panic (Re: Not panic in nfsd (Re: panic in nfsd on 6.2-RC1))
On Mon, 2007-01-15 at 13:22 -0500, Kris Kennaway wrote: > On Mon, Jan 15, 2007 at 11:33:33AM -0500, Sven Willenberger wrote: > > > > This is indicating a problem either with your bge hardware or the driver. > > > > > > Kris > > > > I suspect the driver: This same hardware setup was being used as a > > databse server with FreeBSD 5.4. I had been using the bge driver but set > > at base100T without any issue at all. It was when I did a clean install > > of 6.2-Prerelease and setting bge to use the full gigE speed (via > > autonegotiate) that these issues cropped up. > > Be careful before you start blaming FreeBSD - since you did not test > the failing hardware configuration in the older version of FreeBSD you > cannot yet determine that it is a driver regression. > > Kris I will freely admit that this may be circumstantial, that the hardware failed at the same time I upgraded to the newer version of FreeBSD. It could also be that there is an issue with the bge driver being used with 1000 (gigE) speeds instead of at fastE speeds as I used it with the 5.4 release (same hardware). Unfortunately, now that the fxp connection seems stable (for the moment) I am going to take advantage of the uptime and will have to leave troubleshooting/debugging/etc to what I have provided in the other responses I have sent. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: bge panic (Re: Not panic in nfsd (Re: panic in nfsd on 6.2-RC1))
On Mon, 2007-01-15 at 15:24 -0500, Kris Kennaway wrote: > On Mon, Jan 15, 2007 at 02:03:28PM -0500, Sven Willenberger wrote: > > On Mon, 2007-01-15 at 13:22 -0500, Kris Kennaway wrote: > > > On Mon, Jan 15, 2007 at 11:33:33AM -0500, Sven Willenberger wrote: > > > > > > > > This is indicating a problem either with your bge hardware or the > > > > > driver. > > > > > > > > > > Kris > > > > > > > > I suspect the driver: This same hardware setup was being used as a > > > > databse server with FreeBSD 5.4. I had been using the bge driver but set > > > > at base100T without any issue at all. It was when I did a clean install > > > > of 6.2-Prerelease and setting bge to use the full gigE speed (via > > > > autonegotiate) that these issues cropped up. > > > > > > Be careful before you start blaming FreeBSD - since you did not test > > > the failing hardware configuration in the older version of FreeBSD you > > > cannot yet determine that it is a driver regression. > > > > > > Kris > > > > I will freely admit that this may be circumstantial, that the hardware > > failed at the same time I upgraded to the newer version of FreeBSD. It > > could also be that there is an issue with the bge driver being used with > > 1000 (gigE) speeds instead of at fastE speeds as I used it with the 5.4 > > release (same hardware). > > The latter is what I am referring to. Your hardware may never have > worked in gige mode due to your hardware being broken (yes, this > happens), or it could be a freebsd driver issue either introduced in > 6.x or present in 5.4 too. You just haven't ruled these cases out. > > Anyway, since you're happy with your present workaround we'll have to > just drop the issue for now. > > Kris As the box in question is now fully production I can no longer "guinea pig" it. However, I will attempt to set up a test bed with similar hardware and try and push as much traffic through the bge interface at gigE speeds as I can in an effort to duplicate this issue. If it does crop up, this box should allow me to more effectively provide debugging information as it will not be a production unit. Although the current workaround is satisfactory for now (and to that extent I am "happy") I would much rather have the available headroom of full gigE traffic to this server so I would like to see if I can reproduce the error or at least find out if it is a hardware issue (if nothing else than for my own edification). Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
ggate + gmirror write performance woes
I am trying to set up a HA type system involving two identical boxes and have gone through the following to set up the systems: Slave server: ggated -R 196608 -S 196608 (exporting /dev/amrd1 ) net.inet.tcp.sendspace: 65536 net.inet.tcp.recvspace: 131072 Master server: ggatec create -u 0 -R 196608 -S 196608 -o rw [slaveip] /dev/amrd1 net.inet.tcp.sendspace: 131072 net.inet.tcp.recvspace: 65536 #gmirror status NameStatus Components mirror/gm0 COMPLETE amrd1s1 ggate0s1 the two servers are connected to each other via their 2ndary physical gigE interfaces using cat6 crossover cable. (Netperf shows 890 Mbps at 95% confidence). softupdates are enable on gm0 (though this does not affect the results). The results: /usr/bin/time -h cp testfile64M /data1 28.62s real 0.00s user 0.16s sys and this is very consistent ... about 3 MB/s over repeated runs dd if=/dev/zero of=/data1/testfile32M2 bs=32k count=1024 1024+0 records in 1024+0 records out 33554432 bytes transferred in 16.122641 secs (2081199 bytes/sec) What else can I tune here to make this functional? If I increase recvspace and sendspace much beyond those numbers, ggated will not start claiming to not have enough buffer space. Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: ggate + gmirror write performance woes
On Thu, 2007-04-05 at 17:38 +0100, Tom Judge wrote: > Dmitriy Kirhlarov wrote: > > On Thu, Apr 05, 2007 at 10:58:56AM -0400, Sven Willenberger wrote: > >> I am trying to set up a HA type system involving two identical boxes and > >> have gone through the following to set up the systems: > >> > >> Slave server: > >> ggated -R 196608 -S 196608 > >> (exporting /dev/amrd1 ) > >> net.inet.tcp.sendspace: 65536 > >> net.inet.tcp.recvspace: 131072 > > > > Try > > net.local.stream.recvspace=65535 > > net.local.stream.sendspace=65535 > > > > Also, try increase this sysctls with > > net.inet.tcp.rfc1323=1 > > > > I use it on FreeBSD 5.x with: > > net.inet.tcp.sendspace=131072 > > net.inet.tcp.recvspace=131072 > > net.local.stream.recvspace=65535 > > net.local.stream.sendspace=65535 > > > > ggated -R 1048576 -S 1048576 > > ggatec -R 1048576 -S 1048576 > > > > WBR. > > Dmitriy > > ___ > > freebsd-stable@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > > To unsubscribe, send any mail to "[EMAIL PROTECTED]" > > > I have seen sustained writes of 30Mb/s using the following configuration: > > cat /boot/loader.conf > kern.ipc.nmbclusters="32768" > > cat /etc/sysctl.conf > net.inet.tcp.sendspace=1048576 > net.inet.tcp.recvspace=1048576 > > Server: > /sbin/ggated -S 1310720 -R 1310720 -a 172.31.0.18 /etc/gg.exports > > Client: > /sbin/ggatec create -q 2048 -t 5 -S 1310720 -R 1310720 172.31.0.18 > /dev/amrd0s2 > > The raid array is a RAID 1 volume on a dell PERC4 (Dell PE1850) with > adaptive read ahead and write back caching. > > Tom I have tried both the settings ideas suggested above but I cannot even get out of the gate with those. Setting net.inet.tcp.{send,recv}space to anything higher that 131072 results in ggated bailing with the error: # ggated -v -a 10.10.0.19 info: Reading exports file (/etc/gg.exports). debug: Added 10.10.0.0/24 /dev/amrd1 RW to exports list. debug: Added 10.10.0.0/24 /dev/amrd3 RW to exports list. info: Exporting 2 object(s). error: Cannot open stream socket: No buffer space available. error: Exiting. setting net.inet.tcp.{send,recv}space to 131072 allows me to start ggated with the default R and S values of 131072; anything higher results in "no buffer space" errors. At 131072 ggated starts but then I cannot even open a new connection (like ssh) to the server as the ssh client bails with "no buffer space available". more information: # netstat -m 514/641/1155 mbufs in use (current/cache/total) 512/284/796/32768 mbuf clusters in use (current/cache/total/max) 512/256 mbuf+clusters out of packet secondary zone in use (current/cache) 0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/0 9k jumbo clusters in use (current/cache/total/max) 0/0/0/0 16k jumbo clusters in use (current/cache/total/max) 1152K/728K/1880K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/4/6656 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile 0 calls to protocol drain routines This is on a FreeBSD 6.2-RELENG box i386 SMP using the amr driver (SATA Raid using LSiMegaRaid. The odd thing is that even after I set the send and recvspace down to values like 65536, I continue to get the no buffer error when trying to connect to it remotely again. Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: ggate + gmirror write performance woes
On Fri, 2007-04-06 at 16:18 +0300, Nikolay Pavlov wrote: > On Thursday, 5 April 2007 at 16:15:35 -0400, Sven Willenberger wrote: > > On Thu, 2007-04-05 at 17:38 +0100, Tom Judge wrote: > > > Dmitriy Kirhlarov wrote: > > > > On Thu, Apr 05, 2007 at 10:58:56AM -0400, Sven Willenberger wrote: > > > >> I am trying to set up a HA type system involving two identical boxes > > > >> and > > > >> have gone through the following to set up the systems: > > > >> > > > >> Slave server: > > > >> ggated -R 196608 -S 196608 > > > >> (exporting /dev/amrd1 ) > > > >> net.inet.tcp.sendspace: 65536 > > > >> net.inet.tcp.recvspace: 131072 > > > > > > > > Try > > > > net.local.stream.recvspace=65535 > > > > net.local.stream.sendspace=65535 > > > > > > > > Also, try increase this sysctls with > > > > net.inet.tcp.rfc1323=1 > > > > > > > > I use it on FreeBSD 5.x with: > > > > net.inet.tcp.sendspace=131072 > > > > net.inet.tcp.recvspace=131072 > > > > net.local.stream.recvspace=65535 > > > > net.local.stream.sendspace=65535 > > > > > > > > ggated -R 1048576 -S 1048576 > > > > ggatec -R 1048576 -S 1048576 > > > > > > > > WBR. > > > > Dmitriy > > > > ___ > > > > freebsd-stable@freebsd.org mailing list > > > > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > > > > To unsubscribe, send any mail to "[EMAIL PROTECTED]" > > > > > > > > > I have seen sustained writes of 30Mb/s using the following configuration: > > > > > > cat /boot/loader.conf > > > kern.ipc.nmbclusters="32768" > > > > > > cat /etc/sysctl.conf > > > net.inet.tcp.sendspace=1048576 > > > net.inet.tcp.recvspace=1048576 > > > > > > Server: > > > /sbin/ggated -S 1310720 -R 1310720 -a 172.31.0.18 /etc/gg.exports > > > > > > Client: > > > /sbin/ggatec create -q 2048 -t 5 -S 1310720 -R 1310720 172.31.0.18 > > > /dev/amrd0s2 > > > > > > The raid array is a RAID 1 volume on a dell PERC4 (Dell PE1850) with > > > adaptive read ahead and write back caching. > > > > > > Tom > > > > I have tried both the settings ideas suggested above but I cannot even > > get out of the gate with those. Setting net.inet.tcp.{send,recv}space to > > anything higher that 131072 results in ggated bailing with the error: > > # ggated -v -a 10.10.0.19 > > info: Reading exports file (/etc/gg.exports). > > debug: Added 10.10.0.0/24 /dev/amrd1 RW to exports list. > > debug: Added 10.10.0.0/24 /dev/amrd3 RW to exports list. > > info: Exporting 2 object(s). > > error: Cannot open stream socket: No buffer space available. > > error: Exiting. > > For values of net.inet.tcp.{send,recv}space more than > 524288 you also need to adjust kern.ipc.maxsockbuf > > Try this configuration for example: > kern.ipc.maxsockbuf=2049152 > net.inet.tcp.recvspace=1024576 > net.inet.tcp.sendspace=1024576 > kern.ipc.maxsockbuf was the issue here; I increased its value and now I no longer get the buffer space error. Furthermore, the write speed issue was also tied to a hardware raid controller issue. After fixing that issue, and with just the following: kern.ipc.maxsockbuf=1048576 net.inet.tcp.sendspace=131072 net.inet.tcp.recvspace=131072 I can start ggated with -R 262144 -S 262144 as well as the ggatec and see write speeds of 60+MB/s. I may play around with the settings more (and see if any further speed improvements occur), but this is quite acceptable at this point. (For the record nmbclusters is set to 32768). The next part of the project will be writing the freevrrp failover scripts to deal with I/O locking issues that will happen if the ggated server fails, etc. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
CARP and em0 timeout watchdog
I currently have a FreeBSD 6.2-RELEASE-p3 SMP with dual intel PRO/1000PM nics configured as follows: em0: flags=8943 mtu 1500 options=b inet 192.168.0.18 netmask 0xff00 broadcast 192.168.0.255 ether 00:30:48:8d:5c:0a media: Ethernet autoselect (1000baseTX ) status: active em1: flags=8843 mtu 4096 options=b inet 10.10.0.18 netmask 0xfff8 broadcast 10.10.0.23 ether 00:30:48:8d:5c:0b media: Ethernet autoselect (1000baseTX ) status: active the em0 interface connects to the LAN while the em1 interface is connected to an identical box via CAT6 crossover cable (for ggate/gmirror). Now, I have also configured a carp interface: carp0: flags=49 mtu 1500 inet 192.168.0.20 netmask 0x carp: MASTER vhid 1 advbase 1 advskew 0 There are twin boxes here and I am running Samba. The problem is that with transfers across the carp IP (192.168.0.20) I end up with em0 resetting after a watchdog timeout error. This occurs whether I transfer files from a windows box using a share (samba) or via ftp. This problem does *not* occur if I ftp to the 192.168.0.19 interface (non-virtual). I suspected cabling at first so had all the cabling in question replaced with fresh CAT6 to no avail. Several gigs of data can be transferred to the real interface (em0) without any issue at all; a max of maybe 1 - 2 Gig can be transferred connected to the carp'ed IP before the em0 reset. Any ideas here? Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: CARP and em0 timeout watchdog
On Wed, 2007-04-18 at 11:50 -0400, Sven Willenberger wrote: > I currently have a FreeBSD 6.2-RELEASE-p3 SMP with dual intel PRO/1000PM > nics configured as follows: > > em0: flags=8943 mtu 1500 > options=b > inet 192.168.0.18 netmask 0xff00 broadcast 192.168.0.255 > ether 00:30:48:8d:5c:0a > media: Ethernet autoselect (1000baseTX ) > status: active > em1: flags=8843 mtu 4096 > options=b > inet 10.10.0.18 netmask 0xfff8 broadcast 10.10.0.23 > ether 00:30:48:8d:5c:0b > media: Ethernet autoselect (1000baseTX ) > status: active > > the em0 interface connects to the LAN while the em1 interface is > connected to an identical box via CAT6 crossover cable (for > ggate/gmirror). > > Now, I have also configured a carp interface: > > carp0: flags=49 mtu 1500 > inet 192.168.0.20 netmask 0x > carp: MASTER vhid 1 advbase 1 advskew 0 > > There are twin boxes here and I am running Samba. The problem is that > with transfers across the carp IP (192.168.0.20) I end up with em0 > resetting after a watchdog timeout error. This occurs whether I transfer > files from a windows box using a share (samba) or via ftp. This problem > does *not* occur if I ftp to the 192.168.0.19 interface (non-virtual). I > suspected cabling at first so had all the cabling in question replaced > with fresh CAT6 to no avail. Several gigs of data can be transferred to > the real interface (em0) without any issue at all; a max of maybe 1 - 2 > Gig can be transferred connected to the carp'ed IP before the em0 reset. > Any ideas here? > > Sven > Having done more diagnostics I have found out it is not CARP related at all. It turns out that the same timeouts will happen when ftp'ing to the physical address IPs as well. There is also an odd situation here depending on which protocol I use. The two boxes are connected to a Dell Powerconnect 2616 gig switch with CAT6. If I scp files from the 192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth without a hiccup (I used dd to create various sized testfiles from 32M to 1G in size and just scp testfile* to the other box). On the other hand, if I connect to 192.168.0.19 using ftp (either active or passive) where ftp is being run through inetd, the interface resets (watchdog) within seconds (a few MBs) of traffic. Enabling polling does nothing, nor does changing net.inet.tcp.{recv,send}space. Any ideas why I would be seeing such behavioral differences between scp and ftp? Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: CARP and em0 timeout watchdog
On Fri, 2007-04-20 at 09:04 -0700, Jeremy Chadwick wrote: > On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote: > > Having done more diagnostics I have found out it is not CARP related at > > all. It turns out that the same timeouts will happen when ftp'ing to the > > physical address IPs as well. There is also an odd situation here > > depending on which protocol I use. The two boxes are connected to a Dell > > Powerconnect 2616 gig switch with CAT6. If I scp files from the > > 192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth without a > > hiccup (I used dd to create various sized testfiles from 32M to 1G in > > size and just scp testfile* to the other box). On the other hand, if I > > connect to 192.168.0.19 using ftp (either active or passive) where ftp > > is being run through inetd, the interface resets (watchdog) within > > seconds (a few MBs) of traffic. Enabling polling does nothing, nor does > > changing net.inet.tcp.{recv,send}space. Any ideas why I would be seeing > > such behavioral differences between scp and ftp? > > You'll get a much higher throughput rate with FTP than you will with > SSH, simply because encryption overhead is quite high (even with the > Blowfish cipher). With a very fast processor and on a gigE network > you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP. > That's the only difference I can think of. > > The watchdog resets I can't explain; Jack Vogel should be able to assist > with that. But it sounds like the resets only happen under very high > throughput conditions (which is why you'd see it with FTP but not SSH). > I guess it is possible that the traffic from ftp (or smb) is overloading the interface; fwiw, if I increase the {recv,send}space to 131072 I can acheive 32MB+/s using scp (and ftp shows similar values). The real question is how to avoid these watchdog timeouts during heavy traffic; the whole point here was to replace windows-based fileshare servers with FreeBSD for the local network but at the moment it is proving ineffectual as any samba file transfers stall (much like ftp). I see no other error messages in the logfiles other than the watchdog timeouts plus interface down/up messages. Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: CARP and em0 timeout watchdog
On Fri, 2007-04-20 at 10:17 -0700, Jack Vogel wrote: > On 4/20/07, Jeremy Chadwick <[EMAIL PROTECTED]> wrote: > > On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote: > > > Having done more diagnostics I have found out it is not CARP related at > > > all. It turns out that the same timeouts will happen when ftp'ing to the > > > physical address IPs as well. There is also an odd situation here > > > depending on which protocol I use. The two boxes are connected to a Dell > > > Powerconnect 2616 gig switch with CAT6. If I scp files from the > > > 192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth without a > > > hiccup (I used dd to create various sized testfiles from 32M to 1G in > > > size and just scp testfile* to the other box). On the other hand, if I > > > connect to 192.168.0.19 using ftp (either active or passive) where ftp > > > is being run through inetd, the interface resets (watchdog) within > > > seconds (a few MBs) of traffic. Enabling polling does nothing, nor does > > > changing net.inet.tcp.{recv,send}space. Any ideas why I would be seeing > > > such behavioral differences between scp and ftp? > > > > You'll get a much higher throughput rate with FTP than you will with > > SSH, simply because encryption overhead is quite high (even with the > > Blowfish cipher). With a very fast processor and on a gigE network > > you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP. > > That's the only difference I can think of. > > > > The watchdog resets I can't explain; Jack Vogel should be able to assist > > with that. But it sounds like the resets only happen under very high > > throughput conditions (which is why you'd see it with FTP but not SSH). > > What kind of hardware is this interface? Watchdogs mean TX cleanup > isn't happening in a reasonable time, without further data its hard to > know what might be going on. > > Jack from pciconf: [EMAIL PROTECTED]:0:0: class=0x02 card=0x108c15d9 chip=0x108c8086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' device = 'PRO/1000 PM' class= network subclass = ethernet [EMAIL PROTECTED]:0:0: class=0x02 card=0x109a15d9 chip=0x109a8086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' class= network subclass = ethernet em0 is the interface in question. from dmesg: em0: port 0x4000-0x401f mem 0xe030-0xe031 irq 16 at device 0.0 on pci13 em1: port 0x5000-0x501f mem 0xe040-0xe041 irq 17 at device 0.0 on pci14 Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: CARP and em0 timeout watchdog
On Fri, 2007-04-20 at 18:46 +0200, Clayton Milos wrote: > - Original Message - > From: "Sven Willenberger" <[EMAIL PROTECTED]> > To: "Jeremy Chadwick" <[EMAIL PROTECTED]> > Cc: > Sent: Friday, April 20, 2007 6:25 PM > Subject: Re: CARP and em0 timeout watchdog > > > > On Fri, 2007-04-20 at 09:04 -0700, Jeremy Chadwick wrote: > >> On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote: > >> > Having done more diagnostics I have found out it is not CARP related at > >> > all. It turns out that the same timeouts will happen when ftp'ing to > >> > the > >> > physical address IPs as well. There is also an odd situation here > >> > depending on which protocol I use. The two boxes are connected to a > >> > Dell > >> > Powerconnect 2616 gig switch with CAT6. If I scp files from the > >> > 192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth without > >> > a > >> > hiccup (I used dd to create various sized testfiles from 32M to 1G in > >> > size and just scp testfile* to the other box). On the other hand, if I > >> > connect to 192.168.0.19 using ftp (either active or passive) where ftp > >> > is being run through inetd, the interface resets (watchdog) within > >> > seconds (a few MBs) of traffic. Enabling polling does nothing, nor does > >> > changing net.inet.tcp.{recv,send}space. Any ideas why I would be seeing > >> > such behavioral differences between scp and ftp? > >> > >> You'll get a much higher throughput rate with FTP than you will with > >> SSH, simply because encryption overhead is quite high (even with the > >> Blowfish cipher). With a very fast processor and on a gigE network > >> you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP. > >> That's the only difference I can think of. > >> > >> The watchdog resets I can't explain; Jack Vogel should be able to assist > >> with that. But it sounds like the resets only happen under very high > >> throughput conditions (which is why you'd see it with FTP but not SSH). > >> > > > > I guess it is possible that the traffic from ftp (or smb) is overloading > > the interface; fwiw, if I increase the {recv,send}space to 131072 I can > > acheive 32MB+/s using scp (and ftp shows similar values). The real > > question is how to avoid these watchdog timeouts during heavy traffic; > > the whole point here was to replace windows-based fileshare servers with > > FreeBSD for the local network but at the moment it is proving > > ineffectual as any samba file transfers stall (much like ftp). I see no > > other error messages in the logfiles other than the watchdog timeouts > > plus interface down/up messages. > > > > Sven > > > > Sorry for jumping on a thread here. I've had issues with em NIC's as well. > Especially with heavy loads. What helped for me was turning on polling. I > recompiled the kernel with polling and turned it on in rc.conf and my > problems disappeared. > > Are you running with polling on? > At first I did not have polling compiled in, so no. Then I compiled in polling (and used options HZ=2000) but it didn't change anything. Whether I have polling enabled or disabled on the interface, the outcome is the same. Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: CARP and em0 timeout watchdog
On Fri, 2007-04-20 at 11:27 -0700, Jack Vogel wrote: > On 4/20/07, Sven Willenberger <[EMAIL PROTECTED]> wrote: > > On Fri, 2007-04-20 at 10:17 -0700, Jack Vogel wrote: > > > On 4/20/07, Jeremy Chadwick <[EMAIL PROTECTED]> wrote: > > > > On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote: > > > > > Having done more diagnostics I have found out it is not CARP related > > > > > at > > > > > all. It turns out that the same timeouts will happen when ftp'ing to > > > > > the > > > > > physical address IPs as well. There is also an odd situation here > > > > > depending on which protocol I use. The two boxes are connected to a > > > > > Dell > > > > > Powerconnect 2616 gig switch with CAT6. If I scp files from the > > > > > 192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth > > > > > without a > > > > > hiccup (I used dd to create various sized testfiles from 32M to 1G in > > > > > size and just scp testfile* to the other box). On the other hand, if I > > > > > connect to 192.168.0.19 using ftp (either active or passive) where ftp > > > > > is being run through inetd, the interface resets (watchdog) within > > > > > seconds (a few MBs) of traffic. Enabling polling does nothing, nor > > > > > does > > > > > changing net.inet.tcp.{recv,send}space. Any ideas why I would be > > > > > seeing > > > > > such behavioral differences between scp and ftp? > > > > > > > > You'll get a much higher throughput rate with FTP than you will with > > > > SSH, simply because encryption overhead is quite high (even with the > > > > Blowfish cipher). With a very fast processor and on a gigE network > > > > you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP. > > > > That's the only difference I can think of. > > > > > > > > The watchdog resets I can't explain; Jack Vogel should be able to assist > > > > with that. But it sounds like the resets only happen under very high > > > > throughput conditions (which is why you'd see it with FTP but not SSH). > > > > > > What kind of hardware is this interface? Watchdogs mean TX cleanup > > > isn't happening in a reasonable time, without further data its hard to > > > know what might be going on. > > > > > > Jack > > > > from pciconf: > > > > [EMAIL PROTECTED]:0:0: class=0x02 card=0x108c15d9 chip=0x108c8086 > > rev=0x03 > > hdr=0x00 > > vendor = 'Intel Corporation' > > device = 'PRO/1000 PM' > > class= network > > subclass = ethernet > > [EMAIL PROTECTED]:0:0: class=0x02 card=0x109a15d9 chip=0x109a8086 > > rev=0x00 > > hdr=0x00 > > vendor = 'Intel Corporation' > > class= network > > subclass = ethernet > > > > em0 is the interface in question. > > > > from dmesg: > > > > em0: port > > 0x4000-0x401f mem 0xe030-0xe031 irq 16 at device 0.0 on pci13 > > > > em1: port > > 0x5000-0x501f mem 0xe040-0xe041 irq 17 at device 0.0 on pci14 > > OH, this is an 82573, and I've posted a firmware patcher a couple > different times, there is a bit in the MANC register that is incorrectly > programmed in some vendors systems. Can you search email for > that patcher, it needs to run from DOS. If you are unable to find > it let me know and I'll resent you a copy. > > Jack If you are referring to the dcgdis.ThisIsZip attachment, I found it in earlier threads, thanks. Will work on patching the nics and will keep the list updated. Thanks again. Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: CARP and em0 timeout watchdog
On Fri, 2007-04-20 at 14:44 -0400, Sven Willenberger wrote: > On Fri, 2007-04-20 at 11:27 -0700, Jack Vogel wrote: > > On 4/20/07, Sven Willenberger <[EMAIL PROTECTED]> wrote: > > > On Fri, 2007-04-20 at 10:17 -0700, Jack Vogel wrote: > > > > On 4/20/07, Jeremy Chadwick <[EMAIL PROTECTED]> wrote: > > > > > On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote: > > > > > > Having done more diagnostics I have found out it is not CARP > > > > > > related at > > > > > > all. It turns out that the same timeouts will happen when ftp'ing > > > > > > to the > > > > > > physical address IPs as well. There is also an odd situation here > > > > > > depending on which protocol I use. The two boxes are connected to a > > > > > > Dell > > > > > > Powerconnect 2616 gig switch with CAT6. If I scp files from the > > > > > > 192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth > > > > > > without a > > > > > > hiccup (I used dd to create various sized testfiles from 32M to 1G > > > > > > in > > > > > > size and just scp testfile* to the other box). On the other hand, > > > > > > if I > > > > > > connect to 192.168.0.19 using ftp (either active or passive) where > > > > > > ftp > > > > > > is being run through inetd, the interface resets (watchdog) within > > > > > > seconds (a few MBs) of traffic. Enabling polling does nothing, nor > > > > > > does > > > > > > changing net.inet.tcp.{recv,send}space. Any ideas why I would be > > > > > > seeing > > > > > > such behavioral differences between scp and ftp? > > > > > > > > > > You'll get a much higher throughput rate with FTP than you will with > > > > > SSH, simply because encryption overhead is quite high (even with the > > > > > Blowfish cipher). With a very fast processor and on a gigE network > > > > > you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP. > > > > > That's the only difference I can think of. > > > > > > > > > > The watchdog resets I can't explain; Jack Vogel should be able to > > > > > assist > > > > > with that. But it sounds like the resets only happen under very high > > > > > throughput conditions (which is why you'd see it with FTP but not > > > > > SSH). > > > > > > > > What kind of hardware is this interface? Watchdogs mean TX cleanup > > > > isn't happening in a reasonable time, without further data its hard to > > > > know what might be going on. > > > > > > > > Jack > > > > > > from pciconf: > > > > > > [EMAIL PROTECTED]:0:0: class=0x02 card=0x108c15d9 chip=0x108c8086 > > > rev=0x03 > > > hdr=0x00 > > > vendor = 'Intel Corporation' > > > device = 'PRO/1000 PM' > > > class= network > > > subclass = ethernet > > > [EMAIL PROTECTED]:0:0: class=0x02 card=0x109a15d9 chip=0x109a8086 > > > rev=0x00 > > > hdr=0x00 > > > vendor = 'Intel Corporation' > > > class= network > > > subclass = ethernet > > > > > > em0 is the interface in question. > > > > > > from dmesg: > > > > > > em0: port > > > 0x4000-0x401f mem 0xe030-0xe031 irq 16 at device 0.0 on pci13 > > > > > > em1: port > > > 0x5000-0x501f mem 0xe040-0xe041 irq 17 at device 0.0 on pci14 > > > > OH, this is an 82573, and I've posted a firmware patcher a couple > > different times, there is a bit in the MANC register that is incorrectly > > programmed in some vendors systems. Can you search email for > > that patcher, it needs to run from DOS. If you are unable to find > > it let me know and I'll resent you a copy. > > > > Jack > > If you are referring to the dcgdis.ThisIsZip attachment, I found it in > earlier threads, thanks. Will work on patching the nics and will keep > the list updated. > > Thanks again. > > Sven > I am happy to report that the firmware patch seems to have fixed the issue and I can transfer data across the gigE network without the watchdog timeouts and lockups. Thanks again!! Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Another em0 watchdog timeout
On Tue, 2007-05-01 at 11:05 -0700, Michael Collette wrote: > I realize there is a previous thread discussing this, but my symptoms > seem to be a little bit different. Here's the stats... > > FreeBSD 6.2-STABLE #1: Fri Apr 27 17:28:22 PDT 2007 > > [EMAIL PROTECTED]:0:0: class=0x02 card=0x108c15d9 chip=0x108c8086 > rev=0x03 hdr=0x00 > vendor = 'Intel Corporation' > device = 'PRO/1000 PM' > class = network > subclass = ethernet > [EMAIL PROTECTED]:0:0: class=0x02 card=0x109a15d9 chip=0x109a8086 > rev=0x00 hdr=0x00 > vendor = 'Intel Corporation' > class = network > subclass = ethernet > > em0: port > 0x5000-0x501f mem 0xea30-0xea31 irq 16 at device 0.0 on pci13 > em0: Ethernet address: 00:30:48:5c:cc:84 > em1: port > 0x6000-0x601f mem 0xea40-0xea41 irq 17 at device 0.0 on pci14 > em1: Ethernet address: 00:30:48:5c:cc:85 > > I'm seeing the following entries in my messages log pop up about 2-4 > times a day... > > May 1 08:29:38 alpha kernel: em0: watchdog timeout -- resetting > May 1 08:29:38 alpha kernel: em0: link state changed to DOWN > May 1 08:29:41 alpha kernel: em0: link state changed to UP > > I've gone and added the DEVICE_POLLING option in the kernel, but this > doesn't seem to help. The problem only seems to happen during the > hours that my users would be hitting this box, so it really gets > noticed when those 3 seconds go by. And yes, it's almost always a 3 > second drop on the interface. > > Is there anything I can do to prevent this from happening? I saw > mention of a firmware update I might try, but haven't been able to > locate the file in question. > > Thanks, Search the list for a post by Jack Vogel that contains an attachment named "dcgdis.ThisIsZip". That firmware patch solved my em watchdog timeout issues. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
LSI Megaraid (amr) performance woes
I am having some issues getting any (write) performance out of an LSi Megaraid (320-1) SCSI raid card (using the amr driver). The system is an i386 (p4 xeon) with on-board adaptec scsi controllers and a SUPER GEM318 Saf-te backplane with 6 ea 146GB U320 10k rpm Hitachi drives. dmesg highlights at message end. The main problem I am having is getting anywhere near a decent write performance using the card. I compared having the backplane connected to the on-board adaptec controller to having it connected to the LSi controller. I tried 3 methods of benchmarking. "Adaptec Connected" involved using the on-board adaptec scsi controller, "LSi Connected" involved using the LSI controller as a simple controller having each drive its own logical raid0 drive. LSI write-through and write-back involved using the LSi controller to set up 2 single raid0 drives as their own logical unit and a "spanned" mirror of 4 drives (raid10) as a logical unit (write-back and write-through simply referring to the write method used). In the case of the "XXX Connected" setup, I created a raid10 configuration with 4 of the drives as follows (shown is the commands for adaptec .. for lsi I simply used amrd2 amrd3 etc for the drives). gmirror label -b load md1 da2 da3 gmirror label -b load md2 da4 da5 gmirror load gstripe label -s 65536 md0 /dev/mirror/md1 /dev/mirror/md2 newfs /dev/stripe/md0 mkdir /bench mount /dev/stripe/md0 /bench to test read and write performance I used dd as follows: dd if=/dev/zero of=/raid_or_single_drive/bench64 bs=64k count=32768 which created 2GB files. The summary of results (measured in bytes/sec) is as follows: | SINGLE DRIVE| RAID DRIVE | Connection Method | Write | Read | Write | Read| |-|--| adaptec connected | 58808057 | 78188838 | 78625494 | 127331944 | lsi singles | 43944507 | 81238863 | 95104511 | 111626492 | lsi write-through | 45716204 | 81748996 |*10299554*| 108620637 | lsi write-back | 31689131 | 37241934 | 50382152 | 56053085 | With the drives connected to the adaptec controller and using geom, I get the expected increase in write and read performance when moving from a single drive to a raid10 system. Likewise, when using the LSI controller to manage the drives as single units and using geom to create the raid, I get a marked increase in write performance (less of a read increase). However, when using the LSI to create the raid, I end up with a *miserable* 10MB/sec write speed (while achieving acceptable read speeds) in write-through mode and mediocre write speeds in write-back mode (which, without a battery-backed raid card I would rather not do) and, for some reason, a marked decrease in read speeds (over the write-through values). So the question arises as to whether this is an issue with the way the LSI card (320-1) handles "spans" (which I call stripes - versus mirrors) or the way the amr driver views such spans, or an issue with the card not playing nicely with the supermicro motherboard, or perhaps even a defective card. Has anyone else had experience with this card and motherboard combination? As a side note, I also tried dragonfly-bsd (1.4.0) which also uses the amr driver and experienced similar results, and linux (slackware 10.2 default install) which showed write speeds of 45MB/s or so and read speeds of 140MB/s or so using the default LSI controller settings (write-through, 64k stripe size, etc.) Any help or ideas here would be really appreciated in an effort to get anywhere near acceptable write speeds without relying on the unsafe write-back method or excessively sacrificing read speeds. dmesg highlights: FreeBSD 6.0-RELEASE #0: Thu Nov 3 09:36:13 UTC 2005 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC ACPI APIC Table: Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2799.22-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf29 Stepping = 9 Features=0xbfebfbff Features2=0x4400> Hyperthreading: 2 logical CPUs real memory = 1073217536 (1023 MB) avail memory = 1041264640 (993 MB) pcib5: at device 29.0 on pci4 pci5: on pcib5 amr0: mem 0xfe20-0xfe20 irq 96 at device 1.0 on pci5 amr0: Firmware 1L37, BIOS G119, 64MB RAM pci4: at device 30.0 (no driver attached) pcib6: at device 31.0 on pci4 pci6: on pcib6 ahd0: port 0x4400-0x44ff,0x4000-0x40ff mem 0xfc40-0xfc401fff irq 76 at device 2.0 on pci6 ahd0: [GIANT-LOCKED] aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 101-133Mhz, 512 SCBs ahd1: port 0x4c00-0x4cff,0x4800-0x48ff mem 0xfc402000-0xfc403fff irq 77 at device 2.1 on pci6 ahd1: [GIANT-LOCKED] aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 101-133Mhz, 512 SCBs Thanks, Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/fr
Re: LSI Megaraid (amr) performance woes
On Thu, 2006-02-23 at 15:53 -0500, Kris Kennaway wrote: > On Thu, Feb 23, 2006 at 03:41:06PM -0500, Sven Willenberger wrote: > > I am having some issues getting any (write) performance out of an LSi > > Megaraid (320-1) SCSI raid card (using the amr driver). The system is an > > i386 (p4 xeon) with on-board adaptec scsi controllers and a SUPER GEM318 > > Saf-te backplane with 6 ea 146GB U320 10k rpm Hitachi drives. > > Try again with 6.1, performance should be much better. > > Kris I cvsupped a 6.1 prerelease and found no performance improvements. I did some further tests and the performance issues seem very specific to the mirroring aspect of the raid: The server has 6 drives so I set up a single-drive Raid0, dual-drive raid0 and triple-drive raid0 using the lsi configuration tool. Since these were simple stripes I would expect increasing performance with each additional drive and the results matched these expectations. Write speeds were roughly 50MB/sec, 100MB/sec, and 150MB/sec for the single,dual, and triple drive stripes respectively with read speeds on the order of 110MB/s or so. I then went back to the beginning and setup a simple 2-drive mirror, and a 4-drive raid 10 (spanning/striping over 2 mirrors). After reinstalling the OS I ended up with the following results: on the 2-drive mirror write speeds were an abysmal 7MB/sec and on the 4-disk raid10 write speeds were 8MB/sec. Looking at iostat during the write (using dd if=/dev/zero of=filename bs=64k count=32768) I saw that the 2-drive mirror seemed to jump between 35 tps and 65 tps with and average 128kb per transaction while the 4-drive array maintained a more consistent 65 tps. Read speeds on the 4-drive array were around 110MB/sec. I cannot rule out the possiblity that the card itself is bad but in an effort to try and do so I tried these tests using an install of slackware 10.2 (using the 2.6 kernel and megaraid.ko module) and reiserfs filesystem. On the 2-drive mirror I achieved write speeds of 22MB/sec and the 4-drive array saw write speeds of about 45MB/sec. Read speeds on the 4-drive array were roughly 150MB/sec. In summary, it would seem that the amr driver (which I also tested on a tyan transport system (using the same hard drives) using FreeBSD amd64 (as it is an AMD system) and had the same results as on the i386 system) seems to have issues when any type of data-duplication (mirror) scheme is in place. Is there anything I can try and do to troubleshoot this on a lower level? Is it possible that the card is actually just defective? Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: LSI Megaraid (amr) performance woes
On Wed, 2006-03-01 at 15:08 -0500, Mike Tancsa wrote: > At 02:10 PM 01/03/2006, Sven Willenberger wrote: > > >I cvsupped a 6.1 prerelease and found no performance improvements. I did > >some further tests and the performance issues seem very specific to the > >mirroring aspect of the raid: > > > I am not familiar with the LSI cards, but with older 3ware and the > ARECA cards, the raid sets when in any sort of redundancy mode must > initialize in the background before normal use. Until that is > complete, performance is seriously slow. Is the LSI doing that, and > perhaps just not telling you ? > > ---Mike > I had thought of this too so I disabled the rapid (background) initialization option and let the raids build to completion the slow way. So unless it is still building even after it is done (or is doing some other odd processor-intensive crc checking or something) I don't think this is the source of the problem. Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
vinum to gvinum help
I have an i386 system currently running 5.2.1-RELEASE with a vinum mirror array (2 drives comprising /usr ). I want to upgrade this to 5.5-RELEASE which, if I understand correctly, no longer supports vinum arrays. Would simply chaning /boot/loader.conf to read gvinum_load instead of vinum_load work or would the geom layer prevent this from working properly? If not, is there a recommended way of upgrading a vinum array to a gvinum or gmirror array? Thanks, Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: vinum to gvinum help
On Mon, 2006-06-26 at 19:15 +0200, Roland Smith wrote: > On Mon, Jun 26, 2006 at 12:22:07PM -0400, Sven Willenberger wrote: > > I have an i386 system currently running 5.2.1-RELEASE with a vinum > > mirror array (2 drives comprising /usr ). I want to upgrade this to > > 5.5-RELEASE which, if I understand correctly, no longer supports vinum > > arrays. Would simply chaning /boot/loader.conf to read gvinum_load > > instead of vinum_load work or would the geom layer prevent this from > > working properly? If not, is there a recommended way of upgrading a > > vinum array to a gvinum or gmirror array? > > Lost of things have changed between 5.2.1 and 5.5. I think it would be > best to make a backup and do a clean reinstall. > > Roland Sadly this may not be an option; this is a production server that can at best stand an hour or so of downtime. Between all the custom symlinked directories, applications, etc, plus the sheer volume of data that would need to be backed up, an in-place upgrade would be infinitely more desirable. If it comes to the point of having to back up and do a fresh install I suspect I would be using the 6.x series anyway. I was really hoping that some way of upgrading in-place were available for vinum. Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: /var/spool/clientmque 185meg
Mike Tancsa presumably uttered the following on 04/16/05 19:31: At 08:56 AM 16/04/2005, Warren wrote: /var/spool/clientmqueue <-- 185meg How do i get the messages from the above to the root mail folder of my machine ? im willing to provide any neccessary information to help. Take a look at /var/log/maillog to see why its not being processed. If necessary, bump up the loglevel in sendmail In /etc/mail/sendmail.cf change O LogLevel=9 to O LogLevel=14 cd /etc/mail make stop make start The general recommendation is to *never* edit the sendmail.cf directly. Rather cd to /etc/mail and edit your freebsd.mc file adding: define(`confLOG_LEVEL',`14')dnl (those are backtick, value, singlequote) Then: rm `hostname`.?? make && make install-cf make restart if you want to revert back you can simply delete the line from your freebsd.mc file and then remake and install the newly generated cf file. Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Manipulating disk cache (buf) settings
We are running a PostgreSQL server (8.0.3) on a dual opteron system with 8G of RAM. If I interpret top and vfs.hibufspace correctly (which show values of 215MB and 225771520 (which equals 215MB) respectively. My understanding from having searched the archives is that this is the value that is used by the system/kernel in determining how much disk data to cache. If that is in fact the case, then my question would be how to best increase the amount of memory the system can use for disk caching. Ideally I would like to have upwards of 1G for this type of caching/buffering. I suspect it would not be as easy as simply adjusting vfs.hibufspace upwards but would instead involving add either a loader.conf or kernel option of some "master" setting that affects hibufspace, bufspace, and related tunables. Or would this involve editing one of the system files? Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Manipulating disk cache (buf) settings
On Mon, 2005-05-23 at 10:44 -0700, John-Mark Gurney wrote: > Sven Willenberger wrote this message on Mon, May 23, 2005 at 10:58 -0400: > > We are running a PostgreSQL server (8.0.3) on a dual opteron system with > > 8G of RAM. If I interpret top and vfs.hibufspace correctly (which show > > values of 215MB and 225771520 (which equals 215MB) respectively. My > > understanding from having searched the archives is that this is the > > value that is used by the system/kernel in determining how much disk > > data to cache. > > This is incorrect... FreeBSD merged the vm and buf systems a while back, > so all of memory is used as a disk cache.. The buf cache is still used > for filesystem meta data (and for pending writes of files, but those buf's > reference the original page, not local storage)... > > Just as an experiment, on a quiet system do: > dd if=/dev/zero of=somefile bs=1m count=2048 > and then read it back in: > dd if=somefile of=/dev/null bs=1m > and watch systat or iostat and see if any of the file is read... You'll > probably see that none of it is... > Yes, confirmed as stated, this is great news then. In essence the PostgreSQL planner can be told that the effective cache size is *much* larger than that calculated by using vfs.hibufspace; should result in some [hopefully] marked performance boosts. btw: > dd if=/dev/zero of=zerofile bs=1m count=2048 2048+0 records in 2048+0 records out 2147483648 bytes transferred in 43.381462 secs (49502335 bytes/sec) > dd if=zerofile of=/dev/null bs=1m 2048+0 records in 2048+0 records out 2147483648 bytes transferred in 5.304807 secs (404818435 bytes/sec) and that was on a 3GB RAM system so the caching scheme works great. Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
BKVASIZE for large block-size filesystems
FreeBSD5.4-Stable amd64 on a dual-opteron system with LSI-Megaraid 400G+ partion. The filesystem was created with: newfs -b 65536 -f 8192 -e 15835 /dev/amrd2s1d This is the data filesystem for a PostgreSQL database; as the default page size (files) is 8k, the above newfs scheme has 8k fragments which should fit nicely with the PostgreSQL page size. Now by default param.h defines BKVASIZE as 16384 (which has been pointed out in other posts as being *not* twice the default blocksize of 16k). I have modified it to be set at 32768 but still see a high and increasing value of vfs.bufdefragcnt which makes sense given the blocksize of the major filesystem in use. My question is are there any caveats about increasing BKVASIZE to 65536? The system has 8G of RAM and I understand that nbufs decreases with increasing BKVASIZE; how can I either determine if the resulting nbufs will be sufficient or calculate what is needed based on RAM and system usage? Also, will increasing BKVASIZE require a complete make buildworld or, if not, how can I remake the portions of system affected by BKVASIZE? Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: stack backtrace
On Sun, 2005-05-29 at 11:29 -0700, Derek KuliĆski wrote: > Hello, > > Today I noticed following message in the log: > > > KDB: stack backtrace: > > kdb_backtrace(c07163b8,2,c661994c,0,22) at kdb_backtrace+0x2e > > getdirtybuf(d109ebac,0,1,c661994c,1) at getdirtybuf+0x2b > > flush_deplist(c282f4cc,1,d109ebd4,d109ebd8,0) at flush_deplist+0x57 > > flush_inodedep_deps(c15ba800,22a8a,c2760a7c,d109ec34,c04f7f87) at > > flush_inodedep_deps+0x9e > > softdep_sync_metadata(d109eca4,c2760a50,50,c06ea8f0,0) at > > softdep_sync_metadata+0x9d > > ffs_fsync(d109eca4,0,0,0,0) at ffs_fsync+0x4b2 > > fsync(c17ba000,d109ed14,4,d109ed3c,c0515916) at fsync+0x1a1 > > syscall(c069002f,2f,2f,81522b0,81522b0) at syscall+0x370 > > Xint0x80_syscall() at Xint0x80_syscall+0x1f > > --- syscall (95, FreeBSD ELF32, fsync), eip = 0x28143dcf, esp = 0xbfbfd34c, > > ebp = 0xbfbfd358 --- > > KDB: stack backtrace: > > kdb_backtrace(c07163b8,2,c6683118,0,22) at kdb_backtrace+0x2e > > getdirtybuf(d1098bac,0,1,c6683118,1) at getdirtybuf+0x2b > > flush_deplist(c282facc,1,d1098bd4,d1098bd8,0) at flush_deplist+0x57 > > flush_inodedep_deps(c15ba800,1e9a4,c24cc974,d1098c34,c04f7f87) at > > flush_inodedep_deps+0x9e > > softdep_sync_metadata(d1098ca4,c24cc948,50,c06ea8f0,0) at > > softdep_sync_metadata+0x9d > > ffs_fsync(d1098ca4,0,0,0,0) at ffs_fsync+0x4b2 > > fsync(c17b9c00,d1098d14,4,c17b9c00,7) at fsync+0x1a1 > > syscall(2f,2f,bfbf002f,8111fe0,0) at syscall+0x370 > > Xint0x80_syscall() at Xint0x80_syscall+0x1f > > --- syscall (95, FreeBSD ELF32, fsync), eip = 0x282dfdcf, esp = 0xbfbf9a8c, > > ebp = 0xbfbfb468 --- > > System didn't seem to crash, what does it mean? > > The OS is FreeBSD 5.4-RELEASE, it was compiled using: > CPUTYPE?=i686 > COPTFLAGS= -O -pipe > Apparently this is still somewhat of a mystery, but you are not the first person to witness this: http://lists.freebsd.org/pipermail/freebsd-stable/2005-April/013679.html http://lists.freebsd.org/pipermail/freebsd-current/2004-July/031576.html I don't know if anyone is actually looking into this (behind the scenes maybe) or whether we just need to accumulate a critical mass of similar notices to raise an eyebrow. If your system does not lock up as a result (the way it used to in the earlier 5.x series) then perhaps it is harmless .. Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
PostgreSQL's vacuumdb fails to allocate memory for non-root users
FreeBSD 5.4-Release PostgreSQL 8.0.3 I noticed that the nightly cron consisting of a vacuumdb was failing due to "unable to allocate memory". I do have maintenance_mem set at 512MB, and the /boot/loader.conf file sets the max datasize to 1GB (verified by limit). The odd thing is that if I run the command (either vacuumdb from the command line or vacuum verbose analyze from a psql session) as the Unix user root (and any psql superuser) the vacuum runs fine. It is when the unix user is non-root (e.g. su -l pgsql -c "vacuumdb -a -z") that this memory error occurs. All users use the "default" class for login.conf purposes which has not been modified from its installed settings. Any ideas on how to a) troubleshoot this or b) fix this (if it is something obvious that I just cannot see). Thanks, Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [GENERAL] PostgreSQL's vacuumdb fails to allocate memory for non-root users
On Wed, 2005-06-29 at 09:43 -0400, Douglas McNaught wrote: > Sven Willenberger <[EMAIL PROTECTED]> writes: > > > FreeBSD 5.4-Release > > PostgreSQL 8.0.3 > > > > I noticed that the nightly cron consisting of a vacuumdb was failing due > > to "unable to allocate memory". I do have maintenance_mem set at 512MB, > > and the /boot/loader.conf file sets the max datasize to 1GB (verified by > > limit). The odd thing is that if I run the command (either vacuumdb from > > the command line or vacuum verbose analyze from a psql session) as the > > Unix user root (and any psql superuser) the vacuum runs fine. It is when > > the unix user is non-root (e.g. su -l pgsql -c "vacuumdb -a -z") that > > this memory error occurs. All users use the "default" class for > > login.conf purposes which has not been modified from its installed > > settings. Any ideas on how to a) troubleshoot this or b) fix this (if it > > is something obvious that I just cannot see). > > Is the out-of-memory condition occurring on the server or client side? > Is there anything in the Postgres logs? In this case they are one and the same machine ... i.e whether invoked from the command-line as vacuumdb or invoked from psql (connecting to localhost) as "vacuum analyze;" the memory error occurs. The logfile reveals: ERROR: out of memory DETAIL: Failed on request of size 536870910. > You might put a 'ulimit -a' command in your cron script to make sure > your memory limit settings are propagating correctly... I created a cron that consisted of just that command (ulimit -a) and the output revealed nothing abnormal (i.e. max dataseg still 1G, etc). This occurs outside of cron also, (it was just the failing cronjob that brought it to my attention). Again, if I log in as myself and try to run the command vaccumdb -a -z it fails; if I su to root and repeat it works fine. I am trying to narrow this down to a PostgreSQL issue vs FreeBSD issue. Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [GENERAL] PostgreSQL's vacuumdb fails to allocate memory for
On Wed, 2005-06-29 at 11:21 -0400, Tom Lane wrote: > Sven Willenberger <[EMAIL PROTECTED]> writes: > > ERROR: out of memory > > DETAIL: Failed on request of size 536870910. > > That's a server-side failure ... > > > Again, if I log in as myself and try to run > > the command vaccumdb -a -z it fails; if I su to root and repeat it works > > fine. I am trying to narrow this down to a PostgreSQL issue vs FreeBSD > > issue. > > That's fairly hard to believe, assuming that you are talking to the same > server in both cases (I have seen trouble reports that turned out to be > because the complainant was accidentally using two different servers...) > The ulimit the backend is running under couldn't change depending on > where the client is su'd to. > > Is it possible that you've got per-user configuration settings that > affect this, like a different maintenance_mem value for the root user? > > regards, tom lane > I have done some more tests and tried to keep the results of vacuumdb distinct from connecting to the backend (psql -U pgsql ...) and running vaccum analyze. Apparently the hopping back and forth from both methods interfered with my original interpretations of what appeared to be happening. Anyway, here is what I see: First test psql connection version: psql then vacuum analyze => works fine whether the current unix user is root or plain user. (ran this a couple times via new psql connections to verify). Then quit psql and move to command line vacuumdb => whether running as su -l pgsql -c "vacuumdb -a -z" (or specifying a dbname instead of all) or directly as a user the out of memory error occurs. If I then connect via psql to the backend and try to run vacuum analyze I receive an out of memory error. This last connection to psql after a failed vacuumdb was confabulating my interpretations earlier of the error being based on unix user. top shows: PID USERNAME PRI NICE SIZERES STATE C TIME WCPUCPU COMMAND 6754 pgsql 40 602M 88688K sbwait 0 0:03 0.00% 0.00% postgres until I disconnect the psql session. I can then psql again and the same error happens (out of memory) and top shows the same again. At this point I am not sure if it is a memory issue of vacuumdb, vacuum itself, or the FreeBSD memory management system. Again, if enough time passes (or some other events) since I last try vacuumdb, then running vacuum [verbose][analyze] via a psql connection works fine. Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [GENERAL] PostgreSQL's vacuumdb fails to allocate memory for non-root users
On Wed, 2005-06-29 at 14:59 -0400, Vivek Khera wrote: > On Jun 29, 2005, at 9:01 AM, Sven Willenberger wrote: > > > Unix user root (and any psql superuser) the vacuum runs fine. It is > > when > > the unix user is non-root (e.g. su -l pgsql -c "vacuumdb -a -z") that > > this memory error occurs. All users use the "default" class for > > login.conf purposes which has not been modified from its installed > > settings. Any ideas on how to a) troubleshoot this or b) fix this > > (if it > > is something obvious that I just cannot see). > > This doesn't make sense: the actual command is executed by the > backend postgres server, so the uid of the client program doens't > make a bit of difference. > > You need to see exactly who is generating that error. It certainly > is not the Pg backend. > The issue being tied to a certain "login" user has been negated by further testing (the illusion that it was based on user happened as a result of the order in which I ran tests to try and find out what was going on ) -- it does seem tied to invoking vacuumdb at this point. As a point of clarification, when maxdsiz and dfldsiz are set, those values are per "process" not per "user", correct? Something I have noticed, when the memory error occurs during the psql session (after a failed vacuumdb attempt) the memory stays at 600+MB in top (under size) until the psql session is closed -- that may just be the way top reports it though. Sven Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: PostgreSQL's vacuumdb fails to allocate memory for non-root users
On Wed, 2005-06-29 at 21:54 +0200, Juergen Dankoweit wrote: > Hello, > > Am Mittwoch, den 29.06.2005, 14:59 -0400 schrieb Vivek Khera: > > On Jun 29, 2005, at 9:01 AM, Sven Willenberger wrote: > > > > > Unix user root (and any psql superuser) the vacuum runs fine. It is > > > when > > > the unix user is non-root (e.g. su -l pgsql -c "vacuumdb -a -z") that > > > this memory error occurs. All users use the "default" class for > > > login.conf purposes which has not been modified from its installed > > > settings. Any ideas on how to a) troubleshoot this or b) fix this > > > (if it > > > is something obvious that I just cannot see). > > > > This doesn't make sense: the actual command is executed by the > > backend postgres server, so the uid of the client program doens't > > make a bit of difference. > > > > You need to see exactly who is generating that error. It certainly > > is not the Pg backend. > > Sorry for that possible stupid question. But why do you think that the > PG backend does not generate the error? > I use PostgreSQL since many years under FreeBSD and it is the first time > to hear from such an error. As the postgres logfiles have the out of memory error in them it would appear that it is the backend generating this error. Since, I am assuming here, dfldsiz and maxdsiz (set in loader.conf at 850MB and 1G respectively)) are per process, and since I have maintenance work_mem set at 512M (this all on a 3G box) I am not sure how it fails to allocate the memory; although top reports only 25-30MB free, there is some 2.5G in Inactive so there is plenty of memory available. I am currently running memtest to see if I may have flaky RAM ... Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [GENERAL] PostgreSQL's vacuumdb fails to allocate memory for non-root users
On Wed, 2005-06-29 at 16:40 -0400, Charles Swiger wrote: > On Jun 29, 2005, at 4:12 PM, Sven Willenberger wrote: > [ ... ] > > Something I have noticed, > > when the memory error occurs during the psql session (after a failed > > vacuumdb attempt) the memory stays at 600+MB in top (under size) until > > the psql session is closed -- that may just be the way top reports it > > though. > > Double-check your system limits via "ulimit -a" or "ulimit -aH". By > default, FreeBSD will probably restrict the maximum data size of the > process to 512MB, which may be what you are running into. You can > rebuild the kernel to permit a larger data size, or else tweak /boot/ > loader.conf: > > echo 'kern.maxdsiz="1024M"' >> /boot/loader.conf > :>ulimit -a cpu time (seconds, -t) unlimited file size (512-blocks, -f) unlimited data seg size (kbytes, -d) 1048576 stack size (kbytes, -s) 65536 core file size (512-blocks, -c) unlimited max memory size (kbytes, -m) unlimited locked memory (kbytes, -l) unlimited max user processes (-u) 5547 open files (-n) 11095 virtual mem size(kbytes, -v) unlimited sbsize (bytes, -b) unlimited :> cat /boot/loader.conf kern.maxdsiz="1073741824" kern.dfldsiz="891289600" and if I don't run vacuumdb at all, but rather connect to the backend via psql and run vacuum, it works ok with full memory allocation. Still testing RAM to see if the issue is physical. Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [GENERAL] PostgreSQL's vacuumdb fails to allocate memory for
On Wed, 2005-06-29 at 16:58 -0400, Sven Willenberger wrote: > On Wed, 2005-06-29 at 16:40 -0400, Charles Swiger wrote: > > On Jun 29, 2005, at 4:12 PM, Sven Willenberger wrote: > > [ ... ] > > > Something I have noticed, > > > when the memory error occurs during the psql session (after a failed > > > vacuumdb attempt) the memory stays at 600+MB in top (under size) until > > > the psql session is closed -- that may just be the way top reports it > > > though. > > > > Double-check your system limits via "ulimit -a" or "ulimit -aH". By > > default, FreeBSD will probably restrict the maximum data size of the > > process to 512MB, which may be what you are running into. You can > > rebuild the kernel to permit a larger data size, or else tweak /boot/ > > loader.conf: > > > > echo 'kern.maxdsiz="1024M"' >> /boot/loader.conf > > > > :>ulimit -a > cpu time (seconds, -t) unlimited > file size (512-blocks, -f) unlimited > data seg size (kbytes, -d) 1048576 > stack size (kbytes, -s) 65536 > core file size (512-blocks, -c) unlimited > max memory size (kbytes, -m) unlimited > locked memory (kbytes, -l) unlimited > max user processes (-u) 5547 > open files (-n) 11095 > virtual mem size(kbytes, -v) unlimited > sbsize (bytes, -b) unlimited > :> cat /boot/loader.conf > kern.maxdsiz="1073741824" > kern.dfldsiz="891289600" > > and if I don't run vacuumdb at all, but rather connect to the backend > via psql and run vacuum, it works ok with full memory allocation. Still > testing RAM to see if the issue is physical. > > Sven > > I have found the answer/problem. On a hunch I increased maxdsiz to 1.5G in the loader.conf file and rebooted. I ran vacuumdb and watched top as the process proceeded. What I saw was SIZE sitting at 603MB (which was 512MB plus another 91MB which corresponded nicely to the value of RES for the process. A bit into the process I saw SIZE jump to 1115 -- i.e. another 512 MB of RAM was requested and this time allocated. At one point SIZE dropped back to 603 and then back up to 1115. I suspect the same type of issue was occuring in regular vacuum from the psql client connecting to the backend, for some reason not as frequently. I am gathering that maintenance work mem is either not being recognized as having already been allocated and another malloc is made or the process is thinking the memory was released and tried to grab a chunk of memory again. This would correspond to the situation where I was size stuck at 603MB after a failed memory allocation (when maxdsiz was only 1G). Now I am not sure if I will run into the situation where yet another 512MB request would be made (when already 1115 appears in SIZE) but if so, then the same problem will arise. I will keep an eye on it ... Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [GENERAL] PostgreSQL's vacuumdb fails to allocate memory for
Tom Lane presumably uttered the following on 06/29/05 19:12: Sven Willenberger <[EMAIL PROTECTED]> writes: I have found the answer/problem. On a hunch I increased maxdsiz to 1.5G in the loader.conf file and rebooted. I ran vacuumdb and watched top as the process proceeded. What I saw was SIZE sitting at 603MB (which was 512MB plus another 91MB which corresponded nicely to the value of RES for the process. A bit into the process I saw SIZE jump to 1115 -- i.e. another 512 MB of RAM was requested and this time allocated. At one point SIZE dropped back to 603 and then back up to 1115. I suspect the same type of issue was occuring in regular vacuum from the psql client connecting to the backend, for some reason not as frequently. I am gathering that maintenance work mem is either not being recognized as having already been allocated and another malloc is made or the process is thinking the memory was released and tried to grab a chunk of memory again. Hmm. It's probably a fragmentation issue. VACUUM will allocate a maintenance work mem-sized chunk during command startup, but that's likely not all that gets allocated, and if any stuff allocated after it is not freed at the same time, the process size won't go back down. Which wouldn't be a killer in itself, but unless the next iteration is able to fit that array in the same space, you'd see the above behavior. So maintenance work mem is not a measure of the max that can allocated by a maintenance procedure but rather an increment of memory that is requested by a maintenance process (which currently are vacuum and index, no?), if my reading of the above is correct. BTW, do you have any evidence that it's actually useful to set maintenance work mem that high for VACUUM? A quick and dirty solution would be to bound the dead-tuples array size at something more sane... I was under the assumption that on systems with RAM to spare, it was beneficial to set main work mem high to make those processes more efficient. Again my thinking was that the value you set for that variable determined a *max* allocation by any given maintenance process, not a memory allocation request size. If, as my tests would indicate, the process can request and receive more memory than specified in maintenance work mem, then to play it safe I imagine I could drop that value to 256MB or so. Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: SCSI troubles
On Wed, 2005-07-06 at 00:29 -0700, Ade Lovett wrote: > Niki Denev wrote: > > From what i understand this is not exactly a FreeBSD problem, but rather > > a consequence of U320 being really hard on the hardware with pushing it > > to the limits. > > Incorrect. The relevant parts of the output you pasted are: > > ahd > Seagate drives > > Attaching more than one Seagate drive to a single Adaptec chain will > result in various weird and wonderful behavior as you've described. > > This is above and beyond well known (and documented) issues with data > loss and corruption with certain firmware revisions on Seagate drives. > > You have essentially two options: > > (1) disable the (on-board) adaptec controller, and use something else > (LSI cards work pretty good) > > (2) chunk the Seagate drives, and replace them with some other vendor > (Hitachi, for example, in our high-stress environments, show equivalent > MTBFs) > We went with option 2 about a year or so ago (Hitachi drives in our case) as dealing with Seagate on this issue turned into an exercise in frustration (they suggested things like turning off SMP or using a PCI network card instead of the onboard (em) network. As is pointed out the issue really crops up with more than one seagate drive on the adaptec (ahd) controller, even with the drives upated to their latest bios. Switching to a different hard drive manufacturer solved our woes. Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: ahd0: Invalid Sequencer interrupt occurred.
On Fri, 2005-11-11 at 22:57 -0800, Ade Lovett wrote: > On Nov 11, 2005, at 12:51 , Amit Rao wrote: > > 0) Upgrade to Seagate 10K.7 drive firmware level 0008. That seems > > to help. One "ahd sequencer error" message still appears at boot, > > but after that it seems to work (with your fingers crossed). > > Of course, you then spend far too much time ensuring that any > replacement drives are flashed appropriately (which, afaict, > *requires* Windows to do), and also running the gauntlet of further > problems down the road when you throw the drives into a new machine > with a subtly different HBA bios. > > No thanks, I'll stick with option (2). A few more months, and > Seagate drives will be a nice distant memory that I can look back on > in a few years, and laugh nervously about. > > -aDe There was a flash-utility that was [hand-rolled?] able to run on FreeBSD and I did successfully flash some Seagate drives' firmware -- didn't help any as far as the error [messages] went so we dropped Seagate drives altogether a little over a year ago. Since then we have been using the IBM/Hitachi drives with no issues (much easier to change drive manufacturers than try to respec the servers we were using or do some of the borderline-absurd workarounds that Seagate suggested). Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Creating a system RAID-10 device
I hope this is the appropriate mailing list for this question, if not please redirect me as needed. My goal is to create a filesystem consisting of a striped raid array over 3 mirror raid arrays that would encompass the entire filesystem including the /boot and root partitions. Generally I would create a RAID10 from 6 disks as follows: gmirror label -v -b round-robin md1 da0 da1 gmirror label -v -b round-robin md2 da2 da3 gmirror label -v -b round-robin md3 da4 da5 gstripe label -v -s 131072 md0 /dev/mirror/md1 /dev/mirror/md2 /dev/mirror/md3 newfs /dev/stripe/md0 naturally the problem here is that it cannot be done on a system that booted from da0. I have seen the example of setting up a mirrored system drive (http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/geom-mirror.html ) which won't quite work for my case either. Using this method I could probably get the one mirror (md1) to work, but I know of no way of then adding the other 2 mirror sets and then redoing the system to stripe across all 3 mirrored sets. The only thing I could think of was to boot from the livecd and create the 6-disk array and then trying to install FreeBSD onto this filesystem. In order to do this the installer would have to recognize /dev/stripe/md0 as a valid "drive" -- is there any way to have this happen? Sven ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: serious vinum bug in 4-10 RELEASE?-solved?
Steve Shorter wrote: On Mon, Jul 12, 2004 at 06:40:01AM +0930, Greg 'groggy' Lehey wrote: I see no drives. Ideas? I have concluded that this is the result of somekind of vinum/hardware incompatibility. The problem in question occured during the upgrade to faster disks, specifically, Seagate Cheetah ST318453LC on a DELL 2450. If I swap back the old Quantum Atlas 36G disk, the problem entirely disappears. The new disks function ok with UFS partitions but not vinum. It is 100% repeatable. Don't know why. We have had issues with Cheetah U320 harddrives (at least the 10K 80-pin varieties on our Supermicro boxes) with a high percentage of drive failures ( > 10%) and communications errors across the scsi bus. These errors disappear when we reverted back to IBM/Hitachi drives. The Seagate issues occur with both FreeBSD 4.x and 5.x series so there is something in the Seagate firmware (I believe) that is interacting poorly with FreeBSD. We have experienced these issues with vinum setups and other configurations where the there are either multiple Seagate drives or multiple drives where one of them is a Seagate. Firmware updates did not help. I have not had this problem where there is only one drive and it occupies da0. Sven ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Why the mode of /dev/null is changed after system reboot?
On Mon, 2004-11-01 at 09:30 -0800, Kris Kennaway wrote: > On Mon, Nov 01, 2004 at 09:19:12AM +0800, Iva Hesy wrote: > > I have a gateway running FreeBSD 4.10-p3. Normally, the mode of > > /dev/null should be 666, but recently, I find that its mode is changed > > to 600 automatically after reboot, I have checked all /etc/rc* and > > /usr/local/etc/rc.d/*, but I can't get anything that led to it...:-( > > Probably a local error. Try changing scripts to #!/bin/sh -x and > carefully watch (or log) the boot process. Start with any local > scripts you have since it's most likely to be a problem there. > > Kris Actually I have found this happening on my 4.10 boxen as well. I thought it was some one-time glitch and just chmodded the thing back. Didn't even think about it until I saw this post. I will try to see if I can catch the circumstances surrounding it if it happens again. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"