Re: Fix for grub 2.00/bzr kfreebsd to boot 9.1 kernels
Hi Juergen, My email address has changed so don't be alarmed if your CC bounces. :) An update of grub2 is long overdue, so I'll work on that right now. -Rick On 2012/07/21 11:58, Juergen Lock wrote: > Hi! > > I'm in the process of testing 9.1 on the laptop where I use grub2 > because I had to put bsd in an `extended' slice, and I found out > grub2 won't boot the 9.1 kernel. Asked on #grub where phcoder > found the fix after I made him a test iso using grub-mkrescue: > > http://paste.debian.net/180121/ > > Applied that to grub 2.00 from here: > > http://ftp.gnu.org/gnu/grub/grub-2.00.tar.xz > > (built on a Linux debian slice with checkinstall), and that got > the 9.1 kernel booting. So maybe the sysutils/grub2 maintainer > (Cc'd) wants to update the port to 2.00 and add the patch in > files/? :) (It's still at 1.98 currently where the patch doesn't > apply.) > > The kfreebsd way to boot this affects is the same as in this > earlier post: > > > http://lists.freebsd.org/pipermail/freebsd-multimedia/2011-March/011828.html > > (i.e. this is not chainloading bsd's loader but grub loading the > kernel and klds itself. I think this way to boot was originally > added by the debian kfreebsd guys, hence the command kfreebsd...) > > HTH, > Juergen > ___ > freebsd-curr...@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NFSv4 - how to set up at FreeBSD 8.1 ?
> Rick Macklem wrote: > > > Sun did add a separate file locking protocol called the NLM > > or rpc.lockd if you prefer, but that protocol design was > > fundamentally flawed imho and, as such, using it is in the > > "your mileage may vary" category. > > I suppose it was not all that bad, considering that what it sought > to accomplish is incomputable. There is simply no way for either > the server or the client to distinguish between "the other end has > crashed" and "there is a temporary communication failure" until the > other end comes back up or communication is restored. > Yep. The blocking lock operation is also a trainwreck looking for a place to happen, imho. (In the NLM, the client can do an RPC that says "get a lock, waiting as long as necessary for it, and then let me know".) > On a good day, in a completely homogeneous environment (server and > all clients running the same OS revision and patchlevel), I trust > lockd about as far as I can throw 10GB of 1980's SMD disk drives :) > Heh, heh. For those too young to have had the priviledge, a 1980s SMD drive was big and HEAVY. I just about got a hernia every time one had to go in a 19inch rack. You definitely didn't throw them far:-) > Exporting /var/spool/mail read/write tends to ensure that good days > will be rare. Been there, done that, seen the result. Never again. > That's what IMAP is for. > Great post. I couldn't have said it as well, rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NFSv4 - how to set up at FreeBSD 8.1 ?
> John Baldwin wrote: > > > ... even NFS UDP mounts maintain their own set of "socket" state > > to manage retries and retransmits for UDP RPCs. > > Not according to what I remember of the SunOS NFS documentation, > which indicated that the driving force behind using UDP instead of > TCP was to have the server be _completely_ stateless. (Of course > locking is inherently stateful; they made it very clear that the > locking protocol was considered to be an adjunct rather than part > of the NFS protocol itself.) > For UDP, in the server all requests show up at socket/port 2049. They pretty quickly discovered that retries of non-idempotent RPCs trashed things, so the Duplicate Request Cache was invented, which is really state that doesn't have to be recovered after a server crash. (By Chet Jacuzak at DEC, if I recall correctly, who is living on a little island on a lake up in Maine, last I heard.) My recollection of why Sun didn't use TCP was that "they knew that the overhead would be excessive", which wasn't completely untrue, given the speed of an MC68020. > It's been quite a few years since I read that, and I didn't get > into the details, but I suppose the handle returned to a client (in > response to a mount or open request) must have contained both a > representation of the inode number and a unique identification of > the filesystem (so that, in the case where server crash recovery > included a newfs and reload from backup, the FS ID would not match > and the client would get a "stale handle" response). All of the > retry and retransmit burden had to have been managed by the client, > for both reading and writing. Yea, it depended on how the backup was done. To avoid "stale handle" the backup/reload had to retain the same i-nodes, including the generation number in them. (But, then, those 1980s SMD disks never trashed the file systems, or did they?:-) You shouldn't get me reminising on the good ole days, rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NFSv4 - how to set up at FreeBSD 8.1 ?
> > > > Not according to what I remember of the SunOS NFS documentation, > > which indicated that the driving force behind using UDP instead of > > TCP was to have the server be _completely_ stateless. (Of course > > locking is inherently stateful; they made it very clear that the > > locking protocol was considered to be an adjunct rather than part > > of the NFS protocol itself.) > When I said I recalled that they didn't do TCP because of excessive overhead, I forgot to mention that my recollection could be wrong. Also, I suspect you are correct w.r.t. the above statement. (ie. Sun's official position vs something I heard.) Anyhow, appologies if I gave the impression that I was correcting your statement. My intent was just to throw out another statement that I vaguely recalled someone an Sun stating. rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NFSv4 - how to set up at FreeBSD 8.1 ?
> On 7 January 2011 08:16, Rick Macklem wrote: > > > When I said I recalled that they didn't do TCP because of excessive > > overhead, I forgot to mention that my recollection could be wrong. > > Also, I suspect you are correct w.r.t. the above statement. (ie. > > Sun's > > official position vs something I heard.) > > > > Anyhow, appologies if I gave the impression that I was correcting > > your > > statement. My intent was just to throw out another statement that I > > vaguely recalled someone an Sun stating. > > After hitting yet another serious bug in 8.2 ; I reverted back to 8.1 > > Interestingly, it now complains about having V4: / in /etc/exports > At one time the V4: line was required to be at the end of the /etc/exports file. (You could consider that a bug left over from the OpenBSD port, where it was a separate section of /etc/exports.) I removed that restriction from mountd.c at some point, but maybe after 8.1. So, try just moving the "V4:" line to the end of /etc/exports. > NFSv4 isn't available in 8.1 ? > It should be there, rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NFS - DNS fail stops boot in mountlate
> On Thu, Jan 06, 2011 at 09:19:06PM -0500, grarpamp wrote: > > So what was unclear? > > > > mount_nfs emits a nonzero exit status upon failing to look > > up an FQDN causing mountlate to trigger a dump to shell > > on boot during rc processing. That's a *showstopper*. The > > right thing to do is to hack mount_nfs to punt to background > > mounting in this case with an appropriate exit status. > > > > Personally I'd distinguish mount_nfs exit codes between: > > 0 - mounted > > 1 - backgrounded, for any reason > > 2 - none of the above > > and adjust the rc's to deal with it accordingly. > > > > Words are subject to interpretation and take time. Though > > perhaps masked by brevity, I believe all the above elements > > were in the prior concise post. Thanks everybody :) > > So basically the problem is that the "bg" option in mount_nfs only > applies to "network unreachable" conditions and not "DNS resolution > failed" conditions. > > Initially I was going to refute the above request until I looked > closely > at the mount_nfs(8) man page which has the following clauses: > > For non-critical file systems, the bg and retrycnt options > provide mechanisms to prevent the boot process from hanging > if the server is unavailable. > > [...describing the "bg" option...] > > Useful for fstab(5), where the file system mount is not > critical to multiuser operation. > > I read these statements to mean "if -o bg is used, the system should > not > hang/stall/fail during the boot process". Dumping to /bin/sh on boot > as > a result of a DNS lookup failure violates those statements, IMHO. > > I would agree that DNS resolution should be part of the bg/retry > feature > of "bg" in mount_nfs. How/whether this is feasible to implement is > unknown to me. > I don't think punting to "bg" when a DNS failure occurs is a particularily good idea, mostly because it doesn't help for critical mounts. (I haven't looked to see if the change is feasible, either.) It would be nice to get DNS working more reliably early in boot and the, of course, there is what Doug stated w.r.t. use IP numbers or put entries in /etc/hosts for NFS servers. rick ps: I do think that "server unavailable" doesn't imply "server is available, but DNS can't resolve it's address". ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Hang in VOP_LOCK1_APV on 8-STABLE with NFS.
> Hi, > > OpenOffice hangs on NFS when I try to save a file or even when I try > to > open the save dialog in this case. > > > $ 17:25:35 ron...@ronald [~] > procstat -kk 85575 > PID TID COMM TDNAME KSTACK > 85575 100322 soffice.bin initial thread mi_switch+0x176 > sleepq_wait+0x3b __lockmgr_args+0x655 vop_stdlock+0x39 > VOP_LOCK1_APV+0x46 > _vn_lock+0x44 vget+0x67 vfs_hash_get+0xeb nfs_nget+0xa8 > nfs_lookup+0x65e > VOP_LOOKUP_APV+0x40 lookup+0x48a namei+0x518 kern_statat_vnhook+0x82 > kern_statat+0x15 lstat+0x22 syscallenter+0x186 syscall+0x40 > 85575 100502 soffice.bin - mi_switch+0x176 > sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _sleep+0x1a0 > do_cv_wait+0x639 __umtx_op_cv_wait+0x51 syscallenter+0x186 > syscall+0x40 > Xfast_syscall+0xe2 > 85575 100576 soffice.bin - mi_switch+0x176 > sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _sleep+0x1a0 > do_cv_wait+0x639 __umtx_op_cv_wait+0x51 syscallenter+0x186 > syscall+0x40 > Xfast_syscall+0xe2 > 85575 100577 soffice.bin - mi_switch+0x176 > sleepq_catch_signals+0x309 sleepq_wait_sig+0xc _sleep+0x25d > kern_accept+0x19c accept+0xfe syscallenter+0x186 syscall+0x40 > Xfast_syscall+0xe2 > 85575 100578 soffice.bin - mi_switch+0x176 > sleepq_catch_signals+0x309 sleepq_wait_sig+0xc _cv_wait_sig+0x10e > seltdwait+0xed poll+0x457 syscallenter+0x186 syscall+0x40 > Xfast_syscall+0xe2 > 85575 100579 soffice.bin - mi_switch+0x176 > sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 > _cv_timedwait_sig+0x11d seltdwait+0x79 poll+0x457 syscallenter+0x186 > syscall+0x40 Xfast_syscall+0xe2 > > $ 17:25:35 ron...@ronald [~] > uname -a > FreeBSD ronald.office.base.nl 8.2-PRERELEASE FreeBSD 8.2-PRERELEASE > #6: > Mon Dec 27 23:49:30 CET 2010 > r...@ronald.office.base.nl:/usr/obj/usr/src/sys/GENERIC amd64 > I think all the above tells us is that the thread is waiting for a vnode lock. The question then becomes "what is holding a lock on that vnode and why?". > It is not possible to exit or kill soffice.bin. I had a slighty > different > procstat stack before, but that was fixed a couple of days ago. Yea, it will be in an uniterruptible sleep when waiting for a vnode lock. > Any thoughts? Enabling local locks in NFS doesn't fix it. Here's some things you could try: 1 - apply the attached patch. It fixes a known problem w.r.t. the client side of the krpc. Not likely to fix this, but I can hope:-) 2 - If #1 doesn't fix the problem: - before making it hang, start capturing packets via: # tcpdump -s 0 -w xxx host server - then make it hang, kill the above and # procstat -ka # ps axHlww and capture the output of both of these. Hopefully these 2 commands will indicate what is holding the vnode lock and maybe, why. The "xxx" file can be looked at in wireshark to see what/if any NFS traffic is happening. If you aren't comfortable looking at the above, you can email them to me and I'll take a stab at them someday. 3 - Try the experimental client to see if it behaves differently. The mount command is: # mount -t newnfs -o nfsv3, server:/path /mntpath (This might ideantify if the regular client has an infrequently executed code path that forgets to unlock the vnode, since it uses a somewhat different RPC layer. The buffer cache handling etc are almost the same, but the RPC stuff is fairly different.) > The nfs server is an up-to-date Linux Debian 5 with kernel 2.6.26. > I'm afraid I can't blame Linux (at least not until we have more info;-). > If more info is needed. I can easily reproduce this. See above #2. Good luck with it and let us know how it goes, rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Specifying root mount options on diskless boot.
> Daniel Braniss writes... > > > I have it pxebooting nicely and running with an NFS root > > but it then reports locking problems: devd, syslogd, moused (and > > maybe > > others) lock their PID file to protect against multiple instances. > > Unfortunately, these daemons all start before statd/lockd and so the > > locking fails and reports "operation not supported". > > Are you mounting /var via nfs? You can use the "nolockd" mount option to make locking happen locally in the client. (Only a problem if the file(s) being locked are concurrently shared with other clients.) I don't know if this would fix your diskless problem. rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Re; NFS performance
> It has been suggested that I move this thread to freebsd-stable. The > thread so far (deficient NFS performance in FreeBSD 8): > > http://lists.freebsd.org/pipermail/freebsd-hackers/2011-January/034006.html > > I updated my kernel to FreeBSD 8.2-PRERELEASE. This improved my > throughput, but still not to the level I got from 7.3-STABLE. Here's > an updated table from my original message: > > Observed bytes per second (dd if=filename of=/dev/null bs=65536): > Source machine: mattapan scollay sullivan > Destination machine: > wonderland/7.3-STABLE 870K 5.2M 1.8M > wonderland/8.1-STABLE 496K 690K 420K > wonderland/8.2-PRERELEASE 800K 1.2M 447K > > Furthermore, I was still able to induce the NFS "server not > responding" > message with 8.2-PRERELEASE. So I applied the patch from Rick Macklem. > The throughput did not change, but I haven't seen the NFS "server not > responding" message yet. So, did the patch get rid of the 1min + stalls you reported earlier? Beyond that, all I can suggest is trying to fiddle with some of the options on the net device driver, such as rxcsum, txcsum and tso. (I think tso has had some issues for some drivers, but I don't know any specifics.) When I've seen poor NFS perf. it has usually been a problem at the network device driver level. (If you have a different kind of network card handy, you could try swapping them. Basically one with a different net chipset so that it uses a different net device driver.) rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Hang in VOP_LOCK1_APV on 8-STABLE with NFS.
> > Hi, > > I have got the first steps set up. No solution yet. > 1. With the patch OpenOffice opens my homedir (yeah!), but it gives an > I/O > error when saving a file and everything hangs after that. Hmm, I don't think you mentioned what server you were using. It wouldn't happen to be a FreeBSD one exported ZFS? If so, make sure you have this patch in it: http://people.freebsd.org/~rmacklem/freebsd8.0-patches/freebsd8-nfsserver-estale.patch (With it a stale file handle can result in EIO from a server exporting ZFS and that can make the client loop around, retrying the RPC.) > 2. I have dumps and stuff. I will mail some links in private e-mail. I'll take a look at some point. > 3. Didn't work. It mount, but ls -l /home gives "Operation not > permitted". > It should work. This hints at a server issue. Anyhow, I'll look at the dumps at some point, rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Hang in VOP_LOCK1_APV on 8-STABLE with NFS.
> > > > Hi, > > > > I have got the first steps set up. No solution yet. > > 1. With the patch OpenOffice opens my homedir (yeah!), but it gives > > an > > I/O > > error when saving a file and everything hangs after that. > > Hmm, I don't think you mentioned what server you were using. It > wouldn't happen to be a FreeBSD one exported ZFS? If so, make > sure you have this patch in it: > http://people.freebsd.org/~rmacklem/freebsd8.0-patches/freebsd8-nfsserver-estale.patch > (With it a stale file handle can result in EIO from a server exporting Oops, I meant "Without the patch a stale file handle...", rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: nfsd stuck in *rc_lock state
> Hello Rick, > > Am 11.11.2010 23:54, schrieb Rick Macklem: > > That patch is "self contained", so I think it should be fine to > > apply it > > to an 8.0 server. > > > > You might also want > > > > http://people.freebsd.org/~rmacklem/freebsd8.0-patches/freebsd8-svc-mbufleak.patch > > which plugged an mbuf leak in the regular FreeBSD8.0 server. > > > > Good luck with it, rick > > the patch fixes the 100% cpu utilization, but we now had two times the > issue, that all boxes lost connection to the nfs server (/home not > responding), but nfsd was at about 1%. > > Top did not show a strange behaviour here: > > > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 703 root 55 0 4772K 1384K RUN 5 329:12 1.37% > {nfsd: service} > 703 root 56 0 4772K 1384K rpcsvc 0 326:41 0.59% > {nfsd: service} > 703 root 52 0 4772K 1384K rpcsvc 6 326:28 0.29% > {nfsd: service} > 703 root 60 0 4772K 1384K rpcsvc 5 328:42 0.00% > {nfsd: master} > 703 root 54 0 4772K 1384K rpcsvc 0 327:44 0.00% > {nfsd: service} > 703 root 53 0 4772K 1384K rpcsvc 1 327:37 0.00% > {nfsd: service} > 703 root 54 0 4772K 1384K rpcsvc 6 326:51 0.00% > {nfsd: service} > 703 root 57 0 4772K 1384K rpcsvc 2 326:44 0.00% > {nfsd: service} > 703 root 50 0 4772K 1384K rpcsvc 1 326:20 0.00% > {nfsd: service} > 703 root 71 0 4772K 1384K rpcsvc 2 323:11 0.00% > {nfsd: service} > 703 root 47 0 4772K 1384K rpcsvc 7 321:11 0.00% > {nfsd: service} > 703 root 46 0 4772K 1384K tx->tx 2 320:00 0.00% > {nfsd: service} > > there was nothing special in the logfiles, too. > How to debug such a situation? > First off, I hope you don't mind me adding the mailing list as a cc. I'd like this stuff captured in the archive for others to see. (If people don't like the noise, I'll take the heat:-) Ok, I'm sure others have better techniques, but here's how I would start trying to resolve the above, done when the server is stuck. 1 - Make sure the network is still functioning for other things like ssh. 2 - Do a "ps axHlww" and look at all the nfsd threads. I am primarily interested in the MWCHAN field. If it is: rpcsvc - the thread is just waiting for an RPC-->normal ufs or zfs - waiting for a vnode lock on the underlying file system anything else - I need to look in the kernel sources for the "sleep" with that argument. If I can't easily explain what all the nfsd threads are waiting for, wading through a "procstat -ka" is my next step. (I find this rather painful, so I tend to delay doing this as long as possible.:-) 3 - Do a "nfsstat -s" repeatedly and see if any of the counters are increasing. 4 - Fire up a "tcpdump" and see if there is any NFS traffic. (If there is, I'll capture it and put it in wireshark.) 5 - Do a "vmstat -z | fgrep mbuf" and look at the mbuf allocation. (If the machine is running out of mbufs, all sorts of quirky behaviour is possible.) What top shows above isn't much, although I'd wonder what mbuf usage looks like? If you haven't applied the patch mentioned in the above message, you should do that. I don't know if this helps, but... rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NFS performance
> > > > So, did the patch get rid of the 1min + stalls you reported earlier? > > > Yes. The stalls (and the "server not responding" log messages are > gone. Thanks! -- George > Ok, thats a start anyhow. Maybe someday we can explain the slow read rates you are still observing. Thanks for letting us know, rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
important NFS client patch for FreeBSD8.n
I just commited a patch (r217242) to head. Anyone who is using client side NFS on FreeBSD8.n should apply this patch. It is also available at: http://people.freebsd.org/~rmacklem/krpc.patch It fixes a problem where the kernel rpc assumes that 4 bytes of data exists in the first mbuf without checking. If the data straddles multiple mbufs, it uses garbage and then a typical case will wedge for a minute or so until it times out and establishes a new TCP connection. It also replaces m_pullup() with m_copydata(), since m_pullup() can fail for rare cases when there is data available. (m_pullup() uses MGET(, M_DONTWAIT,) which can fail when mbuf allocation is constrainted, for example.) Thanks to john.gemignani at isilon.com for spotting this problem, rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NFS 75 second stall
> > We're occaisionally seeing these same types of stalls (+ repeated "is > not responding" "is alive again" messages in quick succession). We're > seeing it only on our 8.1-RELEASE systems against a variety of NFS > servers (6.3-RELEASE, 7.2-RELEASE, and 8-STABLE from before the > release > of 8.1). We also see it happen with a variety of client hardware and > network adapters (em, bce, bge); the only common denominator is > 8.1-RELEASE on the clients. > I think this might be fixed by r217242 in head, which went into stable/8 as r217527. This krpc patch is also available at: http://people.freebsd.org/~rmacklem/krpc.patch Thanks go to John Gemignani for spotting this bug in the krpc code. It will not be in 8.2, so please grab the patch if you are using either NFS client in any FreeBSD8.n system, rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: statd/lockd startup failure
I've seen this intermittently for mountd. I think the problem is that the code finds an unused port for udp/ip6 and then tries to use the same port# for tcp/ip6, udp/ip4, tcp/ip4. All three daemons have essentially the same function for doing this. The attached patches changes the behaviour so that it tries to get an unused port for each of the 4 cases. (This all applies to the "wildcard" case, where no port# or hosts have been specified as command args.) If you have the chance to try these patches, please let us know how they work for you? rick ps: I lost track of the thread, so I don't know who started it, but hopefully, they are on the cc list? --- usr.sbin/mountd/mountd.c.sav 2011-02-17 21:45:32.0 -0500 +++ usr.sbin/mountd/mountd.c 2011-02-17 23:23:37.0 -0500 @@ -510,6 +510,7 @@ create_service(struct netconfig *nconf) int r; int registered = 0; u_int32_t host_addr[4]; /* IPv4 or IPv6 */ + int mallocd_svcport = 0; if ((nconf->nc_semantics != NC_TPI_CLTS) && (nconf->nc_semantics != NC_TPI_COTS) && @@ -620,7 +621,7 @@ create_service(struct netconfig *nconf) sin->sin_addr.s_addr = htonl(INADDR_ANY); res->ai_addr = (struct sockaddr*) sin; res->ai_addrlen = (socklen_t) - sizeof(res->ai_addr); + sizeof(struct sockaddr_in); break; case AF_INET6: sin6 = malloc(sizeof(struct sockaddr_in6)); @@ -631,10 +632,12 @@ create_service(struct netconfig *nconf) sin6->sin6_addr = in6addr_any; res->ai_addr = (struct sockaddr*) sin6; res->ai_addrlen = (socklen_t) - sizeof(res->ai_addr); + sizeof(struct sockaddr_in6); break; default: - break; + syslog(LOG_ERR, "bad addr fam %d", + res->ai_family); + exit(1); } } else { if ((aicode = getaddrinfo(NULL, svcport_str, @@ -700,6 +703,7 @@ create_service(struct netconfig *nconf) svcport_str = malloc(NI_MAXSERV * sizeof(char)); if (svcport_str == NULL) out_of_mem(); +mallocd_svcport = 1; if (getnameinfo(res->ai_addr, res->ai_addr->sa_len, NULL, NI_MAXHOST, @@ -715,6 +719,12 @@ create_service(struct netconfig *nconf) exit(1); } + if (mallocd_svcport != 0) { +free(svcport_str); +svcport_str = NULL; +mallocd_svcport = 0; + } + servaddr.buf = malloc(res->ai_addrlen); memcpy(servaddr.buf, res->ai_addr, res->ai_addrlen); servaddr.len = res->ai_addrlen; --- usr.sbin/rpc.statd/statd.c.sav 2011-02-17 23:36:15.0 -0500 +++ usr.sbin/rpc.statd/statd.c 2011-02-17 23:37:53.0 -0500 @@ -233,6 +233,7 @@ create_service(struct netconfig *nconf) int r; int registered = 0; u_int32_t host_addr[4]; /* IPv4 or IPv6 */ + int mallocd_svcport = 0; if ((nconf->nc_semantics != NC_TPI_CLTS) && (nconf->nc_semantics != NC_TPI_COTS) && @@ -326,7 +327,7 @@ create_service(struct netconfig *nconf) sin->sin_addr.s_addr = htonl(INADDR_ANY); res->ai_addr = (struct sockaddr*) sin; res->ai_addrlen = (socklen_t) - sizeof(res->ai_addr); + sizeof(struct sockaddr_in); break; case AF_INET6: sin6 = malloc(sizeof(struct sockaddr_in6)); @@ -336,10 +337,13 @@ create_service(struct netconfig *nconf) sin6->sin6_port = htons(0); sin6->sin6_addr = in6addr_any; res->ai_addr = (struct sockaddr*) sin6; - res->ai_addrlen = (socklen_t) sizeof(res->ai_addr); + res->ai_addrlen = (socklen_t) + sizeof(struct sockaddr_in6); break; default: - break; + syslog(LOG_ERR, "bad addr fam %d", + res->ai_family); + exit(1); } } else { if ((aicode = getaddrinfo(NULL, svcport_str, @@ -401,6 +405,7 @@ create_service(struct netconfig *nconf) svcport_str = malloc(NI_MAXSERV * sizeof(char)); if (svcport_str == NULL) out_of_mem(); +mallocd_svcport = 1; if (getnameinfo(res->ai_addr, res->ai_addr->sa_len, NULL, NI_MAXHOST, @@ -416,6 +421,12 @@ create_service(struct netconfig *nconf) exit(1); } + if (mallocd_svcport != 0) { +free(svcport_str); +svcport_str = NULL; +mallocd_svcport = 0; + } + servaddr.buf = malloc(res->ai_addrlen); memcpy(servaddr.buf, res->ai_addr, res->ai_addrlen); servaddr.len = res->ai_addrlen; --- usr.sbin/rpc.lockd/lockd.c.sav 2011-02-17 23:29:48.0 -0500 +++ usr.sbin/rpc.lockd/lockd.c 2011-02-17 23:35:47.0 -0500 @@ -403,6 +403,7 @@ create_service(struct netconfig *nconf) int r; int registered = 0; u_int32_t host_addr[4]; /* IPv4 or IPv6 */ + int mallocd_svcport = 0; if ((nconf->nc_semantics != NC_TPI_CLTS) && (nconf->nc_semantics != NC_TPI_COTS) && @@ -497,7 +498,7 @@ create_service(struct netconfig *nconf) sin->sin_a
Re: statd/lockd startup failure
> On 02/18/2011 10:08, Rick Macklem wrote: > > The attached patches changes the behaviour so that it tries to > > get an unused port for each of the 4 cases. > > Am I correct in assuming that what you're proposing is to > (potentially) > have different ports for all 4 combinations? I would suggest that this > is not the right way to solve the problem. If I misunderstand, I > apologize. > Well, that was what I was proposing. I could be wrong, but as far as I know, this is allowed by Sun RPC. The port#s are assigned dynamically and registered with rpcbind. (I don't necessarily agree with the design, but this was/is how Sun RPC does it. The philosophy was/is that apps. don't know what port# is being used and shouldn't care. If sysadmins want to use a fixed port#, they can use command line options to override the default dynamic assignment. And, yes, this is one reason that Sun RPC is a pita w.r.t. firewalls. 1980s design...) I don't know an easy way to get a non-assugned port# that is available for all 4 combinations of udp,tcp X ip4,ip6. If others know how to get a port# that is available for all 4 cases, I could implement that. rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NFS client over udp
> --- On Fri, 2/18/11, Kirill Yelizarov > > > On Fri, Feb 18, 2011 at 05:27:00AM > > > -0800, Kirill Yelizarov wrote: > > > > I have a reproducible memory leak when using nfs > > > client with an old > > > > nfs server > > and mbufs used > 8193/1722/9915 mbufs in use (current/cache/total) > 8192/1264/9456/25600 mbuf clusters in use (current/cache/total/max) > 8192/605 mbuf+clusters out of packet secondary zone in use > (current/cache) > 0/768/768/12800 4k (page size) jumbo clusters in use > (current/cache/total/max) > 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) > 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) > 18432K/6030K/24462K bytes allocated to network (current/cache/total) > 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > 0/0/0 sfbufs in use (current/peak/max) > 0 requests for sfbufs denied > 0 requests for sfbufs delayed > 0 requests for I/O initiated by sendfile > 0 calls to protocol drain routines > > Kirill > You could try the attached patch. It fixes the only places in the client side krpc over udp that seems mights cause a leak. I have no idea if it will help, since these cases should rarely, if ever, happen in practice. Please let us know if you have the chance to try the patch and whether or not it helped. rick --- rpc/clnt_dg.c.sav 2011-02-19 19:52:41.0 -0500 +++ rpc/clnt_dg.c 2011-02-20 10:43:11.0 -0500 @@ -704,9 +704,9 @@ got_reply: (reply_msg.acpted_rply.ar_stat == SUCCESS)) errp->re_status = stat = RPC_SUCCESS; else - stat = _seterr_reply(&reply_msg, &(cu->cu_error)); + stat = _seterr_reply(&reply_msg, errp); - if (errp->re_status == RPC_SUCCESS) { + if (stat == RPC_SUCCESS) { results = xdrmbuf_getall(&xdrs); if (! AUTH_VALIDATE(auth, xid, &reply_msg.acpted_rply.ar_verf, @@ -1089,11 +1089,14 @@ clnt_dg_soupcall(struct socket *so, void /* * The XID is in the first uint32_t of the reply. */ - if (m->m_len < sizeof(xid) && m_length(m, NULL) < sizeof(xid)) + if (m->m_len < sizeof(xid) && m_length(m, NULL) < sizeof(xid)) { /* * Should never happen. */ + printf("clnt_dg_soupcall: received garbage\n"); + m_freem(m); continue; + } m_copydata(m, 0, sizeof(xid), (char *)&xid); xid = ntohl(xid); ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: statd/lockd startup failure
> Hi-- > > On Feb 19, 2011, at 1:16 PM, Rick Macklem wrote: > > Well, that was what I was proposing. I could be wrong, but as far as > > I > > know, this is allowed by Sun RPC. The port#s are assigned > > dynamically and > > registered with rpcbind. (I don't necessarily agree with the design, > > but > > this was/is how Sun RPC does it. The philosophy was/is that apps. > > don't know > > what port# is being used and shouldn't care. If sysadmins want to > > use a > > fixed port#, they can use command line options to override the > > default > > dynamic assignment. And, yes, this is one reason that Sun RPC is a > > pita > > w.r.t. firewalls. 1980s design...) > > Trying to force SunRPC and old NFS through fixed ports in order to > pass through a firewall sounds like a lot more work, and weakens the > security of a firewall to such a significant extent that I have to > wonder if it is the right problem to solve. :-) > > Why not setup a VPN via OpenVPN/IPSec/ssh+ppp/etc...? > Well, the discussion was how to fix a problem where the dynamically assigned port# for one of (udp,tcp X ip6,ip4) wasn't available for the others. The test patch I posted allowed each of the four to select different port#s. The daemons already allow specification of a fixed port# (-p option) for anyone who wants a fixed port#. (And yes, I see not being able to run this stuff through a firewall a feature and not a bug.) rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: statd/lockd startup failure
> On 02/19/2011 13:16, Rick Macklem wrote: > >> On 02/18/2011 10:08, Rick Macklem wrote: > >>> The attached patches changes the behaviour so that it tries to > >>> get an unused port for each of the 4 cases. > >> > >> Am I correct in assuming that what you're proposing is to > >> (potentially) > >> have different ports for all 4 combinations? I would suggest that > >> this > >> is not the right way to solve the problem. If I misunderstand, I > >> apologize. > >> > > Well, that was what I was proposing. > > I think that would be a bad idea. It's hard enough to deal with > tracking > these services when they are on the same port. :) > > I don't think there is a single function that you can call that will > provide you an open port on all 4, although it would probably be nice > if > we had one. Something along the line of open a port for 1, then try to > open the same port on the other 3. If one of them fails, start the > process over. In the common case (starting the services when the > system > starts) it shouldn't be difficult to find a port that is open on all > 4. > Yea, it would be a much bigger patch, but should be doable. I don't know about "tracking" (whatever that means?), but I can see the argument for doing this so that the # of ports used is minimized. I'll wait to see if the patch fixes the problem before I proceed. Btw, one issue w.r.t. the above algorithm is "how many iterations do you do before giving up, when it fails?". I'd say "forever", but logging something every 10 attempts, since the likelyhood of N failures before a success only decreases, but never hits 0. rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NFS client over udp
> --- On Sun, 2/20/11, Rick Macklem wrote: > > > From: Rick Macklem > > Subject: Re: NFS client over udp > > To: "Kirill Yelizarov" > > Cc: freebsd-stable@freebsd.org > > Date: Sunday, February 20, 2011, 9:02 PM > > > --- On Fri, 2/18/11, Kirill > > Yelizarov > > > > > On Fri, Feb 18, 2011 at 05:27:00AM > > > > > -0800, Kirill Yelizarov wrote: > > > > > > I have a reproducible memory leak when > > using nfs > > > > > client with an old > > > > > > nfs server > > > > > > and mbufs used > > > 8193/1722/9915 mbufs in use (current/cache/total) > > > 8192/1264/9456/25600 mbuf clusters in use > > (current/cache/total/max) > > > 8192/605 mbuf+clusters out of packet secondary zone in > > use > > > (current/cache) > > > 0/768/768/12800 4k (page size) jumbo clusters in use > > > (current/cache/total/max) > > > 0/0/0/6400 9k jumbo clusters in use > > (current/cache/total/max) > > > 0/0/0/3200 16k jumbo clusters in use > > (current/cache/total/max) > > > 18432K/6030K/24462K bytes allocated to network > > (current/cache/total) > > > 0/0/0 requests for mbufs denied > > (mbufs/clusters/mbuf+clusters) > > > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > > > 0/0/0 sfbufs in use (current/peak/max) > > > 0 requests for sfbufs denied > > > 0 requests for sfbufs delayed > > > 0 requests for I/O initiated by sendfile > > > 0 calls to protocol drain routines > > > > > > Kirill > > > > > You could try the attached patch. It fixes the only places > > in the > > client side krpc over udp that seems mights cause a leak. I > > have no > > idea if it will help, since these cases should rarely, if > > ever, > > happen in practice. > > > > Please let us know if you have the chance to try the patch > > and > > whether or not it helped. > > > > rick > > > Rick, i tried your patch. Fortunately it didn't help me. There are no > warnings on console and memory is climbing up during syncs and not > freed later. I'll try to switch to tcp this evening. Thanks for help > I'll assume that's unfortunately;-) Since the two cases patched probably never happen, I'm not surprised. The only other thing I can think of that you could try is switching to the experimental client. This would identify if the bug is in the regular client or somewhere further down in the rpc transport. The mount command would look something like: # mount -t newnfs -o nfsv3,udp : Thanks for trying it and letting me know, rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NFS client over udp
> --- On Tue, 2/22/11, Rick Macklem wrote: > > > From: Rick Macklem > > Subject: Re: NFS client over udp > > To: "Kirill Yelizarov" > > Cc: freebsd-stable@freebsd.org > > Date: Tuesday, February 22, 2011, 2:10 AM > > > --- On Sun, 2/20/11, Rick > > Macklem > > wrote: > > > > > > > From: Rick Macklem > > > > Subject: Re: NFS client over udp > > > > To: "Kirill Yelizarov" > > > > Cc: freebsd-stable@freebsd.org > > > > Date: Sunday, February 20, 2011, 9:02 PM > > > > > --- On Fri, 2/18/11, Kirill > > > > Yelizarov > > > > > > > On Fri, Feb 18, 2011 at > > 05:27:00AM > > > > > > > -0800, Kirill Yelizarov wrote: > > > > > > > > I have a reproducible memory > > leak when > > > > using nfs > > > > > > > client with an old > > > > > > > > nfs server > > > > > > > > > > and mbufs used > > > > > 8193/1722/9915 mbufs in use > > (current/cache/total) > > > > > 8192/1264/9456/25600 mbuf clusters in use > > > > (current/cache/total/max) > > > > > 8192/605 mbuf+clusters out of packet > > secondary zone in > > > > use > > > > > (current/cache) > > > > > 0/768/768/12800 4k (page size) jumbo > > clusters in use > > > > > (current/cache/total/max) > > > > > 0/0/0/6400 9k jumbo clusters in use > > > > (current/cache/total/max) > > > > > 0/0/0/3200 16k jumbo clusters in use > > > > (current/cache/total/max) > > > > > 18432K/6030K/24462K bytes allocated to > > network > > > > (current/cache/total) > > > > > 0/0/0 requests for mbufs denied > > > > (mbufs/clusters/mbuf+clusters) > > > > > 0/0/0 requests for jumbo clusters denied > > (4k/9k/16k) > > > > > 0/0/0 sfbufs in use (current/peak/max) > > > > > 0 requests for sfbufs denied > > > > > 0 requests for sfbufs delayed > > > > > 0 requests for I/O initiated by sendfile > > > > > 0 calls to protocol drain routines > > > > > > > > > > Kirill > > > > > > > > > You could try the attached patch. It fixes the > > only places > > > > in the > > > > client side krpc over udp that seems mights cause > > a leak. I > > > > have no > > > > idea if it will help, since these cases should > > rarely, if > > > > ever, > > > > happen in practice. > > > > > > > > Please let us know if you have the chance to try > > the patch > > > > and > > > > whether or not it helped. > > > > > > > > rick > > > > > > > Rick, i tried your patch. Fortunately it didn't help > > me. There are no > > > warnings on console and memory is climbing up during > > syncs and not > > > freed later. I'll try to switch to tcp this evening. > > Thanks for help > > > > > I'll assume that's unfortunately;-) Since the two cases > > patched probably > > never happen, I'm not surprised. > > > > The only other thing I can think of that you could try is > > switching to > > the experimental client. This would identify if the bug is > > in the regular > > client or somewhere further down in the rpc transport. > > > > The mount command would look something like: > > # mount -t newnfs -o nfsv3,udp : > > > > > I added options NFSCL to my kernel and tried to mount. mount shows > everything is ok: > 192.168.0.35:/home on /mnt (newnfs) > but when i try to cd /mnt i get permission denied > my export allow root and everything is done as root. What am i doing > wrong? > Try adding the "resvport" option. I don't think it's a default for the experimental client at this time. (I have a series of patches for the client that will go into head in April that brings it in line with the regular client, including same default mount options.) rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NFS client over udp
> Rick, > I have good news. I upgraded to 8.2-stable and i ran all four > different tests (nfs client new and old and over udp and tcp) and > found that there is no leak in either. ALl of them behave almost the > same, i couldn't find any difference. The speed i achieved on 1Gb link > is 52Mb/s. The only difference is that i can't umount new nfs client > even if there are no processes using this mount point. Thanks for help > Kirill > Ok, sounds good, although I have no idea what might have plugged the leak. (I looked at the revision log for clnt_dg.c and udp_usrreq.c and I couldn't spot anything that might have fixed an mbuf leak.) As for the "can't unmount", I assume that it reports the mount pt as busy? (I've never seen this for the exp. client, so I have no idea what the cause might be for this. Possibly some "failure" code path that lacks a vput()/vrele() that I've never exercised?) Anyhow, good to hear about the above, rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: statd/lockd startup failure
> > Thanks for the analysis. The reason I originally posted is to see why > this might have popped up in 8.x, as it never happened in 7.x. > -- George Mitchell > I suspect two things make this occur more frequently with 8.x. One is that it does IPv6 first (I suspect IPv6 wasn't enabled by default on 7.x?). The other is the port randomization code, which probably results in more frequent collisions with port #s used by other things. (Basically, the code selects an unused port# for either UDP or TCP over IPv6 (I can't remember which comes first:-) and then expects that port to be available for the other 3 combinations of UDP/TCP x IPv6/IPv4. rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: statd/lockd startup failure
> >> On 02/18/2011 10:08, Rick Macklem wrote: > >> > The attached patches changes the behaviour so that it tries to > >> > get an unused port for each of the 4 cases. > >> > >> can you send me the patches? > >> thanks, > >> danny > > > They're attached. If you get to test them, please let me know > > how it goes. > > > > rick > > Hi Rick, > the good side of living on different time zones :-) > I got impatient, so I came up with a different fix. > The rational is that IMHO, there is no need for all listeners > to be on the same port: > rnd> rpcinfo protonew |grep mountd > 15 1 udp6 ::.3.141 mountd superuser > 15 3 udp6 ::.3.141 mountd superuser > 15 1 tcp6 ::.3.141 mountd superuser > 15 3 tcp6 ::.3.141 mountd superuser > 15 1 udp 0.0.0.0.3.141 mountd superuser > 15 3 udp 0.0.0.0.3.141 mountd superuser > 15 1 tcp 0.0.0.0.3.92 mountd superuser <--- > 15 3 tcp 0.0.0.0.3.92 mountd superuser <--- > rnd> rpcinfo -t protonew mountd > program 15 version 1 ready and waiting > rpcinfo: RPC: Program/version mismatch; low version = 1, high version > = 3 > program 15 version 2 is not available > program 15 version 3 ready and waiting > > the patches are in: > ftp://ftp.cs.huji.ac.il/users/danny/freebsd/patches/address_already_in_use/ > > cheers, > danny > Yep, a patch that doesn't make them all use the same port# is much simpler. However, others, such as Doug Barton feel that it is important that they use the same port#. (Something he called "tracking".) rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: statd/lockd startup failure
> On 03/12/2011 02:21, Daniel Braniss wrote: > > The problem with trying to get the same port for all > > tcp/udp/inet/inet6 > > though might succeed most of the time, will fail sometimes, then > > what? > > Can you please describe the scenario when it's completely impossible > to > find a port that's open on all 4 families? > > > I saw Doug's commnent, and also the:), it's not as simple as > > tracking port > > 80 or 25, needs some efford, but it's deterministic/programable, and > > worst case > > you can still use the -p option (which again will fail sometimes:-). > > Given that Rick has already written the patch, I don't think it's at > all > unreasonable to put it in as the first choice, perhaps with a fallback > to picking any available port if there isn't one available for all 4 > families. > I suppose the patch could be changed to switch to "allow any port#" after N failed attempts at getting the same one. (I'll admit I have troiuble seeing why getting the same port# would fail "forever" unless all ports are in use and, if that's the case, you're snookered.) My only concern with the "same port# patch" is that it is more complex and, therefore, somewhat riskier w.r.t. my having gotten it wrong. > Meanwhile, I don't think I'm the only person who has ever had trouble > trying to track down network traffic from "random" ports that would > prefer that doing so not be made harder by having the same service on > the same host using 4 different ports. > ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: statd/lockd startup failure
> On 03/13/2011 08:23, Daniel Braniss wrote: > >> On 03/12/2011 02:21, Daniel Braniss wrote: > >>> The problem with trying to get the same port for all > >>> tcp/udp/inet/inet6 > >>> though might succeed most of the time, will fail sometimes, then > >>> what? > >> > >> Can you please describe the scenario when it's completely > >> impossible to > >> find a port that's open on all 4 families? > > i did not say impossible, concidering that Rick asked how many times > > he > > should try, unless N is forever, it could fail. > > And what I'm asking is that you describe the circumstances which might > lead to that failure. > > >>> I saw Doug's commnent, and also the:), it's not as simple as > >>> tracking port > >>> 80 or 25, needs some efford, but it's deterministic/programable, > >>> and worst case > >>> you can still use the -p option (which again will fail > >>> sometimes:-). > >> > >> Given that Rick has already written the patch, I don't think it's > >> at all > >> unreasonable to put it in as the first choice, perhaps with a > >> fallback > >> to picking any available port if there isn't one available for all > >> 4 > >> families. > >> > > as Rick mentioned, the patch is not trivial, and to quote him: > > "My only concern with the "same port# patch" is that it is more > > complex > >and, therefore, somewhat riskier w.r.t. my having gotten it > >wrong." > > Yeah, I saw that, did you see my response? I'm very much in favor of > keeping things simple, but only as simple as they can be made. > [some good stuff snipped for brevity] Ok, well I believe that the patches I currently have aren't broken. How about I change the patches so that after N attempts fail, it does a final attempt allowing different port#s for the 4 cases. (If that fails, I don't think there is anything that can be done, since it means that no port# is available for at least one of the four cases?) Does that sound reasonable? rick ps: I was thinking N should be somewhere in the 10<->100 range. Anyone want to suggest a value for N? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs, nfs and zil
> > I'm mounting the FreeBSD-server from a couple of vmware esxi 4.1 > servers using nfs, but when there is alot of i/o the server becomes > unresponsive, easily triggered by installing ie. ms-sql. The server > itself is up but is not reachable from the network. When I take the > nic down and up again connection to the network is reestablished > (ip-wise). > Others have made good comments w.r.t. the zil, however if all IP activity on the server has stopped (and is fixed via "ifconfig XX down; ifconfig XX up") it sounds more like a network device driver issue to me? So, is it just NFS that wedges or all IP activity and does NFS come back to life after the "ifconfig XX up"? rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Heads up: you'll need to do a fresh "config KERNEL" etc
Hi, Just a heads up that after a commit going into stable/8 in a few minutes, you'll need to do a fresh kernel build, starting at "config GENERIC", including rebuilding the NFS related modules. rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Heads up: you'll need to do a fresh "config KERNEL" etc
> On Sat, May 14, 2011 at 08:05:41PM -0400, Rick Macklem wrote: > > Just a heads up that after a commit going into stable/8 in a few > > minutes, you'll need to do a fresh kernel build, starting at > > "config GENERIC", including rebuilding the NFS related modules. > > Rick, > > Can you explain why a kernel reconfig would be required if the > kernel configuration (e.g. GENERIC) hasn't been changed? > > http://www.freshbsd.org/?branch=RELENG_8&project=freebsd > > Possibly the commit site doesn't have the most recent commits? > > I guess what I'm asking is: why is a kernel reconfig required if only > the NFS code itself changed? A buildworld/buildkernel should be > sufficient, no? > The commit moved the files used for a diskless root NFS from sys/nfsclient to sys/nfs. As such, sys/conf/files has changed and, therefore, fresh kernel Makefiles need to be built. I thought that "config KERNEL" is what does that, but if buildkernel does, then "config KERNEL" isn't needed. rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Heads up: you'll need to do a fresh "config KERNEL" etc
> On 05/14/11 20:05, Rick Macklem wrote: > > Hi, > > > > Just a heads up that after a commit going into stable/8 in a few > > minutes, you'll need to do a fresh kernel build, starting at > > "config GENERIC", including rebuilding the NFS related modules. > > > > rick > > ___ > > freebsd-stable@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > > To unsubscribe, send any mail to > > "freebsd-stable-unsubscr...@freebsd.org" > > > > Sensational! With this update, I finally get NFS client performance > as good as (or better than) 7.x, and I have a warm, fuzzy feeling > about 8.x at last. (Except for SCHED_ULE, which gives terrible > performance on a single-core machine with a compute-bound process > running in the background.) Thanks! -- George Mitchell > There's a weird (and you need to have a weird sense of humour to enjoy it) flick called "Stranger than Paradise". Anyhow, the above sounds like good news, although the commit it was related to should have had no effect on perf, from what I can see. Assuming that you are using the regular 8.n client (and not the new one), there have been some commits related to krpc bugs that could have fixed cases which would have caused poor perf., although all of those (except one where a client would hang on a TCP reconnect attempt) are in 8.2. So, happy to hear it works for you now, but have no idea why;-) rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Heads up: you'll need to do a fresh "config KERNEL" etc
> > On May 19, 2011, at 6:53 PM, Rick Macklem wrote: > > Assuming that you are using the regular 8.n client (and not the new > > one), there have been some commits related to krpc bugs that could > > have > > fixed cases which would have caused poor perf., although all of > > those > > (except one where a client would hang on a TCP reconnect attempt) > > are in > > 8.2. > > Are you referring to r221934? If not, which change? > > (Trying to make sure I have them all...) > r221934 is the one referred to above by "(except one..)", so if you have it, you have all the krpc related fixes. rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NFS related include files and make delete-old
> Hi, > > For a few months now, during the usual make delete-old after > make installworld the files > > /usr/include/nfs/krpc.h > > and > > /usr/include/nfs/nfsdiskless.h > > turn up time and again. I have them deleted, but they get reinstalled > during the next make installworld. This is a fairly old installation, > but running an up-to-date 8.2-STABLE and these header files are also > present in the directory /usr/include/nfsclient. > I moved them from sys/nfsclient to sys/nfs, so that it would be more obvious that they are shared by both NFS clients (in sys/nfsclient and sys/fs/nfsclient). So the ones at the new location /usr/include/nfs would not be deleted, the entry in ObsoleteFiles.inc that removed them from /usr/include/nfs was deleted (by someone else, after discussing it with me). I felt that they should remain in the old location for backwards compatibility. (The "userland" contents of the two copies are identical, so it shouldn't matter which copy any userland app includes. One problem here is that I have no idea if any software outside of /usr/src includes these.) > Could it be that either the wrong files are specified in > /usr/src/ObsoleteFiles.inc or the headers are installed in the wrong > directory during make installworld? > > On my 9.0-CURRENT systems I also have the headers at both locations, > but there only those in /usr/include/nfsclient get reinstalled and > there is no entry in /usr/src/ObsoleteFiles.inc. > Actually, only the ones in /usr/include/nfs should get updated, because they now live in sys/nfs and not sys/nfsclient. I plan on adding an entry to ObsoleteFiles.inc in head/current for the /usr/include/nfsclient ones. (Thanks for the reminder w.r.t. this.) Should I MFC this to stable/8? (I had assumed that I should leave them in the old location for backwards compatibility and therefore wasn't going to MFC deletion of them in /usr/include/nfsclient. If I MFC that, the entries for them in ObsoleteFiles.inc for /usr/include/nfs need to be deleted, so they remain in the new location.) rick ps: Maybe I shouldn't have MFC'd the changes for making the two NFS clients use the shared diskless boot code, but that would have made later MFCs difficult. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NFS related include files and make delete-old
> > At this moment make installworld only installs the headers in the new > location, > on both 8/stable and head/current. On 8/stable they are immediately > removed > again when running make delete-old, because they are in > ObsoleteFiles.inc. > On head/current they are left alone, they are not in ObsoleteFiles.inc > (i.e. not anymore). > > If the files at the old location are still there, it is as a leftover > from > previous installations. On a freshly installed /usr/include hierarchy > they > will be missing. > Yes. I am working on MFC'ing the patch (r221333) to stable/8 so that it doesn't delete the ones in /usr/include/nfs. Having said that, to the best of my knowledge (I looked a while back), nothing in /usr/src outside of the kernel includes them. Also, I can't think of any reason why a third party app. would have any use for what is in them. As such, I doubt it matters if they exist under /usr/include or where they end up. Do you have software that includes either of these files? If so, I would like to hear whay that software is and why it includes them. (Even bootstraps for diskless NFS root systems shouldn't need what's in them, as far as I understand how it works.) > > I felt that they should remain in the old location for backwards > > compatibility. > > (The "userland" contents of the two copies are identical, so it > > shouldn't matter > > which copy any userland app includes. One problem here is that I > > have no idea > > if any software outside of /usr/src includes these.) > > > > I can confirm that the copies are identical (if they are present), > apart from > version control information. I think that you have to install the > copies explicitly > if you want them to be there, also on a fresh installs, for > compatibility with > 8.2-RELEASE and earlier. I would only do that for 8/stable, if at all. > > > > Could it be that either the wrong files are specified in > > > /usr/src/ObsoleteFiles.inc or the headers are installed in the > > > wrong > > > directory during make installworld? > > > > > > On my 9.0-CURRENT systems I also have the headers at both > > > locations, > > > but there only those in /usr/include/nfsclient get reinstalled and > > > there is no entry in /usr/src/ObsoleteFiles.inc. > > > > > Actually, only the ones in /usr/include/nfs should get updated, > > because they > > now live in sys/nfs and not sys/nfsclient. I plan on adding an entry > > to > > ObsoleteFiles.inc in head/current for the /usr/include/nfsclient > > ones. > > (Thanks for the reminder w.r.t. this.) > > > > Even as a relative outsider to the FreeBSD project I am all for it. I > don't > know the schedule for 9.0, but if anything breaks (e.g. in ports, not > in /usr/src) > it had better break now. > > > Should I MFC this to stable/8? > > (I had assumed that I should leave them in > > the old location for backwards compatibility and therefore wasn't > > going to > > MFC deletion of them in /usr/include/nfsclient. If I MFC that, the > > entries > > for them in ObsoleteFiles.inc for /usr/include/nfs need to be > > deleted, so > > they remain in the new location.) > > > > It would save you the effort of finding a way to actually install the > copies > at the old location. However, in a sense it would change the API, and > I do not > know how the keepers of the code tree think about that 8-) > > And, as already explained above, there is already an antry in > ObsoleteFiles.inc on > stable/8, but probably for the wrong directory. > > Kind regards, > > Hans Ottevanger > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: recommendations for laptop and desktop
Kevin Oberman wrote: > On Jul 13, 2011 7:31 AM, "Zoran Kolic" wrote: > > > > > There is this list for laptops: > > > http://laptop.bsdgroup.de/freebsd/ > > > > Been there. Seen that. Obsolete. > > My very idea would be to have recent models in some kind > > of wiki. I believe that at least hundred guys on the list > > could post quality articles on the subject regarding lap- > > tops they regurarly use. > > > > > See the recent thread on the freebsd-mobile list with subject > > > "Laptop > recommendations?" > > > > Mostly older stuff recommended. Hard to find or I dislike > > what I see on the review for particular model. > > Thank you for answering my question. > > > > Zoran > > I agree that a wiki would be ideal, but it would require active > management. > That's the real issue. > > It's also the reason wiki.FreeBSD.org would not be practical. I might > be > able to admin such a wiki, but I have no place to put it. But I'm > retired, > so I should have time. > I recently installed Fedora15 and I thought it had a fairly clever idea in it. At the end of the install (so presumably it had worked for the hardware), it asked you if you wanted to email your hardware config to them. Something that just captured such emails and put them in a list (especially if could catch duplicates) for people to look at might be nice. The list would get long (and not really indicate how well the hardware worked), but at least it would be up-to-date and not require manual maintenance. I'm not volunteering to do this;-) although I'm retired too, but it might be a useful thing to have? rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Sleeping thread owns a nonsleepable lock panic (& lor)
Kostik Belousov wrote: > On Tue, Jul 26, 2011 at 01:17:52PM +0200, Herve Boulouis wrote: > > Le 26/07/2011 12:06, Kostik Belousov a Иcrit: > > > On Tue, Jul 26, 2011 at 11:49:13AM +0200, Herve Boulouis wrote: > > > > Le 25/07/2011 11:59, Kostik Belousov a ?crit: > > > > > > > > Ok the patched server crashed this morning strangely : all httpd > > > > processes were stuck in nfs or vmopar > > > > and were unkillable. Below is the full ps. > > > > > > Please see the > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html > > > for information required to debug the deadlocks. > > > > the box was not stricly deadlocked since I was able to interact with > > it but I suppose you want me to > > break into debugger when the symptoms appears again and report all > > the commands listed in the handbook > > deadlock section ? > > Exactly. > > I think everything was hung that accessed an nfs mount point. > From the usermode, procstat -kk could catch some interesting > information, > but it is redundant if ddb output is captured. Would it be worth considering reverting r223054? (Note that I don't understand the VM side, so this may be completely wrong:-) The sleeps on vmopar could be happening because a dirty page is busy and r223054 changes the VM_PAGER_xx value set a couple of ways. 1 - When it returns VM_PAGER_ERROR instead of VM_PAGER_AGAIN, the return value of "runlen" from vm_pageout_flush() changes. 2 - I'm not sure, but I think the pre-r223054 code marked a partially written page as VM_PAGER_OK instead of VM_PAGER_AGAIN? (I'm wondering about this one, since the problem seems to happen when the file's size has been truncated.) Herve Boulouis, if you want to see what r223054 changes, just go to http://svn.freebsd.org/viewvc/stable/8/sys/nfsclient and then click on nfs_bio.c. (The changes are small and could easily be reverted with a manual edit.) Since r223054 went into stable/8 on Jun 13, it seems a possible explanation? rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: debugging frequent kernel panics on 8.2-RELEASE
Steven Hartland wrote: > - Original Message - > From: "Andriy Gapon" > > >>> I would really appreciate if you could try to reproduce the > >>> problem with the patch that I sent earlier. > >> > >> Hi Andriy, what's the risk of this patch causing other issues? > > > > I can not estimate. > > The code is supposed to affect only things that happen after panic, > > so make your guess. > > So in theory should be good. > > >> I ask as to get results from this we've going to have to roll it > >> out to over 130+ production machines, so I'd like to be clear on > >> the risks before I sign that off. > > > > I will be happy if you try the patch on a single machine > > provided the problem is that reproducible. > > Unfortunately although its happening a lot its taking the > large numbers of machines to make it that way. > > Over the 130+ machines we're seeing between 3 and 8 panics > a day, so based on that we could be waiting quite some time > for a specific machine to panic :( > > Don't think we're going to make any progress on this in the current > state so I think we'll give it a shot. > Just a random thought that is probably not relevent, but... Is it possible that some change for the upgrade is making the machines run hotter and they're failing when they overhead? rick > Regards > Steve > > > This e.mail is private and confidential between Multiplay (UK) Ltd. > and the person or entity to whom it is addressed. In the event of > misdirection, the recipient is prohibited from using, copying, > printing or otherwise disseminating it or any information contained in > it. > > In the event of misdirection, illegible or incomplete transmission > please telephone +44 845 868 1337 > or return the E.mail to postmas...@multiplay.co.uk. > > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: increase in dropped udp datagrams
Robert Schulze wrote: > Hi, > > we are running an NFS server (8.2-STABLE/amd64 from 2011-09-08) with > several 8.1-RELEASE-p1/amd64 clients (webservers). After updating the > server, we now realize a slight increase in udp datagrams "dropped due > to full socket buffers", which wasn't the case in the previously > installed 8.2-STABLE from 2011-05-08. This also results in higher > averages in HTTP request time on the clients. > Well, there isn't much that changed in the krpc or NFS server's handling of UDP during that period. The only change in the krpc is r225384 and I can't think how that would have this affect. (All it does is avoid a loop in the kernel when msleep() returns EINTR or ERESTART.) > Nothing was changed on the configuration regarding sysctls or loader > tuneables. > > kern.ipc.maxsockbuf is at 1024000 on the server. > > Was there a change which could cause the drops? Should this tuneable > be > increased further? > You can certainly try this. (It defaults to 2*1024*1024 in -current now.) > Regarding maxsockbuf: is this number per-socket? So does one have to > keep in mind that there could be maxsockbuf*maxsockets allocated for > UDP? > It's per-socket, but in the case of the NFS server, it only uses one UDP socket for all requests. (I have no idea what else you are running that uses UDP?) > I'm willing to provide further information upon question. > Sounds to me like you've bumped into some second-order effect. One thing I'd try (you didn't mention where you currently have it set) is bump the # of nfsd threads up. For example, you can set it to 64 by putting this line in your /etc/rc.conf. nfs_server_flags="-u -t -n 64" If you have too many of them, the extra ones just sit waiting for a request and don't take up many resources. On the other hand, if you don't have enough of them, requests will get backed up in the socket's receive queue. I'll also mention considering using TCP mounts from the clients. (I can't remember if we've already had this discussion:-) Good luck with it, rick > with kind regards, > Robert Schulze > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NFSD hang
> > > ??? > > > ?1666 100443 nfsd nfsd: service??? mi_switch+0x176 > > > sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 > > > _cv_timedwait_sig+0x11d svc_run_internal+0x939 > > > svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe > > > ?1666 100444 nfsd nfsd: service??? mi_switch+0x176 > > > sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 > > > _cv_timedwait_sig+0x11d svc_run_internal+0x939 > > > svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe > > > ?1666 100445 nfsd nfsd: service??? mi_switch+0x176 > > > sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 > > > _cv_timedwait_sig+0x11d svc_run_internal+0x939 > > > svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe > > > ?1666 100446 nfsd nfsd: service??? mi_switch+0x176 > > > sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 > > > _cv_timedwait_sig+0x11d svc_run_internal+0x939 > > > svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe > > > ?1666 100447 nfsd nfsd: service??? mi_switch+0x176 > > > sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 > > > _cv_timedwait_sig+0x11d svc_run_internal+0x939 > > > svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe > > > > > > > > > > > > > > > From: Jeremy Chadwick > > > To: Kirill Yelizarov > > > Cc: "freebsd-stable@freebsd.org" > > > Sent: Monday, September 26, 2011 10:32 AM > > > Subject: Re: NFSD hang > > > > > > On Sun, Sep 25, 2011 at 11:14:30PM -0700, Kirill Yelizarov wrote: > > > > Good Day! > > > > I'v got a problem with nfs share on zfs volume. Everything > > > > worked fine for a few month and now it hang. This share stores > > > > logs from 9 servers at night, about 1-2Gb from each server. ZFS > > > > is filled to 26% and it is v28 > > > > > > > > last pid: 46573;? load averages: 195.82, 199.86, > > > > 200.12?? > > > > up 108+21:56:50 10:05:06 > > > > 432 processes: 208 running, 224 sleeping > > > > CPU:? 0.0% user,? 0.0% nice,? 100% system,? 0.0% interrupt,? > > > > 0.0% idle > > > > Mem: 280M Active, 1469M Inact, 9584M Wired, 161M Cache, 1232M > > > > Buf, 311M Free > > > > Swap: 16G Total, 16G Free > > > > > > > > ? PID USERNAME? THR PRI NICE?? SIZE??? RES STATE?? C?? > > > > TIME?? WCPU COMMAND > > > > ?1666 root? 256? 76??? 0? 5788K? 5120K RUN??? 14 476.8H > > > > 1508.64% nfsd > > > > > > > > # zpool list > > > > NAME?? SIZE? ALLOC?? FREE??? CAP? DEDUP? HEALTH? ALTROOT > > > > data? 3.62T?? 954G? 2.69T??? 25%? 1.00x? ONLINE? - > > > > > > > > # zfs list > > > > NAME?? USED? AVAIL? REFER? MOUNTPOINT > > > > data?? 954G? 2.64T?? 954G? /data > > > > > > > > # zfs mount > > > > data??? /data > > > > > > > > What should i look for to resolve it? > > > > > > What version of FreeBSD exactly, and what build date? > > > > > > Please provide output from "procstat -k -k 1666" (yes, two -k's). > > > > Can you explain the correlation between the "sync" parameter (which > > I > > have to assume was set to "standard" -- the default -- on all of > > your > > filesystems) and your nfsd issue?? I do not see the correlation. > > > > My intention of asking for procstat -k -k output (which you did > > provide; > > thank you) was for Rick Macklem (who's currently working on NFS on > > FreeBSD) to chime in with some insights.? He may be busy, but I've > > CC'd > > him here. > > > > I found it in the wiki http://wiki.freebsd.org/ZFSTuningGuide. So i > > gave it a try. I thought it is somehow related with zfs because i > > couldn't even run ls on zfs volume. I had to reset this server > > because it didn't respond to init commands. > > I still don't see any indication in the procstat output that your > problem is ZFS-related. To me looks like nfsd is spinning hard; on > what > I do not know, but I don't see any ZFS functions in the stack list. > > I would strongly recommend you reconsider tinkering with the "sync" > parameter, and instead wait for Rick to chime in with some information > or requests for further details. > Well, I didn't chime in because I don't really have anything useful to say. I don't think several nfsd threads should be in "run state", but I don't have any insight as to why they would be. (The other nfsd threads are just waiting for RPC requests from clients, which is normal.) I suspect something is making those threads loop, but I don't know how to figure out where? (In the bad old days, I would have exscaped to a debugger and looked where the program counter was, then repeated after a "cont" a few times, to see where they were executing. However, I have no idea how to do that on a multicore system, even if you still had the system sitting there? If someone does know how to do this, please feel free to chime in;-) Beyond the above, if it happens again, trying to look for some resource exhaustion might help. "vmstat -m" and "vmstat -z" gives you the dynamically allocated stuff. I know nothing about zfs, but if "sync=disable" makes it ignore VOP_SYNC() ops, it will be "risky" for NFS exported volumes. (NFS assumes everything related to a file is committed to stable storage such that it won't be lost upon a crash/reboot, once VOP_SYNC() has been called for the vnode. If that isn't the case and your server crashes, you could lose recent file modifications. Some care, some don't, but you need to be aware of this.) rick ps: You could try the new/experimental server by adding the "-e" option to both mountd and nfsd. It might make a difference, since it does certain things like the duplicate request cache, differently. > Furthermore, your reply removed Rick from the thread. I've put him > back > in the CC list. Please follow mailing list etiquette. Thank you. > > -- > | Jeremy Chadwick jdc at parodius.com | > | Parodius Networking http://www.parodius.com/ | > | UNIX Systems Administrator Mountain View, CA, US | > | Making life hard for others since 1977. PGP 4BD6C0CB | > > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Low nfs write throughput
Bane Ivosev wrote: > and if you use zfs also try this > > zfs set sync=disabled > I know diddly about zfs, but I believe some others have improved zfs performance for NFS writing by moving the ZIL log to a dedicated device, sometimes an SSD. Apparently (again I'm not knowledgible) you do have to be careful what SSD you use and how full you make it, if you want good write performance on the SSD. I should also note that use of these options (vfs.nfsrv.async=1 and the above for zfs) is risky in the sense that recently written data can be lost when a server crashes/reboots because the NFS clients don't know to hold onto the data and re-write it after a server crash/reboot. rick ps: NFS write performance has been an issue since SUN released their first implementation of it in 1985. The "big" server vendors typically solve the problem with lots of non-volatile RAM in the server boxes. (This solution requires server code that specifically knows how to use this non-volatile RAM. Such code is not in the FreeBSD servers.) > On 11/18/11 04:10, Daryl Sayers wrote: > > Can anyone suggest why I am getting poor write performance from my > > nfs setup. > > I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus > > mother boards, > > 4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives > > with > > onboard Gb network cards connected to an idle network. The results > > below show > > that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using > > nfs. It > > improves if I use async but a smbfs mount still beats it. I am using > > the same > > file, source and destinations for all tests. I have tried alternate > > Network > > cards with no resulting benefit. > > > > oguido# dd if=/u0/tmp/D2 | rsh castor dd of=/dsk/ufs/D2 > > 1950511+1 records in > > 1950511+1 records out > > 998661755 bytes transferred in 10.402483 secs (96002246 bytes/sec) > > 1950477+74 records in > > 1950511+1 records out > > 998661755 bytes transferred in 10.115458 secs (98726301 bytes/sec) > > (98Mb/s) > > > > > > oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp gemini:/dsk/ufs > > /mnt > > oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k > > 7619+1 records in > > 7619+1 records out > > 998661755 bytes transferred in 62.570260 secs (15960646 bytes/sec) > > (15Mb/s) > > > > > > oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp,async > > gemini:/dsk/ufs /mnt > > oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k > > 7619+1 records in > > 7619+1 records out > > 998661755 bytes transferred in 50.697024 secs (19698627 bytes/sec) > > (19Mb/s) > > > > > > oguido# mount -t smbfs //gemini/ufs /mnt > > oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k > > 7619+1 records in > > 7619+1 records out > > 998661755 bytes transferred in 29.787616 secs (33526072 bytes/sec) > > (33Mb/s) > > > > Looking at a systat -v on the destination I see that the nfs test > > does not > > exceed 16KB/t with 100% busy where the other tests reach up to > > 128KB/t. > > For the record I get reads of 22Mb/s without and 77Mb/s with async > > turned on > > for the nfs mount. > > > > > > A copy of dmesg: > > > > > > Copyright (c) 1992-2011 The FreeBSD Project. > > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, > > 1994 > > The Regents of the University of California. All rights > > reserved. > > FreeBSD is a registered trademark of The FreeBSD Foundation. > > FreeBSD 8.2-STABLE #0: Tue Jul 26 02:49:49 UTC 2011 > > root@fm32-8-1106:/usr/obj/usr/src/sys/LOCAL i386 > > Timecounter "i8254" frequency 1193182 Hz quality 0 > > CPU: Intel(R) Core(TM)2 Duo CPU E6850 @ 3.00GHz (2995.21-MHz > > 686-class CPU) > > Origin = "GenuineIntel" Id = 0x6fb Family = 6 Model = f Stepping = > > 11 > > > > Features=0xbfebfbff > > > > Features2=0xe3fd > > AMD Features=0x2010 > > AMD Features2=0x1 > > TSC: P-state invariant > > real memory = 4294967296 (4096 MB) > > avail memory = 3141234688 (2995 MB) > > ACPI APIC Table: > > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs > > FreeBSD/SMP: 1 package(s) x 2 core(s) > > cpu0 (BSP): APIC ID: 0 > > cpu1 (AP): APIC ID: 1 > > ioapic0 irqs 0-23 on motherboard > > kbd1 at kbdmux0 > > cryptosoft0: on motherboard > > acpi0: on motherboard > > acpi0: [ITHREAD] > > acpi0: Power Button (fixed) > > acpi0: reservation of
Re: TCP Reassembly Issues
George Mitchell wrote: > On 11/24/11 21:00, Jeremy Chadwick wrote: > >[...] > > If none of this solves the problem, then I consider this a priority > > 0 > > blocker (read: "all hands on deck") issue with the IP stack in > > FreeBSD > > 9.x and will need immediate attention. > > > > I would strongly recommend a developer or clueful end-user begin > > tracking down who committed all of these bits and CC them into the > > thread. I would start by looking who implemented the > > net.inet.tcp.reass.cursegments sysctl, because that isn't in > > RELENG_8 at > > all. > > > > I've tried out the 9.0 release candidates, and what I notice is that > for > a few minutes after the system starts, I get wonderful NFS read > throughput (7+ MB/s over a 100 megabit interface) -- more than twice > as > fast as 7.n or 8.n on the same hardware -- quickly degrading to > abysmal > (less than 0.5 MB/s). Is this possibly related to the problem under > discussion? -- George Mitchell > Well, when I've seen NFS perf. degrade like this, it has usually been related to RPC transport (and TCP is the default for 9.0). Just from reading some of the thread, it sounds like this problem will result in the FAIL count (the last #) for "vmstat -z" for tcpreass will increase and/or net.inet.tcp.reass.cursegments increases to net.inet.tcp.reass.maxsegments. I'd suggest that, after the NFS perf has degrades, you: # vmstat -z | fgrep tcpreass - and see how big the last # is # sysctl -a | fgrep reass - and see how cursegments compares with maxsegments If these don't indicate that is the TCP Reassembly Issue, then... There are many other possibilities w.r.t. the NFS perf. degradation. Most often I've seen it when the net interface hardware/device driver starts dropping packets (like happens on this laptop with an el-cheapo re net interface in it). You can capture a packet trace after the performance has degraded with tcpdump and look to see if TCP segments are being lost/retransmitted. (Although wireshark knows NFS and is nice for this, because it shows relative sequence numbers, the TCP dump will show you the TCP level retries, etc.) Good luck with it, rick > P.S. A lot of other 9.0 features look very nice indeed! > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: kerberized NFS
Yuri Pankov wrote: > On Fri, Jan 27, 2012 at 06:58:47PM +0100, Giulio Ferro wrote: > > I'm trying to setup a kerberized NFS system made of a server and a > > client (both freebsd 9 amd64 stable) > > > > I've tried to follow this howto: > > http://code.google.com/p/macnfsv4/wiki/FreeBSD8KerberizedNFSSetup > > > > But couldn't get much out of it. > > > > First question : is this howto still valid or something more recent > > should be followed? I've searched with Google but I've come up > > empty. > > > > I've set up kerberos heimdal, created the dns entries for both > > client and server, set up krb5.keytab and copied it to client, set > > up nfs4 according to man nfsv4: > > > > (server) > > cat /etc/exports > > V4: /usr/src -sec=krb5:krb5i:krb5p > > > > and then tried to mount it from the client: > > > > mount_nfs -o ntfsv4,sec=krb5i,gssname=nfs > > nfsinternal1.dcssrl.it:/usr/src /usr/src > > > > but it failed with : > > [tcp] nfsinternal1.dcssrl.it:/usr/src: Permission denied > > > > Can you point me to something that I might have got wrong? > > Not really related to Kerberos question, but.. Some problems here: > - ntfsv4 - probably a typo > - more serious one - V4: line specifies the ROOT of NFSv4 exported FS > - nfsinternal1.dcssrl.it:/usr/src points to /usr/src/usr/src. > > What you /etc/exports could look like (the way it works for me, > doesn't > mean that it's correct though): > > /usr/src > V4: / -sec=krb5:krb5i:krb5p > > > Yuri Btw, Guilio, your email address bounces for me, so hopefully you read the mailing list and see the previous messages. rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: kerberized NFS
Yuri Pankov wrote: > On Fri, Jan 27, 2012 at 06:58:47PM +0100, Giulio Ferro wrote: > > I'm trying to setup a kerberized NFS system made of a server and a > > client (both freebsd 9 amd64 stable) > > > > I've tried to follow this howto: > > http://code.google.com/p/macnfsv4/wiki/FreeBSD8KerberizedNFSSetup > > > > But couldn't get much out of it. > > > > First question : is this howto still valid or something more recent > > should be followed? I've searched with Google but I've come up > > empty. > > > > I've set up kerberos heimdal, created the dns entries for both > > client and server, set up krb5.keytab and copied it to client, set > > up nfs4 according to man nfsv4: > > > > (server) > > cat /etc/exports > > V4: /usr/src -sec=krb5:krb5i:krb5p > > > > and then tried to mount it from the client: > > > > mount_nfs -o ntfsv4,sec=krb5i,gssname=nfs > > nfsinternal1.dcssrl.it:/usr/src /usr/src > > > > but it failed with : > > [tcp] nfsinternal1.dcssrl.it:/usr/src: Permission denied > > > > Can you point me to something that I might have got wrong? > > Not really related to Kerberos question, but.. Some problems here: > - ntfsv4 - probably a typo > - more serious one - V4: line specifies the ROOT of NFSv4 exported FS > - nfsinternal1.dcssrl.it:/usr/src points to /usr/src/usr/src. > > What you /etc/exports could look like (the way it works for me, > doesn't > mean that it's correct though): > > /usr/src > V4: / -sec=krb5:krb5i:krb5p > Yes. If you specify "/", then the tree starts at the root. The main problem with doing this is that, for ZFS, you then have to export all file systems from "/" down to where you want to mount. (Again, these are done by export lines separate from the "V4:" line.) If you specify: V4: /usr/src -sec=krb5:krb5i:krb5p /usr/src -sec=krb5:krb5i:krb5p then the client mounts /usr/src via: % mount -t nfs -o nfsv4,sec=krb5i server:/ /mntpoint rick > > Yuri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: kerberized NFS
Giulio Ferro wrote: > I'm trying to setup a kerberized NFS system made of a server and a > client (both freebsd 9 amd64 stable) > > I've tried to follow this howto: > http://code.google.com/p/macnfsv4/wiki/FreeBSD8KerberizedNFSSetup > > But couldn't get much out of it. > > First question : is this howto still valid or something more recent > should be followed? I've searched with Google but I've come up empty. > It's all there is. I don't think anything has changed since it was written. (I haven't had a kerberos setup for about 2 years, so I know I haven't changed anything recently.) It was a google wiki, since I hoped others would add to it, but I don't think that has happened? > I've set up kerberos heimdal, created the dns entries for both > client and server, set up krb5.keytab and copied it to client, set > up nfs4 according to man nfsv4: > > (server) > cat /etc/exports > V4: /usr/src -sec=krb5:krb5i:krb5p > The V4: line doesn't export any file system. It only defines where the root of the directory tree is for NFSv4 and what authentication can be used for "system operations" which do not take any file handle and, therefore, aren't tied to any server file system. For example, the above would need to be something like: V4: /usr/src -sec=krb5:krb5i:krb5p /usr/src -sec=krb5:krb5i:krb5p - If /usr/src is not the root of a file system on the server, it is less confusing to export the root of the file system, such as "/usr" or "/". > and then tried to mount it from the client: > > mount_nfs -o ntfsv4,sec=krb5i,gssname=nfs > nfsinternal1.dcssrl.it:/usr/src /usr/src > To make the "gssname" case work, you need a couple of things: - You need the patch it refers to applied to the client's kernel, so it can handle "host based initiator credentials". After applying the patch, you also need to have an entry in the client's /etc/keytab that looks like: nfs/client-host.dnsdomain@YOUR.REALM Without the above, the client can only do an NFSv4 mount as a user (not root) that has a valid credential. For example: - non-root mounts enabled via # sysctl vfs.usermount=1 - then a user logs in - gets a kerberos TGT via "kinit" - then does a mount command that looks like: % mount -t nfs -o nfsv4,sec=krb5i :/path - this mount breaks if this user's TGT expires, so it either must be maintained via some utility (there are a couple out there, but I can't remember the name of one offhand) or manually by doing "kinit" again before it expires - this user must umount the file system when done with it (I know, it would be nice if the host based initiator cred. worked, "out of the box", but the patch is ugly and the reviewer understandably didn't agree with it. However, I don't know how to do it another way for the version of Heimdal in FreeBSD. There is a bug that has apparently been fixed for newer Heimdal releases, where it gets confused w.r.t. encryption type for the keytab entry unless it is forced to one encryption type only.) Also, you need the following in the server's /etc/rc.conf: nfsv4_server_enable="YES" gssd_enable="YES" and in the client: nfsuserd_enable="YES" gssd_enable="YES" Finally, I'd suggest that you get NFSv4 mounts over "sys" working first and then you can try Kerberos. > but it failed with : > [tcp] nfsinternal1.dcssrl.it:/usr/src: Permission denied > > Can you point me to something that I might have got wrong? > > Thanks in advance. > ___ > freebsd-...@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: kerberized NFS
Giulio Ferro wrote: > I forgot to mentioned that I compiled both servers with > option KGSSAPI and device crypto, and I enabled gssd > on both. > > Is there anyone who was able to configure this setup? > I had a server at the nfsv4 testing event last June and it worked ok. I haven't tried one since then. Step 1: make sure that nfsv4 mounts work over auth_sys. (You'll need to add "sys" to the sec= flavours, so your /etc/exports will look something like: V4: /usr/src -sec=sys:krb5:krb5i:krb5p /usr/src -sec=sys:krb5:krb5i:krb5p Then on the client: # mount -t nfs -o nfsv4 :/ / (Where "<" and ">" indicate "replace this with what yours".) - Then cd / and do an "ls -l" to see that the file ownership looks ok. If it doesn't, it will be related to "nfsuserd", which must be running in both client and server. Once, Step 1 looks fine: Step 2: Check that Kerberos is working ok in the server. - Log into the server as root and do the following: # kinit -k nfs/@ - This should work ok. # klist - This should list a TGT for nfs/@ If this doesn't work, something isn't right in the Kerberos setup on the server. The NFS server (not client) must have a /etc/krb5.keytab file with an entry for: nfs/@ in it. You should create it on your KDC with encryption type DES-CBC_CRC initially and you should specify that as your default enctype in your /etc/krb5.conf. Once that is working, make sure all the daemons are running on the server. mountd, nfsd, nfsuserd and gssd If this all looks good, go to the client: # sysctl vfs.usermount=1 - make sure these daemons are running nfsuserd, gssd - Log in as non-root user: % kinit % klist - there should be a TGT for the user you are logged in as - Now, try a kerberos mount, as follows: % mount -t nfs -o nfsv4,sec=krb5 :/ / - if that works % cd / % ls -l If these last steps fail, it is not easy to figure out why. (Look in /var/log/messages for any errors. If you get what the gssd calls an minor status, that is the kerberos error.) rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Why do I get 32767 id mapping when using NSFv4 with LDAP?
Olav Gjerde wrote: > I've configured a server with 9-STABLE compiled late january. I've > played a bit with NFSv4 and it works great. Except that I can't get it > to play nice with OpenLDAP. If I mirror the passwd and group files > between the client and server the mapping is correct. If I add > pam_ldap to the /etc/pam.d/system file it works fine on both systems > when I browse local files, however NFSv4 map both the uid and gid as > 32767. The files should belong to user olav with uid and gid 1001. Do > anyone how I can get this to work properly? At least what I should > look into? Do I need kerberos? Nope, you shouldn't need Kerberos. The 32767 is what you get when it can't find a mapping. All nfsuserd does is call the library functions like getpwuid()/getpwname() to get a mapping for a uid when it gets an upcall from the kernel asking for a mapping for that uid/user. I've never used ldap, so I can't help with that except to suggest that, for some reason, the libc calls aren't working. You can run nfsuserd with "-verbose" and it will log all mapping attempts. (Maybe what it logs in /var/log/messages will give you a hint.) You can also "tcpdump -s 0 -w xxx host " and then look at "xxx" in wireshark. Then, look in the Getattr reply and see what the Owner and Owner_group replies look like. This will tell you if it is the server that isn't doing the mappings or the client after it receives the name. (For Getattr, the server should translate uid/gid to @ and then the client should turn that back into the same uid/gid.) Good luck with it, rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Why won't 8.2 umount -f?
Doug Barton wrote: > Is there some magic I'm missing to convince an 8.2 system to umount > -f? > I had an NFS server crash, so I'm trying to get the mounts updated. > All > of the 7.x systems happily did 'umount -f', but the 8.x systems > (mostly > 8.2-pN) are just hanging forever. > > Is this a bug, or is it something I'm missing? > Well, I didn't realize that a 7.n system would "umount -f" an NFS mount when the server was down and there were dirty blocks that needed to be written back, but I don't know. (I seem to recall that someone encouraged me to MFC one of my changes related to this back to stable/7, but I'm not sure if it mattered?) I have pretty well fixed the new client w.r.t. this except for the case where you do a "umount " and that gets hung. Once a non "-f" umount gets hung, there is nothing you can do, because the mount point is locked up, so a subsequent "umount -f" can't get as far as nfs_umount(). My guess is that the old (default for 8.n) client isn't fixed for this. If you "grep MNTK_UNMOUNTF" in the sources, you'll see it used some in the old/regular client, but not as much as the new one. You also need a fairly recent (can't remember if that is in 8.2) version of umount.c, since the code had a "sync();" at the beginning of it that would hang before even getting to the umount(2) syscall. Bottom line, I think the newnfs client (the default for 9.0) can do this, but I'm doubtful the old/reguler one can. (I also wouldn't be surprised if there is still a bug other than the above mentioned one w.r.t. doing a "umount /mnt" and getting that hung before trying "umount -f /mnt". rick > > Doug > > -- > > It's always a long day; 86400 doesn't fit into a short. > > Breadth of IT experience, and depth of knowledge in the DNS. > Yours for the right price. :) http://SupersetSolutions.com/ > > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Why won't 8.2 umount -f?
Doug Barton wrote: > On 02/13/2012 18:23, Rick Macklem wrote: > > Doug Barton wrote: > >> Is there some magic I'm missing to convince an 8.2 system to umount > >> -f? > >> I had an NFS server crash, so I'm trying to get the mounts updated. > >> All > >> of the 7.x systems happily did 'umount -f', but the 8.x systems > >> (mostly > >> 8.2-pN) are just hanging forever. > >> > >> Is this a bug, or is it something I'm missing? > >> > > Well, I didn't realize that a 7.n system would "umount -f" an NFS > > mount when the server was down and there were dirty blocks that > > needed to be written back, but I don't know. > > I'm doubtful that any of those systems had dirty blocks. > > > (I seem to recall that > > someone encouraged me to MFC one of my changes related to this back > > to stable/7, but I'm not sure if it mattered?) > > Please don't unless you can verify that it doesn't make this situation > worse. :) > sbruno did the MFC. I don't think the changes would make it worse. > > I have pretty well fixed the new client w.r.t. this except for the > > case where you do a "umount " and that gets hung. Once a non > > "-f" > > umount gets hung, there is nothing you can do, because the mount > > point is > > locked up, so a subsequent "umount -f" can't get as far as > > nfs_umount(). > > I'm aware of this issue, and I did 'umount -f' first. But I wonder if > this isn't something that should be fixed because I think most users > would expect that 'umount -> umount -f' would be the natural > progression, similar to 'kill -> kill -9'. > > > My guess is that the old (default for 8.n) client isn't fixed for > > this. If you "grep MNTK_UNMOUNTF" in the sources, you'll see it > > used some in the old/regular client, but not as much as the new one. > > > > You also need a fairly recent (can't remember if that is in 8.2) > > version of umount.c, since the code had a "sync();" at the beginning > > of it that would hang before even getting to the umount(2) syscall. > > I just looked and at least some of the fixes were MFC'd to stable/8 about 8months ago. So, they aren't in 8.2, but will be in 8.3. > > Bottom line, I think the newnfs client (the default for 9.0) can > > do this, but I'm doubtful the old/reguler one can. (I also wouldn't > > be surprised if there is still a bug other than the above mentioned > > one w.r.t. doing a "umount /mnt" and getting that hung before trying > > "umount -f /mnt". > > Is the new client in 8-stable up to date relevant to 9.0, and/or is it > considered safe to use in production? > It looks like stable/8 might be ok using either client. The newnfs in stable/8 should be up to date w.r.t. bugfixes in the new/regular client in 9.0. > > Thanks, > > Doug > > -- > > It's always a long day; 86400 doesn't fit into a short. > > Breadth of IT experience, and depth of knowledge in the DNS. > Yours for the right price. :) http://SupersetSolutions.com/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Why won't 8.2 umount -f?
Doug Barton wrote: > On 02/13/2012 18:23, Rick Macklem wrote: > > Doug Barton wrote: > >> Is there some magic I'm missing to convince an 8.2 system to umount > >> -f? > >> I had an NFS server crash, so I'm trying to get the mounts updated. > >> All > >> of the 7.x systems happily did 'umount -f', but the 8.x systems > >> (mostly > >> 8.2-pN) are just hanging forever. > >> > >> Is this a bug, or is it something I'm missing? > >> > > Well, I didn't realize that a 7.n system would "umount -f" an NFS > > mount when the server was down and there were dirty blocks that > > needed to be written back, but I don't know. > > I'm doubtful that any of those systems had dirty blocks. > > > (I seem to recall that > > someone encouraged me to MFC one of my changes related to this back > > to stable/7, but I'm not sure if it mattered?) > > Please don't unless you can verify that it doesn't make this situation > worse. :) > > > I have pretty well fixed the new client w.r.t. this except for the > > case where you do a "umount " and that gets hung. Once a non > > "-f" > > umount gets hung, there is nothing you can do, because the mount > > point is > > locked up, so a subsequent "umount -f" can't get as far as > > nfs_umount(). > > I'm aware of this issue, and I did 'umount -f' first. But I wonder if > this isn't something that should be fixed because I think most users > would expect that 'umount -> umount -f' would be the natural > progression, similar to 'kill -> kill -9'. > I suspect that is "very difficult" to fix. The regular "umount /mnt" will stuck somewhere inside vinvalbuf() trying to flush blocks back to the server while holding a lock on the mount point. Although kib@ is the guy who would most likely know, I don't think it would be easy to get it to come out ok. For example, one approach might be to make all the sleeps interruptible and then add code to gracefully handle an EINTR return from them and then release locks as they return and. well it's not something I would want to tackle. > > My guess is that the old (default for 8.n) client isn't fixed for > > this. If you "grep MNTK_UNMOUNTF" in the sources, you'll see it > > used some in the old/regular client, but not as much as the new one. > > > > You also need a fairly recent (can't remember if that is in 8.2) > > version of umount.c, since the code had a "sync();" at the beginning > > of it that would hang before even getting to the umount(2) syscall. > > > > Bottom line, I think the newnfs client (the default for 9.0) can > > do this, but I'm doubtful the old/reguler one can. (I also wouldn't > > be surprised if there is still a bug other than the above mentioned > > one w.r.t. doing a "umount /mnt" and getting that hung before trying > > "umount -f /mnt". > > Is the new client in 8-stable up to date relevant to 9.0, and/or is it > considered safe to use in production? > > > Thanks, > > Doug > > -- > > It's always a long day; 86400 doesn't fit into a short. > > Breadth of IT experience, and depth of knowledge in the DNS. > Yours for the right price. :) http://SupersetSolutions.com/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Why won't 8.2 umount -f?
Doug Barton wrote: > On 02/13/2012 19:13, Rick Macklem wrote: > > I just looked and at least some of the fixes were MFC'd to stable/8 > > about > > 8months ago. So, they aren't in 8.2, but will be in 8.3. > > Well 8.3 is about to enter code freeze, any way we can check to be > sure > all of the relevant fixes can be mfc'ed? > I took a look and they seem to have been MFC'd. rick > > Doug > > -- > > It's always a long day; 86400 doesn't fit into a short. > > Breadth of IT experience, and depth of knowledge in the DNS. > Yours for the right price. :) http://SupersetSolutions.com/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: kerberized NFS
Giulio Ferro wrote: > Thanks everybody again for your help with setting up a working > kerberized nfsv4 system. > > I was able to user-mount a nfsv4 share with krb5 security, and I was > trying to do the same as root. > > Unfortunately the patch I found here: > http://people.freebsd.org/~rmacklem/rpcsec_gss.patch > > fails to apply cleanly on a 9 stable system. > I'll try and generate an updated patch. I guess some commit has changed the code enough that "patch" gets confused and it's a little big to do the patch manually. (I'm pretty sure any changes done to the sys/rpc/rpcsec_gss code hasn't broken the patch, but I have no way of doing Kerberos testing these days.) > Is there a more recent patch available or some better way to > automatically > mount the share at boot time? > > Thanks again. > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: kerberized NFS
Giulio Ferro wrote: > Thanks everybody again for your help with setting up a working > kerberized nfsv4 system. > > I was able to user-mount a nfsv4 share with krb5 security, and I was > trying to do the same as root. > > Unfortunately the patch I found here: > http://people.freebsd.org/~rmacklem/rpcsec_gss.patch > > fails to apply cleanly on a 9 stable system. > There is now a patch called: http://people.freebsd.org/~rmacklem/rpcsec_gss-9.patch that should apply to a FreeBSD9 or later kernel. For the kernel to build after applying the patch, you will need a kernel config with options KGSSAPI in it, since the patch adds a function that can't be called via one of the XXX_call() functions using the function pointers. Also, review the section of the wiki where it discusses setting vfs.rpcsec.keytab_enctype because the host based initiator keytab entry won't work unless it is set correctly. Good luck with it, rick > Is there a more recent patch available or some better way to > automatically > mount the share at boot time? > > Thanks again. > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: "File too large" error when appending to a file of 130 MB
Jeremie Le Hen wrote: > Hi list, > > Can you please Cc: me when replying as I'm not subscribed, thanks. > > I have a problem with procmail which gets a "File too large" error > when > it tries to write at the end of some mailbox file. > > I truss'ed it and I found the following: > > % stat("/home/jlh/Mail//mbox1",{ mode=-rw--- > ,inode=336983,size=138744672,blksize=131072 }) = 0 (0x0) > % open("/home/jlh/Mail//mbox1",O_WRONLY|O_APPEND|O_CREAT,0667) = 5 > (0x5) > % lseek(5,0x0,SEEK_END) = 138744672 (0x8451360) > % wait4(0x,0x0,0x1,0x0,0x3,0x5) ERR#10 'No child processes' > % lseek(5,0x0,SEEK_CUR) = 138744672 (0x8451360) > % fcntl(5,F_SETLKW,0xd9a4) = 0 (0x0) > % lseek(5,0x0,SEEK_END) = 138744672 (0x8451360) > % write(5,"F",1) ERR#27 'File too large' > % fstat(5,{ mode=-rw--- > ,inode=336983,size=138744672,blksize=131072 }) = 0 (0x0) > % write(5,"rom lionel.messien+caf_=jlh=chch"...,3627) ERR#27 'File too > large' > > I can append something to the file manually. I wonder if the error > doesn't come from the SETLKW fnctl(2) call, but I cannot experiment it > because truss(1) doesn't show the content of the flock structure. > > If I change the procmail recipe to write to another file (which > doesn't > exist), the file is successfully created and messages can be appended. > I narrowed down the failure threshold between 48 MB and 49 MB (in > steps > of 64 KB, it failed between 781 and 782 blocks). > > > This is a 8.2 32 bits jail on a 8.2 amd64 host. In the jail, /home is > a > nullfs mounted ZFS filesystem. The mailbox is not that big: > > % felucia:jlh$ ls -l Mail/mbox1 > % -rw---+ 1 jlh jlh 138744672 Feb 19 11:46 Mail/mbox1 > > > (( For some unknown reason some ACL keep appearing, but the problem if > still there > anyway if I do setfacl -b on it: > > % felucia:jlh$ getfacl Mail/mbox1 > % # file: Mail/mbox1 > % # owner: jlh > % # group: jlh > % owner@:rw-p--aARWcCos:--:allow > % group@:--a-R-c--s:--:allow > % everyone@:--a-R-c--s:--:allow > )) > > > Does anyone have an idea about this error? Besides, if someone knows > why those ACLs keep appearing, I would be glad to know it :). > AFAIK, NFSv4 style ACLs are always enabled for ZFS and cannot be turned off. > Thanks. > -- > Jeremie Le Hen > > Men are born free and equal. Later on, they're on their own. > Jean Yanne > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: "File too large" error when appending to a file of 130 MB
Jeremie Le Hen wrote: > Hi Rick, > > On Sun, Feb 19, 2012 at 07:46:42PM -0500, Rick Macklem wrote: > > > This is a 8.2 32 bits jail on a 8.2 amd64 host. In the jail, /home > > > is > > > a > > > nullfs mounted ZFS filesystem. The mailbox is not that big: > > > > > > % felucia:jlh$ ls -l Mail/mbox1 > > > % -rw---+ 1 jlh jlh 138744672 Feb 19 11:46 Mail/mbox1 > > > > > > > > > (( For some unknown reason some ACL keep appearing, but the > > > problem if > > > still there > > > anyway if I do setfacl -b on it: > > > > > > % felucia:jlh$ getfacl Mail/mbox1 > > > % # file: Mail/mbox1 > > > % # owner: jlh > > > % # group: jlh > > > % owner@:rw-p--aARWcCos:--:allow > > > % group@:--a-R-c--s:--:allow > > > % everyone@:--a-R-c--s:--:allow > > > )) > > > > > > > > > Does anyone have an idea about this error? Besides, if someone > > > knows > > > why those ACLs keep appearing, I would be glad to know it :). > > > > > AFAIK, NFSv4 style ACLs are always enabled for ZFS and cannot be > > turned > > off. > > Weirdly, some of my mailboxes don't have such ACLs, including some > I've > recently written into. How is it possible? > Sorry, I have no idea. Maybe one of the ZFS folks knows when ACLs are generated. (It might happen as a side effect of a "chmod". You could experiment with that?) rick > -- > Jeremie Le Hen > > Men are born free and equal. Later on, they're on their own. > Jean Yanne > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: panic in 8.3-PRERELEASE
Hiroki Sato wrote: > Hi, > > Just a report, but I got the following panic on an NFS server running > 8.3-PRERELEASE: > > (from here) > pool.allbsd.org dumped core - see /var/crash/vmcore.0 > > Tue Feb 21 10:59:44 JST 2012 > > FreeBSD pool.allbsd.org 8.3-PRERELEASE FreeBSD 8.3-PRERELEASE #7: Thu > Feb 16 19:29:19 JST 2012 h...@pool.allbsd.org:/usr/obj/usr/src/sys/POOL > amd64 > > panic: Assertion lock == sq->sq_lock failed at > /usr/src/sys/kern/subr_sleepqueue.c:335 > Oops, I didn't know that mixing msleep() and tsleep() calls on the same event wasn't allowed. There are two places in the code where it did a: mtx_unlock(); tsleep(); left over from the days when it was written for OpenBSD. I don't think the mix would actually break anything, except that the MPASS() assertion fails, but I've cc'd jhb@ since he seems to have been the author of the sleep() stuff. Anyhow, please try the attached patch which replaces the mtx_unlock(); tsleep(); with msleep()s using PDROP. If the attachment gets lost, the patch is also here: http://people.freebsd.org/~rmacklem/tsleep.patch Thanks for reporting this, rick ps: Is mtx_lock() now preferred over msleep()? > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and > you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for > details. > This GDB was configured as "amd64-marcel-freebsd"... > > Unread portion of the kernel message buffer: > > > Reading symbols from /boot/kernel/geom_mirror.ko...Reading symbols > from /boot/kernel/geom_mirror.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/geom_mirror.ko > Reading symbols from /boot/kernel/zfs.ko...Reading symbols from > /boot/kernel/zfs.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/zfs.ko > Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols > from /boot/kernel/opensolaris.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/opensolaris.ko > Reading symbols from /boot/kernel/ipfw.ko...Reading symbols from > /boot/kernel/ipfw.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/ipfw.ko > #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:263 > 263 if (textdump_pending) > (kgdb) #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:263 > #1 0x801f8cfc in db_fncall (dummy1=Variable "dummy1" is not > available. > ) > at /usr/src/sys/ddb/db_command.c:548 > #2 0x801f9031 in db_command (last_cmdp=0x80d37f40, > cmd_table=Variable "cmd_table" is not available. > ) > at /usr/src/sys/ddb/db_command.c:445 > #3 0x801f9280 in db_command_loop () > at /usr/src/sys/ddb/db_command.c:498 > #4 0x801fb369 in db_trap (type=Variable "type" is not > available. > ) at /usr/src/sys/ddb/db_main.c:229 > #5 0x8069e021 in kdb_trap (type=3, code=0, > tf=0xff86c5f7e640) > at /usr/src/sys/kern/subr_kdb.c:548 > #6 0x80946766 in trap (frame=0xff86c5f7e640) > at /usr/src/sys/amd64/amd64/trap.c:595 > #7 0x8092d324 in calltrap () > at /usr/src/sys/amd64/amd64/exception.S:228 > #8 0x8069de7b in kdb_enter (why=0x80a891dd "panic", > msg=0xa ) at cpufunc.h:63 > #9 0x8066afc0 in panic (fmt=Variable "fmt" is not available. > ) at /usr/src/sys/kern/kern_shutdown.c:597 > #10 0x806a9360 in sleepq_add (wchan=0xff0073b97a00, > lock=0x80d6af00, wmesg=0x80a7bb28 "nfsrc", flags=0, > queue=0) at /usr/src/sys/kern/subr_sleepqueue.c:335 > #11 0x80673e4f in _sleep (ident=0xff0073b97a00, > lock=0x80d6af00, priority=Variable "priority" is not > available. > ) at /usr/src/sys/kern/kern_synch.c:218 > #12 0x805fe01e in nfsrvd_updatecache (nd=0xff86c5f7e960, > so=0xff002217c000) at > /usr/src/sys/fs/nfsserver/nfs_nfsdcache.c:697 > #13 0x805ea934 in nfssvc_program (rqst=0xff0476070800, > xprt=0xff000edd0a00) at > /usr/src/sys/fs/nfsserver/nfs_nfsdkrpc.c:333 > #14 0x8084c76b in svc_run_internal (pool=0xff000c876600, > ismaster=0) at /usr/src/sys/rpc/svc.c:895 > #15 0x8084cc8b in svc_thread_start (arg=Variable "arg" is not > available. > ) > at /usr/src/sys/rpc/svc.c:1200 > #16 0x80640865 in fork_exit ( > callout=0x8084cc80 , arg=0xff000c876600, > frame=0xff86c5f7ec50) at /usr/src/sys/kern/kern_fork.c:876 > #17 0x8092d86e in for
Re: panic in 8.3-PRERELEASE
John Baldwin wrote: > On Wednesday, February 22, 2012 2:24:14 pm Konstantin Belousov wrote: > > On Wed, Feb 22, 2012 at 11:29:40AM -0500, Rick Macklem wrote: > > > Hiroki Sato wrote: > > > > Hi, > > > > > > > > Just a report, but I got the following panic on an NFS server > > > > running > > > > 8.3-PRERELEASE: > > > > > > > > (from here) > > > > pool.allbsd.org dumped core - see /var/crash/vmcore.0 > > > > > > > > Tue Feb 21 10:59:44 JST 2012 > > > > > > > > FreeBSD pool.allbsd.org 8.3-PRERELEASE FreeBSD 8.3-PRERELEASE > > > > #7: Thu > > > > Feb 16 19:29:19 JST 2012 > > > > h...@pool.allbsd.org:/usr/obj/usr/src/sys/POOL > > > > amd64 > > > > > > > > panic: Assertion lock == sq->sq_lock failed at > > > > /usr/src/sys/kern/subr_sleepqueue.c:335 > > > > > > > Oops, I didn't know that mixing msleep() and tsleep() calls on the > > > same > > > event wasn't allowed. > > > There are two places in the code where it did a: > > > mtx_unlock(); > > > tsleep(); > > > left over from the days when it was written for OpenBSD. > > This sequence allows to lost the wakeup which is happen right after > > cache unlock (together with clearing the RC_WANTED flag) but before > > the thread enters sleep state. The tsleep has a timeout so thread > > should > > recover in 10 seconds, but still. > > > > Anyway, you should use consistent outer lock for the same wchan, > > i.e. > > no lock (tsleep) or mtx (msleep), but not mix them. > > Correct. > > > > I don't think the mix would actually break anything, except that > > > the > > > MPASS() assertion fails, but I've cc'd jhb@ since he seems to have > > > been > > > the author of the sleep() stuff. > > > > > > Anyhow, please try the attached patch which replaces the > > > mtx_unlock(); > tsleep(); with > > > msleep()s using PDROP. If the attachment gets lost, the patch is > > > also > here: > > > http://people.freebsd.org/~rmacklem/tsleep.patch > > > > > > Thanks for reporting this, rick > > > ps: Is mtx_lock() now preferred over msleep()? > > What do you mean ? > > mtx_sleep() is preferred over msleep(), but I doubt I will remove > msleep() > anytime soon. > Ok, I'll redo the patch with mtx_sleep() and get one of you guys to review it. One question. Do you think this is serious enough to worry about for 8.3? (Just wondering if I need to rush a patch into head with a 1 week MFC. I realize it would still be up to re@, even if I rush it.) Thanks for the useful comments, rick > -- > John Baldwin > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: panic in 8.3-PRERELEASE
Konstantin Belousov wrote: > On Wed, Feb 22, 2012 at 11:29:40AM -0500, Rick Macklem wrote: > > Hiroki Sato wrote: > > > Hi, > > > > > > Just a report, but I got the following panic on an NFS server > > > running > > > 8.3-PRERELEASE: > > > > > > (from here) > > > pool.allbsd.org dumped core - see /var/crash/vmcore.0 > > > > > > Tue Feb 21 10:59:44 JST 2012 > > > > > > FreeBSD pool.allbsd.org 8.3-PRERELEASE FreeBSD 8.3-PRERELEASE #7: > > > Thu > > > Feb 16 19:29:19 JST 2012 > > > h...@pool.allbsd.org:/usr/obj/usr/src/sys/POOL > > > amd64 > > > > > > panic: Assertion lock == sq->sq_lock failed at > > > /usr/src/sys/kern/subr_sleepqueue.c:335 > > > > > Oops, I didn't know that mixing msleep() and tsleep() calls on the > > same > > event wasn't allowed. > > There are two places in the code where it did a: > > mtx_unlock(); > > tsleep(); > > left over from the days when it was written for OpenBSD. > This sequence allows to lost the wakeup which is happen right after > cache unlock (together with clearing the RC_WANTED flag) but before > the thread enters sleep state. The tsleep has a timeout so thread > should > recover in 10 seconds, but still. > Yes. > Anyway, you should use consistent outer lock for the same wchan, i.e. > no lock (tsleep) or mtx (msleep), but not mix them. > > > > I don't think the mix would actually break anything, except that the > > MPASS() assertion fails, but I've cc'd jhb@ since he seems to have > > been > > the author of the sleep() stuff. > > > > Anyhow, please try the attached patch which replaces the > > mtx_unlock(); tsleep(); with > > msleep()s using PDROP. If the attachment gets lost, the patch is > > also here: > > http://people.freebsd.org/~rmacklem/tsleep.patch > > > > Thanks for reporting this, rick > > ps: Is mtx_lock() now preferred over msleep()? > What do you mean ? It appears jhb@ figured out the typo. I meant to type mtx_sleep(), not mtx_lock(). rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Resume broken in 8.3-PRERELEASE
Alexey Dokuchaev wrote: > On Mon, Feb 27, 2012 at 09:28:15PM +0700, Alexey Dokuchaev wrote: > > I was mistaken, the latest kernel with working resume is from Jan 4 > > 00:00 > > UTC, kernel from Jan 4 01:00 UTC does not allow my laptop to come > > back from > > zzz(8) successfully. It seems that offending change is rev. 1.9.2.5 > > of > > sys/nfsclient/nfs_krpc.c by rmacklem@ (SVN rev 229450). To be sure, > > I've > > reverted just this change in the latest RELENG_8 sources -- and the > > problem > > goes away. > > Hmm, apparently the problem lies deeply/earlier. Backing out SVN rev > 229450 allows me to resume twice, but third time it fails with the > same > symptoms as before (no keyboard while VTY switching works and > screensaver > fires, no network but ping(8) works, fans are bursting up). Stay tuned > while I investigate more... > Yes, I can't think of how r229450 would affect "resume". All it does is clear the high order bit in an error reply from an NFS server, since that bit should never be set in an NFS error reply and, if set, it results in an mbuf list being free'd twice. The bit is erroneously set by "amd" sometimes. If you are using "amd", that might be related to the resume problem? rick ps: I suspect you saw it, but there was a recent thread related to known suspend/resume issues and discussed how they might be fixed in the future. Sorry, I don't remember which list or the exact subject line. > ./danfe > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 9-STABLE, ZFS, NFS, ggatec - suspected memory leak
Oliver Brandmueller wrote: > Hi, > > After figuring an easy way to repeat the behaviour and hunting it down > to the combination of ZFS+newNFS and removal of files or directories I > opened PR kern/167266 > Good work isolating this! I now see the problem. The new NFS server code assumed that VOP_LOOKUP() calls would not set SAVENAME, so it expected the path buffer to be free'd by the nfsvno_namei() when it hadn't set SAVENAME. It turns out ZFS sets SAVENAME in zfs_lookup() for the DELETE case. The attached patch, which is also here, should fix the problem for now: http://people.freebsd.org/~namei-leak.patch Please test this patch and let me know if it fixes the leak. jwd@ is working on a patch that will avoid using uma_zalloc() to get a path buffer for most cases for performance reasons. Once that patch goes it, the code should be patched so that it checks for SAVENAME being set for all cases where uma_zalloc() has allocated a path buffer, so that no more leaks like this will happen when underlying file systems set SAVENAME. rick --- fs/nfsserver/nfs_nfsdport.c.sav 2012-04-25 16:50:05.0 -0400 +++ fs/nfsserver/nfs_nfsdport.c 2012-04-25 17:08:43.0 -0400 @@ -1047,6 +1047,8 @@ nfsvno_removesub(struct nameidata *ndp, else vput(ndp->ni_dvp); vput(vp); + if ((ndp->ni_cnd.cn_flags & SAVENAME) != 0) + nfsvno_relpathbuf(ndp); NFSEXITCODE(error); return (error); } @@ -1086,6 +1088,8 @@ out: else vput(ndp->ni_dvp); vput(vp); + if ((ndp->ni_cnd.cn_flags & SAVENAME) != 0) + nfsvno_relpathbuf(ndp); NFSEXITCODE(error); return (error); } ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 9-STABLE, ZFS, NFS, ggatec - suspected memory leak
Oliver Brandmueller wrote: > Hi, > > After figuring an easy way to repeat the behaviour and hunting it down > to the combination of ZFS+newNFS and removal of files or directories I > opened PR kern/167266 > Oops, the patch for this is at: http://people.freebsd.org/~rmacklem/namei-leak.patch rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 9-STABLE, ZFS, NFS, ggatec - suspected memory leak
Oliver Brandmueller wrote: > Hi, > > On Wed, Apr 25, 2012 at 05:34:05PM -0400, Rick Macklem wrote: > > Good work isolating this! > > Thank you! > > > I now see the problem. The new NFS server code assumed that > > VOP_LOOKUP() > > calls would not set SAVENAME, so it expected the path buffer to be > > free'd > > by the nfsvno_namei() when it hadn't set SAVENAME. > > > > It turns out ZFS sets SAVENAME in zfs_lookup() for the DELETE case. > > > > The attached patch, which is also here, should fix the problem for > > now: > >http://people.freebsd.org/~namei-leak.patch > > > > Please test this patch and let me know if it fixes the leak. > > Thanx for the explanation - anf coming up with a patch that fast! > > I applied the patch and in my testing environment I don't see the leak > anymore. I will not be able to apply it to our prod environment before > about mid of May, since I don't want to leave my fellow co-workers > with > any problems while being on holidays :) > > > jwd@ is working on a patch that will avoid using uma_zalloc() to get > > a path buffer for most cases for performance reasons. Once that > > patch > > goes it, the code should be patched so that it checks for SAVENAME > > being > > set for all cases where uma_zalloc() has allocated a path buffer, so > > that > > no more leaks like this will happen when underlying file systems set > > SAVENAME. > > So is itlikely, that this final version will at some time be included > into 9-STABLE or will the current patch (given more positive results > come up) make it into -STABLE until the other one is ready? > Well, I think I can commit it to head with an MFC of 1 month. That way, hopefully you will have been able to test it in your production environment before it gets MFC'd to 9-STABLE. I suspect John's patch will be committed sometime later, but I'll leave that up to him. (He runs a server with ZFS, so he should be able to check for the leak.) > > Greeting and many thanks. > And thanks for tracking it down. It's surprising only one other person noticed this. I guess others don't have enough removes going on for the to get serious. rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 9-STABLE, ZFS, NFS, ggatec - suspected memory leak
Steven Hartland wrote: > - Original Message - > From: "Rick Macklem" > To: "Oliver Brandmueller" > Cc: > Sent: Thursday, April 26, 2012 1:24 AM > Subject: Re: 9-STABLE, ZFS, NFS, ggatec - suspected memory leak > > > > Oliver Brandmueller wrote: > >> Hi, > >> > >> After figuring an easy way to repeat the behaviour and hunting it > >> down > >> to the combination of ZFS+newNFS and removal of files or > >> directories I > >> opened PR kern/167266 > >> > > Oops, the patch for this is at: > > http://people.freebsd.org/~rmacklem/namei-leak.patch > > Is this specific to 9.x or is 8.x effected? > At a glance, it looks to me like 8.x is affected. Note that the bug only affects the new NFS server (the experimental one for 8.x) when exporting ZFS volumes. (UFS exported volumes don't leak) If you are running a server that might be affected, just: # vmstat -z | fgrep -i namei on the server and see if the 3rd number shown is increasing. rick > Regards > Steve > > > This e.mail is private and confidential between Multiplay (UK) Ltd. > and the person or entity to whom it is addressed. In the event of > misdirection, the recipient is prohibited from using, copying, > printing or otherwise disseminating it or any information contained in > it. > > In the event of misdirection, illegible or incomplete transmission > please telephone +44 845 868 1337 > or return the E.mail to postmas...@multiplay.co.uk. > > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 9-STABLE, ZFS, NFS, ggatec - suspected memory leak
Steven Hartland wrote: > Original Message - > From: "Rick Macklem" > > At a glance, it looks to me like 8.x is affected. Note that the > > bug only affects the new NFS server (the experimental one for 8.x) > > when exporting ZFS volumes. (UFS exported volumes don't leak) > > > > If you are running a server that might be affected, just: > > # vmstat -z | fgrep -i namei > > on the server and see if the 3rd number shown is increasing. > > Many thanks Rick wasnt aware we had anything experimental enabled > but I think that would be a yes looking at these number:- > > vmstat -z | fgrep -i namei > NAMEI: 1024, 0, 1, 1483, 25285086096, 0 > vmstat -z | fgrep -i namei > NAMEI: 1024, 0, 0, 1484, 25285945725, 0 > ^ I don't think so, since the 3rd number (USED) is 0 here. If that # is increasing over time, you have the leak. You are probably running the old (default in 8.x) NFS server. rick > Regards > Steve > > > This e.mail is private and confidential between Multiplay (UK) Ltd. > and the person or entity to whom it is addressed. In the event of > misdirection, the recipient is prohibited from using, copying, > printing or otherwise disseminating it or any information contained in > it. > > In the event of misdirection, illegible or incomplete transmission > please telephone +44 845 868 1337 > or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 9-STABLE, ZFS, NFS, ggatec - suspected memory leak
Daniel Braniss wrote: > > Security_Multipart(Fri_Apr_27_13_35_56_2012_748)-- > > Content-Type: Text/Plain; charset=us-ascii > > Content-Transfer-Encoding: 7bit > > > > Rick Macklem wrote > > in > > <1527622626.3418715.1335445225510.javamail.r...@erie.cs.uoguelph.ca>: > > > > rm> Steven Hartland wrote: > > rm> > Original Message - > > rm> > From: "Rick Macklem" > > rm> > > At a glance, it looks to me like 8.x is affected. Note that > > the > > rm> > > bug only affects the new NFS server (the experimental one > > for 8.x) > > rm> > > when exporting ZFS volumes. (UFS exported volumes don't > > leak) > > rm> > > > > rm> > > If you are running a server that might be affected, just: > > rm> > > # vmstat -z | fgrep -i namei > > rm> > > on the server and see if the 3rd number shown is increasing. > > rm> > > > rm> > Many thanks Rick wasnt aware we had anything experimental > > enabled > > rm> > but I think that would be a yes looking at these number:- > > rm> > > > rm> > vmstat -z | fgrep -i namei > > rm> > NAMEI: 1024, 0, 1, 1483, 25285086096, 0 > > rm> > vmstat -z | fgrep -i namei > > rm> > NAMEI: 1024, 0, 0, 1484, 25285945725, 0 > > rm> > > > rm> ^ > > rm> I don't think so, since the 3rd number (USED) is 0 here. > > rm> If that # is increasing over time, you have the leak. You are > > rm> probably running the old (default in 8.x) NFS server. > > > > Just a report, I confirmed it affected 8.x servers running newnfs. > > > > Actually I have been suffered from memory starvation symptom on > > that > > server (24GB RAM) for a long time and watching vmstat -z > > periodically. It stopped working once a week. I investigated the > > vmstat log again and found the amount of NAMEI leak was 11,543,956 > > (about 11GB!) just before the locked-up. After applying the patch, > > the leak disappeared. Thank you for fixing it! > > > > -- Hiroki And thanks Hiroki for testing it on 8.x. > this is on 8.2-STABLE/amd64 from around August: > same here, this zfs+newnfs has been hanging every few months, and I > can see > now the leak, it's slowly increasing: > NAMEI: 1024, 0, 122975, 529, 15417248, 0 > NAMEI: 1024, 0, 122984, 520, 15421772, 0 > NAMEI: 1024, 0, 123002, 502, 15424743, 0 > NAMEI: 1024, 0, 123008, 496, 15425464, 0 > > cheers, > danny Maybe you could try the patch, too. It's at: http://people.freebsd.org/~rmacklem/namei-leak.patch I'll commit it to head soon with a 1 month MFC, so that hopefully Oliver will have a chance to try it on his production server before the MFC. Thanks everyone, for your help with this, rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 9-STABLE, ZFS, NFS, ggatec - suspected memory leak
Daniel Braniss wrote: > > Daniel Braniss wrote: > > > > Security_Multipart(Fri_Apr_27_13_35_56_2012_748)-- > > > > Content-Type: Text/Plain; charset=us-ascii > > > > Content-Transfer-Encoding: 7bit > > > > > > > > Rick Macklem wrote > > > > in > > > > <1527622626.3418715.1335445225510.javamail.r...@erie.cs.uoguelph.ca>: > > > > > > > > rm> Steven Hartland wrote: > > > > rm> > Original Message - > > > > rm> > From: "Rick Macklem" > > > > rm> > > At a glance, it looks to me like 8.x is affected. Note > > > > that > > > > the > > > > rm> > > bug only affects the new NFS server (the experimental > > > > one > > > > for 8.x) > > > > rm> > > when exporting ZFS volumes. (UFS exported volumes don't > > > > leak) > > > > rm> > > > > > > rm> > > If you are running a server that might be affected, > > > > just: > > > > rm> > > # vmstat -z | fgrep -i namei > > > > rm> > > on the server and see if the 3rd number shown is > > > > increasing. > > > > rm> > > > > > rm> > Many thanks Rick wasnt aware we had anything experimental > > > > enabled > > > > rm> > but I think that would be a yes looking at these number:- > > > > rm> > > > > > rm> > vmstat -z | fgrep -i namei > > > > rm> > NAMEI: 1024, 0, 1, 1483, 25285086096, 0 > > > > rm> > vmstat -z | fgrep -i namei > > > > rm> > NAMEI: 1024, 0, 0, 1484, 25285945725, 0 > > > > rm> > > > > > rm> ^ > > > > rm> I don't think so, since the 3rd number (USED) is 0 here. > > > > rm> If that # is increasing over time, you have the leak. You > > > > are > > > > rm> probably running the old (default in 8.x) NFS server. > > > > > > > > Just a report, I confirmed it affected 8.x servers running > > > > newnfs. > > > > > > > > Actually I have been suffered from memory starvation symptom on > > > > that > > > > server (24GB RAM) for a long time and watching vmstat -z > > > > periodically. It stopped working once a week. I investigated > > > > the > > > > vmstat log again and found the amount of NAMEI leak was > > > > 11,543,956 > > > > (about 11GB!) just before the locked-up. After applying the > > > > patch, > > > > the leak disappeared. Thank you for fixing it! > > > > > > > > -- Hiroki > > And thanks Hiroki for testing it on 8.x. > > > > > this is on 8.2-STABLE/amd64 from around August: > > > same here, this zfs+newnfs has been hanging every few months, and > > > I > > > can see > > > now the leak, it's slowly increasing: > > > NAMEI: 1024, 0, 122975, 529, 15417248, 0 > > > NAMEI: 1024, 0, 122984, 520, 15421772, 0 > > > NAMEI: 1024, 0, 123002, 502, 15424743, 0 > > > NAMEI: 1024, 0, 123008, 496, 15425464, 0 > > > > > > cheers, > > > danny > > Maybe you could try the patch, too. > > > > It's at: > >http://people.freebsd.org/~rmacklem/namei-leak.patch > > > > I'll commit it to head soon with a 1 month MFC, so that hopefully > > Oliver will have a chance to try it on his production server before > > the MFC. > > > > Thanks everyone, for your help with this, rick > > I haven't applied the patch yet, but in the meanime I have been > running some > experiments on a zfs/nfs server running 8.3-STABLE, and don't see any > leaks > what triggers the leak? > Fortunately Oliver isolated this. It should leak when you do a successful "rm" or "rmdir" while running the new/experimental server. rick > thanks, > danny > > > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 9-STABLE, ZFS, NFS, ggatec - suspected memory leak
Daniel Braniss wrote: > > Daniel Braniss wrote: > > > > Daniel Braniss wrote: > > > > > > Security_Multipart(Fri_Apr_27_13_35_56_2012_748)-- > > > > > > Content-Type: Text/Plain; charset=us-ascii > > > > > > Content-Transfer-Encoding: 7bit > > > > > > > > > > > > Rick Macklem wrote > > > > > > in > > > > > > > > > > > > <1527622626.3418715.1335445225510.javamail.r...@erie.cs.uoguelph.ca>: > > > > > > > > > > > > rm> Steven Hartland wrote: > > > > > > rm> > Original Message - > > > > > > rm> > From: "Rick Macklem" > > > > > > rm> > > At a glance, it looks to me like 8.x is affected. > > > > > > Note > > > > > > that > > > > > > the > > > > > > rm> > > bug only affects the new NFS server (the > > > > > > experimental > > > > > > one > > > > > > for 8.x) > > > > > > rm> > > when exporting ZFS volumes. (UFS exported volumes > > > > > > don't > > > > > > leak) > > > > > > rm> > > > > > > > > rm> > > If you are running a server that might be affected, > > > > > > just: > > > > > > rm> > > # vmstat -z | fgrep -i namei > > > > > > rm> > > on the server and see if the 3rd number shown is > > > > > > increasing. > > > > > > rm> > > > > > > > rm> > Many thanks Rick wasnt aware we had anything > > > > > > experimental > > > > > > enabled > > > > > > rm> > but I think that would be a yes looking at these > > > > > > number:- > > > > > > rm> > > > > > > > rm> > vmstat -z | fgrep -i namei > > > > > > rm> > NAMEI: 1024, 0, 1, 1483, 25285086096, 0 > > > > > > rm> > vmstat -z | fgrep -i namei > > > > > > rm> > NAMEI: 1024, 0, 0, 1484, 25285945725, 0 > > > > > > rm> > > > > > > > rm> ^ > > > > > > rm> I don't think so, since the 3rd number (USED) is 0 here. > > > > > > rm> If that # is increasing over time, you have the leak. > > > > > > You > > > > > > are > > > > > > rm> probably running the old (default in 8.x) NFS server. > > > > > > > > > > > > Just a report, I confirmed it affected 8.x servers running > > > > > > newnfs. > > > > > > > > > > > > Actually I have been suffered from memory starvation > > > > > > symptom on > > > > > > that > > > > > > server (24GB RAM) for a long time and watching vmstat -z > > > > > > periodically. It stopped working once a week. I > > > > > > investigated > > > > > > the > > > > > > vmstat log again and found the amount of NAMEI leak was > > > > > > 11,543,956 > > > > > > (about 11GB!) just before the locked-up. After applying the > > > > > > patch, > > > > > > the leak disappeared. Thank you for fixing it! > > > > > > > > > > > > -- Hiroki > > > > And thanks Hiroki for testing it on 8.x. > > > > > > > > > this is on 8.2-STABLE/amd64 from around August: > > > > > same here, this zfs+newnfs has been hanging every few months, > > > > > and > > > > > I > > > > > can see > > > > > now the leak, it's slowly increasing: > > > > > NAMEI: 1024, 0, 122975, 529, 15417248, 0 > > > > > NAMEI: 1024, 0, 122984, 520, 15421772, 0 > > > > > NAMEI: 1024, 0, 123002, 502, 15424743, 0 > > > > > NAMEI: 1024, 0, 123008, 496, 15425464, 0 > > > > > > > > > > cheers, > > > > > danny > > > > Maybe you could try the patch, too. > > > > > > > > It's at: > > > >http://people.freebsd.org/~rmacklem/namei-leak.patch > > > > > > > > I'll commit it to head soon with a 1 month MFC, so that > > > > hopefully > > > > Oliver will have a chance to try it on his production server > > > > before > > > > the MFC. > > > > > > > > Thanks everyone, for your help with this, rick > > > > > > I haven't applied the patch yet, but in the meanime I have been > > > running some > > > experiments on a zfs/nfs server running 8.3-STABLE, and don't see > > > any > > > leaks > > > what triggers the leak? > > > > > Fortunately Oliver isolated this. It should leak when you do a > > successful > > "rm" or "rmdir" while running the new/experimental server. > > > but that's what I did, I'm running the new/experimental nfs server > (or so I think :-), and did a huge rm -rf and nothing, nada, no leak. > To check the patch, I have to upgrade the production server, the one > with the > leak, > but I wanted to test it on a non production first. Anyways, ill patch > the > kernel > and try it on the leaking production server tomorrow. > Well, I think the patch should be harmless. You can check which server you are running by doing: # nfsstat -e -s - and see if the numbers are increasing if they're zero or not increasing, you are running the old (default on 8.x) server rick > danny > > > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Support for Intel 82599ES?
Hi All, I did not see the Intel 82599ES chipset in the hardware release notes for 8.3 or 9.0. Are these controllers supported at this time? -- Take care Rick Miller ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Support for Intel 82599ES?
Thanks, Jack! Also another support question for the listsIs the Broadcom BCM5719 supported? I can find neither in the hardware notes for 8.3 nor 9.0. On Fri, Jun 1, 2012 at 12:49 PM, Jack Vogel wrote: > Yes, it is supported in the ixgbe driver. > > Jack > > > On Fri, Jun 1, 2012 at 8:36 AM, Rick Miller > wrote: >> >> Hi All, >> >> I did not see the Intel 82599ES chipset in the hardware release notes >> for 8.3 or 9.0. Are these controllers supported at this time? >> >> -- >> Take care >> Rick Miller >> ___ >> freebsd-stable@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-stable >> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > > -- Take care Rick Miller ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Support for Intel 82599ES?
Thanks, Michael! I took a look at the manpage and it does appear that it is supported by the bge driver. It also states that the 572x controller is also supported, but I heard a rumor stating that the BCM5720 in particular did not work even though the manpage indicates it is supported. I was unable to verify this, but that's why I was asking for clarification. I will assume it works at this point. On Fri, Jun 1, 2012 at 1:25 PM, Michael Butler wrote: > On 06/01/12 13:06, Rick Miller wrote: >> Thanks, Jack! >> >> Also another support question for the listsIs the Broadcom BCM5719 >> supported? I can find neither in the hardware notes for 8.3 nor 9.0. > > man bge > > -- Take care Rick Miller ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Kernel trap with stable/8 on DL360p G8 w/ BCM5719
Hi all, I am attempting to build stable/8 (as of 21 May 2012) on a DL360p G8 with a BCM5719. I receive a kernel panic very similar to the one at this URL: http://freebsd.1045724.n5.nabble.com/Fatal-trap-19-Stopped-at-bge-init-locked-and-bge-booting-problems-td5504461.html . The hardware notes don't specify that the BCM5719 is supported, but the bge manpage appears to indicate it is supported. Is there a definitive answer whether or not it is supported? -- Take care Rick Miller ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Ports from a particular date in the past... Re: Why Are You NOT Using FreeBSD?
I, for one, appreciate you changing the subject because I didn't know this either and its an important function in my use case where point in time snapshots are important to the architects and ops folks! On 6/6/12, grenville armitage wrote: > > > On 06/07/2012 00:16, Chris Rees wrote: >> On 6 June 2012 14:12, Erich wrote: > [..] > >>> is my English really this bad? >>> >>> From the handbook: >>> >>> '. In particular, use only tag=. for the ports-* collections.' >> >> Your English is fine, but "being told to use tag=." != "tag=. is the >> only tag that exists". > > Another data point: > > In Erich's defense, I'd say his interpretation is quite understandable. > "...use only tag=. for the ports-* collections" also left me with the > distinct impression (some many moons in the past) that there are no > other meaningful (or safe) tags when csup'ing the Ports tree. > > In 12 years of using FreeBSD I've never really sought out Erich's use > case (viz. roll back /usr/ports to some past known-good version), I > just assumed it wasn't possible. So this thread has taught at least one > person (me) a new thing -- I never fully grokked that adding "date=" > to the supfile could achieve this desired result when csup'ing the > Ports tree. Now I know, and I've changed the Subject line of this email > in the hope it helps some future soul googling for the answer. > > cheers, > gja > > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > -- Sent from my mobile device Take care Rick Miller ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Why Are You NOT Using FreeBSD ?
Phil Regnauld wrote: > David Magda (dmagda) writes: > > On Jun 1, 2012, at 09:12, Phil Regnauld wrote: > > > > > * Gluster > > > > > > For very large FSes, nothing beats it, especially now that 3.3 > > > has been > > > released. > > > > Isilon built their OneFS on top of FreeBSD, does that count? :) > > > > Panasas too IIRC. > In the case of Panasas, I believe that they only provide a "client driver" for Linux to talk to their object storage appliance. There is an NFSv4.1 pNFS object layout that they have developed, but it requires an ODS2 (I think I got that right?) stack and it's unlikely that the client I am working on will be able to do this any time soon. rick > Good pointers, thanks. It's still "appliance", but good to know that > FreeBSD is out there :) > > Phil > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Why Are You NOT Using FreeBSD ?
David Magda wrote: > On Jun 1, 2012, at 21:03, Chris Nehren wrote: > > > You say your'e using ZVOLs but then recommend gluster for large > > filesystems. I would like to take a moment to point out that one of > > the > > design goals of ZFS was to scale beyond the capabilities of current > > hardware. > > > > What does gluster do that ZFS does not? I'm not trying to troll > > here, > > but am genuinely curious about ZFS's shortfalls in one of the > > problem > > domains it seeks to address. > > ZFS is for storing file systems on locally connected block devices. > Gluster is a network file system where data can be distributed over > many nodes. > > So ZFS can ensure that bits-on-disk stay safe through checksums and > mirroring / RAIDZ, while Gluster allows entire file servers to go > offline and the files are still accessible because you have a kind of > network-level RAID going on. This also helps in performance since > instead of clients pounding on one file server (as usually happens > with NFS), every write is sent to many data nodes so you're striping > across many network elements. Think of it as NFS on steroids. > > A competitive open source equivalent would be Lustre, while Isilon and > Panasas would probably be commercial alternatives (though they do NFS > / CIFS on the 'front-end' and the distributed "magic" occurs on a > 'back-end' network between the appliances). > > http://en.wikipedia.org/wiki/GlusterFS > http://en.wikipedia.org/wiki/Lustre_(file_system) > Just fyi, someone is currently working on an NFSv4.1 pNFS layout type for Lustre. As such, once that layout is implemented, the NFSv4.1 client I am working on should be able to use a Lustre server cluster. So, it could be a while (next summer, maybe?), but that should be FreeBSD eventually. (I have no idea how easy porting of the Lustre server to FreeBSD would be?) Having said the above, I am not familiar with either Gluster or Lustre, so take the above as based on what little I currently know, rick > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Kernel trap with stable/8 on DL360p G8 w/ BCM5719
On Wed, Jun 13, 2012 at 12:07 PM, Philipp Wuensche wrote: > Rick Miller wrote: >> Hi all, >> >> I am attempting to build stable/8 (as of 21 May 2012) on a DL360p G8 >> with a BCM5719. I receive a kernel panic very similar to the one at >> this URL: >> http://freebsd.1045724.n5.nabble.com/Fatal-trap-19-Stopped-at-bge-init-locked-and-bge-booting-problems-td5504461.html > > When booting 9.0-RELEASE on the exact same machine everything is fine, > until I set an interface to UP which has an active link. I can install stable/8 from physical media, but like you, I get the panic under the same circumstances. -- Take care Rick Miller ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Intel X520-DA2 Supported in stable/8?
Hi All, Wondering if the Intel X520-DA2 10G Fibre NIC is supported in stable/8. Hardware notes don't specify it, but I have a system up and the interfaces appear to be loaded by the ix driver. However, status indicates "no carrier". -- Take care Rick Miller ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Intel X520-DA2 Supported in stable/8?
On Fri, Jun 22, 2012 at 3:13 PM, Rick Miller wrote: > Hi All, > > Wondering if the Intel X520-DA2 10G Fibre NIC is supported in > stable/8. Hardware notes don't specify it, but I have a system up and > the interfaces appear to be loaded by the ix driver. However, status > indicates "no carrier". Ok, brain fart. Please forgive my ineptitude. I once sent an email inquiring about the Intel 82599, which is this NIC. Responses to that mail say it's supported by the ixgbe driver. My stable/8 installation (5/21/2012) probes it with an ix driver that I cannot find any info on. The ixgbe manage indicates it only supports 82598 based controllers. Not sure what to think here... ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Intel X520-DA2 Supported in stable/8?
On Fri, Jun 22, 2012 at 3:54 PM, Andrew Boyer wrote: > The ixgbe driver creates devices named ix0, etc. > > I believe you need to run 'ifconfig ix0 up' before it will attempt to get > link. Thanks for clarifying that tidbit. At least I know the driver loading is the correct driver :) I did try ifup'ing the interface...it shows the interface up, status is still no carrier. I've had confirmation that the cable itself is good. I wonder if it matters that the upstream switch has VLAN tagging enabled? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Intel X520-DA2 Supported in stable/8?
dmesg and ifconfig output below... On Fri, Jun 22, 2012 at 4:02 PM, Rick Miller wrote: > On Fri, Jun 22, 2012 at 3:54 PM, Andrew Boyer wrote: >> The ixgbe driver creates devices named ix0, etc. >> >> I believe you need to run 'ifconfig ix0 up' before it will attempt to get >> link. > > Thanks for clarifying that tidbit. At least I know the driver loading > is the correct driver :) > > I did try ifup'ing the interface...it shows the interface up, status > is still no carrier. I've had confirmation that the cable itself is > good. I wonder if it matters that the upstream switch has VLAN > tagging enabled? ix0: port 0x7000-0x701f mem 0xf6b8-0xf6bf,0xf6b7-0xf6b73fff irq 40 at device 0.0 on pci7 ix0: Using MSIX interrupts with 9 vectors ix0: RX Descriptors exceed system mbuf max, using default instead! ix0: [ITHREAD] ix0: [ITHREAD] ix0: [ITHREAD] ix0: [ITHREAD] ix0: [ITHREAD] ix0: [ITHREAD] ix0: [ITHREAD] ix0: [ITHREAD] ix0: [ITHREAD] ix0: Ethernet address: 90:e2:ba:15:e2:60 ix0: PCI Express Bus: Speed 5.0Gb/s Width x8 ix1: port 0x7020-0x703f mem 0xf6a8-0xf6af,0xf6a7-0xf6a73fff irq 44 at device 0.1 on pci7 ix1: Using MSIX interrupts with 9 vectors ix1: RX Descriptors exceed system mbuf max, using default instead! ix1: [ITHREAD] ix1: [ITHREAD] ix1: [ITHREAD] ix1: [ITHREAD] ix1: [ITHREAD] ix1: [ITHREAD] ix1: [ITHREAD] ix1: [ITHREAD] ix1: [ITHREAD] ix1: Ethernet address: 90:e2:ba:15:e2:61 ix1: PCI Express Bus: Speed 5.0Gb/s Width x8 ix0: flags=8843 metric 0 mtu 1500 options=401bb ether 90:e2:ba:XX:XX:XX inet 10.1.2.50 netmask 0xfe00 broadcast 10.1.3.255 media: Ethernet autoselect status: no carrier ix1: flags=8802 metric 0 mtu 1500 options=401bb ether 90:e2:ba:XX:XX:XX media: Ethernet autoselect status: no carrier -- Take care Rick Miller ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Intel X520-DA2 Supported in stable/8?
On Fri, Jun 22, 2012 at 5:21 PM, Jack Vogel wrote: > Increase your system mbuf pool size, you do not want that failure to happen. Thanks, Jack. I saw a thread where you discussed this. You are referring to kern.ipc.nmbclusters, correct? Should I also adjust the following? hw.ixgbe.rxd hw.ixgbe.txd hw.ixgbe.num_queues hw.intr_storm_threshold ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Intel X520-DA2 Supported in stable/8?
On Fri, Jun 22, 2012 at 7:23 PM, Jack Vogel wrote: > Would probably be good to take care of the storm threshold if you haven't, > set it to 0 > and you disable the check, that's what we do internally. As for the queues > and number > of descriptors, that's kind of up to you, different work loads and > environments work best > with different setups. > > Hopefully, when you get rid of the rx ring setup failure you will get things > working. Thanks, Jack. I did get rid of the rx ring failure. Link status still shows no carrier. I think everything looks right from the host's perspective. -- Take care Rick Miller ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Intel X520-DA2 Supported in stable/8?
Turns out the gbic in the switch was bad...I didn't think there was a problem on the host, but you all still gave me some good info. I appreciate it! On 6/25/12, Rick Miller wrote: > On Fri, Jun 22, 2012 at 7:23 PM, Jack Vogel wrote: >> Would probably be good to take care of the storm threshold if you >> haven't, >> set it to 0 >> and you disable the check, that's what we do internally. As for the >> queues >> and number >> of descriptors, that's kind of up to you, different work loads and >> environments work best >> with different setups. >> >> Hopefully, when you get rid of the rx ring setup failure you will get >> things >> working. > > Thanks, Jack. I did get rid of the rx ring failure. Link status > still shows no carrier. I think everything looks right from the > host's perspective. > > -- > Take care > Rick Miller > -- Sent from my mobile device Take care Rick Miller ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Need help with nfsv4 and krb5 access denied
Herbert Poeckl wrote: > Hi everybody. > > We are new to this list and need technical help. > > We are getting access denied error on our debian clients when mounting > nfsv4 network drives with kerberos 5 authentication. > > What is wired about this, is that it works with one server, but not > with > a second server. The configuration on these both machines are > identical, > witch we have tested by booting from the same USB drive. > Ok, if I understand you correctly, you are booting the 2 machines using the same USB root disk? Are they using DHCP to configure their network? (I'm just checking, since they would need to boot as the same hostname and IP address, if they are using the same /etc/krb5.keytab file. ie. They must both think they are: tmp2.ist.intra@IST.INTRA including name<->IP# resolution (/etc/hosts, DNS, or ???) If they are the "same host", then the only other thought is to make sure that their Time of Day clocks are correctly set. One simple check you can do on the server to confirm that the keytab entry is ok is to do: # kinit -k nfs/tmp2.ist.intra@IST.INTRA and make sure it can put an entry in root's credential cache from the keytab. Beyond that, I have no idea why one would work and the other not. (I always avoid multiple encryption types for keytabs, since I've seen Heimdal get confused about which one to use, but that normally happened to me when I was trying to get initiator credentials from a keytab entry.) Hopefully someone else conversant with kerberos can help, rick > The one where it works on is a Intel based standard workstation (HP > DC7800). The machine where it does not work is a AMD Opteron based > server (Sun X4540). Any other kerberos authentication (like smb and > netatalk) works fine. > > We basically followed these instructions: > http://code.google.com/p/macnfsv4/wiki/FreeBSD8KerberizedNFSSetup > > Our system configuration looks as follows: > -- 8< - >8 -- > root@tmp2:/root # uname -a > FreeBSD tmp2.ist.intra 9.0-STABLE FreeBSD 9.0-STABLE #4: Thu Jun 14 > 08:58:14 UTC 2012 r...@srv.ist.intra:/usr/obj/system/usr/src/sys/SRV > amd64 > > > root@tmp2:/root # diff /usr/src/sys/amd64/conf/GENERIC > /usr/src/sys/amd64/conf/SRV > 348a349,354 > > > > > > options KGSSAPI > > device crypto > > > > options NETATALK > > > root@tmp2:/root # cat /etc/krb5.conf > [libdefaults] > default_realm = IST.INTRA > forwardable = true > proxiable = true > > > root@tmp2:/root # ktutil list > FILE:/etc/krb5.keytab: > > Vno Type Principal > 1 aes256-cts-hmac-sha1-96 nfs/tmp2.ist.intra@IST.INTRA > 1 des3-cbc-sha1 nfs/tmp2.ist.intra@IST.INTRA > 1 arcfour-hmac-md5 nfs/tmp2.ist.intra@IST.INTRA > > ktutil: krb5_kt_start_seq_get krb4:/etc/srvtab: open(/etc/srvtab): No > such file or directory > > > root@tmp2:/root # cat /etc/exports > > V4: /tmp -sec=krb5p -network 192.168.1.0 -mask 255.255.255.0 > /tmp/blah -sec=krb5p -network 192.168.1.0 -mask 255.255.255.0 > root@tmp2:/root # > > > > root@tmp2:/root # less /var/run/dmesg.boot > FreeBSD 9.0-STABLE #4: Thu Jun 14 08:58:14 UTC 2012 > r...@srv.ist.intra:/usr/obj/system/usr/src/sys/SRV amd64 > CPU: Six-Core AMD Opteron(tm) Processor 2435 (2600.16-MHz K8-class > CPU) > Origin = "AuthenticAMD" Id = 0x100f80 Family = 10 Model = 8 > Stepping = 0 > > Features=0x178bfbff > Features2=0x802009 > AMD > Features=0xee500800 > AMD > Features2=0x37ff > TSC: P-state invariant > -- 8< - >8 -- > > Any help is greatly appreciated. > > Kind regards, > Herbert Poeckl > > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Need help with nfsv4 and krb5 access denied
Herbert Poeckl wrote: > Hallo everyone, > > > we did more testing on this topic. > > After we found a few hosts, basically HP desktop workstation with > Intel > onboard NICs, that worked and more hosts that didn't work, we placed a > second PCI based NIC into one of the hosts that worked. > > > The surprising result is: > With the onboard NIC nfs kerberos mount works fine. When the second > NIC > takes over, we get a access denied! > > > Here is the keylog of what we did. > > A few explanations: em0 is the embedded onboard card, em1 is the PCI > card we plugged into the machine[1]. > > 192.168.1.164 is the IP address the server is configured for (which is > tmp2.ist.intra in our DNS resolution). 192.168.6.2 is just a > placeholder > address. Both NICs are connected to the same switch (there is no > firewall or VPN configured). > Ok, from my limited knowledge of Kerberos, here is how I understand that a host based keytab entry is used. The NFS server will authenticate nfs/tmp2.ist.intra against the Kerberos KDC, using the information in the keytab entry. The whole idea behind a host based principal like "nfs/tmp2.ist.intra" is that it can only be used by the host "tmp2.ist.intra". As such, when the Kerberos KDC receives an auathentication request for nfs/tmp2.ist.intra, it will DNS resolve tmp2.ist.intra (to 192.168.1.164 it seems) and will compare that to the IP address the authentication request is received from. I think this means the KDC will fail the request if it is sent to the KDC from 192.168.6.2. Your KDC should be logging something when this fails and the traffic you'd need to look at is the traffic between the NFS server and the KDC. (I'd use wireshark, since it probably knows a fair bit about Kerberos.) My guess is that this is what is causing your failure, rick > The system boots up with em0 as 192.168.1.164 and em1 as > 192.168.6.2.[2] > This is the configuration that works, see also the attached tcpdump on > that interface[5]. > > Now we change the IP addresses of em0 to the placeholder address and > em1 > to the servers address and proof that the name resolution is still > available[3]. This is were we get a access denied on the linux nfs > client, see tcpdump[6]. > > When we switch the IP addresses back[4], everything starts working > again. > > > Please note: It doesn't make any difference if we configure em1 as the > server IP address and em0 as placeholder at startup time, the result > is > the same. > > > We do hope that the dump is of any use. If not, or if there are better > ways to debug the problem, your help would be welcome. > > King regards, > Herbert Poeckl > > > [1] > --- 8< >8 --- > root@tmp2:/root # dmesg | grep em0 > em0: port 0x3100-0x311f > mem > 0xf310-0xf311,0xf3125000-0xf3125fff irq 19 at device 25.0 on > pci0 > em0: Using an MSI interrupt > em0: Ethernet address: 00:0f:fe:e7:1c:ae > em0: link state changed to UP > > > root@tmp2:/root # dmesg | grep em1 > em1: port > 0x1100-0x113f mem 0xf304-0xf305,0xf300-0xf303 irq 20 > at > device 4.0 on pci7 > em1: Ethernet address: 00:1b:21:00:8b:2b > em1: link state changed to UP > --- 8< >8 --- > > > [2] > --- 8< >8 --- > root@tmp2:/root # grep em0 /etc/rc.conf > ifconfig_em0="inet 192.168.1.164 netmask 255.255.255.0" > > root@tmp2:/root # grep em1 /etc/rc.conf > ifconfig_em1="inet 192.168.6.2 netmask 255.255.255.0" > > root@tmp2:/root # grep defaultrouter /etc/rc.conf > defaultrouter="192.168.1.1" > > root@tmp2:/root # host tmp2 > tmp2.ist.intra has address 192.168.1.164 > --- 8< >8 --- > > > [3] > --- 8< >8 --- > root@tmp2:/root # ifconfig em0 192.168.6.2 netmask 255.255.255.0 ; > ifconfig em1 192.168.1.164 netmask 255.255.255.0 ; /etc/rc.d/routing > restart > route: writing to routing socket: No such process > delete net default: gateway 192.168.1.1: not in table > delete net :::0.0.0.0: gateway ::1 > delete net ::0.0.0.0: gateway ::1 > delete net fe80::: gateway ::1 > delete net ff02::: gateway ::1 > add net default: gateway 192.168.1.1 > add net :::0.0.0.0: gateway ::1 > add net ::0.0.0.0: gateway ::1 > add net fe80::: gateway ::1 > add net ff02::: gateway ::1 > root@tmp2:/root # > > root@tmp2:/root # host tmp2 > tmp2.ist.intra has address 192.168.1.164 > --- 8< >8 --- > > [4] > --- 8< --
Re: Need help with nfsv4 and krb5 access denied
Herbert Poeckl wrote: > On 06/28/2012 02:07 AM, Rick Macklem wrote: > > The NFS server will authenticate nfs/tmp2.ist.intra against the > > Kerberos > > KDC, using the information in the keytab entry. The whole idea > > behind a > > host based principal like "nfs/tmp2.ist.intra" is that it can only > > be > > used by the host "tmp2.ist.intra". As such, when the Kerberos KDC > > receives > > an auathentication request for nfs/tmp2.ist.intra, it will DNS > > resolve > > tmp2.ist.intra (to 192.168.1.164 it seems) and will compare that to > > the > > IP address the authentication request is received from. I think this > > means the KDC will fail the request if it is sent to the KDC from > > 192.168.6.2. > > Yes, of course. There is and will be no traffic on 192.168.6.2. > > What I've tried to say (and probably failed), is that we have a > network > card in the machine, where the result is always access denied (with > the > correct server IP address set for that NIC). > Hmm, have you tried krb5 or krb5i. krb5p (which was the only one you had exported) means that the NFS RPCs are DES encrypted on the wire. This makes looking at them pretty useless in wireshark. (This comment doesn't apply to the traffic between the NFS server and the KDC, but wireshark will do a good job of decoding krb5, krb5i NFS traffic.) The only other thought I had (I have no idea if this is even possible?) is that some sort of hardware offload in the network card is screwing things up. (I don't know the em hardware, but you might try disabling TSO etc, in case the packets are somehow getting corrupted?) Good luck with it. It would be nice to know why this is happening. Since the NIC is way below the NFS layer, I can't think of any reason why NFS would care which NIC is used. rick > > > Your KDC should be logging something when this fails and the traffic > > you'd > > need to look at is the traffic between the NFS server and the KDC. > > (I'd use > > wireshark, since it probably knows a fair bit about Kerberos.) > > Thank you, I will give it a try. > > Kind regards, > Herbert > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
FreeBSD 8-STABLE on R620 w/ X520-DA2/Intel 82599
Hi All, I have 2 hosts, HP DL360 G8 and Dell R620. Both have the X520-DA2/Intel 82599 10G Fiber NIC. Both also have the same FreeBSD 8-STABLE image. The Dell displays the following in dmesg and we are unable to configure the ix0 or ix1 interfaces where the HP works just fine. Wondering if anyone else has experienced this? pci4: at device 0.0 (no driver attached) pci4: at device 0.1 (no driver attached) -- Take care Rick Miller ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD 8-STABLE on R620 w/ X520-DA2/Intel 82599
On Fri, Jun 29, 2012 at 11:56 AM, Gary Palmer wrote: > On Fri, Jun 29, 2012 at 10:50:52AM -0400, Rick Miller wrote: >> Hi All, >> >> I have 2 hosts, HP DL360 G8 and Dell R620. Both have the >> X520-DA2/Intel 82599 10G Fiber NIC. Both also have the same FreeBSD >> 8-STABLE image. The Dell displays the following in dmesg and we are >> unable to configure the ix0 or ix1 interfaces where the HP works just >> fine. Wondering if anyone else has experienced this? >> >> pci4: at device 0.0 (no driver attached) >> pci4: at device 0.1 (no driver attached) > > Please see > > http://lists.freebsd.org/pipermail/freebsd-net/2012-June/032579.html > > it may be of some assistance. It looks like adding the Dell specific > PCI IDs may be all thats required. Hrmm, very interesting indeed. How do I identify if/when/where the source has been updated? -- Take care Rick Miller ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD 8-STABLE on R620 w/ X520-DA2/Intel 82599
On Fri, Jun 29, 2012 at 11:56 AM, Gary Palmer wrote: > On Fri, Jun 29, 2012 at 10:50:52AM -0400, Rick Miller wrote: >> Hi All, >> >> I have 2 hosts, HP DL360 G8 and Dell R620. Both have the >> X520-DA2/Intel 82599 10G Fiber NIC. Both also have the same FreeBSD >> 8-STABLE image. The Dell displays the following in dmesg and we are >> unable to configure the ix0 or ix1 interfaces where the HP works just >> fine. Wondering if anyone else has experienced this? >> >> pci4: at device 0.0 (no driver attached) >> pci4: at device 0.1 (no driver attached) > > Please see > > http://lists.freebsd.org/pipermail/freebsd-net/2012-June/032579.html > > it may be of some assistance. It looks like adding the Dell specific > PCI IDs may be all thats required. We removed an Intel branded equivalent from the DL360 and tried it in the R620. It detected it no problem. Only problem was we could not see it in the BIOS, not a huge deal to us. -- Rick Miller ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: new Heimdal version, was NFSv3 + krb5 mysteries - need help tracking down
Someone was/is recently working on a Heimdal upgrade, but I'm not sure if they are doing it as a port or part of the base system. Otherwise, the version of Kerberos in FreeBSD is quite old (around Heimdal 1.0.5 I think?) and it would be no surprise that the new gssapi wouldn't be supported. Maybe the person working on the newer Heimdal can comment? (I've changed the subject line so they might notice.) rick - Original Message - > Hi, > > I have a FreeBSD 9-STABLE acting as a kerberized NFSv3 server. > > server# ktutil list > FILE:/etc/krb5.keytab: > > Vno Type Principal > 5 aes256-cts-hmac-sha1-96 nfs/server.linguamatics@linguamatics.com > 5 des3-cbc-sha1 nfs/server.linguamatics@linguamatics.com > 5 arcfour-hmac-md5 nfs/server.linguamatics@linguamatics.com > > ntp in sync everywhere > > The network is a lagg device composed of two bce interfaces (an add-in > card). > > -- 8< [nfstest.sh] -- > #!/bin/bash > > i=0 > fail=0 > while [ $i -lt 100 ] > do > i=$[i+1] > echo "RUN: $i" > umount -f /mnt > sleep 1 > mount -v -o sec=krb5i,vers=3,proto=tcp server:/export/share /mnt || > fail > =$[fail+1] > done > echo "$fail times failed" > -- 8< -- > > centos62# ./nfstest.sh > 54 times failed > > ubuntu1204# ./nfstest.sh > 98 times failed > > ubuntu1104# ./nfstest.sh > 0 times failed > > centos58# ./nfstest.sh > 0 times failed > > I started rpc.gssd -v on all linux clients. > > The clients which did not fail are using gssapi v1 with DES. > Jun 29 18:17:41 centos58 rpc.gssd[1452]: prepare_krb5_rfc1964_buffer: > serializing keys with enctype 4 and length 8 > Jun 29 18:04:36 ubuntu1104 rpc.gssd[911]: prepare_krb5_rfc1964_buffer: > serializing keys with enctype 4 and length 8 > The failing clients are using the newer gssapi v2 with AES256. > Jun 29 17:59:37 ubuntu1204 rpc.gssd[756]: prepare_krb5_rfc4121_buffer: > serializing key with enctype 18 and size 32 > Jun 29 17:55:48 centos62 rpc.gssd[1183]: prepare_krb5_rfc4121_buffer: > serializing key with enctype 18 and size 32 > > Note the different RFC being used. This is just a suspicion, this may > not be realted to the problem. > The cipher being used is different too. > > Then I changed my script to proto=udp. > from ubuntu1104 fails 0 times. > from centos62 fails 0 times. > > On centos58 and ubuntu1204 mount locks up all the time. > > Then I added to krb5.conf [libdefaults] > default_tgs_enctypes = dec-cbc-crc and rebooted both centos58 and > ubuntu1204. > > After rebooting centos56 and ubuntu1204: > > nfstest fails 0 times on centos58 with udp > I get very long response times for ubuntu1204 mounts and always a > permission denied. > > This is a mystery. > > I have not tested NFSv4 yet. > > I need some help to track down this problem. > > Attila > > PS: This may be the same problem as this thread: > http://lists.freebsd.org/pipermail/freebsd-stable/2012-June/068619.html > > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: mountd suddenly stopped working
Jason Hellenthal wrote: > On Sat, Jun 30, 2012 at 09:07:17PM +, Marcin Wisnicki wrote: > > On Sat, 30 Jun 2012 16:21:18 +0200, Ronald Klop wrote: > > > > > On Sat, 30 Jun 2012 15:53:53 +0200, Marcin Wisnicki > > > wrote: > > > > > >> I have just rebooted an old system after 100 days of uptime and > > >> this > > >> came up: > > >> > > >> Jun 30 15:39:00 ghost mountd[1592]: can't change attributes for > > >> /tftpboot Jun 30 15:39:00 ghost mountd[1592]: bad exports list > > >> line > > >> /tftpboot -ro -mapall > > >> > > > > > > Than probably somebody changed something else. Can you provide the > > > content of your exports file? > > > > > > > OK, I've found the reason. > > > > There were two paths exported with same attributes: > > > > /tftpboot -ro -mapall=nobody > > /vol/tank1 -ro -mapall=nobody > > > > As long as there is a filesystem mounted on /vol/tank1, above > > exports > > will work. Since I've disconnected that drive, there was nothing > > mounted > > this time. > > > > Apparently mountd does not allow exporting multiple paths from a > > single > > filesystem on separate lines if they happen to have identical > > attributes. > > > > Its been like that for a long time. I remember running into this way > back on 6.X and 5.X. > > The solution is to: > > /tftpboot /vol/tank1 -ro -mapall=nobody > Yea, since it doesn't specify any host/network, it is the "default" entry that covers "the rest of the world". As such, there can only be one/server file system. So, if /tftpboot and /vol/tank are the same server file system, the above makes mountd happy. (I tend to use -alldirs instead of listing the mount directories, because I find it less confusing, but that's personal taste.) rick > -- > > - (2^(N-1)) > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: bge problems in RELENG_9, bge0: watchdog timeout -- resetting
Hi Anders I've not had good luck with the BCM5719 in stable/8 either. Not sure if the driver has been updated in 9 or not, but I have a blog post explaining my woes with the BCM5719 at http://blog.hostileadmin.com/ On 7/3/12, Anders Nordby wrote: > Hi, > > I'm having lots of difficulties with BCM5719, which is the default > network card of HP Proliant DL 360 G8 servers. I can get a few ping > replies before I get a couple of these: > > bge0: watchdog timeout -- resetting > > bge0: watchdog timeout -- resetting > > > Then everything hangs. Can not log in using ssh. > > I'm running: FreeBSD-9.0-RELENG_9-20120701-JPSNAP-amd64 > > Info about the NIC: > > # devinfo -rv | grep phy > brgphy0 pnpinfo oui=0x1be9 model=0x22 rev=0x0 at phyno=1 > > brgphy1 pnpinfo oui=0x1be9 model=0x22 rev=0x0 at phyno=2 > > brgphy2 pnpinfo oui=0x1be9 model=0x22 rev=0x0 at phyno=3 > > brgphy3 pnpinfo oui=0x1be9 model=0x22 rev=0x0 at phyno=4 > > # grep bge /var/run/dmesg.boot > bge0: mem > 0xf6bf-0xf6bf, > 0xf6be-0xf6be,0xf6bd-0xf6bd irq 32 at device 0.0 on pci3 > > bge0: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E > > miibus0: on bge0 > > bge0: Ethernet address: 2c:76:8a:54:08:14 > > bge1: mem > 0xf6bc-0xf6bc, > 0xf6bb-0xf6bb,0xf6ba-0xf6ba irq 36 at device 0.1 on pci3 > > bge1: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E > > miibus1: on bge1 > > bge1: Ethernet address: 2c:76:8a:54:08:15 > > bge2: mem > 0xf6b9-0xf6b9, > 0xf6b8-0xf6b8,0xf6b7-0xf6b7 irq 32 at device 0.2 on pci3 > > bge2: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E > > miibus2: on bge2 > > bge2: Ethernet address: 2c:76:8a:54:08:16 > > bge3: mem > 0xf6b6-0xf6b6, > 0xf6b5-0xf6b5,0xf6b4-0xf6b4 irq 36 at device 0.3 on pci3 > > bge3: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E > > miibus3: on bge3 > > bge3: Ethernet address: 2c:76:8a:54:08:17 > > > Searching other bug reports and posts, I've tried: > > hw.bge.allow_asf="0" > > hw.pci.enable_msi="0" > > > But it didn't help. Any ideas? > > If I don't use the loader.conf settings above, I also get (before the > watchdog timeouts): > > bge0: 2 link states coalesced > > bge0: 2 link states coalesced > > bge0: 2 link states coalesced > > > Best regards, > > -- > Anders. > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > -- Sent from my mobile device Take care Rick Miller ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NFS trouble on 7.3-STABLE i386
On Fri, 21 May 2010, Mark Morley wrote: Having an issue with a file server here (7.3-STABLE i386) The nfsd processes are hanging. Client access to the nfs shares stops working and the nfsd processes on the server cannot be killed by any means. There are no errors showing up anywhere on the server. The network connection to the server seems fine (ie: anything other than nfs traffic seems ok). Rebooting the server fixes the problem for a while, but it doesn't reboot easily. It times out on terminating the nfsd processes. When it finally does reboot the file system isn't marked clean, resulting in a long wait for fsck (although it doesn't find any problems, it's a multi terrabyte share and it takes a while). This morning it did it again. This time I tried manually killing nfsd but nothing I did would make them die. No errors. Next time it happens, do a "ps axlH" to see what the nfsd threads are waiting for. It might give you a hint as to what is happening. rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
hung on ufs vnode lock, was Re: NFS trouble on 7.3-STABLE i386
On Tue, 25 May 2010, Mark Morley wrote: On Fri, 21 May 2010 11:32:33 -0400 (EDT) Rick Macklem wrote: On Fri, 21 May 2010, Mark Morley wrote: Having an issue with a file server here (7.3-STABLE i386) The nfsd processes are hanging. Client access to the nfs shares stops working and the nfsd processes on the server cannot be killed by any means. There are no errors showing up anywhere on the server. The network connection to the server seems fine (ie: anything other than nfs traffic seems ok). Rebooting the server fixes the problem for a while, but it doesn't reboot easily. It times out on terminating the nfsd processes. When it finally does reboot the file system isn't marked clean, resulting in a long wait for fsck (although it doesn't find any problems, it's a multi terrabyte share and it takes a while). This morning it did it again. This time I tried manually killing nfsd but nothing I did would make them die. No errors. Next time it happens, do a "ps axlH" to see what the nfsd threads are waiting for. It might give you a hint as to what is happening. Ok, it did it again. ps axlH shows all the nfsd processes stuck in the _ufs_ state. The server isn't doing anything else, no other processes seem to be monopolizing resources or disks in any way. If the nfsd threads are sleeping on WCHAN "ufs", I think that means that they are waiting for a ufs vnode lock. I don't know what has changed between FreeBSD7.1 and FreeBSD7.3 that might have caused this. I changed the Subject: line in the hopes that someone who might know the answer to this will take a look. rpcinfo doesn't show anything amiss as far as I can tell (ie: rpc is running) After a reboot, one of the 32 nfsd's almost immediately goes into the "ufs" state and never leaves it (and never racks up and CPU time either). The others are fine. Slowly over time more and more enter this state. When I rebooted it today, all but one were in that state. The clients were bogging down, presumably because the one and only functioning nfsd was overworked. One client is running 8.1-prerelease as a test, and that particular client only will start getting lots of timeouts accessing the nfs share (even with less load than the other clients). Just in case it's tickling something on the server I've shut it down this time and I'm leaving it off for the time being. I don't think that the 8.1-prerelease client is an issue. It's just that the FreeBSD8 krpc likes to generate the "not responding" messages more agreesively. They are pretty well meaningless, imho. rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NFS trouble on 7.3-STABLE i386
On Tue, 25 May 2010, Mark Morley wrote: On Fri, 21 May 2010 11:32:33 -0400 (EDT) Rick Macklem wrote: On Fri, 21 May 2010, Mark Morley wrote: Having an issue with a file server here (7.3-STABLE i386) The nfsd processes are hanging. Client access to the nfs shares stops working and the nfsd processes on the server cannot be killed by any means. There are no errors showing up anywhere on the server. The network connection to the server seems fine (ie: anything other than nfs traffic seems ok). Rebooting the server fixes the problem for a while, but it doesn't reboot easily. It times out on terminating the nfsd processes. When it finally does reboot the file system isn't marked clean, resulting in a long wait for fsck (although it doesn't find any problems, it's a multi terrabyte share and it takes a while). This morning it did it again. This time I tried manually killing nfsd but nothing I did would make them die. No errors. Next time it happens, do a "ps axlH" to see what the nfsd threads are waiting for. It might give you a hint as to what is happening. Ok, it did it again. ps axlH shows all the nfsd processes stuck in the _ufs_ state. The server isn't doing anything else, no other processes seem to be monopolizing resources or disks in any way. rpcinfo doesn't show anything amiss as far as I can tell (ie: rpc is running) After a reboot, one of the 32 nfsd's almost immediately goes into the "ufs" state and never leaves it (and never racks up and CPU time either). The others are fine. Slowly over time more and more enter this state. When I rebooted it today, all but one were in that state. The clients were bogging down, presumably because the one and only functioning nfsd was overworked. You could try this patch. (It reverts the only vnode locking change that I can see was done the the nfs server between 7.1 and 7.3.): --- nfs_serv.c.sav 2010-05-25 19:40:29.0 -0400 +++ nfs_serv.c 2010-05-25 19:41:38.0 -0400 @@ -3236,7 +3236,7 @@ io.uio_rw = UIO_READ; io.uio_td = NULL; eofflag = 0; - vn_lock(vp, LK_SHARED | LK_RETRY, td); + vn_lock(vp, LK_EXCLUSIVE | LK_RETRY, td); if (cookies) { free((caddr_t)cookies, M_TEMP); cookies = NULL; @@ -3518,7 +3518,7 @@ io.uio_rw = UIO_READ; io.uio_td = NULL; eofflag = 0; - vn_lock(vp, LK_SHARED | LK_RETRY, td); + vn_lock(vp, LK_EXCLUSIVE | LK_RETRY, td); if (cookies) { free((caddr_t)cookies, M_TEMP); cookies = NULL; If you get a chance to try it, please let us know if it helps, rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NFS trouble on 7.3-STABLE i386
On Wed, 26 May 2010, Mark Morley wrote: Thanks, but unfortunately it didn't work. Rebooted it four hours ago with the patch in place and at the moment I have seven nfsd processes stuck in that state. Could it indicate a problem with the underlying disk system? It's an aac0 raid, but it has no errors and the controller indicates all is well, so I doubt it. Just about anything is possible. All we seem to know at this point is that it is some change that went in between 7.1->7.3. It also doesn't appear to be the only change that was done to the nfs server during this period. Any change applied to the aac driver might be a factor, but?? Is anyone else seeing this problem (nfsd threads stuck in wchan "ufs") on FreeBSD7.3? rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Re: freeBSD nullfs together with nfs and "silly rename"
On Sat, 12 Jun 2010, Kostik Belousov wrote: On Sat, Jun 12, 2010 at 11:56:10AM +0300, Mikolaj Golub wrote: On Sun, 6 Jun 2010 16:44:43 +0200 Leon Me??ner wrote: LM> Hi, LM> I hope this is not the wrong list to ask. Didn't get any answers on LM> -questions. LM> When you try to do the following inside a nullfs mounted directory, LM> where the nullfs origin is itself mounted via nfs you get an error: LM> # foo LM> # tail -f foo& LM> # rm -f foo LM> tail: foo: Stale NFS file handle LM> # fg LM> This is really a problem when running services inside jails and using LM> NFS as storage. As of [2] it looks like this problem is known for a LM> while. On a normal NFS mount this does not happen as "silly renaming" LM> [1] works there (producing nasty little .nfs files). nfs_sillyrename() is called when vnode's usecount is more then 1. It is expected that unlink() syscall increases vnode's usecount in namei() and if the file has been already opened usecount will be more then 1. But with nullfs layer present the reference counts are held by the upper node, not the lower (nfs) one, so when unlink() is called it increases usecount of the upper vnode, not nfs vnode and nfs_sillyrename() is never called. The strightforward solution looks like to implement null_remove() that will increase lower vnode's refcount before calling null_bypass() and then decrement it after the call. See the attached patch (it works for me on both 8-STABLE and CURRENT). The upper vnode holds a reference to the lower vnode, as you noted. Now, with your patch, I believe that _all_ calls to the nfs_remove() are happen with refcount > 1. I'm not familiar with the nullfs so this might be way off, but would this patch be ok by any chance? Index: sys/fs/nullfs/null_vnops.c === --- sys/fs/nullfs/null_vnops.c (revision 208960) +++ sys/fs/nullfs/null_vnops.c (working copy) @@ -499,6 +499,23 @@ } /* + * Increasing refcount of lower vnode is needed at least for the case + * when lower FS is NFS to do sillyrename if the file is in use. + */ +static int +null_remove(struct vop_remove_args *ap) +{ + int retval; + struct vnode *lvp; + + if (ap->a_vp->v_usecount > 1) { + lvp = NULLVPTOLOWERVP(ap->a_vp); + VREF(lvp); + } else + lvp = NULL; + retval = null_bypass(&ap->a_gen); + if (lvp != NULL) + vrele(lvp); + return (retval); +} + +/* * We handle this to eliminate null FS to lower FS * file moving. Don't know why we don't allow this, * possibly we should. @@ -809,6 +826,7 @@ .vop_open = null_open, .vop_print =null_print, .vop_reclaim = null_reclaim, + .vop_remove = null_remove, .vop_rename = null_rename, .vop_setattr = null_setattr, .vop_strategy = VOP_EOPNOTSUPP, ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Re: Re: freeBSD nullfs together with nfs and "silly rename"
On Sat, 12 Jun 2010, Kostik Belousov wrote: Yes, I hoped that Mikolaj ends up with something similar :). Please note that this is racy, since we cannot know why usecount is greater then 1. This might cause the silly rename to kick in some time where it should not, but the race is rare. I'd say that having silly rename happen once in a while for unlink when it doesn't have to happen is better than having the file deleted on the server while it is still open on the client. rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: nfsv4_server_enable="YES": link_elf: symbol svcpool_destroy undefined
On Sun, 13 Jun 2010, Dmitry Pryanishnikov wrote: Hello! I'm trying to start the experimental NFSv4 server in RELENG_8 w/o building it into the kernel, as nfsv4(4) suggests: ... or start mountd(8) and nfsd(8) with the ``-e'' option to force use of the experimental server. The nfsuserd(8) daemon must also be running. This will occur if nfs_server_enable="YES" nfsv4_server_enable="YES" nfsuserd_enable="YES" are set in rc.conf(5). However, mountd fails to start nfsd; the same problem exists when doing it by hands: "kldload nfsd" gives kernel: link_elf: symbol svcpool_destroy undefined error. Can this problem be solved w/o building kernel with "options NFSD"? Well, if you build a kernel with any of the options that cause "krpc" to be compiled into the kernel, it works. (I usually test with a GENERIC kernel that has NFSCLIENT and NFSSERVER defined in it, so nfsd.ko loads fine.) Basically "nfsd" is defined as dependent on "nfscommon", then "nfscommon" is defined as dependent on "krpc" and "nfssvc". This gets everthing to load, but when it tries to load "nfsd.ko", it can't find the symbols in "krpc.ko" or "nfssvc.ko" if they weren't linked into the kernel. For example, here's what I saw: nfsv4-laptop# kldstat Id Refs AddressSize Name 1 12 0xc040 d1f338 kernel 41 0xc2eff000 1e000nfsclient.ko 51 0xc2ea9000 2000 nfs_common.ko 62 0xc2f1d000 15000krpc.ko 111 0xc2fe3000 16000nfscommon.ko 121 0xc2fc5000 2000 nfssvc.ko nfsv4-laptop# nm /boot/nkernel/krpc.ko | fgrep svcpool cdf0 t svcpool_active de40 t svcpool_create e590 t svcpool_destroy e1d0 t svcpool_maxthread_sysctl e2b0 t svcpool_minthread_sysctl and "nfsd" wouldn't load because it couldn't find "svcpool_destroy", just like you saw. If you apply this patch and rebuild the module, it will find the symbols. (Is that what is supposed to happen or is something broken?) --- fs/nfsserver/nfs_nfsdport.c.sav 2010-06-12 20:27:53.0 -0400 +++ fs/nfsserver/nfs_nfsdport.c 2010-06-12 20:37:09.0 -0400 @@ -3147,4 +3147,6 @@ MODULE_VERSION(nfsd, 1); MODULE_DEPEND(nfsd, nfscommon, 1, 1, 1); MODULE_DEPEND(nfsd, nfslockd, 1, 1, 1); +MODULE_DEPEND(nfsd, krpc, 1, 1, 1); +MODULE_DEPEND(nfsd, nfssvc, 1, 1, 1); ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Re: Re: Re: freeBSD nullfs together with nfs and "silly rename"
On Sun, 13 Jun 2010, Kostik Belousov wrote: My note was not an objection, only a note. Also, when committing, please add a comment explaining what is going on. Righto, and my response was just my opinion. I'm assuming Mikolaj is looking at committing this? rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: diskless boot, nfs server behind router
On Fri, 25 Jun 2010, al...@ulgsm.ru wrote: Hi all. I tryed setup server for booting diskless hosts from different networks. In one network booting is ok. I see thet realtek 8139 pxe can`t load pxeboot file fromi tftp server from another network. By changing options in dhcp server, i resolve that pxeboot can load kernel from this server, but than kernel trying mount nfs root file system its failing. Later mounting from /etc/fstab is ok. Maybe im wrong? but diskless booting in several networks possible ony using several servers, one in each network? I think pxe rom on nic and kernel nfs root mounting can`t work on 3-layer. From a quick glance at the code, I think that the dhcp server must return the gateway the client uses to get to the server. (ie. it must be an ip addr on the diskless client's network for the gateway to where the server is) It looks like this will then be used to set boot.netif.gateway to the correct value for the kernel. So you might want to check how your dhcpd is configured w.r.t. gateway address? rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: diskless boot, nfs server behind router
On Sat, 26 Jun 2010, al...@ulgsm.ru wrote: [stuff snipped] dhcp seems ok. [alexs:ul-it13:~]>kenv LINES="24" acpi_load="YES" boot.netif.gateway="10.144.140.1" boot.netif.hwaddr="00:1c:c0:5a:f4:72" boot.netif.ip="10.144.142.78" boot.netif.netmask="255.255.252.0" boot.nfsroot.nfshandle="Xbb55e849c6f9fd520c00011c0600c20af931X" boot.nfsroot.path="/exp/fbsd71" boot.nfsroot.server="10.144.140.160" It is from diskless host in same subnet where tftp and nfs server. Ok, this is when the server is on the same subnet and works ok. Am I correct? If i use server from over subnet, pxe cant load pxeboot, and kernel can`t mount boot.nfsroot.path="/exp/fbsd71". I thought you had said that you had gotten pxeboot to work across to the other subnet and that it was the mount of "/" that then failed? (It is that case where "boot.netif.gateway" needs to be checked to see that it has the gateway for the client's subnet.) About pxe google found http://www.appdeploy.com/faq/detail.asp?id=10 As i understand router must make ip helper. I seen it in cisco routers, not freebsd. If you haven't been able to get pxeboot to work when the server is on a different subnet, I'm afraid I can't help with that. rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"