classes and kernel_cookie was Re: Specifying root mount options on diskless boot.
... > I note that the response to your message from "danny" offers the ability > to pass arguments to the nfs mount command, but also seems to offer a fix > for the fact that "classes" are not supported under PXE: > > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/90368 > > I hope "danny" will offer a patch to mainline code - it would be an > important improvement (and already promised in the documentation). ... I'm willing to try and add the missing pieces, but I need some better explanantion as to what they are, for example, I have no clue what the kernel_cookie is used for, nor what the ${class} is all about. BTW, it would be kind if the line in the pxeboot(8): As PXE is still in its infancy ... can be changed :-) "danny" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Living on gmirror: need to reincarnate /etc/rc.early
> On 01/25/2011 12:28, Kostik Belousov wrote: > > No, my use for rc.early is different. I use it to load modules > > before filesystems are mounted. > > Ok, I'll bite ... what is deficient about doing this in /boot/loader.conf? > in case if diskless, where the root (/boot/loader.conf) is shared, it's nice to be able to configure clients via rc.conf. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
harmless zfs warnings?
hi, I have one disk, labeled r0 (/dev/mfid0), which i gpart'ed so: =>34 1952448445 mfid0 GPT (931G) 34 128 1 freebsd-boot (64K) 162 4194304 2 freebsd-ufs (2.0G) 4194466 100663296 3 freebsd-swap (48G) 104857762 1847590717 4 freebsd-zfs (881G) and a second 'disk' labeled r5 (/dev/mfid1). now, doing a 'spool import': pool: z id: 784424638598804 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: z ONLINE gpt/r0/zfs ONLINE pool: h id: 535400138652241 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: h ONLINE label/r5 ONLINE what caught my attention was the following message on the console: ZFS WARNING: Unable to attach to gpt/r0/swap. ZFS WARNING: Unable to attach to mfid0p3. should I realy get warried? thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: statd/lockd startup failure
> Under 8.2-PRERELEASE (GENERIC kernel), about 15% of the times I boot up > (with rpc.statd and rpc.lockd enabled in rc.conf), I get: > > Feb 4 07:31:11 wonderland rpc.statd: bindresvport_sa: Address already in use > Feb 4 07:31:11 wonderland root: /etc/rc: WARNING: failed to start statd > > and slightly later: > > Feb 4 07:31:36 wonderland kernel: NLM: unexpected error contacting NSM, > stat=5, errno=35 > > I can start rpc.statd and rpc.lockd manually at this point (and I have to > start them to run firefox and mail with my NFS-mounted home directory and > mail spool). But what might cause the above errors? -- George Mitchell We have been seeing this too, with the addition of mountd. So I decided to try and track it down. rpc.lockd, rpc.statd or mountd, all share the same code for allocating address/port. I added some more info to be displayed in case of error, mainly the ai_family and port, so after many successfull reboots, I got: Mar 9 09:18:19 chamsa mountd[1070]: bindresvport_sa: (2/617) Address already in use but: chamsa> rpcinfo | grep mountd 151udp 0.0.0.0.2.105 mountd superuser 153udp 0.0.0.0.2.105 mountd superuser 151tcp 0.0.0.0.2.105 mountd superuser 153tcp 0.0.0.0.2.105 mountd superuser BTW, 0.0.0.2.105 is 617, and 2 is AF_INET the above is wierd, since the rpc stuff happens after the bindresvport_sa(...) danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: statd/lockd startup failure
> > > > Thanks for the analysis. The reason I originally posted is to see why > > this might have popped up in 8.x, as it never happened in 7.x. > > -- George Mitchell > > > I suspect two things make this occur more frequently with 8.x. One is > that it does IPv6 first (I suspect IPv6 wasn't enabled by default on 7.x?). > > The other is the port randomization code, which probably results in > more frequent collisions with port #s used by other things. (Basically, > the code selects an unused port# for either UDP or TCP over IPv6 (I can't > remember which comes first:-) and then expects that port to be available > for the other 3 combinations of UDP/TCP x IPv6/IPv4. anothere reason for it is probably the multy-cores, most of these daemons fork very early, and very quickly are compiting for resources in parallel. danny PS: rick, can you send me your patch? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: statd/lockd startup failure
>> On 02/18/2011 10:08, Rick Macklem wrote: >> > The attached patches changes the behaviour so that it tries to >> > get an unused port for each of the 4 cases. >> >> can you send me the patches? >> thanks, >> danny > They're attached. If you get to test them, please let me know > how it goes. > > rick Hi Rick, the good side of living on different time zones :-) I got impatient, so I came up with a different fix. The rational is that IMHO, there is no need for all listeners to be on the same port: rnd> rpcinfo protonew |grep mountd 151udp6 ::.3.141 mountd superuser 153udp6 ::.3.141 mountd superuser 151tcp6 ::.3.141 mountd superuser 153tcp6 ::.3.141 mountd superuser 151udp 0.0.0.0.3.141 mountd superuser 153udp 0.0.0.0.3.141 mountd superuser 151tcp 0.0.0.0.3.92 mountd superuser <--- 153tcp 0.0.0.0.3.92 mountd superuser <--- rnd> rpcinfo -t protonew mountd program 15 version 1 ready and waiting rpcinfo: RPC: Program/version mismatch; low version = 1, high version = 3 program 15 version 2 is not available program 15 version 3 ready and waiting the patches are in: ftp://ftp.cs.huji.ac.il/users/danny/freebsd/patches/address_already_in_use/ cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: statd/lockd startup failure
> > >> On 02/18/2011 10:08, Rick Macklem wrote: > > >> > The attached patches changes the behaviour so that it tries to > > >> > get an unused port for each of the 4 cases. > > >> > > >> can you send me the patches? > > >> thanks, > > >> danny > > > > > They're attached. If you get to test them, please let me know > > > how it goes. > > > > > > rick > > > > Hi Rick, > > the good side of living on different time zones :-) > > I got impatient, so I came up with a different fix. > > The rational is that IMHO, there is no need for all listeners > > to be on the same port: > > rnd> rpcinfo protonew |grep mountd > > 15 1 udp6 ::.3.141 mountd superuser > > 15 3 udp6 ::.3.141 mountd superuser > > 15 1 tcp6 ::.3.141 mountd superuser > > 15 3 tcp6 ::.3.141 mountd superuser > > 15 1 udp 0.0.0.0.3.141 mountd superuser > > 15 3 udp 0.0.0.0.3.141 mountd superuser > > 15 1 tcp 0.0.0.0.3.92 mountd superuser <--- > > 15 3 tcp 0.0.0.0.3.92 mountd superuser <--- > > rnd> rpcinfo -t protonew mountd > > program 15 version 1 ready and waiting > > rpcinfo: RPC: Program/version mismatch; low version = 1, high version > > = 3 > > program 15 version 2 is not available > > program 15 version 3 ready and waiting > > > > the patches are in: > > ftp://ftp.cs.huji.ac.il/users/danny/freebsd/patches/address_already_in_use/ > > > > cheers, > > danny > > > Yep, a patch that doesn't make them all use the same port# is much > simpler. However, others, such as Doug Barton feel that it is important > that they use the same port#. (Something he called "tracking".) The problem with trying to get the same port for all tcp/udp/inet/inet6 though might succeed most of the time, will fail sometimes, then what? I saw Doug's commnent, and also the :), it's not as simple as tracking port 80 or 25, needs some efford, but it's deterministic/programable, and worst case you can still use the -p option (which again will fail sometimes :-). IMHO, having a system that might fail to reboot is not very pleasant. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: statd/lockd startup failure
> On 03/12/2011 02:21, Daniel Braniss wrote: > > The problem with trying to get the same port for all tcp/udp/inet/inet6 > > though might succeed most of the time, will fail sometimes, then what? > > Can you please describe the scenario when it's completely impossible to > find a port that's open on all 4 families? i did not say impossible, concidering that Rick asked how many times he should try, unless N is forever, it could fail. > > > I saw Doug's commnent, and also the:), it's not as simple as tracking port > > 80 or 25, needs some efford, but it's deterministic/programable, and worst > > case > > you can still use the -p option (which again will fail sometimes:-). > > Given that Rick has already written the patch, I don't think it's at all > unreasonable to put it in as the first choice, perhaps with a fallback > to picking any available port if there isn't one available for all 4 > families. > as Rick mentioned, the patch is not trivial, and to quote him: "My only concern with the "same port# patch" is that it is more complex and, therefore, somewhat riskier w.r.t. my having gotten it wrong." > Meanwhile, I don't think I'm the only person who has ever had trouble > trying to track down network traffic from "random" ports that would > prefer that doing so not be made harder by having the same service on > the same host using 4 different ports. To track rpc based traffic, which means random-port to start with, you have to check with rpcinfo anyways. So yes, it's harder than tracking 1 port, but IMHO, less complex than the patch requiered :-), and BTW, mountd is already heavely patched, rpc.statd less, and rpc.lockd is, so far, the only one that is not complaining - guess Rick is a good programer! and I concider myself lucky that we don't use NIS/yellow-pages. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
mountd stuck
I am runing mountd with -e (experimental :-) this is happening too often lately, where mountd just stops responding mountd 11762 [dp->dp_config_rwlock] 8.93r 0.00u 0.00s 0% 1320k any help/clues? thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
mountd stuck in ZFS code.
I have been running the experimental nfs/mount for some time now, and it mostly works, except with this particular case, where the mountd just gets stuck: mountd 11762 [dp->dp_config_rwlock] 8.93r 0.00u 0.00s 0% 1320k and stops respondig. I can't reproduce it at will, but it happens quiet often. The host in question is an nfs/zfs server, runing 8-stable and zfs ZFS pool version 15 ZFS filesystem version 4 cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
portmaster goes into a loop
hi, this: portmaster p5-libwww-5.837 goes into a loop: ... ===>>> The dependency for net/p5-Net-HTTP seems to be handled by p5-libwww-5.837 ===>>> Launching child to update p5-libwww-5.837 to p5-libwww-6.02 p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libw w-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-l bwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5. 37 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p -libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww 5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 > p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-lib ww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.83 >> p5-libwww-5.837 >> p5-libwww-5.837 ===>>> Port directory: /usr/ports/www/p5-libwww ... how can I fix this? (the loop I kill with ^C :-), it's the going into a loop that I want to fix. thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: portmaster goes into a loop
> hi, > this: > portmaster p5-libwww-5.837 > > goes into a loop: > ... > ===>>> The dependency for net/p5-Net-HTTP >seems to be handled by p5-libwww-5.837 > > ===>>> Launching child to update p5-libwww-5.837 to p5-libwww-6.02 > p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> > p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> > p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> > p5-libwww-5.837 >> p5-libw w-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> > p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> > p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> > p5-libwww-5.837 >> p5-libwww-5.837 >> p5-l bwww-5.837 >> p5-libwww-5.837 >> > p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> > p5-libwww-5. 37 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> > p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p -libwww-5.837 >> > p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> > p5-libwww-5.837 >> p5-libwww 5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> > p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 > > p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> > p5-libwww-5.837 >> p5-libwww-5.837 >> p5-lib ww-5.837 >> p5-libwww-5.837 >> > p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> > p5-libwww-5.83 >> p5-libwww-5.837 >> p5-libwww-5.837 > > ===>>> Port directory: /usr/ports/www/p5-libwww > ... > > how can I fix this? > (the loop I kill with ^C :-), it's the going into a loop that I want to fix. > p5-libwww depends on p5-Net-HTTP, but p5-Net-HTTP says it conflicts with p5-libwww-5* maybe portmaster can better catch this conflict? danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: can't ping local address
> Hi, > > Does any one have this [1] problem? or just know how to fix it? > > [1] http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/159103 > the bug has been around for a while, and the fix for a diskless won't work :-), downing the link will hang the host danny > -- > Andrey Zonov > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
8.3-PRERELEASE and ATA_CAM
with the latest svn, I can't compile kernel with options ATA_CAM: ... linking kernel.debug ata-disk.o(.text+0x93): In function `ad_init': /r+d/stable/8.3/sys/dev/ata/ata-disk.c:389: undefined reference to `ata_setmode' ata-disk.o(.text+0xaa):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:397: undefined reference to `ata_wc' ata-disk.o(.text+0xc5):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:398: undefined reference to `ata_controlcmd' ata-disk.o(.text+0x113):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:400: undefined reference to `ata_controlcmd' ata-disk.o(.text+0x133):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:393: undefined reference to `ata_controlcmd' ata-disk.o(.text+0x16d):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:407: undefined reference to `ata_controlcmd' ata-disk.o(.text+0x21a): In function `ad_shutdown': /r+d/stable/8.3/sys/dev/ata/ata-disk.c:196: undefined reference to `ata_controlcmd' ata-disk.o(.text+0x45c): In function `ad_detach': /r+d/stable/8.3/sys/dev/ata/ata-disk.c:182: undefined reference to `ata_fail_requests' ... danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.3-PRERELEASE and ATA_CAM
> On Fri, Apr 06, 2012 at 10:48:13AM +0300, Daniel Braniss wrote: > > with the latest svn, I can't compile kernel with options ATA_CAM: > > > > ... > > linking kernel.debug > > ata-disk.o(.text+0x93): In function `ad_init': > > /r+d/stable/8.3/sys/dev/ata/ata-disk.c:389: undefined reference to > > `ata_setmode' > > ata-disk.o(.text+0xaa):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:397: > > undefined > > reference to `ata_wc' > > ata-disk.o(.text+0xc5):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:398: > > undefined > > reference to `ata_controlcmd' > > ata-disk.o(.text+0x113):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:400: > > undefined > > reference to `ata_controlcmd' > > ata-disk.o(.text+0x133):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:393: > > undefined > > reference to `ata_controlcmd' > > ata-disk.o(.text+0x16d):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:407: > > undefined > > reference to `ata_controlcmd' > > ata-disk.o(.text+0x21a): In function `ad_shutdown': > > /r+d/stable/8.3/sys/dev/ata/ata-disk.c:196: undefined reference to > > `ata_controlcmd' > > ata-disk.o(.text+0x45c): In function `ad_detach': > > /r+d/stable/8.3/sys/dev/ata/ata-disk.c:182: undefined reference to > > `ata_fail_requests' > > ... > > > > You seem to be using a mutually exclusive set of ata(4) options and > devices (previously, this erroneously wasn't a bug). When including > options ATA_CAM you do _not_ want to also include any of the following > devices: > deviceatapicam > deviceatadisk > deviceataraid > deviceatapicd > deviceatapifd > deviceatapist > > Instead you need the corresponding driver from the following set: > devicescbus > devicech > deviceda > devicesa > devicecd > devicepass > > Marius > they are included by GENERIC, which i include, bummer. what about ATA_STATIC_ID, I guess that is also a nono? thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Time Clock Stops in FreeBSD 9.0 guest running under ESXi 5.0
> On Thu, 2012-03-22 at 18:13 +0200, Volodymyr Kostyrko wrote: > > Andriy Gapon wrote: > > > on 22/03/2012 17:33 Volodymyr Kostyrko said the following: > > >> Andriy Gapon wrote: > > >>> on 22/03/2012 15:19 Mike Tkachuk said the following: > > kern.eventtimer.periodic: 0 > > >>> > > >>> It might make sense to try 1 here. > > >>> Also you could attempt to involve mav@ directly - here is an author of > > >>> the code > > >>> and an expert on it. > > >> > > >> Better ask before setting as this doubles hpet0 (with HPET) or > > >> cpu0:timer (with > > >> LAPIC) interrupt rate for me. > > > > > > Does it make your system unusable? > > > Are you comparing with pre-eventtimers version of FreeBSD? > > > > In short term - no. Haven't tested it thoroughly. Results are the same > > (double interrupt rate according to `systat 1 -v`) for: > > * i386 and amd64 9-STABLE; > > * amd64 9.0. > > > > As everything related to timing/freq/acpi can be unpredictive I wouldn't > > recommend this to anyone. I own at least two Intel CPU's failing > > somewhere near timing/apic when loading cpufreq and enabling powerd. > > > > I'm not sure I understand that advice. We have someone whose system is > failing (time stops counting) when using the new event timer code. The > recommendation is to set kern.eventtimer.periodic=1, which as I > understand it makes the new code work more like it did before. That > seems to be a reasonable attempt to work around the problem. > > If it works, the system becomes 100% more usable than it is now, even if > that comes at the cost of timers interrupting twice as fast as they did > in previous OS releases. It also generates another datapoint that might > somehow help track down why the event timer code has trouble on some > hardware. Enough such datapoints may eventually lead to an "aha -- it > happens on all systems that have the xyz chipset." Just a me too: but it was running 8.2-stable! since it's a production machine, I had no choice but to reboot it. Also the BIOS time got stuck, so I had to fix the time manualy! ntpd doesn't like to advance past a certain delta. cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
lost devices in 8.3
hi, I'm trying to upgrade this old opteron box, which is running 8.2, but when booting 8.3 the disks disappear. with 8.2: ... atapci1@pci0:0:7:1: class=0x01018a card=0x74691022 chip=0x74691022 rev=0x03 hdr=0x00 vendor = 'Advanced Micro Devices (AMD)' device = 'UltraATA/133 Controller (AMD-8111)' class = mass storage subclass = ATA ... atapci0@pci0:3:5:0: class=0x010400 card=0x61141095 chip=0x31141095 rev=0x02 hdr=0x00 vendor = 'Silicon Image Inc (Was: CMD Technology Inc)' device = 'SATALink/SATARaid Controller (Sil 3114)' class = mass storage subclass = RAID but none on 8.3: none0@pci0:0:7:1: class=0x01018a card=0x74691022 chip=0x74691022 rev=0x03 hdr=0x00 vendor = 'Advanced Micro Devices (AMD)' device = 'UltraATA/133 Controller (AMD-8111)' class = mass storage subclass = ATA ... none3@pci0:3:5:0: class=0x018000 card=0x31141095 chip=0x31141095 rev=0x02 hdr=0x00 vendor = 'Silicon Image Inc (Was: CMD Technology Inc)' device = 'SATALink/SATARaid Controller (Sil 3114)' class = mass storage and the only diff in the configuration is that 8.3 has: options ATA_CAM nodeviceata nodeviceatadisk # ATA disk drives nodeviceataraid # ATA RAID drives nodeviceatapicd # ATAPI CDROM drives nodeviceatapifd # ATAPI floppy drives nodeviceatapist # ATAPI tape drives cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 9-STABLE, ZFS, NFS, ggatec - suspected memory leak
> Security_Multipart(Fri_Apr_27_13_35_56_2012_748)-- > Content-Type: Text/Plain; charset=us-ascii > Content-Transfer-Encoding: 7bit > > Rick Macklem wrote > in <1527622626.3418715.1335445225510.javamail.r...@erie.cs.uoguelph.ca>: > > rm> Steven Hartland wrote: > rm> > Original Message - > rm> > From: "Rick Macklem" > rm> > > At a glance, it looks to me like 8.x is affected. Note that the > rm> > > bug only affects the new NFS server (the experimental one for 8.x) > rm> > > when exporting ZFS volumes. (UFS exported volumes don't leak) > rm> > > > rm> > > If you are running a server that might be affected, just: > rm> > > # vmstat -z | fgrep -i namei > rm> > > on the server and see if the 3rd number shown is increasing. > rm> > > rm> > Many thanks Rick wasnt aware we had anything experimental enabled > rm> > but I think that would be a yes looking at these number:- > rm> > > rm> > vmstat -z | fgrep -i namei > rm> > NAMEI: 1024, 0, 1, 1483, 25285086096, 0 > rm> > vmstat -z | fgrep -i namei > rm> > NAMEI: 1024, 0, 0, 1484, 25285945725, 0 > rm> > > rm> ^ > rm> I don't think so, since the 3rd number (USED) is 0 here. > rm> If that # is increasing over time, you have the leak. You are > rm> probably running the old (default in 8.x) NFS server. > > Just a report, I confirmed it affected 8.x servers running newnfs. > > Actually I have been suffered from memory starvation symptom on that > server (24GB RAM) for a long time and watching vmstat -z > periodically. It stopped working once a week. I investigated the > vmstat log again and found the amount of NAMEI leak was 11,543,956 > (about 11GB!) just before the locked-up. After applying the patch, > the leak disappeared. Thank you for fixing it! > > -- Hiroki this is on 8.2-STABLE/amd64 from around August: same here, this zfs+newnfs has been hanging every few months, and I can see now the leak, it's slowly increasing: NAMEI: 1024,0, 122975, 529, 15417248,0 NAMEI: 1024,0, 122984, 520, 15421772,0 NAMEI: 1024,0, 123002, 502, 15424743,0 NAMEI: 1024,0, 123008, 496, 15425464,0 cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Restricting users from certain privileges
> Hi: > > I could not figure out how to restrict users or other users from certain > privileges to execute certain commands in FreeBSD/NanoBSD? > > What I meant is I want to create a NanoBSD image in which there will be an > additional user, say 'admin'. I need to give this new user (admin) some > privileges to run some root-can-only-execute commands, but not all (ACL > similar to the firmwares in adsl modems from ISPs). > > I read Dru Lavingne's 'BSD Hacks' and Joseph Kong's 'Designing BSD > Rootkits' besides FreeBSD handbook, but I simply could not figure out. > Could anyone throw some light on this? Appreciate it! > > Thanks! > > /zenny try sudo from ports, security/sudo cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 9-STABLE, ZFS, NFS, ggatec - suspected memory leak
> Daniel Braniss wrote: > > > Security_Multipart(Fri_Apr_27_13_35_56_2012_748)-- > > > Content-Type: Text/Plain; charset=us-ascii > > > Content-Transfer-Encoding: 7bit > > > > > > Rick Macklem wrote > > > in > > > <1527622626.3418715.1335445225510.javamail.r...@erie.cs.uoguelph.ca>: > > > > > > rm> Steven Hartland wrote: > > > rm> > Original Message - > > > rm> > From: "Rick Macklem" > > > rm> > > At a glance, it looks to me like 8.x is affected. Note that > > > the > > > rm> > > bug only affects the new NFS server (the experimental one > > > for 8.x) > > > rm> > > when exporting ZFS volumes. (UFS exported volumes don't > > > leak) > > > rm> > > > > > rm> > > If you are running a server that might be affected, just: > > > rm> > > # vmstat -z | fgrep -i namei > > > rm> > > on the server and see if the 3rd number shown is increasing. > > > rm> > > > > rm> > Many thanks Rick wasnt aware we had anything experimental > > > enabled > > > rm> > but I think that would be a yes looking at these number:- > > > rm> > > > > rm> > vmstat -z | fgrep -i namei > > > rm> > NAMEI: 1024, 0, 1, 1483, 25285086096, 0 > > > rm> > vmstat -z | fgrep -i namei > > > rm> > NAMEI: 1024, 0, 0, 1484, 25285945725, 0 > > > rm> > > > > rm> ^ > > > rm> I don't think so, since the 3rd number (USED) is 0 here. > > > rm> If that # is increasing over time, you have the leak. You are > > > rm> probably running the old (default in 8.x) NFS server. > > > > > > Just a report, I confirmed it affected 8.x servers running newnfs. > > > > > > Actually I have been suffered from memory starvation symptom on > > > that > > > server (24GB RAM) for a long time and watching vmstat -z > > > periodically. It stopped working once a week. I investigated the > > > vmstat log again and found the amount of NAMEI leak was 11,543,956 > > > (about 11GB!) just before the locked-up. After applying the patch, > > > the leak disappeared. Thank you for fixing it! > > > > > > -- Hiroki > And thanks Hiroki for testing it on 8.x. > > > this is on 8.2-STABLE/amd64 from around August: > > same here, this zfs+newnfs has been hanging every few months, and I > > can see > > now the leak, it's slowly increasing: > > NAMEI: 1024, 0, 122975, 529, 15417248, 0 > > NAMEI: 1024, 0, 122984, 520, 15421772, 0 > > NAMEI: 1024, 0, 123002, 502, 15424743, 0 > > NAMEI: 1024, 0, 123008, 496, 15425464, 0 > > > > cheers, > > danny > Maybe you could try the patch, too. > > It's at: >http://people.freebsd.org/~rmacklem/namei-leak.patch > > I'll commit it to head soon with a 1 month MFC, so that hopefully > Oliver will have a chance to try it on his production server before > the MFC. > > Thanks everyone, for your help with this, rick I haven't applied the patch yet, but in the meanime I have been running some experiments on a zfs/nfs server running 8.3-STABLE, and don't see any leaks what triggers the leak? thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 9-STABLE, ZFS, NFS, ggatec - suspected memory leak
> Daniel Braniss wrote: > > > Daniel Braniss wrote: > > > > > Security_Multipart(Fri_Apr_27_13_35_56_2012_748)-- > > > > > Content-Type: Text/Plain; charset=us-ascii > > > > > Content-Transfer-Encoding: 7bit > > > > > > > > > > Rick Macklem wrote > > > > > in > > > > > > > > > > <1527622626.3418715.1335445225510.javamail.r...@erie.cs.uoguelph.ca>: > > > > > > > > > > rm> Steven Hartland wrote: > > > > > rm> > Original Message - > > > > > rm> > From: "Rick Macklem" > > > > > rm> > > At a glance, it looks to me like 8.x is affected. Note > > > > > that > > > > > the > > > > > rm> > > bug only affects the new NFS server (the experimental > > > > > one > > > > > for 8.x) > > > > > rm> > > when exporting ZFS volumes. (UFS exported volumes don't > > > > > leak) > > > > > rm> > > > > > > > rm> > > If you are running a server that might be affected, > > > > > just: > > > > > rm> > > # vmstat -z | fgrep -i namei > > > > > rm> > > on the server and see if the 3rd number shown is > > > > > increasing. > > > > > rm> > > > > > > rm> > Many thanks Rick wasnt aware we had anything experimental > > > > > enabled > > > > > rm> > but I think that would be a yes looking at these number:- > > > > > rm> > > > > > > rm> > vmstat -z | fgrep -i namei > > > > > rm> > NAMEI: 1024, 0, 1, 1483, 25285086096, 0 > > > > > rm> > vmstat -z | fgrep -i namei > > > > > rm> > NAMEI: 1024, 0, 0, 1484, 25285945725, 0 > > > > > rm> > > > > > > rm> ^ > > > > > rm> I don't think so, since the 3rd number (USED) is 0 here. > > > > > rm> If that # is increasing over time, you have the leak. You > > > > > are > > > > > rm> probably running the old (default in 8.x) NFS server. > > > > > > > > > > Just a report, I confirmed it affected 8.x servers running > > > > > newnfs. > > > > > > > > > > Actually I have been suffered from memory starvation symptom on > > > > > that > > > > > server (24GB RAM) for a long time and watching vmstat -z > > > > > periodically. It stopped working once a week. I investigated > > > > > the > > > > > vmstat log again and found the amount of NAMEI leak was > > > > > 11,543,956 > > > > > (about 11GB!) just before the locked-up. After applying the > > > > > patch, > > > > > the leak disappeared. Thank you for fixing it! > > > > > > > > > > -- Hiroki > > > And thanks Hiroki for testing it on 8.x. > > > > > > > this is on 8.2-STABLE/amd64 from around August: > > > > same here, this zfs+newnfs has been hanging every few months, and > > > > I > > > > can see > > > > now the leak, it's slowly increasing: > > > > NAMEI: 1024, 0, 122975, 529, 15417248, 0 > > > > NAMEI: 1024, 0, 122984, 520, 15421772, 0 > > > > NAMEI: 1024, 0, 123002, 502, 15424743, 0 > > > > NAMEI: 1024, 0, 123008, 496, 15425464, 0 > > > > > > > > cheers, > > > > danny > > > Maybe you could try the patch, too. > > > > > > It's at: > > >http://people.freebsd.org/~rmacklem/namei-leak.patch > > > > > > I'll commit it to head soon with a 1 month MFC, so that hopefully > > > Oliver will have a chance to try it on his production server before > > > the MFC. > > > > > > Thanks everyone, for your help with this, rick > > > > I haven't applied the patch yet, but in the meanime I have been > > running some > > experiments on a zfs/nfs server running 8.3-STABLE, and don't see any > > leaks > > what triggers the leak? > > > Fortunately Oliver isolated this. It should leak when you do a successful > "rm" or "rmdir" while running the new/experimental server. > but that's what I did, I'm running the new/experimental nfs server (or so I think :-), and did a huge rm -rf and nothing, nada, no leak. To check the patch, I have to upgrade the production server, the one with the leak, but I wanted to test it on a non production first. Anyways, ill patch the kernel and try it on the leaking production server tomorrow. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 9-STABLE, ZFS, NFS, ggatec - suspected memory leak
> Daniel Braniss wrote: > > > Daniel Braniss wrote: > > > > > Daniel Braniss wrote: > > > > > > > Security_Multipart(Fri_Apr_27_13_35_56_2012_748)-- > > > > > > > Content-Type: Text/Plain; charset=us-ascii > > > > > > > Content-Transfer-Encoding: 7bit > > > > > > > > > > > > > > Rick Macklem wrote > > > > > > > in > > > > > > > > > > > > > > <1527622626.3418715.1335445225510.javamail.r...@erie.cs.uoguelph.ca>: > > > > > > > > > > > > > > rm> Steven Hartland wrote: > > > > > > > rm> > Original Message - > > > > > > > rm> > From: "Rick Macklem" > > > > > > > rm> > > At a glance, it looks to me like 8.x is affected. > > > > > > > Note > > > > > > > that > > > > > > > the > > > > > > > rm> > > bug only affects the new NFS server (the > > > > > > > experimental > > > > > > > one > > > > > > > for 8.x) > > > > > > > rm> > > when exporting ZFS volumes. (UFS exported volumes > > > > > > > don't > > > > > > > leak) > > > > > > > rm> > > > > > > > > > rm> > > If you are running a server that might be affected, > > > > > > > just: > > > > > > > rm> > > # vmstat -z | fgrep -i namei > > > > > > > rm> > > on the server and see if the 3rd number shown is > > > > > > > increasing. > > > > > > > rm> > > > > > > > > rm> > Many thanks Rick wasnt aware we had anything > > > > > > > experimental > > > > > > > enabled > > > > > > > rm> > but I think that would be a yes looking at these > > > > > > > number:- > > > > > > > rm> > > > > > > > > rm> > vmstat -z | fgrep -i namei > > > > > > > rm> > NAMEI: 1024, 0, 1, 1483, 25285086096, 0 > > > > > > > rm> > vmstat -z | fgrep -i namei > > > > > > > rm> > NAMEI: 1024, 0, 0, 1484, 25285945725, 0 > > > > > > > rm> > > > > > > > > rm> ^ > > > > > > > rm> I don't think so, since the 3rd number (USED) is 0 here. > > > > > > > rm> If that # is increasing over time, you have the leak. > > > > > > > You > > > > > > > are > > > > > > > rm> probably running the old (default in 8.x) NFS server. > > > > > > > > > > > > > > Just a report, I confirmed it affected 8.x servers running > > > > > > > newnfs. > > > > > > > > > > > > > > Actually I have been suffered from memory starvation > > > > > > > symptom on > > > > > > > that > > > > > > > server (24GB RAM) for a long time and watching vmstat -z > > > > > > > periodically. It stopped working once a week. I > > > > > > > investigated > > > > > > > the > > > > > > > vmstat log again and found the amount of NAMEI leak was > > > > > > > 11,543,956 > > > > > > > (about 11GB!) just before the locked-up. After applying the > > > > > > > patch, > > > > > > > the leak disappeared. Thank you for fixing it! > > > > > > > > > > > > > > -- Hiroki > > > > > And thanks Hiroki for testing it on 8.x. > > > > > > > > > > > this is on 8.2-STABLE/amd64 from around August: > > > > > > same here, this zfs+newnfs has been hanging every few months, > > > > > > and > > > > > > I > > > > > > can see > > > > > > now the leak, it's slowly increasing: > > > > > > NAMEI: 1024, 0, 122975, 529, 15417248, 0 > > > > > > NAMEI: 1024, 0, 122984, 520, 15421772, 0 > > > > > > NAMEI: 1024, 0, 123002, 502, 15424743, 0 > > > > > > NAMEI: 1024, 0, 123008, 496, 15425464, 0 > > > > > > >
Re: su problem
> Sami Halabi wrote: > > Hi Oliver, > > I saw you had similar problem for console on 2010 > > > http://freebsd.1045724.n5.nabble.com/Serial-console-problems-with-stab=le-8-td3950684.html > > No, I don't think that the problem is related. My problem > was with the serial console, while you don't have a serial > console attached at all (at least you didn't mention it). > > > but the thread wasn't ended by recommendation or conclusions by you. > > > > did you solve that problem then? > > No, I came to the conclusion that the serial console support > in FreeBSD 8 was broken somehow. So I removed the console > cable; it's running with an old VGA CRT as the console for > now. Fortunately I require console access very seldom, so > I don't have to drive to that machine often. It's still > annoying, but I didn't find a better solution; downgrading > to 7.x isn't an option. > just for the record, serial on 8.x works fine! the device naming has changed from sio to uart, and maybe some features. We use it on all our servers, even redirecting it where possible via ILO,IMPI,DRAC. and is great for debuging or saving long trips :-) WARNING: control access to these devices, specialy since root can login on the console! danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: su problem
> > > On 6/10/12 1:52 PM, Daniel Braniss wrote: > >> Sami Halabi wrote: > >> > Hi Oliver, > >> > I saw you had similar problem for console on 2010 > >> > > >> http://freebsd.1045724.n5.nabble.com/Serial-console-problems-with-stab=le-8-td3950684.html > >> > >> No, I don't think that the problem is related. My problem > >> was with the serial console, while you don't have a serial > >> console attached at all (at least you didn't mention it). > >> > >> > but the thread wasn't ended by recommendation or conclusions by you. > >> > > >> > did you solve that problem then? > >> > >> No, I came to the conclusion that the serial console support > >> in FreeBSD 8 was broken somehow. So I removed the console > >> cable; it's running with an old VGA CRT as the console for > >> now. Fortunately I require console access very seldom, so > >> I don't have to drive to that machine often. It's still > >> annoying, but I didn't find a better solution; downgrading > >> to 7.x isn't an option. > >> > > just for the record, serial on 8.x works fine! the device naming has changed > > from sio to uart, and maybe some features. We use it on all our servers, > > even > > redirecting it where possible via ILO,IMPI,DRAC. and is great for debuging > > or saving long trips :-) > > > > WARNING: control access to these devices, specialy since root can login > > on the console! > > > > danny > > > > Daniel, would you kindly elaborate on the DRAC console redirection thingy ? > > We're using Dells here and I loathe having to use their web interface > and the java app to get a console shell. you need the drac module - sometimes it's optional, but if you can access it via the web you probably have it. you will have to: set the bios to allow serial over ethernet, I can't remember off heart at the moment. configure /boot/loader.conf: console="comconsole,vidconsole" comconsole_speed="38400"-- the speed is what you set it in the bios configure /boot/device.hints: hint.uart.0.flags="0x10"-- or .1. depending on the bios settings install from ports sysutils/ipmitools connect the ethernet port and finaly: ipmitool -A MD5 -H c -U root -I lanplus sol activate danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD and IPMI how-to (was Re: su problem)
> Hi, all, > > > Am 15.06.2012 um 03:27 schrieb Matthew X. Economou: > > Daniel Braniss writes: > > > >> just for the record, serial on 8.x works fine! the device naming > >> has changed from sio to uart, and maybe some features. We use it > >> on all our servers, even redirecting it where possible via > >> ILO,IMPI,DRAC. and is great for debuging or saving long trips :-) > > > > Would some kind soul point me to a howto for configuring IPMI on > > FreeBSD? I have a Dell PowerEdge 840 that supports IPMI, but I have > > no idea how to set it up - either in the BIOS or in FreeBSD. I've > > messed around with ipmitools a little, but I haven't gotten it to > > work. > > > Did you > > kldload ipmi > ? > > What's the output of > > dmesg> kldstat > > after loading the module? > > With the module loaded, you should be able to get something like this: > > devel# ipmitool sensor > Ambient | 23.500 | degrees C | ok| na| 1.000 = | > 6.000 | 37.000| 42.000| na > Systemboard | 32.000 | degrees C | ok| na| na = | > na| 60.000| 65.000| na > CPU1 | 49.000 | degrees C | ok| na| na = | > na| 93.000| 97.000| na > CPU2 | 48.000 | degrees C | ok| na| na = | > na| 93.000| 97.000| na > ... [...] the ipmi kernel module allows interfacing/communicating with the 'local system', which is nice, unless the kernel went bonkers. You can - after some configuring(*) - connect from another host via something like: ipmitool -A MD5 -H -U root -I lanplus sol activate and get the remote host console, or do a power cycle: ipmitool -A MD5 -H -U root power cycle danny *: you need configure/enable the bios/drac. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: PORTS_MODULES
> Howdy, > > This is an FYI to let people know about a really nice feature for those > that have ports installed which include kernel modules. You can place a > list in /etc/src.conf like this: > > PORTS_MODULES= emulators/virtualbox-ose-kmod sysutils/fusefs-kmod > x11/nvidia-driver > > which will cause those modules to be built and installed with all the > proper matching stuff at the same time as buildkernel and installkernel. > > This feature has existed for a while, but has had "issues." Thanks to a > team effort it's a lot more robust now, and ready for prime time (in > HEAD, and the -STABLE branches for now, soon to be in 9.1-RELEASE). > > Enjoy, > > Doug nice! does it also work when cross-compiling? ie, using an amd64-freebsd-8.3 kernel to compile for i386-freebsd-8.2 thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
nfs problems
Hi, starting about last week, I'm getting: rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32) rsync: write failed on "/net/rnd/dist/tmp/local/amd64.FreeBSD_8.3-wip/compat/li nux/usr/lib/locale/locale-archive.tmpl": Permission denied (13) rsync error: error in file IO (code 11) at receiver.c(322) [receiver=3.0.9] rsync: connection unexpectedly closed (21872 bytes received so far) [sender] rsync error: error in rsync protocol data stream (code 12) at io.c(605) [sender=3.0.9] the server is running 8.2, but the client is very upto date, 8.3-stable as of this morning (local time). after runing rsync several times, it finaly gets synced. another item is that i'm using am-utils, but I don't see it causing the problem I will try using tcp (instead of udp) soon. any insights? cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: nfs problems
> Hi, > starting about last week, I'm getting: > > rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken > pipe (32) > rsync: write failed on > "/net/rnd/dist/tmp/local/amd64.FreeBSD_8.3-wip/compat/li > nux/usr/lib/locale/locale-archive.tmpl": Permission denied (13) > rsync error: error in file IO (code 11) at receiver.c(322) [receiver=3.0.9] > rsync: connection unexpectedly closed (21872 bytes received so far) [sender] > rsync error: error in rsync protocol data stream (code 12) at io.c(605) > [sender=3.0.9] > > the server is running 8.2, but the client is very upto date, 8.3-stable as of > this morning > (local time). > > after runing rsync several times, it finaly gets synced. > > another item is that i'm using am-utils, but I don't see it causing the > problem > > I will try using tcp (instead of udp) soon. > > any insights? > > cheers, > danny the problem is most probably NFS/UDP related. I took am-utils out of the equation. mounted using TCP, and no problems mounted using UDP: Jun 29 12:38:14 pe-02 kernel: nfs server nrnfdn:sf/s ds isseterr:vve ernr o trrnn dd:r:e/s/pddoinisdsitt::n nngoo Jun 29 12:38:14 pe-02 kernel: tt Jun 29 12:38:14 pe-02 kernel: Jun 29 12:38:14 pe-02 kernel: <<66>> rreessppoonnddiinngg Jun 29 12:38:14 pe-02 kernel: Jun 29 12:38:14 pe-02 kernel: nfs server rnd:/dist: not responding Jun 29 12:38:14 pe-02 last message repeated 11 times Jun 29 12:38:27 pe-02 kernel: nfs server rnd:/dist: is alive again the above happens about every 15 seconds (you have to learn to read in between the bytes :-) cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: nfs problems
> On 29/06/2012 10:45, Daniel Braniss wrote: > >> Hi, > >> starting about last week, I'm getting: > >> > >> rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: > >> Broken > >> pipe (32) > >> rsync: write failed on > >> "/net/rnd/dist/tmp/local/amd64.FreeBSD_8.3-wip/compat/li > >> nux/usr/lib/locale/locale-archive.tmpl": Permission denied (13) > >> rsync error: error in file IO (code 11) at receiver.c(322) [receiver=3.0.9] > >> rsync: connection unexpectedly closed (21872 bytes received so far) > >> [sender] > >> rsync error: error in rsync protocol data stream (code 12) at io.c(605) > >> [sender=3.0.9] > >> > >> the server is running 8.2, but the client is very upto date, 8.3-stable as > >> of > >> this morning > >> (local time). > >> > >> after runing rsync several times, it finaly gets synced. > >> > >> another item is that i'm using am-utils, but I don't see it causing the > >> problem > >> > >> I will try using tcp (instead of udp) soon. > >> > >> any insights? > >> > >> cheers, > >>danny > > the problem is most probably NFS/UDP related. > > > > I took am-utils out of the equation. > > mounted using TCP, and no problems > > mounted using UDP: > > Jun 29 12:38:14 pe-02 kernel: nfs server nrnfdn:sf/s ds isseterr:vve ernr o > > trrnn dd:r:e/s/pddoinisdsitt::n nngoo > > Jun 29 12:38:14 pe-02 kernel: tt > > Jun 29 12:38:14 pe-02 kernel: > > Jun 29 12:38:14 pe-02 kernel: <<66>> rreessppoonnddiinngg > > Jun 29 12:38:14 pe-02 kernel: > > Jun 29 12:38:14 pe-02 kernel: nfs server rnd:/dist: not responding > > Jun 29 12:38:14 pe-02 last message repeated 11 times > > Jun 29 12:38:27 pe-02 kernel: nfs server rnd:/dist: is alive again > > > > the above happens about every 15 seconds > > (you have to learn to read in between the bytes :-) > > > > cheers, > > danny > > > Its also possible you are hitting a bug I came across recently. > See http://lists.freebsd.org/pipermail/freebsd-current/2012-June/034860.html > basicly mountd may give incorrect permission denied errors when it is > refreshing the exports list due to non-atomic operations. > see > > kern/131342 > kern/136865 Hi Vince, I thought so too, there used to be a bug caused by am-utils umounting, succeding even if the mount was active, then re-mounting, which caused all kind of problems, the work around was to increase the timeout. But I don't think it's the case here, unless mountd has a life of its own. Furthermore, rsync works without a glitch when mounted nfs/tcp. thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: [Stable 7] CPIO breakage/
> -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > On 2010/06/15 17:05, Sean Bruno wrote: > > On Tue, 2010-06-15 at 17:10 -0500, Sean Bruno wrote: > >> http://svn.freebsd.org/viewvc/base?limit_changes=0&view=revision&revision=208361 > >> > >> I'm not sure what's up with this update, but it hosed up the default > >> behavior of cpio. > >> > >> It appears now that -o won't do the same things that it used to: > >> > >> + cd / > >> + find -x . > >> + egrep -v '^\.(/snap|/usr/sup|/boot/kernel/kernel > >> \.[[:alpha:]_]+\.[[:digit:]]+|/boot/kernel/kernel > >> \.old|/etc/start_if.*|/etc/ssh/ssh_host_.*key|/etc/hostid|/etc/(master.passwd|passwd|spwd.db|pwd.db))' > >> + '[' -n '' ']' > >> + '[' 7 = 4 ']' > >> + '[' -n '' -a -z '' ']' > >> + '[' -n /home/backup ']' > >> + echo 'dumping / ...' > >> dumping / ... > >> + cpio -o --quiet --format crc -O /home/backup/root.amd64.cpio > >> cpio: ./dev not dumped: minor number would be truncated > >> cpio: Removing leading `/' from member names > >> cpio: ./proc not dumped: minor number would be truncated > >> cpio: Removing leading `../' from member names > >> > >> We've had to revert this change from our local tree, suggestions? > >> > >> Sean > > > > > > A little more background. It looks like symlinks are getting stripped > > of their '/' which sucks. Ideas? > > > > Sean > > > > e.g. /home/foo/bar -> /opt/baz/blob > > > > becomes > > > > home/foo/bar -> opt/baz/blob > > > > Yuck. > > This is a security measurement I think. > > - --absolute-filenames disables this behavior. A similar 'security feature' was introduced sometime ago, wich 'silently' broke firefox instalation , it refused to allow symlinks in destination directory, of course the error was ignored by 'make install' so it took some time later to find out that nothing was installed - my /usr/local is symlinked. The solution was to 'fix' cpio to behave as before, since adding the ignore-symlinks feature to firefox's makefile was beyond me :-) danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: diskless boot, nfs server behind router
> > > On Mon, 28 Jun 2010, al...@ulgsm.ru wrote: > > > > > > > kernel built with: > > options BOOTP # Use BOOTP to obtain IP address/hostname > > options BOOTP_NFSROOT # NFS mount root file system using BOOTP info > > options BOOTP_NFSV3 > > > Try building a kernel without the above options, but with > options NFS_ROOT > specified. I think that's what most pxeboot users do and it was what > I had assumed when I looked at the code. > > If that doesn't fix the problem...I haven't got a solution for you, rick I use: options BOOTP_NFSV3 # Use NFS v3 to NFS mount root but the best advice I can give, on the server run tcpdump/wireshark it is very enlighting. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: diskless boot, nfs server behind router
> > > On Mon, 28 Jun 2010, Daniel Braniss wrote: > > >> > >> > >> On Mon, 28 Jun 2010, al...@ulgsm.ru wrote: > >> > >>> > >>> > >>> kernel built with: > >>> options BOOTP # Use BOOTP to obtain IP address/hostname > >>> options BOOTP_NFSROOT # NFS mount root file system using BOOTP info > >>> options BOOTP_NFSV3 > >>> > >> Try building a kernel without the above options, but with > >> options NFS_ROOT > >> specified. I think that's what most pxeboot users do and it was what > >> I had assumed when I looked at the code. > >> > >> If that doesn't fix the problem...I haven't got a solution for you, rick > > > > I use: > > options BOOTP_NFSV3 # Use NFS v3 to NFS mount root > > > > Here's the critical snippet of code: > #if defined(BOOTP_NFSROOT) && defined(BOOTP) > bootpc_init(); /* use bootp to get nfs_diskless filled in */ > #elif defined(NFS_ROOT) > nfs_setup_diskless(); > #endif > > Just fyi, as you can see, unless you have BOOTP_NFSROOT and BOOTP options, > it does things the NFS_ROOT way and basically ignores BOOTP_NFSV3. > (At least thats the way it looks to me. I've been tricked by convoluted > code before:-) you are correct, I missed the NFS_ROOT which is defined in GENERIC, and yes, convoluted is an understatement :-) danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
problems with 8.1-PRERELEASE
Hi, I'm running a resent 8.1-Pre (Friday July 2nd), but I've seen this in previous ones too, make buildworld -j will sometimes fail, or even panic. when it failes it's usually some 'internal compiler error' or panic: page fault. The failures I've seen on different hardware, all runing amd64 version, so I doubt it's hardware. Another common point, the all are multicores, both intel and amd. Any one else seeing this? danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: net-booting the install disks (Re: 8.x grudges)
> On Thu, Jul 08, 2010 at 11:08:04AM -0400, Mikhail T. wrote: > > 08.07.2010 09:53, Jeremy Chadwick напиÑаÐ=²(ла): > > >Then don't modify loader.conf. Instead, once the "Welcome to FreeBSD!=" > > >portion of loader appears, press "6" to shell to the loader prompt > > >and type: > > > > > >set vfs.root.mountfrom="ufs:/dev/md0" > > >boot > > Yes, that works... It just should not be necessary. > > Okay, so let me get this straight. First the complaint was that you had > to modify loader.conf, which involved extracting the CD image, editing > the file, yadda yadda. Now that you've been shown you don't have to > edit loader.conf, the complaint is "it shouldn't be necessary". :-) > > There's actually quite a bit about FreeBSD that "shouldn't be> necessary" > (from an administrator's point of view), but that's a > completely separate issue when compared to your "when I do thing X in > the kernel config, it breaks". Which of those two approaches do you > want to focus on? > > > Red Hat's "kickstart" does not require one to extract CD-images to > > fiddle with a couple of lines, and FreeBSD comes tantalizingly close > > to offer the same functionality. Just not quite :-( > > I've PXE booted Ubuntu and Debian. It was easy to accomplish (read: > easier than FreeBSD) because they offer pxelinux vs. FreeBSD's pxeboot. > > pxelinux[1] offers the ability to read a configuration file via TFTP, > which configures pxelinux itself. The configuration capabilities are > very impressive[2]. FreeBSD folks interested in PXE should really take > a look at this thing. I believe the configuration file is read and > applied immediately, so things like serial port speed changes happen > before pxelinux outputs anything (e.g. no need to rebuild pxelinux just > to get a faster rate). > > That said, given that FreeBSD's pxeboot requires a bunch of extra work > (rebuilding for faster serial speed, and a bunch of other stuff -- it's > in my doc), I'm a surprised you're not complaining about that. :-) > > The bottom line: the PXE booting framework in FreeBSD could be improved. It has been improved, though not the documentation :-( you can configure most of the stuff via DHCP, take a look at src/lib/libstand/bootp.c example lines from dhcpd.conf: option FBSD.ind0 "hint.uart.0.flags=0x10" option FBSD.ind1 "kern.ipc.semmni=256" option FBSD.ind2 "kern.ipc.semmns=2048" and with this code in rc.initdiskless: confpath=`kenv conf-path` if [ -n "$confpath" ] ; then if [ "`expr $confpath : '\(.*\):'`" ] ; then echo Mounting $confpath on /conf mount_nfs $confpath /conf chkerr $? "mount_nfs $confpath /conf" to_umount="${to_umount} $confpath" fi fi eval `kenv | sed -n 's/^rc\.//p'` rm -f /etc/rc.conf /etc/rc.conf.local for fc in $conf0 $conf1 $conf2 $conf3 $conf4 $conf5 $conf6 $conf7 $conf8 $conf9 rc.conf.$hostname do ho=`expr $fc : '\(.*\):'` fl=`expr $fc : '.*/\(.*\)'` if [ "${ho}" != "" ]; then mp=`expr $fc : '\(.*\)/.*'` mount_nfs $mp /mnt > /dev/null 2>&1 if [ -f /mnt/$fl ]; then echo "# from $fc /mnt/$fl" >> /etc/rc.conf cat /mnt/$fl >> /etc/rc.conf fi umount /mnt > /dev/null 2>&1 elif [ -e /conf/$fc ] ; then echo "# from /conf/$fc" >> /etc/rc.conf cat /conf/$fc >> /etc/rc.conf fi done and these lines in dhcpd.conf option FBSD.conf-path="fr-01:/vol/system/share/conf" option FBSD.rc-conf3 "rc.ws8" ... will generate a 'personalized' rc.conf danny PS: this is not the first time I have posted this. [...] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
latest 8.1 hangs on xpt_config
It seems that the latest changes (last 7 days) introduced this problem: ... run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config run_interrupt_driven_hooks: still waiting after 120 seconds for xpt_config run_interrupt_driven_hooks: still waiting after 180 seconds for xpt_config run_interrupt_driven_hooks: still waiting after 240 seconds for xpt_config ... i'll try to hunt this down, but any help is welcome. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: latest 8.1 hangs on xpt_config
> On Fri, Jul 23, 2010 at 09:35:55AM +0300, Daniel Braniss wrote: > > It seems that the latest changes (last 7 days) introduced this problem: > > ... > > run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config > > run_interrupt_driven_hooks: still waiting after 120 seconds for xpt_config > > run_interrupt_driven_hooks: still waiting after 180 seconds for xpt_config > > run_interrupt_driven_hooks: still waiting after 240 seconds for xpt_config > > ... > > > > i'll try to hunt this down, but any help is welcome. > > Recent to semi-recent commits relevant to xpt that I can find The > problem might not be even in xpt though. Which xpt piece pertains to > you probably depends on your system setup/configuration. Dates/times > are in PDT/UTC-0700: > > -rw-r--r--1 root wheel 6037 1 Mar 22:48 > /usr/src/sys/cam/cam_xpt_internal.h > -rw-r--r--1 root wheel 124773 9 May 10:19 > /usr/src/sys/cam/cam_xpt.c > -rw-r--r--1 root wheel 72556 23 May 10:41 > /usr/src/sys/cam/scsi/scsi_xpt.c > -rw-r--r--1 root wheel 56663 19 Jul 05:28 > /usr/src/sys/cam/ata/ata_xpt.c > > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cam/cam_xpt_internal.h > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cam/cam_xpt.c > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cam/scsi/scsi_xpt.c > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cam/ata/ata_xpt.c thanks Jeremy, i'll try and make some sence of the changes. here is some more info: there are one disk and one dvd connected via SATA CPU: Intel(R) Core(TM) i5 CPU 660 @ 3.33GHz (3325.02-MHz K8-class CPU)^M Origin = "GenuineIntel" Id = 0x20652 Family = 6 Model = 25 Stepping = 2^M Features=0xbfebfbff^M Features2=0x298e3ff^M AMD Features=0x28100800^M AMD Features2=0x1^M ... atapci0: port 0xf0f0-0xf0f7,0xf0e0-0xf0e3,0xf0d0-0xf0d7,0xf0c0-0xf0c3,0xf0b0-0xf0bf irq 18 at device 22.2 on pci0^M atapci0: Reserved 0x10 bytes for rid 0x20 type 4 at 0xf0b0^M ioapic0: routing intpin 18 (PCI IRQ 18) to lapic 0 vector 49^M atapci0: [MPSAFE]^M atapci0: [ITHREAD]^M ata2: on atapci0^M atapci0: Reserved 0x8 bytes for rid 0x10 type 4 at 0xf0f0^M atapci0: Reserved 0x4 bytes for rid 0x14 type 4 at 0xf0e0^M ata2: reset tp1 mask=03 ostat0=7f ostat1=7f^M ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata2: stat1=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata2: reset tp2 stat0=ff stat1=ff devices=0x0^M ata2: [MPSAFE]^M ata2: [ITHREAD]^M ata3: on atapci0^M atapci0: Reserved 0x8 bytes for rid 0x18 type 4 at 0xf0d0^M atapci0: Reserved 0x4 bytes for rid 0x1c type 4 at 0xf0c0^M ata3: reset tp1 mask=03 ostat0=7f ostat1=7f^M ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata3: stat1=0x7f err=0x7f lsb=0x7f msb=0x7f^M ata3: reset tp2 stat0=ff stat1=ff devices=0x0^M ata3: [MPSAFE]^M ata3: [ITHREAD]^M ... ahci0: port 0xf090-0xf097,0xf080-0xf083,0xf070-0xf077,0xf060-0xf063,0xf020-0xf03f mem 0xfe425000-0xfe4257ff irq 19 at device 31.2 on pci0^M ahci0: Reserved 0x800 bytes for rid 0x24 type 3 at 0xfe425000^M ahci0: attempting to allocate 1 MSI vectors (1 supported)^M msi: routing MSI IRQ 257 to local APIC 0 vector 53^M ahci0: using IRQ 257 for MSI^M ahci0: [MPSAFE]^M ahci0: [ITHREAD]^M ahci0: AHCI v1.30 with 6 3Gbps ports, Port Multiplier not supported^M ahci0: Caps: 64bit NCQ SNTF MPS ALP AL CLO 3Gbps PMD SSC PSC 32cmd EM eSATA 6ports^M ahci0: Caps2: APST^M ahci0: EM Caps: ALHD XMT SMB LED^M ahcich0: at channel 0 on ahci0^M ahcich0: [MPSAFE]^M ahcich0: [ITHREAD]^M ahcich0: Caps:^M ahcich1: at channel 1 on ahci0^M ahcich1: [MPSAFE]^M ahcich1: [ITHREAD]^M ahcich1: Caps:^M ahcich2: at channel 4 on ahci0^M ahcich2: [MPSAFE]^M ahcich2: [ITHREAD]^M ahcich2: Caps: HPCP ESP^M ... ata2: Identifying devices: ^M ata2: New devices: ^M ata3: Identifying devices: ^M ata3: New
WITNESS is the culprit was Re: latest 8.1 hangs on xpt_config
> Daniel Braniss wrote: > >> On Fri, Jul 23, 2010 at 09:35:55AM +0300, Daniel Braniss wrote: > >>> It seems that the latest changes (last 7 days) introduced this problem: > >>> ... > >>> run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config > >>> run_interrupt_driven_hooks: still waiting after 120 seconds for xpt_config > >>> run_interrupt_driven_hooks: still waiting after 180 seconds for xpt_config > >>> run_interrupt_driven_hooks: still waiting after 240 seconds for xpt_config > >>> ... > >>> > >>> i'll try to hunt this down, but any help is welcome. > >> Recent to semi-recent commits relevant to xpt that I can find The > >> problem might not be even in xpt though. Which xpt piece pertains to > >> you probably depends on your system setup/configuration. Dates/times > >> are in PDT/UTC-0700: > >> > >> -rw-r--r--1 root wheel 6037 1 Mar 22:48 > >> /usr/src/sys/cam/cam_xpt_internal.h > >> -rw-r--r--1 root wheel 124773 9 May 10:19 > >> /usr/src/sys/cam/cam_xpt.c > >> -rw-r--r--1 root wheel 72556 23 May 10:41 > >> /usr/src/sys/cam/scsi/scsi_xpt.c > >> -rw-r--r--1 root wheel 56663 19 Jul 05:28 > >> /usr/src/sys/cam/ata/ata_xpt.c > >> > >> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cam/cam_xpt_internal.h > >> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cam/cam_xpt.c > >> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cam/scsi/scsi_xpt.c > >> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cam/ata/ata_xpt.c > > > > thanks Jeremy, i'll try and make some sence of the changes. > > > > here is some more info: > > there are one disk and one dvd connected via SATA > > I recently had report about alike problem with "PIONEER DVD-RW DVR-215 > 1.19" drive. Don't you have the same device or another Pioneer? > > In that case this patch: > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cam/ata/ata_xpt.c.diff?r1=1.3.2.29;r2=1.3.2.30;f=h > allowed system to boot, though problem seemed to be hardware. Try this > patch, it at least may give additional info about the problem. That was my first guess, so I detached the DVD, but the problem persisted. the device is: ATAPI DVD A DH16AAS JL34> Removable CD-ROM SCSI-0 device anyways, I compiled a kernel without WITNESS, and it now works ok! thanks all, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: WITNESS is the culprit was Re: latest 8.1 hangs on xpt_config
> That's hardly a solution or reason. of course it's not, just that last successful boot had WITNESS configured, and with the latest patches it hang, compiling without WITNESS allowed the boot to proceed. > Still please try the patch (or fresh > 8-STABLE with it), may be it tell more. > according to my logs (i do some hulahups sync'ing via svn/hg which makes it abit of a problem following versions :-) your patch seems to be all ready in my kernel: hg log ata_xpt.c -p changeset: 2939:440362ab79cb branch: 8 tag: tip parent: 2938:fc1c9d5f4b38 parent: 2937:846cb2242d34 user:da...@cs.huji.ac.il date:Fri Jul 23 08:41:24 2010 +0300 summary: -- merge from head -- diff -r fc1c9d5f4b38 -r 440362ab79cb sys/cam/ata/ata_xpt.c --- a/sys/cam/ata/ata_xpt.c Fri Jul 23 08:40:46 2010 +0300 +++ b/sys/cam/ata/ata_xpt.c Fri Jul 23 08:41:24 2010 +0300 @@ -134,6 +134,7 @@ uint32_tpm_prv; int restart; int spinup; + int faults; u_int caps; struct cam_periph *periph; } probe_softc; @@ -738,14 +739,28 @@ ident_buf = &path->device->ident_data; if ((done_ccb->ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) { -device_fail: if ((!softc->restart) && - cam_periph_error(done_ccb, 0, 0, NULL) == ERESTART) { + if (softc->restart) { + if (bootverbose) { + cam_error_print(done_ccb, + CAM_ESF_ALL, CAM_EPF_ALL); + } + } else if (cam_periph_error(done_ccb, 0, 0, NULL) == ERESTART) return; - } else if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) { + if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) { /* Don't wedge the queue */ xpt_release_devq(done_ccb->ccb_h.path, /*count*/1, /*run_queue*/TRUE); } + if (softc->restart) { + softc->faults++; + if ((done_ccb->ccb_h.status & CAM_STATUS_MASK) == + CAM_CMD_TIMEOUT) + softc->faults += 4; + if (softc->faults < 10) + goto done; + else + softc->restart = 0; + } else /* Old PIO2 devices may not support mode setting. */ if (softc->action == PROBE_SETMODE && ata_max_pmode(ident_buf) <= ATA_PIO2 && @@ -761,7 +776,7 @@ * already marked unconfigured, notify the peripheral * drivers that this device is no more. */ - if ((path->device->flags & CAM_DEV_UNCONFIGURED) == 0) +device_fail: if ((path->device->flags & CAM_DEV_UNCONFIGURED) == 0) xpt_async(AC_LOST_DEVICE, path, NULL); found = 0; goto done; @@ -1209,6 +1224,12 @@ !(work_ccb->cpi.hba_misc & PIM_NOBUSRESET) && !timevalisset(&request_ccb->ccb_h.path->bus->last_reset)) { reset_ccb = xpt_alloc_ccb_nowait(); + if (reset_ccb == NULL) { + request_ccb->ccb_h.status = CAM_RESRC_UNAVAIL; + xpt_free_ccb(work_ccb); + xpt_done(request_ccb); + return; + } xpt_setup_ccb(&reset_ccb->ccb_h, request_ccb-> ccb_h.path, CAM_PRIORITY_NONE); reset_ccb->ccb_h.func_code = XPT_RESET_BUS; @@ -1228,6 +1249,7 @@ malloc(sizeof(ata_scan_bus_info), M_CAMXPT, M_NOWAIT); if (scan_info == NULL) { request_ccb->ccb_h.status = CAM_RESRC_UNAVAIL; + xpt_free_ccb(work_ccb); xpt_done(request_ccb); return; } cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Diskless/readonly root booting issues
> Hi all, > > I've been working on updating my semi-embedded images to > 7.3-stable of late (I generally wait for .3+ releases), it's been a > few years since the last time I did one of these and I'm having some > issues getting my netboot test environment to behave itself. > > I'm sure it's something simple but I've spent quite a bit of time > looking for answers and poking the system but no joy yet. > > Basically I use a PXE booted NFS root to test my reduced footprint > image builds, the boot is working but init is attempting to remount / > rw (in spite of it being marked ro in fstab) which of course fails > because the directory is exported ro from the NFS server at which > point the system dumps me to single user mode; > > === OUTPUT === > > Starting file system checks: > udp: Netconfig database not found > Mounting root filesystem rw failed, startup aborted > ERROR: ABORTING BOOT (sending SIGTERM to parent)! > Sep 30 09:60:02 init: /bin/sh on /etc/rc terminated abnormally, going > to single user mode > Enter full pathname of shell or RETURN for /bin/sh: > > > > Relevant configs from the diskless root > > == rc.conf == > > ifconfig_le0="DHCP" > > diskless_mount=/etc/rc.initdiskless > > varsize=8192 > varmfs="YES" > > tmpsize=8192 > tmpmfs="YES" > > nfs_client_enable="YES" > > dumpdev="NO" > > = > > rc.initdiskless is the version from /usr/share/examples/rc.initdiskless > > == fstab == > > 192.168.2.2:/usr/fbtest / nfs ro 0 0 > proc /proc procfs rw 0 0 > > > > == loader.conf == > > verbose_loading="YES" > > autoboot_delay="2" > > > > Kernel is (obviously) built with NFS_ROOT and NFSCLIENT, relatively > minimalist otherwise, have also tested with GENERIC, same result. > > I must be forgetting something simple in all of this, I don't recall > it being terribly difficult to get this stuff working when I was doing > my original work with 6.3, though I don't recall the use of the > initdiskless script, IIRC I was using rc.diskless2 which (again IIRC) > was later replaced by /etc/rc.d/diskless but I've not been able to > find this script anywhere. > > Any suggestions would be greatly appreciated at this point. > > Thanks, > > Morgan Reed firstly, you should be using the latest pxeboot, it passes the root file-handle to the kernel, so no need to remount it, so remove the line from the fstab. secondly, try using /etc/rc.initdiskless - which is the default. use the KISS method :-) danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
boot0cfg problems
In a not so distant past, boot0cfg -sn ... used to work, then it only partialy worked, it would modify the data in boot but not the mbr, for which 'gpart -s set active -in ...' modified the mbr. Now # boot0cfg -s1 -v /dev/mfid0 boot0cfg: write_mbr: /dev/mfid0: Operation not permitted but: # boot0cfg -v /dev/mfid0 # flag start chs type end chs offset size 1 0x80 0: 1: 1 0xa5 1023:212:63 63 41943006 2 0x00 1023:255:63 0xa5 1023:169:63 41943069 41943006 3 0x00 1023:255:63 0xa5 1023:126:63 83886075 41943006 4 0x00 1023:255:63 0xa5 1023:201:63125829081 1046478825 version=2.0 drive=0x80 mask=0x3 ticks=182 bell=# (0x23) options=packet,update,nosetdrv volume serial ID 9090-9090 default_selection=F2 (Slice 2) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: boot0cfg problems
> On Fri, Oct 01, 2010 at 09:26:41AM +0200, Daniel Braniss wrote: > > In a not so distant past, boot0cfg -sn ... used to work, then it only > > partialy worked, it would modify the data in boot but not the mbr, for > > which 'gpart -s set active -in ...' modified the mbr. Now > > # boot0cfg -s1 -v /dev/mfid0 > > boot0cfg: write_mbr: /dev/mfid0: Operation not permitted > > but: > > # boot0cfg -v /dev/mfid0 > > # flag start chs type end chs offset size > > 1 0x80 0: 1: 1 0xa5 1023:212:63 63 41943006 > > 2 0x00 1023:255:63 0xa5 1023:169:63 41943069 41943006 > > 3 0x00 1023:255:63 0xa5 1023:126:63 83886075 41943006 > > 4 0x00 1023:255:63 0xa5 1023:201:63125829081 1046478825 > > > > version=2.0 drive=0x80 mask=0x3 ticks=182 bell=# (0x23) > > options=packet,update,nosetdrv > > volume serial ID 9090-9090 > > default_selection=F2 (Slice 2) > > Can you try doing "sysctl kern.geom.debugflags=16" first? > this is not realy foot-shooting :-), but - the error msg is gone, - the slice info is updated, - but the active bit in the mbr is not! - some bioses rely on it. looking at changes done to boot0cfg.c there is now an err(...) call which does an exit, before the boot is updated. I changed it to a warn(...) and the old behaviour is back. BTW, a- gpart command should have been: gpart set -a active -i n ... b- this works with kern.geom.debugflags=0. thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: boot0cfg problems
> On Fri, Oct 01, 2010 at 01:20:42PM +0200, Daniel Braniss wrote: > > > On Fri, Oct 01, 2010 at 09:26:41AM +0200, Daniel Braniss wrote: > > > > In a not so distant past, boot0cfg -sn ... used to work, then it only > > > > partialy worked, it would modify the data in boot but not the mbr, for > > > > which 'gpart -s set active -in ...' modified the mbr. Now > > > > # boot0cfg -s1 -v /dev/mfid0 > > > > boot0cfg: write_mbr: /dev/mfid0: Operation not permitted > > > > but: > > > > # boot0cfg -v /dev/mfid0 > > > > # flag start chs type end chs offset size > > > > 1 0x80 0: 1: 1 0xa5 1023:212:63 63 41943006 > > > > 2 0x00 1023:255:63 0xa5 1023:169:63 41943069 41943006 > > > > 3 0x00 1023:255:63 0xa5 1023:126:63 83886075 41943006 > > > > 4 0x00 1023:255:63 0xa5 1023:201:63125829081 1046478825 > > > > > > > > version=2.0 drive=0x80 mask=0x3 ticks=182 bell=# (0x23) > > > > options=packet,update,nosetdrv > > > > volume serial ID 9090-9090 > > > > default_selection=F2 (Slice 2) > > > > > > Can you try doing "sysctl kern.geom.debugflags=16" first? > > > > > this is not realy foot-shooting :-), but > > - the error msg is gone, > > - the slice info is updated, > > - but the active bit in the mbr is not! - some bioses rely on it. > > looking at changes done to boot0cfg.c there is now an err(...) call which > > does an exit, before the boot is updated. I changed it to a warn(...) and > > the > > old > > behaviour is back. > > BTW, > > a- gpart command should have been: gpart set -a active -i n ... > > b- this works with kern.geom.debugflags=0. > > Bit 4 (hence 0x10, or 16 decimal) in kern.geom.debugflags is described > as: > > 0x10 (allow foot shooting) > Allow writing to Rank 1 providers. This would, for example, > allow the super-user to overwrite the MBR on the root disk or > write random sectors elsewhere to a mounted disk. The implicaâ > tions are obvious. > > I read this as: "you can't modify the MBR of a root disk unless bit 4 of > this sysctl is set". Sector 0 holds the MBR, and boot0cfg modifies the > MBR. So can you explain what you mean by "this really isn't > foot-shooting?" I mean, even the NOTE section of the boot0cfg(8) man > page documents what I'm trying to say. > > Anyway, if the MBR did get updated without kern.geom.debugflags having > bit 4 set, then wouldn't this indicate there's a bug in GEOM's "sector > 0" protection? but mbr did NOT get updated by boot0cfg, gpart does however succeed, but gpart knows nothing about the other bits boot0cfg knows, like which slice to boot from (not to be confused with the current active slice), what bell to ring, etc, these are (or used to be) updated before the last change. anyways, as you correctly pointed out, the problem is in GEOM, being somewhat over protective :-) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
latest -stable: still waiting after ...
hi, with the latest -stable, the boot process gets stuck with ... ugen2.2: at usbus2 uhub6: on usbus2 uhub6: 3 ports with 3 removable, self powered ugen3.2: at usbus3 ukbd0: on usbus3 kbd2 at ukbd0 ums0: on usbus3 ums0: 3 buttons and [Z] coordinates ID=0 <- stuck here run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config SMP: AP CPU #1 Launched! SMP: AP CPU #7 Launched! SMP: AP CPU #3 Launched! this does not happen with and older -stable (August) kernel Cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NFS deadlock (unkillable nfsd and no mounts work)
> on 05/11/2010 23:27 Kostik Belousov said the following: > > I agree that the fix a right fix for real issue. It should only > > affect the filesystems that do support VFS_VGET(). In other words, > > it is relevant for e.g. UFS exports, but not for ZFS, that is the > > Andrey case. > > Actually ZFS does implement vfs_vget, but with a special quirk for .zfs/ and > stuff under it: > > static int > zfs_vget(vfs_t *vfsp, ino_t ino, int flags, vnode_t **vpp) > { > zfsvfs_t*zfsvfs = vfsp->vfs_data; > znode_t *zp; > int err; > > /* > * zfs_zget() can't operate on virtual entires like .zfs/ or entries === == > * .zfs/snapshot/ directories, that's why we return EOPNOTSUPP. > * This will make NFS to switch to LOOKUP instead of using VGET. > */ > if (ino == ZFSCTL_INO_ROOT || ino == ZFSCTL_INO_SNAPDIR) > return (EOPNOTSUPP); > ... > ... > > > -- > Andriy Gapon > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
panic on boot
the hardware is Sun Fire X2200 M2, and it's discless, PXE booted. this seems to have started sometime before 8.2, and it 'sometimes happens': FreeBSD 8.2-PRERELEASE #15 r4274: Wed Dec 22 09:11:27 IST 2010c40, rbp = 0x80ef5c60 --- da...@rnd:/home/obj/rnd/r+d/stable/8/sys/HUJI amd64 Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Dual-Core AMD Opteron(tm) Processor 2218 (2613.40-MHz K8-class CPU) Origin = "AuthenticAMD" Id = 0x40f13 Family = f Model = 41 Stepping = 3 Features=0x178bfbff Features2=0x2001 AMD Features=0xea500800 AMD Features2=0x1f ... SMP: AP CPU #3 Launched! (cd0:ata0:0:0:0): SCSI status: Check Condition cpu3 AP: (cd0:ata0:0:0:0): SCSI sense: NOT READY asc:3a,0 (Medium not present) ID: 0x0300 VER: 0x80050010 LDR: 0x DFR: 0x (cd0: lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff ata0:0: timer: 0x000200ef therm: 0x0001 err: 0x00f00: pmc: 0x000104000): Error 6, Unretryable error SMP: AP CPU #2 Launched! cd0 at ata0 bus 0 scbus0 target 0 lun 0 cpu2 AP: cd0: ID: 0x0200 VER: 0x80050010 LDR: 0x DFR: 0x Removable CD-ROM SCSI-0 device lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff cd0: 33.300MB/s transfers timer: 0x000200ef therm: 0x0001 err: 0x00f0 ( pmc: 0x00010400UDMA2, ATAPI 12bytes, ioapic0: routing intpin 3 (PIO 65534bytesISA IRQ 3)) to lapic 1 vector 48 f loiwotaapbilce0 :c lreoaunteirn gs tianrttpeidn 4 (cd0: Attempt to query device size failed: NOT READY, Medium not present ISA IRQ 4) to lapic 2 vector 48 ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 3 vector 48 ioapic0: routing intpin 15 (ISA IRQ 15) to lapic 1 vector 49 ioapic0: routing intpin 17 (PCI IRQ 17) to lapic 2 vector 49 ioapic0: routing intpin 18 (PCI IRQ 18) to lapic 3 vector 49 ioapic0: routing intpin 22 (PCI IRQ 22) to lapic 1 vector 50 ioapic0: routing intpin 23 (PCI IRQ 23) to lapic 2 vector 50 kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x10 fault code = supervisor read data, page not present instruction pointer = 0x20:0x808b1581 stack pointer = 0x28:0x80ef5b20 frame pointer = 0x28:0x80ef5b50 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= resume, IOPL = 0 current process = 0 (swapper) trap number = 12 panic: page fault cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a kdb_backtrace() at kdb_backtrace+0x37 panic() at panic+0x187 trap_fatal() at trap_fatal+0x290 trap_pfault() at trap_pfault+0x28f trap() at trap+0x3df calltrap() at calltrap+0x8 --- trap 0xc, rip = 0x808b1581, rsp = 0x80ef5b20, rbp = 0x80ef5b50 --- intr_execute_handlers() at intr_execute_handlers+0x21 lapic_handle_intr() at lapic_handle_intr+0x37 Xapic_isr1() at Xapic_isr1+0xa5 --- interrupt, rip = 0x808b6cf3, rsp = 0x80ef5c40, rbp = 0x80ef5c60 --- spinlock_exit() at spinlock_exit+0x33 ioapic_assign_cpu() at ioapic_assign_cpu+0x123 intr_shuffle_irqs() at intr_shuffle_irqs+0x9d mi_startup() at mi_startup+0x77 btext() at btext+0x2c Uptime: 2s ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: panic on boot
> On Wednesday, December 22, 2010 5:12:03 am Daniel Braniss wrote: > > the hardware is Sun Fire X2200 M2, and it's discless, PXE booted. > > > > this seems to have started sometime before 8.2, and it > > 'sometimes happens': > > > > FreeBSD 8.2-PRERELEASE #15 r4274: Wed Dec 22 09:11:27 IST 2010c40, rbp = > > 0x80ef5c60 --- > > da...@rnd:/home/obj/rnd/r+d/stable/8/sys/HUJI amd64 > > Timecounter "i8254" frequency 1193182 Hz quality 0 > > CPU: Dual-Core AMD Opteron(tm) Processor 2218 (2613.40-MHz K8-class CPU) > > Origin = "AuthenticAMD" Id = 0x40f13 Family = f Model = 41 Stepping = > > 3 > > > > Features=0x178bfbff > CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> > > Features2=0x2001 > > AMD Features=0xea500800 > > AMD Features2=0x1f > > ... > > SMP: AP CPU #3 Launched! > > (cd0:ata0:0:0:0): SCSI status: Check Condition > > cpu3 AP: > > (cd0:ata0:0:0:0): SCSI sense: NOT READY asc:3a,0 (Medium not present) > > ID: 0x0300 VER: 0x80050010 LDR: 0x DFR: 0x > > (cd0: lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff > > ata0:0: timer: 0x000200ef therm: 0x0001 err: 0x00f00: pmc: > > 0x000104000): > > Error 6, Unretryable error > > SMP: AP CPU #2 Launched! > > cd0 at ata0 bus 0 scbus0 target 0 lun 0 > > cpu2 AP: > > cd0: ID: 0x0200 VER: 0x80050010 LDR: 0x DFR: 0x > > Removable CD-ROM SCSI-0 device > > lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff > > cd0: 33.300MB/s transfers timer: 0x000200ef therm: 0x0001 err: > > 0x00f0 ( pmc: 0x00010400UDMA2, > > ATAPI 12bytes, ioapic0: routing intpin 3 (PIO 65534bytesISA IRQ 3)) to > > lapic 1 vector 48 > > f > > loiwotaapbilce0 :c lreoaunteirn gs tianrttpeidn > > 4 (cd0: Attempt to query device size failed: NOT READY, Medium not present > > ISA IRQ 4) to lapic 2 vector 48 > > ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 3 vector 48 > > ioapic0: routing intpin 15 (ISA IRQ 15) to lapic 1 vector 49 > > ioapic0: routing intpin 17 (PCI IRQ 17) to lapic 2 vector 49 > > ioapic0: routing intpin 18 (PCI IRQ 18) to lapic 3 vector 49 > > ioapic0: routing intpin 22 (PCI IRQ 22) to lapic 1 vector 50 > > ioapic0: routing intpin 23 (PCI IRQ 23) to lapic 2 vector 50 > > kernel trap 12 with interrupts disabled > > > > > > Fatal trap 12: page fault while in kernel mode > > cpuid = 0; apic id = 00 > > fault virtual address = 0x10 > > fault code = supervisor read data, page not present > > instruction pointer = 0x20:0x808b1581 > > stack pointer = 0x28:0x80ef5b20 > > frame pointer = 0x28:0x80ef5b50 > > code segment= base 0x0, limit 0xf, type 0x1b > > = DPL 0, pres 1, long 1, def32 0, gran 1 > > processor eflags= resume, IOPL = 0 > > current process = 0 (swapper) > > trap number = 12 > > panic: page fault > > cpuid = 0 > > KDB: stack backtrace: > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > > kdb_backtrace() at kdb_backtrace+0x37 > > panic() at panic+0x187 > > trap_fatal() at trap_fatal+0x290 > > trap_pfault() at trap_pfault+0x28f > > trap() at trap+0x3df > > calltrap() at calltrap+0x8 > > --- trap 0xc, rip = 0x808b1581, rsp = 0x80ef5b20, rbp = > > 0x80ef5b50 --- > > intr_execute_handlers() at intr_execute_handlers+0x21 > > lapic_handle_intr() at lapic_handle_intr+0x37 > > Xapic_isr1() at Xapic_isr1+0xa5 > > --- interrupt, rip = 0x808b6cf3, rsp = 0x80ef5c40, rbp = > > 0x80ef5c60 --- > > spinlock_exit() at spinlock_exit+0x33 > > ioapic_assign_cpu() at ioapic_assign_cpu+0x123 > > intr_shuffle_irqs() at intr_shuffle_irqs+0x9d > > mi_startup() at mi_startup+0x77 > > btext() at btext+0x2c > > Uptime: 2s > > Can you do 'l *intr_execute_handlers+0x21' and 'l *ioapic_assign_cpu+0x123' > in 'gdb kernel.debug' of your kernel? sure, as soon as it happens, and it aint happening now :-( but when it will happen, I think it won't let me into the debugger - probably will have to recompile thanks danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: panic on boot
ok, it happened ... Cannot dump. Device not defined or unavailable. Automatic reboot in 15 seconds - press a key on the console to abort --> Press a key on the console to reboot, --> or switch off the system now. but a- the 15 seconds never happen :-) b- there is some magic to get into the debugger but can't find it. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: panic on boot
> On Wednesday, December 22, 2010 10:58:56 am Daniel Braniss wrote: > > > On Wednesday, December 22, 2010 5:12:03 am Daniel Braniss wrote: > > > > the hardware is Sun Fire X2200 M2, and it's discless, PXE booted. > > > > > > > > this seems to have started sometime before 8.2, and it > > > > 'sometimes happens': > > > > > > > > FreeBSD 8.2-PRERELEASE #15 r4274: Wed Dec 22 09:11:27 IST 2010c40, rbp > > > > = > > > > 0x80ef5c60 --- > > > > da...@rnd:/home/obj/rnd/r+d/stable/8/sys/HUJI amd64 > > > > Timecounter "i8254" frequency 1193182 Hz quality 0 > > > > CPU: Dual-Core AMD Opteron(tm) Processor 2218 (2613.40-MHz K8-class CPU) > > > > Origin = "AuthenticAMD" Id = 0x40f13 Family = f Model = 41 > > > > Stepping = 3 > > > > > > > > Features=0x178bfbff > > > CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> > > > > Features2=0x2001 > > > > AMD > > > > Features=0xea500800 > > > > AMD Features2=0x1f > > > > ... > > > > SMP: AP CPU #3 Launched! > > > > (cd0:ata0:0:0:0): SCSI status: Check Condition > > > > cpu3 AP: > > > > (cd0:ata0:0:0:0): SCSI sense: NOT READY asc:3a,0 (Medium not present) > > > > ID: 0x0300 VER: 0x80050010 LDR: 0x DFR: 0x > > > > (cd0: lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: > > > > 0x01ff > > > > ata0:0: timer: 0x000200ef therm: 0x0001 err: 0x00f00: pmc: > > > > 0x000104000): > > > > Error 6, Unretryable error > > > > SMP: AP CPU #2 Launched! > > > > cd0 at ata0 bus 0 scbus0 target 0 lun 0 > > > > cpu2 AP: > > > > cd0: ID: 0x0200 VER: 0x80050010 LDR: 0x DFR: > > > > 0x > > > > Removable CD-ROM SCSI-0 device > > > > lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff > > > > cd0: 33.300MB/s transfers timer: 0x000200ef therm: 0x0001 err: > > > > 0x00f0 ( pmc: 0x00010400UDMA2, > > > > ATAPI 12bytes, ioapic0: routing intpin 3 (PIO 65534bytesISA IRQ 3)) to > > > > lapic 1 vector 48 > > > > f > > > > loiwotaapbilce0 :c lreoaunteirn gs tianrttpeidn > > > > 4 (cd0: Attempt to query device size failed: NOT READY, Medium not > > > > present > > > > ISA IRQ 4) to lapic 2 vector 48 > > > > ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 3 vector 48 > > > > ioapic0: routing intpin 15 (ISA IRQ 15) to lapic 1 vector 49 > > > > ioapic0: routing intpin 17 (PCI IRQ 17) to lapic 2 vector 49 > > > > ioapic0: routing intpin 18 (PCI IRQ 18) to lapic 3 vector 49 > > > > ioapic0: routing intpin 22 (PCI IRQ 22) to lapic 1 vector 50 > > > > ioapic0: routing intpin 23 (PCI IRQ 23) to lapic 2 vector 50 > > > > kernel trap 12 with interrupts disabled > > > > > > > > > > > > Fatal trap 12: page fault while in kernel mode > > > > cpuid = 0; apic id = 00 > > > > fault virtual address = 0x10 > > > > fault code = supervisor read data, page not present > > > > instruction pointer = 0x20:0x808b1581 > > > > stack pointer = 0x28:0x80ef5b20 > > > > frame pointer = 0x28:0x80ef5b50 > > > > code segment= base 0x0, limit 0xf, type 0x1b > > > > = DPL 0, pres 1, long 1, def32 0, gran 1 > > > > processor eflags= resume, IOPL = 0 > > > > current process = 0 (swapper) > > > > trap number = 12 > > > > panic: page fault > > > > cpuid = 0 > > > > KDB: stack backtrace: > > > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > > > > kdb_backtrace() at kdb_backtrace+0x37 > > > > panic() at panic+0x187 > > > > trap_fatal() at trap_fatal+0x290 > > > > trap_pfault() at trap_pfault+0x28f > > > > trap() at trap+0x3df > > > > calltrap() at calltrap+0x8 > > > > --- trap 0xc, rip = 0x808b1581, rsp = 0x80ef5b20, rbp = > > > > 0x80ef5b50 --- > > > > intr_execute_handlers() at intr_execute_handlers+0x21 > > > > lapic_handle_intr() at lapic_handle_intr+0x37 > > > > Xapic_isr1() at Xapic_isr1+0xa5 > > > > --- inter
Re: panic on boot
> On Thursday, December 23, 2010 1:47:39 am Daniel Braniss wrote: > > > On Wednesday, December 22, 2010 10:58:56 am Daniel Braniss wrote: > > > > > On Wednesday, December 22, 2010 5:12:03 am Daniel Braniss wrote: > > > > > > the hardware is Sun Fire X2200 M2, and it's discless, PXE booted. > > > > > > > > > > > > this seems to have started sometime before 8.2, and it > > > > > > 'sometimes happens': > > > > > > > > > > > > FreeBSD 8.2-PRERELEASE #15 r4274: Wed Dec 22 09:11:27 IST 2010c40, > > > > > > rbp = > > > > > > 0x80ef5c60 --- > > > > > > da...@rnd:/home/obj/rnd/r+d/stable/8/sys/HUJI amd64 > > > > > > Timecounter "i8254" frequency 1193182 Hz quality 0 > > > > > > CPU: Dual-Core AMD Opteron(tm) Processor 2218 (2613.40-MHz K8-class > > > > > > CPU) > > > > > > Origin = "AuthenticAMD" Id = 0x40f13 Family = f Model = 41 > > > > > > Stepping = 3 > > > > > > > > > > > > Features=0x178bfbff > > > > > CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> > > > > > > Features2=0x2001 > > > > > > AMD > > > > > > Features=0xea500800 > > > > > > AMD Features2=0x1f > > > > > > ... > > > > > > SMP: AP CPU #3 Launched! > > > > > > (cd0:ata0:0:0:0): SCSI status: Check Condition > > > > > > cpu3 AP: > > > > > > (cd0:ata0:0:0:0): SCSI sense: NOT READY asc:3a,0 (Medium not > > > > > > present) > > > > > > ID: 0x0300 VER: 0x80050010 LDR: 0x DFR: > > > > > > 0x > > > > > > (cd0: lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: > > > > > > 0x01ff > > > > > > ata0:0: timer: 0x000200ef therm: 0x0001 err: 0x00f00: pmc: > > > > > > 0x000104000): > > > > > > Error 6, Unretryable error > > > > > > SMP: AP CPU #2 Launched! > > > > > > cd0 at ata0 bus 0 scbus0 target 0 lun 0 > > > > > > cpu2 AP: > > > > > > cd0: ID: 0x0200 VER: 0x80050010 LDR: 0x DFR: > > > > > > 0x > > > > > > Removable CD-ROM SCSI-0 device > > > > > > lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: > > > > > > 0x01ff > > > > > > cd0: 33.300MB/s transfers timer: 0x000200ef therm: 0x0001 err: > > > > > > 0x00f0 ( pmc: 0x00010400UDMA2, > > > > > > ATAPI 12bytes, ioapic0: routing intpin 3 (PIO 65534bytesISA IRQ 3)) > > > > > > to lapic 1 vector 48 > > > > > > f > > > > > > loiwotaapbilce0 :c lreoaunteirn gs tianrttpeidn > > > > > > 4 (cd0: Attempt to query device size failed: NOT READY, Medium not > > > > > > present > > > > > > ISA IRQ 4) to lapic 2 vector 48 > > > > > > ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 3 vector 48 > > > > > > ioapic0: routing intpin 15 (ISA IRQ 15) to lapic 1 vector 49 > > > > > > ioapic0: routing intpin 17 (PCI IRQ 17) to lapic 2 vector 49 > > > > > > ioapic0: routing intpin 18 (PCI IRQ 18) to lapic 3 vector 49 > > > > > > ioapic0: routing intpin 22 (PCI IRQ 22) to lapic 1 vector 50 > > > > > > ioapic0: routing intpin 23 (PCI IRQ 23) to lapic 2 vector 50 > > > > > > kernel trap 12 with interrupts disabled > > > > > > > > > > > > > > > > > > Fatal trap 12: page fault while in kernel mode > > > > > > cpuid = 0; apic id = 00 > > > > > > fault virtual address = 0x10 > > > > > > fault code = supervisor read data, page not present > > > > > > instruction pointer = 0x20:0x808b1581 > > > > > > stack pointer = 0x28:0x80ef5b20 > > > > > > frame pointer = 0x28:0x80ef5b50 > > > > > > code segment= base 0x0, limit 0xf, type 0x1b > > > > > > = DPL 0, pres 1, long 1, def32 0, gran 1 > > > > > > processor eflags= resume, IOPL = 0 > > > > > > current process = 0 (swapper) &g
Re: panic on boot
> On Thursday, December 23, 2010 1:47:39 am Daniel Braniss wrote: > > > On Wednesday, December 22, 2010 10:58:56 am Daniel Braniss wrote: > > > > > On Wednesday, December 22, 2010 5:12:03 am Daniel Braniss wrote: > > > > > > the hardware is Sun Fire X2200 M2, and it's discless, PXE booted. > > > > > > > > > > > > this seems to have started sometime before 8.2, and it > > > > > > 'sometimes happens': > > > > > > > > > > > > FreeBSD 8.2-PRERELEASE #15 r4274: Wed Dec 22 09:11:27 IST 2010c40, > > > > > > rbp = > > > > > > 0x80ef5c60 --- > > > > > > da...@rnd:/home/obj/rnd/r+d/stable/8/sys/HUJI amd64 > > > > > > Timecounter "i8254" frequency 1193182 Hz quality 0 > > > > > > CPU: Dual-Core AMD Opteron(tm) Processor 2218 (2613.40-MHz K8-class > > > > > > CPU) > > > > > > Origin = "AuthenticAMD" Id = 0x40f13 Family = f Model = 41 > > > > > > Stepping = 3 > > > > > > > > > > > > Features=0x178bfbff > > > > > CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> > > > > > > Features2=0x2001 > > > > > > AMD > > > > > > Features=0xea500800 > > > > > > AMD Features2=0x1f > > > > > > ... > > > > > > SMP: AP CPU #3 Launched! > > > > > > (cd0:ata0:0:0:0): SCSI status: Check Condition > > > > > > cpu3 AP: > > > > > > (cd0:ata0:0:0:0): SCSI sense: NOT READY asc:3a,0 (Medium not > > > > > > present) > > > > > > ID: 0x0300 VER: 0x80050010 LDR: 0x DFR: > > > > > > 0x > > > > > > (cd0: lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: > > > > > > 0x01ff > > > > > > ata0:0: timer: 0x000200ef therm: 0x0001 err: 0x00f00: pmc: > > > > > > 0x000104000): > > > > > > Error 6, Unretryable error > > > > > > SMP: AP CPU #2 Launched! > > > > > > cd0 at ata0 bus 0 scbus0 target 0 lun 0 > > > > > > cpu2 AP: > > > > > > cd0: ID: 0x0200 VER: 0x80050010 LDR: 0x DFR: > > > > > > 0x > > > > > > Removable CD-ROM SCSI-0 device > > > > > > lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: > > > > > > 0x01ff > > > > > > cd0: 33.300MB/s transfers timer: 0x000200ef therm: 0x0001 err: > > > > > > 0x00f0 ( pmc: 0x00010400UDMA2, > > > > > > ATAPI 12bytes, ioapic0: routing intpin 3 (PIO 65534bytesISA IRQ 3)) > > > > > > to lapic 1 vector 48 > > > > > > f > > > > > > loiwotaapbilce0 :c lreoaunteirn gs tianrttpeidn > > > > > > 4 (cd0: Attempt to query device size failed: NOT READY, Medium not > > > > > > present > > > > > > ISA IRQ 4) to lapic 2 vector 48 > > > > > > ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 3 vector 48 > > > > > > ioapic0: routing intpin 15 (ISA IRQ 15) to lapic 1 vector 49 > > > > > > ioapic0: routing intpin 17 (PCI IRQ 17) to lapic 2 vector 49 > > > > > > ioapic0: routing intpin 18 (PCI IRQ 18) to lapic 3 vector 49 > > > > > > ioapic0: routing intpin 22 (PCI IRQ 22) to lapic 1 vector 50 > > > > > > ioapic0: routing intpin 23 (PCI IRQ 23) to lapic 2 vector 50 > > > > > > kernel trap 12 with interrupts disabled > > > > > > > > > > > > > > > > > > Fatal trap 12: page fault while in kernel mode > > > > > > cpuid = 0; apic id = 00 > > > > > > fault virtual address = 0x10 > > > > > > fault code = supervisor read data, page not present > > > > > > instruction pointer = 0x20:0x808b1581 > > > > > > stack pointer = 0x28:0x80ef5b20 > > > > > > frame pointer = 0x28:0x80ef5b50 > > > > > > code segment= base 0x0, limit 0xf, type 0x1b > > > > > > = DPL 0, pres 1, long 1, def32 0, gran 1 > > > > > > processor eflags= resume, IOPL = 0 > > > > > > current process = 0 (swapper) &g
unable to pwd in ZFS snapshot
hi, this is still broken in 8.2-PRERELEASE, there seems to be a patch, but it's almost a year old. http://people.freebsd.org/~jh/patches/zfs-ctldir-vptocnp.diff danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: unable to pwd in ZFS snapshot
> On Sun, Dec 26, 2010 at 09:26:13AM +0200, Daniel Braniss wrote: > > this is still broken in 8.2-PRERELEASE, there seems to be a patch, but > > it's almost a year old. > > http://people.freebsd.org/~jh/patches/zfs-ctldir-vptocnp.diff > > Setting snapdir to visible should fix this right away: > # zfs set snapdir=visible tank/foo > it did indeed! any reason why this should not be the default behaviour? thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: unable to pwd in ZFS snapshot
> > On 26 Dec 2010, at 10:05, Daniel Braniss wrote: > > >> On Sun, Dec 26, 2010 at 09:26:13AM +0200, Daniel Braniss wrote: > >>> this is still broken in 8.2-PRERELEASE, there seems to be a patch, =but > >>> it's almost a year old. > >>> http://people.freebsd.org/~jh/patches/zfs-ctldir-vptocnp.diff > >> > >> Setting snapdir to visible should fix this right away: > >> # zfs set snapdir=visible tank/foo > >> > > it did indeed! > > any reason why this should not be the default behaviour? > > Personally, I want to have the snapshot, but not see the directory otherwise > so that > it doesn't get scooped up by rsync et al inadvertently I agree, so the point is that as usual, the solution fixes one problem by creating another one :-) so basically, the bug is still there, or is it a feature? ie: ls /h/.zfs/snapshot/20101225/ works cd /h/.zfs/snapshot/20101225/ works pwd pwd: .: No such file or directory btw, why use rsync if 'zfs send| zfs recv' work realy nice? cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Specifying root mount options on diskless boot.
> > --2iBwrppp/7QCDedR > Content-Type: text/plain; charset=us-ascii > Content-Disposition: inline > Content-Transfer-Encoding: quoted-printable > > [I'm not sure if -stable is the best list for this but anyway...] > > I'm trying to convert an old laptop running FreeBSD 8.0 into a diskless > client (since its internal HDD is growing bad spots faster than I can > repair them). I have it pxebooting nicely and running with an NFS root > but it then reports locking problems: devd, syslogd, moused (and maybe > others) lock their PID file to protect against multiple instances. > Unfortunately, these daemons all start before statd/lockd and so the > locking fails and reports "operation not supported". > > It's not practical to reorder the startup sequence to make lockd start > early enough (I've tried). > > Since the filesystem is reserved for this client, there's no real need > to forward lock requests across the wire and so specifying "nolockd" > would be another solution. Looking through sys/nfsclient/bootp_subr.c, > DHCP option 130 should allow NFS mount options to be specified (though > it's not clear that the relevant code path is actually followed because > I don't see the associated printf()s anywhere on the console. After > getting isc-dhcpd to forward this option (made more difficult because > its documentation is incorrect), it still doesn't work. > > Understanding all this isn't helped by kenv(8) reporting three different > sets of root filesystem options: > boot.nfsroot.path=3D"/tank/m3" > boot.nfsroot.server=3D"192.168.123.200" > dhcp.option-130=3D"nolockd" > dhcp.root-path=3D"192.168.123.200:/tank/m3" > vfs.root.mountfrom=3D"nfs:server:/tank/m3" > vfs.root.mountfrom.options=3D"rw,tcp,nolockd" > > And the console also reports conflicting root definitions: > Trying to mount root from nfs:server:/tank/m3 > NFS ROOT: 192.168.123.200:/tank/m3 > > Working through all these: > boot.nfsroot.* appears to be initialised by sys/boot/i386/libi386/pxe.c > but, whilst nfsclient/nfs_diskless.c can parse boot.nfsroot.options, > there's no code to initialise that kenv name in pxe.c > > dhcp.* appears to be initialised by lib/libstand/bootp.c - which does > include code to populate boot.nfsroot.options (using vendor specific > DHCP option 20) but this code is not compiled in. Further studying > of bootp.c shows that it's possible to initialise arbitrary kenv's > using DHCP options 246-254 - but the DHCPDISCOVER packets do not > request these options so they don't work without special DHCP server > configuration (to forward options that aren't requested). > > vfs.root.* is parsed out of /etc/fstab but, other than being > reported in the console message above, it doesn't appear to be > used in this environment (it looks like the root entry can be > commented out of /etc/fstab without problem). > > My final solution was to specify 'boot.nfsroot.options=3D"nolockd"' in > loader.conf - and this seems to actually work. > > It seems rather unfortunate that FreeBSD has code to allow NFS root > mount options to be specified via DHCP (admittedly in several > incompatible ways) but none actually work. A quick look at -current > suggests that the situation there remains equally broken. > > Has anyone else tried to use any of this? And would anyone be interested > in trying to make it actually work? Hi Peter, i have beed doing diskless booting for a long time, and am very pleased (though 8.2-prerelease is causing some problems :-). In my case /var is mfs, or ufs/zfs, and have no lockd problems. here is what you need to do: either change in libstand/bootp.c: #define DHCP_ENV DHCP_ENV_NO_VENDOR to #define DHCP_ENVDHCP_ENV_FREEBSD or pick my version from: ftp://ftp.cs.huji.ac.il/users/danny/freebsd/diskless-boot/ and compile a new pxeboot. this new pxeboot will allow you to pass via dhcp some key options. next, take a look at ftp://ftp.cs.huji.ac.il/users/danny/freebsd/diskless-boot/rc.initdiskless make sure that your exported root has /.etc If you'r /var is also nfs mounted, maybe unionfs might help too. just writing quickly so you won't feel discouraged, and that diskless actually works. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
gstripe/gpart problems.
Hi, I have 2 ada disks striped: # gstripe list Geom name: s1 State: UP Status: Total=2, Online=2 Type: AUTOMATIC Stripesize: 65536 ID: 2442772675 Providers: 1. Name: stripe/s1 Mediasize: 1000215674880 (932G) Sectorsize: 512 Stripesize: 65536 Stripeoffset: 0 Mode: r0w0e0 Consumers: 1. Name: ada0 Mediasize: 500107862016 (466G) Sectorsize: 512 Mode: r0w0e0 Number: 0 2. Name: ada1 Mediasize: 500107862016 (466G) Sectorsize: 512 Mode: r0w0e0 Number: 1 boot complains: GEOM_STRIPE: Device s1 created (id=2442772675). GEOM_STRIPE: Disk ada0 attached to s1. GEOM: ada0: corrupt or invalid GPT detected. GEOM: ada0: GPT rejected -- may not be recoverable. GEOM_STRIPE: Disk ada1 attached to s1. GEOM_STRIPE: Device s1 activated. # gpart show =>34 1953546173 stripe/s1 GPT (932G) 34 128 1 freebsd-boot (64K) 162 1953546045 - free - (932G) # gpart show =>34 1953546173 stripe/s1 GPT (932G) 34 128 1 freebsd-boot (64K) 162 1953546045 - free - (932G) # gpart add -t freebsd-ufs -s 20g stripe/s1 GEOM: ada0: corrupt or invalid GPT detected. GEOM: ada0: GPT rejected -- may not be recoverable. stripe/s1p2 added # gpart show =>34 1953546173 stripe/s1 GPT (932G) 34 128 1 freebsd-boot (64K) 16241943040 2 freebsd-ufs (20G) 41943202 1911603005 - free - (912G) if I go the MBR road, all seems ok, but as soon as I try to write the boot block (boot0cfg -B /dev/stripe/s1) again the kernel starts to complain about corrupted GEOM too. any ideas? thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: gstripe/gpart problems.
> On Tue, Jan 04, 2011 at 04:21:31PM +0200, Daniel Braniss wrote: > > Hi, > > I have 2 ada disks striped: > > > > # gstripe list > > Geom name: s1 > > State: UP > > Status: Total=2, Online=2 > > Type: AUTOMATIC > > Stripesize: 65536 > > ID: 2442772675 > > Providers: > > 1. Name: stripe/s1 > >Mediasize: 1000215674880 (932G) > >Sectorsize: 512 > >Stripesize: 65536 > >Stripeoffset: 0 > >Mode: r0w0e0 > > Consumers: > > 1. Name: ada0 > >Mediasize: 500107862016 (466G) > >Sectorsize: 512 > >Mode: r0w0e0 > >Number: 0 > > 2. Name: ada1 > >Mediasize: 500107862016 (466G) > >Sectorsize: 512 > >Mode: r0w0e0 > >Number: 1 > > > > boot complains: > > > > GEOM_STRIPE: Device s1 created (id=2442772675). > > GEOM_STRIPE: Disk ada0 attached to s1. > > GEOM: ada0: corrupt or invalid GPT detected. > > GEOM: ada0: GPT rejected -- may not be recoverable. > > GEOM_STRIPE: Disk ada1 attached to s1. > > GEOM_STRIPE: Device s1 activated. > > > > # gpart show > > =>34 1953546173 stripe/s1 GPT (932G) > > 34 128 1 freebsd-boot (64K) > > 162 1953546045 - free - (932G) > > # gpart show > > =>34 1953546173 stripe/s1 GPT (932G) > > 34 128 1 freebsd-boot (64K) > > 162 1953546045 - free - (932G) > > > > # gpart add -t freebsd-ufs -s 20g stripe/s1 > > GEOM: ada0: corrupt or invalid GPT detected. > > GEOM: ada0: GPT rejected -- may not be recoverable. > > stripe/s1p2 added > > # gpart show > > =>34 1953546173 stripe/s1 GPT (932G) > > 34 128 1 freebsd-boot (64K) > > 16241943040 2 freebsd-ufs (20G) > > 41943202 1911603005 - free - (912G) > > > > if I go the MBR road, all seems ok, but as soon as I try to write > > the boot block (boot0cfg -B /dev/stripe/s1) again the kernel > > starts to complain about corrupted GEOM too. > > So are you trying to partition the drives and then stripe the > partitions within the drives, or are you trying to partition the > stripe? > > It seems here as though you might be trying to first partition the > drives (not clear on that) then stripe the whole drives - which will > mean the partition info is wrong for the resulting striped drive set - > and then repartition the striped drive set, and neither is ending up > valid. > > If what you are intending is to partition after striping the raw > drives, then you are doing the right steps, but when the geom layer > tries to look at the info on the individual drives as at boot, it will > find it invalid. If it the gpart layer is actually refusing to write > partition info to the drives which is wrong for the drives taken > individually, that would account for your problems. > > One valid order to do things in would be partition the drives with > gpart, creating identical sets of partitions on both drives, then > stripe the partitions created within them (syntax not exact): > > gpart add -t freebsd-ufs0 -s 10g ada0 > gpart add -t freebsd-ufs1 -s 10g ada1 > gstripe label freebsd-ufs freebsd-ufs0 freebsd-ufs1 > > That would give you a 20GB stripe, with valid partition info on each > drive. > > If this will be your boot drive, depending on how much needs to be read > from the drive before the geom_stripe kernel module gets loaded, I > would think there could also be a problem booting from the drive. This > is not like gmirroring two drives or partitions, where the info read > from either disk early in boot will be identical, and identical (except > for the last block of the partition) to what the OS sees later after > the mirror is formed. > > I assume you're bearing in mind that if you lose either drive to a > hardware fault you lose the whole thing, and consider the risk worth > the potential speed/size gain. > -- Clifton Hi Clifton, I was getting very frustrated yesterday, hence the cripted message, your response requieres some background :-) the box is a Sun Fire X2200, which has bays for 2 disks, (we have several of these) before the latest upgrade, the 2 disks were 'raided' via 'nVidia MediaShield' and appeared as ar0, when I upgraded to 8.2, it disappeared, since I had in the kernel config file ATA_CAM. So I starded fiddling with gstripe, which 'recoverd' the data. Next, since the kernel boot kept complaining abouf GEOM errors, (and not w
Re: Serial console not working in 7.2-p4 and 7.2-STABLE
> All, > > I'm pulling my hair out on this one! Can't get the serial console to > work with nanoBSD, either 7.2-p4 or 7-STABLE. A 8.0 nanoBSD image > works fine (which I have not created myself). The symptom is that all > kernel output goes to VGA. Whatever I do. This happens in VMware > Player (where I actually see the VGA output) and on my ALIX > (Soekris-like) board (which does not have a VGA card). > > boot0 is boot0sio, boot.config contains -h and the loader works fine > over the serial port. console=comconsole there so that should work, > right? No, because still my kernel outputs everything to VGA... > > I'm using the sio device. Even tried putting flags on 0x30 -> no > difference at all. Tried the uart device and removing sio from my > kernel but that resulted in having NO serial ports at all... > > Any help is much appreciated! > > Sven hi, put hint.uart.0.flags="0x10" in /boot/device.hints, or better, make sure you have an updated one from /sys/i386/conf/GENERIC.hints another thing, make sure the speed/bauds is correct, else you probably wont see any output either in /boot/loader.conf you need console="comconsole,vidconsole" and comconsole_speed="115200" to set the speed. to get a login you will need, in /etc/ttys: ttyu0 "/usr/libexec/getty 3wire.115200" dialup on secure hope this helps danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
can't boot table-8 on HP Proliant DL580 G5
Hi, the boot stops somewhare after probing ata0, so far playing with the BIOS (disabling stuff) does not help. BTW, linux boots ok (except it has problems with IPMI) So, any success stories there? danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
PCengines ALIX boot0sio serial input failes
hi, FreeBSD-8 works great on these boards, but there are some gotchas, the boot and the serial: output works fine, but input is 'problematic'. the pxeboot serial handling is ok, the boot menu is ok, but booting off the CF (using boot0sio), the input 'screwy' at the selection of partition it is ignored, at the OK: prompt from the boot (i had no kernel in the slice), the input is usually doubled: sshooww instead of show which is probably similar to what is happening with boot0sio but it only echoes # (the current bell). Once the kernel is up, the serial works fine. any ideas? thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: PCengines ALIX boot0sio serial input failes
> On 12/9/2009 11:13 AM, Daniel Braniss wrote: > > hi, > > FreeBSD-8 works great on these boards, but there are some > > gotchas, the boot and the serial: output works fine, but input > > is 'problematic'. the pxeboot serial handling is ok, the boot menu > > is ok, but booting off the CF (using boot0sio), the input 'screwy' > > at the selection of partition it is ignored, at the OK: prompt > > from the boot (i had no kernel in the slice), the input is usually > > doubled: > > sshooww instead of show > > which is probably similar to what is happening with boot0sio but it > > only echoes # (the current bell). > > > > Once the kernel is up, the serial works fine. > > The development version of pfSense (2.0) is running on FreeBSD 8.0 using > NanoBSD and its serial input/output works pretty well on ALIX, the > 2d3.2d13 version at least (and others, but those are the only two I have > used personally). > > My test ALIX is at home unplugged at the moment, but based on what I see > in the image file there are a few things that were done: > > /boot/device.hints contains: > hint.uart.0.at="isa" > hint.uart.0.port="0x3F8" > hint.uart.0.flags="0x10" > hint.uart.0.irq="4" > > /boot.config contains: > -h > > The initial boot0cfg on an image is done with: > boot0cfg -B -b /path/to/boot/boot0sio -o packet -s 1 -m 3 > > Here is what shows up when I mount an md device from a CF image: > # boot0cfg -v /dev/md0 > # flag start chs type end chs offset size > 1 0x80 0: 1: 1 0xa5444: 15:63 63 448497 > 2 0x00445: 1: 1 0xa5889: 15:63 448623 448497 > 3 0x00890: 0: 1 0xa5991: 15:63 897120 102816 > > version=2.0 drive=0x80 mask=0x3 ticks=182 bell=# (0x23) > options=packet,update,nosetdrv > volume serial ID 9090-9090 > default_selection=F1 (Slice 1) > > Seems to work pretty well there. If you want the details, you can check > out the pfSense tools git repository which contains the build scripts > that generate the images. I have the same /boot/device.hints. can you confirm that 1) when booting from CF, the boot0sio accepts input 2) the /boot/boot accepts input from the serial? thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: PCengines ALIX boot0sio serial input failes
> On Wednesday 09 December 2009 17:13:57 Daniel Braniss wrote: > > hi, > > FreeBSD-8 works great on these boards, but there are some > > gotchas, the boot and the serial: output works fine, but input > > is 'problematic'. the pxeboot serial handling is ok, the boot menu > > is ok, but booting off the CF (using boot0sio), the input 'screwy' > > at the selection of partition it is ignored, at the OK: prompt > > from the boot (i had no kernel in the slice), the input is usually > > doubled: > > sshooww instead of show > > which is probably similar to what is happening with boot0sio but it > > only echoes # (the current bell). > > > > Once the kernel is up, the serial works fine. > > > > any ideas? > > > > Which ALIX board exactly? There are some differences (even various BIOSes). > Any chance you have vga driver in kernel? TinyBIOS emulates VGA a bit, > redirects output to serial port. If at the beginning you are trying both VGA > and serial port, output is doubled. Similar behavior is observed on older > WRAP boards, too. I have tried ALIX-1 and 2 here is an example: PC Engines ALIX.3 v0.99h 640 KB Base Memory 261120 KB Extended Memory Waiting for HDD ... 01F0 Master 848A SanDisk SDCFH2-002G Phys C/H/S 3970/16/63 Log C/H/S 992/64/63 1 FreeBSD 2 FreeBSD 3 FreeBSD 6 PXE Boot: 1 any key I hit, it echoes as # and is ignored. at this point the kernel is not yet involved, so having vga+kb support is not the reason, though I will try out the alix-3, which has vga support, and a different BIOS soon. thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: PCengines ALIX boot0sio serial input failes
> On 12/10/2009 2:32 AM, Daniel Braniss wrote: > >> Which ALIX board exactly? There are some differences (even various BIOSes). > >> Any chance you have vga driver in kernel? TinyBIOS emulates VGA a bit, > >> redirects output to serial port. If at the beginning you are trying both > >> VGA > >> and serial port, output is doubled. Similar behavior is observed on older > >> WRAP boards, too. > > > > I have tried ALIX-1 and 2 > > here is an example: > > > > PC Engines ALIX.3 v0.99h > > 640 KB Base Memory > > 261120 KB Extended Memory > > Waiting for HDD ... > > > > 01F0 Master 848A SanDisk SDCFH2-002G > > Phys C/H/S 3970/16/63 Log C/H/S 992/64/63 > > > > 1 FreeBSD > > 2 FreeBSD > > 3 FreeBSD > > > > 6 PXE > > Boot: 1 > > > > any key I hit, it echoes as # and is ignored. > > at this point the kernel is not yet involved, so having vga+kb support > > is not the reason, though I will try out the alix-3, which has vga support, > > and > > a different BIOS soon. i have now, and the results are: - serial works - bios boot skips boot0, and goes straight to boot slice 1. The good side is that PXE boot works, but switching to boot from disk is a pain, on other systems, hitting ^C at the dhcp will stop it, and the boot will continue from disk, which if fails (forgot some critical setup :-), reboot, fix, boot ^C ... > > A lot of users have seen that happen, but typically it has been cleared > up by using ALIX BIOS v0.99h, which that box already appears to have, > and setting the BIOS for CHS mode. > > I haven't tried any of the ALIX models with VGA, but I have heard they > are working as long as you set the BIOS for APM power management. it's actualy setting Power Management to anything but ACPI. > (See > the previous -STABLE thread titled "8.0-rc2 dropped hardsupport". > > Jim cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: iSCSI initiator and Dell PowerVault MD3000i
> Hi all, > I am playing with iscsi_initiator on FreeBSD 7-STABLE and Dell > PowerVault MD3000i. This is the first time I am testing iSCSI... > > Does anyone have FreeBSD's iSCSI initiator in production / heavy load? > Or does somebody have experiences with Dell MD3000i? > > One thing is "poor performance" ~ 60 - 70MB/s depending on RAID level > used. (poor performance compared to plain SATA disk which have 110MB/s - > both tested for reading as it is our planned load - multimedia streaming > and downloads) > > > The other thing is some problem with compatibility of initiator and Dell > MD3000i. > > If I setup RAID 5 'Disk Group' consisted of 4x 1TB SATA drives (in > MD3000i) and then created for example 2 'Virtual Disks', both are > detected by iscontrol and added to /dev/ as da0 and da1, but da1 spams > log with messages like this: > > Dec 15 04:00:38 dust kernel: da0 at iscsi0 bus 0 target 0 lun 0 > Dec 15 04:00:38 dust kernel: da0: Fixed Direct > Access SCSI-5 device > Dec 15 04:00:38 dust kernel: da1 at iscsi0 bus 0 target 0 lun 1 > Dec 15 04:00:38 dust kernel: da1: Fixed Direct > Access SCSI-5 device > Dec 15 04:00:38 dust iscontrol[48576]: cam_open_btl: no passthrough > device found at 0:0:2 > Dec 15 04:00:38 dust iscontrol[48576]: cam_open_btl: no passthrough > device found at 0:0:3 > Dec 15 04:00:39 dust kernel: (da1:iscsi0:0:0:1): READ(6)/WRITE(6) not > supported, increasing minimum_cmd_size to 10. > Dec 15 04:00:39 dust kernel: (da1:iscsi0:0:0:1): READ(10). CDB: 28 0 0 0 > 0 0 0 0 1 0 > Dec 15 04:00:39 dust kernel: (da1:iscsi0:0:0:1): CAM Status: SCSI Status > Error > Dec 15 04:00:39 dust kernel: (da1:iscsi0:0:0:1): SCSI Status: Check > Condition > Dec 15 04:00:39 dust kernel: (da1:iscsi0:0:0:1): ILLEGAL REQUEST asc:94,1 > Dec 15 04:00:39 dust kernel: (da1:iscsi0:0:0:1): Vendor Specific ASC > Dec 15 04:00:39 dust kernel: (da1:iscsi0:0:0:1): Unretryable error > Dec 15 04:00:40 dust kernel: (da1:iscsi0:0:0:1): READ(10). CDB: 28 0 c > 7f df ff 0 0 1 0 > Dec 15 04:00:40 dust kernel: (da1:iscsi0:0:0:1): CAM Status: SCSI Status > Error > Dec 15 04:00:40 dust kernel: (da1:iscsi0:0:0:1): SCSI Status: Check > Condition > Dec 15 04:00:40 dust kernel: (da1:iscsi0:0:0:1): ILLEGAL REQUEST asc:94,1 > Dec 15 04:00:40 dust kernel: (da1:iscsi0:0:0:1): Vendor Specific ASC > Dec 15 04:00:40 dust kernel: (da1:iscsi0:0:0:1): Unretryable error > Dec 15 04:00:41 dust kernel: (da1:iscsi0:0:0:1): READ(10). CDB: 28 0 0 0 > 0 0 0 0 1 0 > > The message repeated many times. > > If I created more 'Virtual Disks' (7 for example), 3 of them are > producing same errors (da1, da3, da5) > > If there is only one 'Virtual Disk', it seems fine... until I configured > second path to the virtual disk as I want to try gmultipath or geom_fox > (MD3000i is dual controller with 4 NICs), then second session produces > same errors. > > First path - OK: > > Dec 15 22:47:57 dust kernel: da0 at iscsi0 bus 0 target 0 lun 0 > Dec 15 22:47:57 dust kernel: da0: Fixed Direct > Access SCSI-5 device > Dec 15 22:47:57 dust iscontrol[52226]: cam_open_btl: no passthrough > device found at 0:0:1 > Dec 15 22:47:57 dust iscontrol[52226]: cam_open_btl: no passthrough > device found at 0:0:2 > Dec 15 22:47:57 dust iscontrol[52226]: cam_open_btl: no passthrough > device found at 0:0:3 > > > Second path - error: > > Dec 15 22:48:04 dust kernel: da1 at iscsi0 bus 0 target 1 lun 0 > Dec 15 22:48:04 dust kernel: da1: Fixed Direct > Access SCSI-5 device > Dec 15 22:48:05 dust kernel: (da1:iscsi0:0:1:0): READ(6)/WRITE(6) not > supported, increasing minimum_cmd_size to 10. > Dec 15 22:48:05 dust kernel: (da1:iscsi0:0:1:0): READ(10). CDB: 28 0 0 0 > 0 0 0 0 1 0 > Dec 15 22:48:05 dust kernel: (da1:iscsi0:0:1:0): CAM Status: SCSI Status > Error > Dec 15 22:48:05 dust kernel: (da1:iscsi0:0:1:0): SCSI Status: Check > Condition > Dec 15 22:48:05 dust kernel: (da1:iscsi0:0:1:0): ILLEGAL REQUEST asc:94,1 > Dec 15 22:48:05 dust kernel: (da1:iscsi0:0:1:0): Vendor Specific ASC > Dec 15 22:48:05 dust kernel: (da1:iscsi0:0:1:0): Unretryable error > Dec 15 22:48:05 dust iscontrol[52230]: cam_open_btl: no passthrough > device found at 0:1:1 > Dec 15 22:48:05 dust iscontrol[52230]: cam_open_btl: no passthrough > device found at 0:1:2 > Dec 15 22:48:05 dust iscontrol[52230]: cam_open_btl: no passthrough > device found at 0:1:3 > Dec 15 22:48:06 dust kernel: (da1:iscsi0:0:1:0): READ(16). CDB: 88 0 0 0 > 0 1 5d 21 1f ff 0 0 0 1 0 0 > Dec 15 22:48:06 dust kernel: (da1:iscsi0:0:1:0): CAM Status: SCSI Status > Error > Dec 15 22:48:06 dust kernel: (da1:iscsi0:0:1:0): SCSI Status: Check > Condition > Dec 15 22:48:06 dust kernel: (da1:iscsi0:0:1:0): ILLEGAL REQUEST asc:94,1 > Dec 15 22:48:06 dust kernel: (da1:iscsi0:0:1:0): Vendor Specific ASC > Dec 15 22:48:06 dust kernel: (da1:iscsi0:0:1:0): Unretryable error > Dec 15 22:48:07 dust kernel: (da1:iscsi0:0:1:0): READ(10). CDB: 28 0 0 0 > 0 0 0 0 1 0 > Dec 15 22:48:07 dust kernel: (da
Re: iSCSI initiator and Dell PowerVault MD3000i
> please Cc: me, I am not subscribed to freebsd-scsi > > Sossi Andrej wrote: > >> On 16. 12. 2009 15:57, Miroslav Lachman wrote: > >> [...] > >> I use MD300i with FreeBSD 7.0 and 7.1 with iscsi-2.2.2. It work fine. > >> But be careful to configure MD3000i. MD3000i assign by default first > >> disk to preferred controller 0, second disk to preferred controller 1, > >> third disk to preferred controller 0, and so on. First, third, fifth... > >> disks is usable from FreeBSD, but second, fourth,... disks result > unusable. > >> Work around: manually assign all disks to controller 0. > > > > When you say "unusable" do you mean you can't access it at all / it > > errors even if it's the only path (drive) used? It would be normal if > > you have for example two paths to each drive and can't mount the other > > path if one path to the drive is mounted - this is not a usable > > combination. You can use geom_multipath to get multipath failover. > > I got errors even in unmounted state. > I tried iscsi-2.2.3 and got same errors. I tried second path first > (device da0) and it produces same errors, then I run iscontrol for the > first path (device da1) and everything is fine. > > path throught second controller: ERROR > # diskinfo -t /dev/da0 > /dev/da0 > 512 # sectorsize > 2998998663168 # mediasize in bytes (2.7T) > 5857419264 # mediasize in sectors > 364607 # Cylinders according to firmware. > 255 # Heads according to firmware. > 63 # Sectors according to firmware. > > Seek times: > Full stroke:diskinfo: read error or disk too small for > test.: Invalid argument > > > path throught first controller: OK > # diskinfo -t /dev/da1 > /dev/da1 > 512 # sectorsize > 2998998663168 # mediasize in bytes (2.7T) > 5857419264 # mediasize in sectors > 364607 # Cylinders according to firmware. > 255 # Heads according to firmware. > 63 # Sectors according to firmware. > > Seek times: > Full stroke: 250 iter in 2.483517 sec =9.934 msec > Half stroke: 250 iter in 2.575778 sec = 10.303 msec > Quarter stroke: 500 iter in 2.926170 sec =5.852 msec > Short forward:400 iter in 0.916901 sec =2.292 msec > Short backward: 400 iter in 2.181790 sec =5.454 msec > Seq outer: 2048 iter in 0.520920 sec =0.254 msec > Seq inner: 2048 iter in 0.545300 sec =0.266 msec > Transfer rates: > outside: 102400 kbytes in 1.414997 sec =72368 > kbytes/sec > middle:102400 kbytes in 1.45 sec =70405 > kbytes/sec > inside:102400 kbytes in 1.422527 sec =71985 > kbytes/sec > the numbers seem ok to me, concidering that the net is 1Gb. can you configure the target virtual disk to have luns? in any case the errors seem to be in the md3000i, can you see/check its error log? > > Do you have experiences with iSCSI multipath? I read about geom_fox and > gmultipath... i have no experience with it, and personaly see no benefit in it (but then others might disagree :-) danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
NFS/UDP and vfs.nfs.nfs_ip_paranoia=0 does not help
Hi, While trying to find out why our NSF/ZFS servers now hangs about once a week, I got hold of a similiar box, and got a bit more ambitious, I connected it via 2 NICs, to complicate things a bit, the server boots via pxeboot (ie, is datatless). After fiddling with the default gateway, adding -h to rpcbind and mountd, things seem ok, but UDP is 'problematic', I could do with TCP except that am-utils does a fsinfo via UDP when doing a /net/ and will hang the client. even with vfs.nfs.nfs_ip_paranoia=0, when the response from the server arrives with the 'wrong' ip, an ICMP destination unreachable (port unreachable) is replied. in short, on the client: this works: mount_nfs -o mntudp server-ip-vlanA:/mnt /mnt this fails: mount_nfs -o mntudp server-ip-vlanB:/mnt /mnt since the response is coming from server-ip-vlanA. Q: why does this work for TCP and fails for UDP Q: is there a workaround? danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: RELENG_8 -- NFSv3 credentials/permissions issue
> I'm willing to bet this is something simple I've overlooked, but I'm out > of ideas. Client is 8.0-RELEASE i386, server is 8.0-STABLE amd64 > (kernel/world 2010/01/16). NFS version used is v3. Server filesystem > is UFS2. at boot time, the NFS is V2!, if the server is FreeBSD it can be upgraded later in the boot progress to V3 > > Client configuration is off-kilter: it's a PXE booted machine. Initial > PXE booting uses TFTP, then switches to NFS to load the kernel and > kernel modules. The TFTP part works, with a caveat[1], but the NFS > portion fails. TFTP is as old as the Internet, so it mostly works, and security was in dipers, so the T for trivial also means un-secure :-) > > With NFS, I'm forced to change permissions on all the exported > files/directories to be 0644/0755 (specifically, setting other/global > read/write access) otherwise the client gets back "Permission denied". > The nfsd(8) man page implies that this shouldn't be necessary; adding > -mapall=nobody:nobody or -maproot=nobody doesn't fix things either. > why not use -maproot=root? by adding -ro, the client will be able to read but not modify. That's what we do here, the /etc is mounted via unionfs to a md, but that is yet another solution. > In the absence of -maproot and -mapall options, remote accesses by root > will result in using a credential of -2:-2. All other users will be > mapped to their remote credential. If a -maproot option is given, remote > access by root will be mapped to that credential instead of -2:-2. If a > -mapall option is given, all users (including root) will be mapped to > that credential in place of their own. > > Configuration data, tcpdump validation (client=192.168.1.140, > server=192.168.1.51), and syslog data is below. > > Ideas? > > [1]: TFTP works as long as the file its trying to request (in this case > /usr/local/freebsd8/boot/pxeboot) has its other/global read bit set, > otherwise EACCESS is returned; I had to look in the tftpd source to > figure this out. I'm not sure what the justification is there, given > that use of -s and/or -u switches credentials to user/group nobody... > only root can read a file with mode 0, so you need to set the read bit for any non root user. > -- > | Jeremy Chadwick j...@parodius.com | > | Parodius Networking http://www.parodius.com/ | > | UNIX Systems Administrator Mountain View, CA, USA | > | Making life hard for others since 1977. PGP: 4BD6C0CB | > > > Relevant server configuration bits: > > /etc/rc.conf > == > rpcbind_enable="yes" > rpcbind_flags="-l" > mountd_enable="yes" > mountd_flags="-r -l" > nfs_server_enable="yes" > > /etc/exports > == > /usr/local/freebsd8 -network 192.168.1 -mask 255.255.255.0 > > Permissions > = > drwxr-xr-x 22 rootwheel512 Feb 6 12:25 / > drwxr-xr-x 17 rootwheel512 Feb 12 03:38 /usr > drwxr-xr-x 15 rootwheel512 Feb 19 10:41 /usr/local > drwx-- 5 nobody nobody 512 Feb 19 10:42 /usr/local/freebsd8 > drwx-- 7 nobody nobody 1024 Nov 21 08:11 /usr/local/freebsd8/boot > drwx-- 2 nobody nobody 12800 Nov 21 08:11 > /usr/local/freebsd8/boot/kernel > -r 1 nobody nobody 11492703 Nov 21 07:48 > /usr/local/freebsd8/boot/kernel/kernel > > tcpdump > = > {...snipping TFTP portion...} > 10:57:20.601313 IP 192.168.1.140.68 > 255.255.255.255.67: BOOTP/DHCP, Request > from 00:30:48:71:60:6b, length 548 > 10:57:20.601442 IP 192.168.1.51.67 > 192.168.1.140.68: BOOTP/DHCP, Reply, > length 323 > 10:57:20.601688 IP 192.168.1.140.68 > 255.255.255.255.67: BOOTP/DHCP, Request > from 00:30:48:71:60:6b, length 548 > 10:57:20.601782 IP 192.168.1.51.67 > 192.168.1.140.68: BOOTP/DHCP, Reply, > length 323 > 10:57:20.613056 IP 192.168.1.140.1023 > 192.168.1.51.111: UDP, length 76 > 10:57:20.613369 IP 192.168.1.51.111 > 192.168.1.140.1023: UDP, length 28 > 10:57:20.613556 IP 192.168.1.140.1023 > 192.168.1.51.947: UDP, length 84 > 10:57:20.613921 IP 192.168.1.51.947 > 192.168.1.140.1023: UDP, length 60 > 10:57:20.614055 IP 192.168.1.140.1023 > 192.168.1.51.111: UDP, length 76 > 10:57:20.614291 IP 192.168.1.51.111 > 192.168.1.140.1023: UDP, length 28 > 10:57:20.614432 IP 192.168.1.140.4 > 192.168.1.51.2049: 100 lookup fh > 1197,150310/6618112 "boot" > 10:57:20.614458 IP 192.168.1.51.2049 > 192.168.1.140.4: reply ok 28 lookup > ERROR: Permission denied > 10:57:20.615436 IP 192.168.1.140.1022 > 192.168.1.51.947: UDP, length 84 > 10:57:20.615677 IP 192.168.1.51.947 > 192.168.1.140.1022: UDP, length 60 > 10:57:20.615806 IP 192.168.1.140.6 > 192.168.1.51.2049: 100 lookup fh > 1197,150310/6618112 "boot" > 10:57:20.615824 IP 192.168.1.51.2049 > 192.168.1.140.6: reply ok 28 lookup > ERROR: Permission denied > 10:57:20.615929 IP 192.168.1.140.1021 > 192.168.1.51.947: UDP, length 84 > 10:57:20.616164 IP 192.168.1.51.9
Re: RELENG_8 -- NFSv3 credentials/permissions issue
> On Sun, Feb 21, 2010 at 09:25:45AM +0200, Daniel Braniss wrote: > > > I'm willing to bet this is something simple I've overlooked, but I'm out > > > of ideas. Client is 8.0-RELEASE i386, server is 8.0-STABLE amd64 > > > (kernel/world 2010/01/16). NFS version used is v3. Server filesystem > > > is UFS2. > > at boot time, the NFS is V2!, if the server is FreeBSD it can be upgraded > > later in the boot progress to V3 > > > > > > Client configuration is off-kilter: it's a PXE booted machine. Initial > > > PXE booting uses TFTP, then switches to NFS to load the kernel and > > > kernel modules. The TFTP part works, with a caveat[1], but the NFS > > > portion fails. > > TFTP is as old as the Internet, so it mostly works, and security was in > > dipers, > > so the T for trivial also means un-secure :-) > > > > > > With NFS, I'm forced to change permissions on all the exported > > > files/directories to be 0644/0755 (specifically, setting other/global > > > read/write access) otherwise the client gets back "Permission denied". > > > The nfsd(8) man page implies that this shouldn't be necessary; adding > > > -mapall=nobody:nobody or -maproot=nobody doesn't fix things either. > > > > > why not use -maproot=root? > > by adding -ro, the client will be able to read but not modify. > > That's what we do here, the /etc is mounted via unionfs to a md, but > > that is yet another solution. > > I'll have to try that (shouldn't take me long), but I remember messing > with -maproot and -mapall both and wasn't able to get anywhere. I'll > try again and report back. > > > > Configuration data, tcpdump validation (client=192.168.1.140, > > > server=192.168.1.51), and syslog data is below. > > > > > > Ideas? > > > > > > [1]: TFTP works as long as the file its trying to request (in this case > > > /usr/local/freebsd8/boot/pxeboot) has its other/global read bit set, > > > otherwise EACCESS is returned; I had to look in the tftpd source to > > > figure this out. I'm not sure what the justification is there, given > > > that use of -s and/or -u switches credentials to user/group nobody... > > > > > only root can read a file with mode 0, so you need to set the read bit for > > any non root user. > > I'm not sure if you're referring to NFS here, or my TFTP comment. My > TFTP comment should be discussed elsewhere -- it's broken/odd behaviour, > but the workaround for TFTP (to set the file permissions to 0644 for > read) I'm fine with -- it's TFTP! :-) > if the owner does not have read permition, it wont be able to read the file, no matter that the other read bits are enabled. % date > 0 % chmod 04 0 % cat 0 cat: 0: Permission denied % chmod 040 0 % cat 0 cat: 0: Permission denied % chmod 0400 0 % cat 0 Sun Feb 21 11:47:32 IST 2010 % this answers the TFTP problem. > With regards to NFS: none of the files below are mode . The request > made via NFS should have gotten "translated" to being done by > nobody:nobody on the NFS server, since there's no -mapall or -maproot > line in the exports; user nobody has read access to everything shown > below, so "Permission denied" makes no sense. > as I mentioned before/above, maybe not so clearly, the initial NFS transactions are done via NFS/V2 - which is problematic/broken[1], and so probably the access permitions are not exactly what we expect. [1]: rm /any-file in a read-only exported fs will hang the client > > > Permissions > > > = > > > drwxr-xr-x 22 rootwheel512 Feb 6 12:25 / > > > drwxr-xr-x 17 rootwheel512 Feb 12 03:38 /usr > > > drwxr-xr-x 15 rootwheel512 Feb 19 10:41 /usr/local > > > drwx-- 5 nobody nobody 512 Feb 19 10:42 /usr/local/freebsd8 > > > drwx-- 7 nobody nobody 1024 Nov 21 08:11 > > > /usr/local/freebsd8/boot > > > drwx-- 2 nobody nobody 12800 Nov 21 08:11 > > > /usr/local/freebsd8/boot/kernel > > > -r 1 nobody nobody 11492703 Nov 21 07:48 > > > /usr/local/freebsd8/boot/kernel/kernel > > > > > > tcpdump > > > = > > > {...snipping TFTP portion...} > > > 10:57:20.601313 IP 192.168.1.140.68 > 255.255.255.255.67: BOOTP/DHCP, > > > Request from 00:30:48:71:60:6b, length 548 > > > 10:57:20.601442 IP 192.168.1.51.67 > 192.168.1.140.68: BOOTP/DHCP, Reply, > > > length 323 >
Re: RELENG_8 -- NFSv3 credentials/permissions issue
> On Sun, Feb 21, 2010 at 12:02:28PM +0200, Daniel Braniss wrote: > > > I'm not sure if you're referring to NFS here, or my TFTP comment. My > > > TFTP comment should be discussed elsewhere -- it's broken/odd behaviour, > > > but the workaround for TFTP (to set the file permissions to 0644 for > > > read) I'm fine with -- it's TFTP! :-) > > > > > if the owner does not have read permition, it wont be able to read the file, > > no matter that the other read bits are enabled. > > > > % date > 0 > > % chmod 04 0 > > % cat 0 > > cat: 0: Permission denied > > % chmod 040 0 > > % cat 0 > > cat: 0: Permission denied > > % chmod 0400 0 > > % cat 0 > > Sun Feb 21 11:47:32 IST 2010 > > % > > this answers the TFTP problem. > > Actually it doesn't. Are you familiar with C? If so, have a look at > this piece of the source code (src/libexec/tftpd/tftpd.c): > > 586 int > 587 validate_access(char **filep, int mode) > 588 { > ... > 618 if (stat(filename, &stbuf) < 0) > 619 return (errno == ENOENT ? ENOTFOUND : EACCESS); > 620 if ((stbuf.st_mode & S_IFMT) != S_IFREG) > 621 return (ENOTFOUND); > 622 if (mode == RRQ) { > 623 if ((stbuf.st_mode & S_IROTH) == 0) > 624 return (EACCESS); > 625 } else { > 626 if ((stbuf.st_mode & S_IWOTH) == 0) > 627 return (EACCESS); > 628 } > ... > 694 return (0); > 695 } > > This function is called whenever there's a request of any sort via TFTP > (such as file retrieval (read) or file storage (write)). In this > context, RRQ = "read request". > > The above code explicitly requires the global/other read (or write, if > the request is to write data) bit be set on the files being > requested/written to, otherwise EACCESS ("Access Denied") is returned to > the client. This is *regardless* of who owns the file. See the stat(2) > man page for verification of S_IROTH and S_IWOTH bits. > > This is justified *unless* UID switching is present -- meaning, if the > -s option (and/or -u) is used. If -s is used but no -u is specified, > the daemon switches to user "nobody" by default. But regardless of what > user the daemon switches to, its code still explicitly requires the > global read or global write bits be set on the files. > > IMHO, the above permissions checks should be removed if -s is in effect. > the code is only usefull if running as root (and questionable too). I agree, the code is useless, it should use access(2), but tftpd predates it :-( > > > With regards to NFS: none of the files below are mode . The request > > > made via NFS should have gotten "translated" to being done by > > > nobody:nobody on the NFS server, since there's no -mapall or -maproot > > > line in the exports; user nobody has read access to everything shown > > > below, so "Permission denied" makes no sense. > > > > > as I mentioned before/above, maybe not so clearly, the initial NFS > > transactions > > are done via NFS/V2 - which is problematic/broken[1], and so probably > > the access permitions are not exactly what we expect. > > > > [1]: rm /any-file in a read-only exported fs will hang the client > > > > > > > Permissions > > > > > = > > > > > drwxr-xr-x 22 rootwheel512 Feb 6 12:25 / > > > > > drwxr-xr-x 17 rootwheel512 Feb 12 03:38 /usr > > > > > drwxr-xr-x 15 rootwheel512 Feb 19 10:41 /usr/local > > > > > drwx-- 5 nobody nobody 512 Feb 19 10:42 > > > > > /usr/local/freebsd8 > > > > > drwx-- 7 nobody nobody 1024 Nov 21 08:11 > > > > > /usr/local/freebsd8/boot > > > > > drwx-- 2 nobody nobody 12800 Nov 21 08:11 > > > > > /usr/local/freebsd8/boot/kernel > > > > > -r 1 nobody nobody 11492703 Nov 21 07:48 > > > > > /usr/local/freebsd8/boot/kernel/kernel > > Okay, so then you're saying it's a bug of some sort in NFSv2, not NFSv3. > yes > But the above (and below, see tcpdump) files are not attempting to be > removed nor written to -- they're attempting to be read. I mentioned the rm bug to show that there is at least a well known problem, and your problem seems to point to yet another one. > Should I file a PR for this problem? IMHO, it's a pretty serious > oversight (it effectively means user/group ownership means jack squat > with NFSv2). well, V2 is quiet dead, and I doubt anyone is willing to look into it, what would be nice if pxeboot is upgraded to use NFS/V3 - before it becomes obsolete too :-) danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
ahcich3: Timeout on slot 0 ...
hi, with latest 8-stable, I can't boot since it's stuck with: ... ahci0: port 0xb880-0xb887,0xb800-0xb803,0xb48 0-0xb487,0xb400-0xb403,0xb080-0xb09f mem 0xfe7fa800-0xfe7fafff irq 22 at device 31.2 on pci0 ahci0: [ITHREAD] ahci0: AHCI v1.20 with 4 3Gbps ports, Port Multiplier supported ahcich0: at channel 0 on ahci0 ahcich0: [ITHREAD] ahcich1: at channel 1 on ahci0 ahcich1: [ITHREAD] ahcich2: at channel 4 on ahci0 ahcich2: [ITHREAD] ahcich3: at channel 5 on ahci0 ahcich3: [ITHREAD] ... umass0:4:0:-1: Attached to scbus4 (probe0:umass-sim0:0:0:0): TEST UNIT READY. CDB: 0 0 0 0 0 0 (probe0:umass-sim0:0:0:0): CAM status: SCSI Status Error (probe0:umass-sim0:0:0:0): SCSI status: Check Condition (probe0:umass-sim0:0:0:0): SCSI sense: NOT READY asc:3a,0 (Medium not present) (probe0:umass-sim0:0:0:1): TEST UNIT READY. CDB: 0 20 0 0 0 0 (probe0:umass-sim0:0:0:1): CAM status: SCSI Status Error (probe0:umass-sim0:0:0:1): SCSI status: Check Condition (probe0:umass-sim0:0:0:1): SCSI sense: NOT READY asc:3a,0 (Medium not present) (probe0:umass-sim0:0:0:2): TEST UNIT READY. CDB: 0 40 0 0 0 0 (probe0:umass-sim0:0:0:2): CAM status: SCSI Status Error (probe0:umass-sim0:0:0:2): SCSI status: Check Condition (probe0:umass-sim0:0:0:2): SCSI sense: NOT READY asc:3a,0 (Medium not present) (probe0:umass-sim0:0:0:3): TEST UNIT READY. CDB: 0 60 0 0 0 0 (probe0:umass-sim0:0:0:3): CAM status: SCSI Status Error (probe0:umass-sim0:0:0:3): SCSI status: Check Condition (probe0:umass-sim0:0:0:3): SCSI sense: NOT READY asc:3a,0 (Medium not present) ahcich3: Timeout on slot 0 ahcich3: is cs ss rs 0001 tfd 50 serr run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config ahcich3: Timeout on slot 0 ahcich3: is cs ss rs 0001 tfd 50 serr ahcich3: Timeout on slot 0 ahcich3: is cs ss rs 0001 tfd 50 serr run_interrupt_driven_hooks: still waiting after 120 seconds for xpt_config ... with a slightly older kernel all is ok. ... Trying to mount root from nfs: (probe0:umass-sim0:0:0:0): TEST UNIT READY. CDB: 0 0 0 0 0 0 (probe0:umass-sim0:0:0:0): CAM Status: SCSI Status Error (probe0:umass-sim0:0:0:0): SCSI Status: Check Condition (probe0:umass-sim0:0:0:0): NOT READY asc:3a,0 (probe0:umass-sim0:0:0:0): Medium not present (probe0:umass-sim0:0:0:0): Unretryable error da0 at umass-sim0 bus 0 scbus4 target 0 lun 0 da0: Removable Direct Access SCSI-0 device da0: 40.000MB/s transfers da0: Attempt to query device size failed: NOT READY, Medium not present the only wierd thing I see, is the Trying to mount root from nfs: which does not happen in the failing kernel. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ahcich3: Timeout on slot 0 ...
> Daniel Braniss wrote: > > with latest 8-stable, I can't boot since it's stuck with: > > > the only wierd thing I see, is the > > Trying to mount root from nfs: > > which does not happen in the failing kernel. > > Could you show full verbose boot messages? here it comes ... GDB: no debug ports present KDB: debugger backends: ddb KDB: current backend: ddb SMAP type=01 base= len=0009ec00 SMAP type=02 base=0009ec00 len=1400 SMAP type=02 base=000e4000 len=0001c000 SMAP type=01 base=0010 len=7f58 SMAP type=03 base=7f68 len=e000 SMAP type=04 base=7f68e000 len=00052000 SMAP type=02 base=7f6e len=0002 SMAP type=02 base=fee0 len=1000 SMAP type=02 base=fff0 len=0010 Copyright (c) 1992-2010 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.0-STABLE #0 r1589: Tue Feb 23 09:10:52 IST 2010 da...@sunfire:/r+d/obj/sunfire/r+d/stable/8/sys/HUJI amd64 Preloaded elf kernel "/boot/kernel/kernel" at 0x80e8f000. Preloaded elf obj module "/boot/kernel/ahci.ko" at 0x80e8f1c0. Timecounter "i8254" frequency 1193182 Hz quality 0 Calibrating TSC clock ... TSC clock: 258570 Hz CPU: Intel(R) Core(TM)2 Duo CPU E6850 @ 3.00GHz (2999.96-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x6fb Stepping = 11 Features=0xbfebfbff Features2=0xe3fd AMD Features=0x20100800 AMD Features2=0x1 TSC: P-state invariant real memory = 2147483648 (2048 MB) Physical memory chunk(s): 0x1000 - 0x0009afff, 630784 bytes (154 pages) 0x00ebd000 - 0x7ba66fff, 2059051008 bytes (502698 pages) avail memory = 2046427136 (1951 MB) ACPI APIC Table: INTR: Adding local APIC 1 as a target FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs FreeBSD/SMP: 1 package(s) x 2 core(s) cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 APIC: CPU 0 has ACPI ID 1 APIC: CPU 1 has ACPI ID 2 ULE: setup cpu 0 ULE: setup cpu 1 ACPI: RSDP 0xfb790 00014 (v0 ACPIAM) ACPI: RSDT 0x7f68 00040 (v1 _ASUS_ Notebook 1829 MSFT 0097) ACPI: FACP 0x7f680200 00084 (v2 A_M_I_ OEMFACP 1829 MSFT 0097) ACPI: DSDT 0x7f6805c0 087ED (v1 A0827 A0827000 INTL 20060113) ACPI: FACS 0x7f68e000 00040 ACPI: APIC 0x7f680390 0006C (v1 A_M_I_ OEMAPIC 1829 MSFT 0097) ACPI: MCFG 0x7f680400 0003C (v1 A_M_I_ OEMMCFG 1829 MSFT 0097) ACPI: SLIC 0x7f680440 00176 (v1 _ASUS_ Notebook 1829 MSFT 0097) ACPI: OEMB 0x7f68e040 00081 (v1 A_M_I_ AMI_OEM 1829 MSFT 0097) ACPI: HPET 0x7f688db0 00038 (v1 A_M_I_ OEMHPET 1829 MSFT 0097) ACPI: GSCI 0x7f68e0d0 02024 (v1 A_M_I_ GMCHSCI 1829 MSFT 0097) MADT: Found IO APIC ID 2, Interrupt 0 at 0xfec0 ioapic0: Routing external 8259A's -> intpin 0 MADT: Interrupt override: source 0, irq 2 ioapic0: Routing IRQ 0 -> intpin 2 MADT: Interrupt override: source 9, irq 9 ioapic0: intpin 9 trigger: level ioapic0 irqs 0-23 on motherboard cpu0 BSP: ID: 0x VER: 0x00050014 LDR: 0x DFR: 0x lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff timer: 0x000100ef therm: 0x0001 err: 0x0001000f pcm: 0x00010400 wlan: <802.11 Link Layer> kbd: new array size 4 kbd1 at kbdmux0 nfslock: pseudo-device mem: null: random: io: hptrr: RocketRAID 17xx/2xxx SATA controller driver v1.2 acpi0: <_ASUS_ Notebook> on motherboard PCIe: Memory Mapped configuration base @ 0xe000 ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 0 vector 48 acpi0: [MPSAFE] acpi0: [ITHREAD] ACPI: Executed 1 blocks of module-level executable AML code acpi0: Power Button (fixed) acpi0: wakeup code va 0xff806000 pa 0x4000 AcpiOsDerivePciId: \_SB_.PCI0.SBRG.IELK.RXA0 -> bus 0 dev 0 func 0 AcpiOsDerivePciId: \_SB_.PCI0.SBRG.PIX0 -> bus 0 dev 31 func 0 acpi0: reservation of 0, a (3) failed acpi0: reservation of 10, 7f60 (3) failed ACPI timer: 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 -> 10 Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 pci_link0:Index IRQ Rtd Ref IRQs Initial Probe 0 10 N 0 3 4 5 6 7 10 11 12 14 15 Validation 0 10 N 0 3 4 5 6 7 10 11 12 14 15 After Disable 0 255 N 0 3 4 5 6 7 10 11 12 14 15 pci_link1:Index IRQ Rtd Ref IRQs Initial Probe 0 11 N 0 3 4 5 6 7 10 11 12 14 15 Validation 0 11 N 0 3 4 5 6 7 10 11 12 14 15 After Disable 0 255 N 0 3 4 5 6 7 10 11 12 14 15 pci_link2:
Re: em0 freezes on ZFS server
> On Fri, 26 Feb 2010 13:31:38 +0100 Gerrit Kühn > wrote about Re: em0 freezes on ZFS server: > > GK> JC> Note how close the "current" value is to that of "total". I'm not > GK> JC> too surprised you're seeing what you are as a result of this. > GK> JC> What on earth is this machine doing at all times? > > GK> Is there any way I could find out what is actually using these buffers? > > Sorry for replying to my own email: > At least in my case I found out what is eating the buffers: nfsd does! > The buffers stop increasing as soon as I stop nfsd. However, they start > increasing as soon as I start nfsd again. > Are there any ideas how to fix this? Downgrading back to 7-stable is not > really an easy task as far as I know, and I need the server to run without > having to reboot it once for twice a day... I want to add some spices to this stew: :-) I have this big server (> 10 TB) which was running pretty much without major problems, till one morning it started panicking because some 'ZFS * credential *', Since this server is used by many and uptime being a priority, I upgraded it to 8-stable, the panic went away, one problem solved. Some few day later it hung, and it's now hanging every few days. Most of the hangs are because there is no network, but the NIC is bce not em! I doubled kern.ipc.nmbclusters and lets see what happens ... netstat -m: 23066/6634/29700 mbufs in use (current/cache/total) 22072/5942/28014/51200 mbuf clusters in use (current/cache/total/max) 22021/2939 mbuf+clusters out of packet secondary zone in use (current/cache) hope this helps in finding a cure, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: em0 freezes on ZFS server
> On Fri, 26 Feb 2010 15:04:37 +0200 Daniel Braniss > wrote about Re: em0 freezes on ZFS server : > > DB> > At least in my case I found out what is eating the buffers: nfsd > DB> > does! The buffers stop increasing as soon as I stop nfsd. However, > DB> > they start increasing as soon as I start nfsd again. > DB> > Are there any ideas how to fix this? Downgrading back to 7-stable is > DB> > not really an easy task as far as I know, and I need the server to > DB> > run without having to reboot it once for twice a day... > > DB> I want to add some spices to this stew: :-) > > You're welcome. :-) > > DB> Some few day later it hung, and it's now hanging every few days. > DB> Most of the hangs are because there is no network, but the NIC is bce > DB> not em! I doubled kern.ipc.nmbclusters and lets see what happens ... > > Do you have nfsd running and serving clients? If so, we should maybe > change the topic to something like "possible nfs mbuf leakage"... > it's only purpose in life is a nfs server. but I wouldn't exclude zfs from the equation yet. I have othere nfs servers, not doing zfs and I don't see this. > DB> 23066/6634/29700 mbufs in use (current/cache/total) > > My server is at 22k now, and the buffer number is still increasing every > few seconds... > Can you monitor your mbuf usage and report if it grows? > I am, and in the last 2hs. it grew by about 300, it does oscilate, i.e. it grows some, then it goes down, but it seems that the low always increases. when I have enough data i'll plot it. Cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: em0 freezes on ZFS server
> when I have enough data i'll plot it. > check: ftp://ftp.cs.huji.ac.il/users/danny/freebsd/plot.ps x is seconds, y is mbus current. > Cheers, > danny > > > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: mbuf leakage with nfs/zfs? (was: em0 freezes on ZFS server)
> On Fri, 26 Feb 2010 17:41:02 +0200 Daniel Braniss > wrote about Re: em0 freezes on ZFS server : > > DB> check: > DB> ftp://ftp.cs.huji.ac.il/users/danny/freebsd/plot.ps > DB> x is seconds, y is mbus current. > > Looks not as bad as mine. I had 37k when I rebooted the machine some > minutes ago (and it's basically idle, just serving a few nfs clients that > don't do much). > But from the values Jeremy has posted and from my own comparsisons here I > would think that something like 5k of mbuf clusters would be normal for my > machine (and probably also for yours). > > Some more info from my side: > In the meantime I also tried a different network interface. The > nfe-interface that is onboard causes the same problems, so it is probably > not an em-specific issue. > Furthermore I found this via Google: > <http://lists.freebsd.org/pipermail/freebsd-current/2009-December/014062.html>. I'll have to do some packet snooping to check if it's TCP or UDP nfs traffic, since some of the clients are Linux ... > I patched and recompiled my kernel with this, just to try it out. Right > now I have > > 2264/1321/3585 mbufs in use (current/cache/total) > 1239/1017/2256/65000 mbuf clusters in use (current/cache/total/max) > 1239/809 mbuf+clusters out of packet secondary zone in use (current/cache) > > but the uptime is only 12min so far. In some hours I'll know for certain > if this patch has anything to do with the problem. at the moment there is not much activity, but if you check the latest plot.ps you will see that the bottom is slowly increasing, so my bet is that there must be some leakage! cheers danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: mbuf leakage with nfs/zfs?
> On Fri, 26 Feb 2010 23:12:39 +0100 Willem Jan Withagen > wrote about Re: mbuf leakage with nfs/zfs?: > > WJW> > DB> I'll have to do some packet snooping to check if it's TCP or > WJW> > DB> UDP nfs traffic, since some of the clients are Linux ... > > WJW> > I have Linux clients, too. Some use tcp, some udp. > > WJW> I have Linux and FreeBSD clients running. The build system runs on > WJW> Linux. All Linux's are UDP > > Another shot in the dark: > After upgrading the server, all my Linux clients hang with "stale nfs > dir/file handle/whatever". I was not able to umount them (not even > forcefully). I had to use either lazy forceful umount (-fl) or reboot. Some > of these clients are still hanging around, because they are physically > hard to access (clean room installs etc.). Maybe these clients still try to > establish connections that eat up the buffers and never come back? I doubt it, but here is another shot: are we all running samba? I'm asking because the lock manager keeps dying and ... cheers, danny PS: I dropped Jack from the CC, I think em is innocent :-) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: mbuf leakage with nfs/zfs?
> On Sat, 27 Feb 2010 09:24:10 +0200 Daniel Braniss > wrote about Re: mbuf leakage with nfs/zfs? : > > DB> I doubt it, but here is another shot: > DB> are we all running samba? I'm asking because the lock manager keeps > DB> dying and ... > > Nope, no samba on my side. I am running lockd and statd on the server, but > stoppeing them does not change anything. All clients are using option > nolock anyway. > it was a shot in the dark. anyways, I am running tests on an 'unused' server, only me using it to 'make world' and it's leaking. cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: mbuf leakage with nfs/zfs?
> On Sat, 27 Feb 2010 11:14:56 +0200 Daniel Braniss > wrote about Re: mbuf leakage with nfs/zfs? : > > DB> anyways, I am running tests on an 'unused' server, only me using it to > DB> 'make world' > DB> and it's leaking. > > Hm, I've got a server with 8-PRE from somewhen in Nov09 that is serving > nfs from zfs fine and shows no leakage... > > > cu > Gerrit the binary search has started! sorry, have to go know :-) [realy], but should be back in a couple of hours, let me know if you managed to pin it down, else I can continue. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: mbuf leakage with nfs/zfs?
> On Sat, 27 Feb 2010 12:26:02 +0200 Daniel Braniss > wrote about Re: mbuf leakage with nfs/zfs? : > > > DB> > Hm, I've got a server with 8-PRE from somewhen in Nov09 that is > DB> > serving nfs from zfs fine and shows no leakage... > > DB> the binary search has started! > > After considering the last email from Willem: My 8-PRE server does not > have udp Linux clients, only Linux with tcp. If indeed Linux with udp is > causing the problem, it may very well even be in 8-PRE, and I just did not > see it so far. I have been running for the last few hours, 8-rel, and the only client is another 8-stable, furthermore, no ZFS, just plain UFS, and the leak is there! I am now trying 8-rc2 but will check in the morning, it is after all saturday night :-) danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: mbuf leakage with nfs/zfs?
> On Sat, Feb 27, 2010 at 10:53:00PM +0100, Willem Jan Withagen wrote: > > On 27-2-2010 21:32, Eirik Øverby wrote: > > >I've had a discussion with some folks on this for a while. I can easil=y > > >reproduce this situation by mounting a FreeBSD ZFS filesystem via > > >NFS-UDP from an OpenBSD machine. Telling the OpenBSD machine to use TC=P > > >instead of UDP makes the problem go away. > > > > > >Other FreeBSD systems mounting the same share, either using UDP or TCP=, > > >does not cause the problem to show up. > > > > > >A patch was suggested by Rick Macklem, but that did not solve the issu=e: > > >http://lists.freebsd.org/pipermail/freebsd-current/2009-December/01418=1.html> > > > > > > I concur. > > Everything in my network is now on TCP, and there is no mbuf leakage. > > I just don't get over the 5500 mark, no matter what I throw at it. > > > > I do feel that TCP is not as well performing on a local net with Linux, > > hence the choice for UDP. But TCP is workable as next best. > > I'm pulling in Robert Watson, who has some familiarity with the UDP > stack/code in FreeBSD. I'm not sure he'll be a sufficient source of > knowledge for this specific issue since it appears (?) to be specific to > NFS; Rick Macklem would be a better choice, but as reported, he's MIA. > > Robert, are you aware of any changes or implementation issues which > might cause excessive (read: leaking) mbuf use under UDP-based NFS? Do > you know of a way folks could determine the source of the leak, either > via DDB or while the system is live? I have been runing some tests in a controlled environment. server and client are both 64bit Xeon/X5550 @ 2.67GHz with 16Gb of memory FreeBSD/SMP: 2 package(s) x 4 core(s) x 2 SMT threads the client is runing latest 8.0 stable the load is created by runing 'make -j32 buildworld' and sleeping 150 sec. in between runs, this is the straight line you will see in the graphs. Both the src and obj directories are NFS mounted from the server, regular UFS. when server is running 7.2-stable no leakage is seen. see ftp://ftp.cs.huji.ac.il/users/danny/freebsd/mbufs/{tcp,udp}-7.2.ps when server is runing 8.0-stable see ftp://ftp.cs.huji.ac.il/users/danny/freebsd/mbufs/{tcp,udp}-8.0.ps you can see that udp is leaking! cheers, danny ps: I think the subject should be changed again, removing zfs ... ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: mbuf leakage with nfs/udp (was: mbuf leakage with nfs/zfs)
> > On Feb 28, 2010, at 12:11 PM, Daniel Braniss wrote: > > >> I'm pulling in Robert Watson, who has some familiarity with the UDP > >> stack/code in FreeBSD. I'm not sure he'll be a sufficient source of > >> knowledge for this specific issue since it appears (?) to be specific =to > >> NFS; Rick Macklem would be a better choice, but as reported, he's =MIA. > >> > >> Robert, are you aware of any changes or implementation issues which > >> might cause excessive (read: leaking) mbuf use under UDP-based NFS? =Do > >> you know of a way folks could determine the source of the leak, =either > >> via DDB or while the system is live? > > > > I have been runing some tests in a controlled environment. > > > > server and client are both 64bit Xeon/X5550 @ 2.67GHz with 16Gb of > > > > =memory > > FreeBSD/SMP: 2 package(s) x 4 core(s) x 2 SMT threads > > > > the client is runing latest 8.0 stable > > the load is created by runing 'make -j32 buildworld' and sleeping 150 =sec. > > in between runs, this is the straight line you will see in the graphs. > > Both the src and obj directories are NFS mounted from the server, =regular > > UFS. > > > > when server is running 7.2-stable no leakage is seen. > > see ftp://ftp.cs.huji.ac.il/users/danny/freebsd/mbufs/{tcp,udp}-7.2.ps > > when server is runing 8.0-stable > > see ftp://ftp.cs.huji.ac.il/users/danny/freebsd/mbufs/{tcp,udp}-8.0.ps > > you can see that udp is leaking! > > > > cheers, > > danny > > ps: I think the subject should be changed again, removing zfs ... > > This type of problem (occurs with one client but not another) is almost > > =always the result of the access pattern of a particular client =triggering > > a specific (and perhaps single) bug in error-handling. For =example, we > > might not be properly freeing the received request when =generating an > > EPERM in an edge case. The hard bit is identifying which =it is. If it's > > reproducible with UDP, then usually the process is: > > - Build a minimal test case to trigger the problem -- ideally with as > > =little complexity as possible. > - Run netstat -m at the beginning of the test and the end of the test on =the > server to count the number of leaked mbufs > - Run wireshark throughout the test > - Walk the wireshark trace looking for some error that occurs at about =the > same or slightly lower number of times then the number of mbufs =leaked > - Iterate, narrowing the test case until it's either obvious exactly =what's > going on, or you've identified a relatively constrained code path =and can > just spot the bug by reading the code > > It's almost certainly one or a small number of very specific RPCs that =are > > triggering it -- maybe OpenBSD does an extra lookup, or stat, or > > =something, on a name that may not exist anymore, or does it sooner than > > =the other clients. Hard to say, other than to wave hands at the > > =possibilities. > > And it may well be we're looking at two bugs: Danny may see one bug, > > =perhaps triggered by a race condition, but it may be different from the > > =OpenBSD client-triggered bug (to be clear: it's definitely a FreeBSD =bug, > > although we might only see it when an OpenBSD client is used =because > > perhaps OpenBSD also has a bug or feature). > > Robert= well, I have further reduced the problem, it happens with NFS/UDP writes. i'll try the wireshark road, but i'm very rusty with RPC, the other road is to check the changes, my oldest is from late october (RC2) where it's happening, while Gerrit tried 8-pre from November and worked, so it will be fun trying to nail it down :-) cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: mbuf leakage with nfs/zfs?
> > > On Sat, 27 Feb 2010, Jeremy Chadwick wrote: > > >> I concur. > >> Everything in my network is now on TCP, and there is no mbuf leakage. > >> I just don't get over the 5500 mark, no matter what I throw at it. > >> > >> I do feel that TCP is not as well performing on a local net with Linux, > >> hence the choice for UDP. But TCP is workable as next best. > > > > NFS; Rick Macklem would be a better choice, but as reported, he's MIA. > > > > Not exactly MIA, but only able to read email from time to time at this > point. I don't know when I'll be able to do more than that. > > So, it does sound like it is UDP specific. Robert mentioned one scenario, > which was an infrequently executed code path that is being tickled and it > has a missing m_freem(). > > One thing someone could try is switching to the experimental nfs server > ("-e" on both mountd and nfsd) and see if the leak goes away. If it does > go away, it is almost certainly the above in the regular nfs server code. > runing with the experimental nfs server all is ok! (at least I can't see any mbuf leakage :-) so now that we can assume that the problem is in NFS/UDP writes via classic nfsserver, where to look? > If it doesn't go away, the problem is more likely in the krpc or the > generic udp code. (When I looked at svc_dg.c, I could only spot one > possible leak and you've already determined that patch doesn't help. > The other big difference when using udp on the FreeBSD8 krpc is the > reply cache code. I seem to recall it's an lru cache with a fixed upper > bound, but it might be broken and leaking. > > If you change the server to set sp_rcache = NULL in the initialization > function in sys/nfsserver/nfs_srvkrpc.c, I think that disables the replay > cache. You wouldn't want to run this way in production, but it would > determine if the leak is in it. > > Change the 3 lines in nfsrv_init() to: > nfsrv_pool->sp_rcache = NULL; > nfsrv_pool->sp_assign = NULL; > nfsrv_pool->sp_done = NULL; > > and I think the krpc replay cache will be disabled. > > Good luck with it and please report back if you get to try the above. > > I'll get back to committing etc one of these days, rick just keep sending insights/pointers and enjoy life danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: mbuf leakage with nfs/udp (was mbuf leakage with nfs/zfs?)
> > > On Tue, 2 Mar 2010, Daniel Braniss wrote: > > > runing with the experimental nfs server all is ok! > > (at least I can't see any mbuf leakage :-) > > > > so now that we can assume that the problem is in NFS/UDP writes via > > classic nfsserver, where to look? > > > > It might also be the krpc reply cache, since the experimental server > isn't using it (nfsv4 requires a rather twisted reply cache and it was > easier to just use that one for nfsv2,3 for the experimental server, > as well). > > >> If it doesn't go away, the problem is more likely in the krpc or the > >> generic udp code. (When I looked at svc_dg.c, I could only spot one > >> possible leak and you've already determined that patch doesn't help. > >> The other big difference when using udp on the FreeBSD8 krpc is the > >> reply cache code. I seem to recall it's an lru cache with a fixed upper > >> bound, but it might be broken and leaking. > >> > >> If you change the server to set sp_rcache = NULL in the initialization > >> function in sys/nfsserver/nfs_srvkrpc.c, I think that disables the replay > >> cache. You wouldn't want to run this way in production, but it would > >> determine if the leak is in it. > >> > >> Change the 3 lines in nfsrv_init() to: > >> nfsrv_pool->sp_rcache = NULL; > >> nfsrv_pool->sp_assign = NULL; > >> nfsrv_pool->sp_done = NULL; > >> > >> and I think the krpc replay cache will be disabled. > >> > > If someone gets a chance to try the above (not in production mode:-), > it will determine if the problem is in the reply cache or the nfs server's > write code. > >> Good luck with it and please report back if you get to try the above. > >> > > Thanks for trying the experimental server. It is getting narrowed down, > due to everyone's work on it. > disabling the krpc reply cache does it, no visible damage. Somehow this reminds me of my old 1970 beetle, parts would fall off but it would continue working :-) where to go from here? danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: mbuf leakage with nfs/udp (was mbuf leakage with nfs/zfs?)
> > > On Wed, 3 Mar 2010, Daniel Braniss wrote: > > > disabling the krpc reply cache does it, no visible damage. Somehow > > this reminds me of my old 1970 beetle, parts would fall off but it would > > continue working :-) > > where to go from here? > > > Ok, so it sounds like the leak is in the krpc reply cache code, if I > understand this? (ie. you are running the regular server with the reply > cache disabled and the UDP client mounts aren't causing the leak.) correct. The interesting side effect, is that I can't see any negative issues when disabling the cash. > > Good work on tracking this down! > it was a coordinated efford :-) > I guess the next step is to look through the code for the leak. I'll > do that someday, but if anyone else is inspired to do so, they are > more than welcome.:-) > > Thanks for working through this, rick thank you! I have a vested interest in having this fixed, on the other hand nfsd seems ok, I have been running it now on a semi production server and it's holding up quiet nicely, the cache seems not up to expectations: store-mg-03# nfsstat -se Server Info: Getattr SetattrLookup Readlink Read WriteCreateRemove 48176764262687 12582599 19732 4225907 9186574780793818837 Rename Link Symlink Mkdir Rmdir Readdir RdirPlusAccess 7623 160 27753 59551 59552118216 0 1992779 MknodFsstatFsinfo PathConfCommit LookupP SetClId SetClIdCf 097900519 0 1644267 0 0 0 Open OpenAttr OpenDwnGr OpenCfrm DelePurge DeleRet GetFH Lock 0 0 0 0 0 0 0 0 LockT LockU CloseVerify NVerify PutFH PutPubFH PutRootFH 0 0 0 0 0 0 0 0 Renew RestoreFHSaveFH Secinfo RelLckOwn V4Create 0 0 0 0 0 0 Server: RetfailedFaults Clients 0 0 0 OpenOwner Opens LockOwner LocksDelegs 0 0 0 0 0 Server Cache Stats: Inprog Idem Non-idemMisses CacheSize TCPPeak 307 0 297 80943198 0 0 danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: mbuf leakage with nfs/zfs?
> > > On Tue, 2 Mar 2010, Daniel Braniss wrote: > > > > > just keep sending insights/pointers and enjoy life > > > > > You could try this patch for sys/rpc/replay.c. Completely untested and > just typed into email (so don't give it to "patch", just edit the file). > > - try adding these 2 lines just before the end of replay_setreply() in >sys/rpc/replay.c: > > - } > + } else if (m) > + m_freem(m); > mtx_unlock(&rc->rc_lock); > } > > It's the only place I can see in replay.c that might leak, rick > this is what I did: --- a/sys/rpc/replay.c Mon Mar 01 18:29:54 2010 +0200 +++ b/sys/rpc/replay.c Fri Mar 05 09:24:17 2010 +0200 @@ -243,6 +243,9 @@ rce->rce_repbody = m; if (m) rc->rc_size += m_length(m, NULL); + } else if (m) { +printf("free m=%p ...\n", m); +m_freem(m); } mtx_unlock(&rc->rc_lock); } but it didn't help, it's not triggered Thanks for the explanation on the cache, things are begining to make sense. If I understand, the reason for this cache is to prevent re-applying an already performed rpc, which could lead to data corruption btw, the list of CCs is rather big, so if anyone feels he rather be removed, please let me know. cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: mbuf leakage with nfs/zfs?
> > > On Tue, 2 Mar 2010, Daniel Braniss wrote: > > > > > just keep sending insights/pointers and enjoy life > > > > > You could try this patch for sys/rpc/replay.c. Completely untested and > just typed into email (so don't give it to "patch", just edit the file). > > - try adding these 2 lines just before the end of replay_setreply() in >sys/rpc/replay.c: > > - } > + } else if (m) > + m_freem(m); > mtx_unlock(&rc->rc_lock); > } > > It's the only place I can see in replay.c that might leak, rick > this is what I did: --- a/sys/rpc/replay.c Mon Mar 01 18:29:54 2010 +0200 +++ b/sys/rpc/replay.c Fri Mar 05 09:24:17 2010 +0200 @@ -243,6 +243,9 @@ rce->rce_repbody = m; if (m) rc->rc_size += m_length(m, NULL); + } else if (m) { +printf("free m=%p ...\n", m); +m_freem(m); } mtx_unlock(&rc->rc_lock); } but it didn't help, it's not triggered Thanks for the explanation on the cache, things are begining to make sense. If I understand, the reason for this cache is to prevent re-applying an already performed rpc, which could lead to data corruption btw, the list of CCs is rather big, so if anyone feels he rather be removed, please let me know. cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: mbuf leakage with nfs/zfs?
[...] > > but it didn't help, it's not triggered > > > > Hmm, well that's the only place I could see in replay.c that could leak > (and it's a pretty straightforward piece of code). This is getting > interesting. Just to confirm where we currently are... > > - replay cache disabled --> no leak > - replay cache enabled (with or without the above patch) --> leak > yes and yes. > I'll take another look, but I doubt the leak is in replay.c so... maybe > a reply from the cache is somehow handled incorrectly and that causes the > leak elsewhere? (Just a random hunch at this point.) > it works ok in 7.2, so it would be interesting to compare changes ... > > Thanks for the explanation on the cache, things are begining to make sense. > > If I understand, the reason for this cache is to prevent re-applying an > > already performed rpc, which could lead to data corruption > > > > Yep, you've got it. It is basically a bandaid for the poor transport > semantics provided by UDP. > > Having fun with this one. Thanks for the help, rick > I'm glad :-) danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
is dtrace usable?
hi, I get link_elf_obj: symbol lapic_cyclic_clock_func undefined when trying kldload dtraceall this is with a fearly resent 8-stable I'm trying to help Rick Maclem debug the NSF/UDP problem, and I thought it would be a good chance to learn dtrace, but :-( danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: is dtrace usable?
> > On Sat, 6 Mar 2010, Daniel Braniss wrote: > > > link_elf_obj: symbol lapic_cyclic_clock_func undefined > > > > when trying > > kldload dtraceall this is with a fearly resent 8-stable > > > > I'm trying to help Rick Maclem debug the NSF/UDP problem, and I thought it > > would be a good chance to learn dtrace, but :-( > > Take a look at the DTrace configuration information here: > >http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/dtrace.html > > And here: > >http://wiki.freebsd.org/DTrace > > It looks like options KDTRACE_HOOKS may not be defined in your kernel > configuration, but there are some other details, such as WITH_CTF=1, that > you'll also need to make sure are appropriately set. > > Robert I did all that, but booted the wrong kernel, sorry for the noise danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Fwd: Re: NFS Client error
> Thanks for your kind reply, I'm forwarding it there... > > > Original Message > Subject: Re: NFS Client error > Date: Mon, 08 Mar 2010 23:59:29 +0100 > From: vol...@vwsoft.com > To: Giulio Ferro > CC: freebsd-hack...@freebsd.org, freebsd-...@freebsd.org > > > > On 03/08/10 12:16, Giulio Ferro wrote: > > Freebsd 8 stable amd64 > > > > It mounts different file systems by NFS (with locking) on a > > data server directly connected (gigabit) to the server > > > > Apache running in a several jails on those nfs folders. > > > > Now and then I get huge slow-down. When I look in the logs > > I get thousand of lines like these: > > Mar 5 11:50:52 virt2 kernel: vm_fault: pager read error, pid 46487 (httpd) > > Mar 5 11:50:52 virt2 kernel: pid 46487 (httpd), uid 80: exited on > > signal 11 > > > > > > What should I do? If the binary (httpd) is on a nfs server, then if the binary got modified this is what usualy happens my 2c danny > > Giulio, > > it seems this is anyhow not related to network (nfs) operations. It's > looking like a problem in the VM. I think it makes sense to have a look > at the httpd.core file if the binary has been linked with debugging > symbols turned on. Also I think at first, it may not hurt to look at > vmstat -m output. > > You may want to change ${subject} and post to stable@ to drive more > attention to your problem. > > Volker > > > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
boot and boot0cfg problem
hi, I have a this SBC that boots off a CF card, when it boots, I can select the boot partition via F1 or F2 and all is OK. when I do it via boot0cfg the 'default_selection' changes correctly, but the 'active' partition is not changed, so boot ignores it. I went ahead and changed boot0cfg.c to set the active partition and now I'm baffled: alix-3# ./boot0cfg -v ad0 # flag start chs type end chs offset size 1 0x00 0: 1: 1 0xa5519: 15:63 63 524097 2 0x80520: 0: 1 0xa5 1023: 15:63 524160 524160 --+ 3 0x00 1023:255:63 0xa5 1023: 15:63 1048320 2951424 | | version=2.0 drive=0x80 mask=0xf ticks=182 bell=# (0x23) | options=packet,update,nosetdrv | volume serial ID -800f | default_selection=F2 (Slice 2) <+ so far so good. alix-3# ./boot0cfg -v -s1 ad0 ... 1 0x80 0: 1: 1 0xa5519: 15:63 63 524097 ... default_selection=F1 (Slice 1) ok right? but no! ./boot0cfg -v ad0 ... 2 0x80520: 0: 1 0xa5 1023: 15:63 524160 524160 ... default_selection=F1 (Slice 1) so it seems that someone is preventing changes to the partition table! btw, this problem was not present in older boot0 (1.0) where the active partition flag is ignored. help needed here! danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: boot and boot0cfg problem
> On 30.03.2010 12:05, Daniel Braniss wrote: > > so it seems that someone is preventing changes to the partition table! > > btw, this problem was not present in older boot0 (1.0) where the active > > partition flag is ignored. > > You can change active partition via gpart(8). > Hi Andrey, I'm sorry, I've reread the manual, and can't find the write magic. btw, boot0cfg does call geom but something seems to be broken. cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: boot and boot0cfg problem
> 30.03.10, 14:03, "Daniel Braniss" : > > > > On 30.03.2010 12:05, Daniel Braniss wrote: > > > > so it seems that someone is preventing changes to the partition table! > > > > btw, this problem was not present in older boot0 (1.0) where the active > > > > partition flag is ignored. > > > > > > You can change active partition via gpart(8). > > > > > Hi Andrey, > > I'm sorry, I've reread the manual, and can't find the write magic. > > Yes, i also doesn't remember where it can be read. Only in g_part_mbr.c :) > Try this: > # gpart set -a active -i 1 ada2 > This will set active first partition on ada2: > # gpart show ada2 > =>63 1250263665 ada2 MBR (596G) > 6340965687 1 !7 [active] (20G) > 40965750 1209292875 2 !7 (577G) > 12502586255103- free - (2.5M) > > > btw, boot0cfg does call geom but something seems to be broken. > I'll look boot0cfg code today and probably made a patch. ok, that worked! now if you can get boot0cfg to work that would realy be nice. thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
panic: vm_fault_copy_wired: page missing
Hi, I'm getting this with FreeBSD-8-stable, it usually happens when starting apache: panic: vm_fault_copy_wired: page missing cpuid = 3 KDB: enter: panic [thread pid 1013 tid 100106 ] Stopped at kdb_enter+0x3d: movq$0,0x68f170(%rip) db> tr Tracing pid 1013 tid 100106 td 0xff0007a66ae0 kdb_enter() at kdb_enter+0x3d panic() at panic+0x17b vm_fault_copy_entry() at vm_fault_copy_entry+0x283 vmspace_fork() at vmspace_fork+0x4d0 fork1() at fork1+0x35f fork() at fork+0x1c syscall() at syscall+0x1e7 Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (2, FreeBSD ELF64, fork), rip = 0x8009f41ac, rsp = 0x7fffe7d8, rbp = 0x800c34a80 --- any help in tracking this? thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: panic: vm_fault_copy_wired: page missing
> On Thu, Apr 15, 2010 at 12:22 AM, Daniel Braniss wrot=e: > > Hi, > > I'm getting this with FreeBSD-8-stable, it usually happens when > > starting apache: > > > > panic: vm_fault_copy_wired: page missing > > cpuid = 3 > > KDB: enter: panic > > [thread pid 1013 tid 100106 ] > > Stopped at kdb_enter+0x3d: movq $0,0x68f170(%rip) > > db> tr > > Tracing pid 1013 tid 100106 td 0xff0007a66ae0 > > kdb_enter() at kdb_enter+0x3d > > panic() at panic+0x17b > > vm_fault_copy_entry() at vm_fault_copy_entry+0x283 > > vmspace_fork() at vmspace_fork+0x4d0 > > fork1() at fork1+0x35f > > fork() at fork+0x1c > > syscall() at syscall+0x1e7 > > Xfast_syscall() at Xfast_syscall+0xe1 > > --- syscall (2, FreeBSD ELF64, fork), rip = 0x8009f41ac, rsp = > > 0x7fff=e7d8, > > rbp = 0x800c34a80 --- > > > > any help in tracking this? > > Hi Danny, > Can you provide some details about your systems, like amd64 vs > i386, processor model, amount of RAM, swap, etc? sure, straight from the lion's mouth: Copyright (c) 1992-2010 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.0-STABLE #33 r2073: Wed Apr 14 15:29:07 IDT 2010 da...@sunfire:/r+d/obj/sunfire/r+d/stable/8/sys/HUJI amd64 Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Dual-Core AMD Opteron(tm) Processor 2218 (2613.41-MHz K8-class CPU) Origin = "AuthenticAMD" Id = 0x40f13 Family = f Model = 41 Stepping = 3 Features=0x178bfbff Features2=0x2001 AMD Features=0xea500800 AMD Features2=0x1f real memory = 17179869184 (16384 MB) avail memory = 16562614272 (15795 MB) the hardware is a Sun X2200. thanks for any help! this machine is supposed to replace our old web server and it's not happening :-( danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: panic: vm_fault_copy_wired: page missing
> On Thu, Apr 15, 2010 at 11:50:41AM +0300, Daniel Braniss wrote: > > > On Thu, Apr 15, 2010 at 12:22 AM, Daniel Braniss > > > wrot=e: > > > > Hi, > > > > I'm getting this with FreeBSD-8-stable, it usually happens when > > > > starting apache: > > > > > > > > panic: vm_fault_copy_wired: page missing > > > > cpuid = 3 > > > > KDB: enter: panic > > > > [thread pid 1013 tid 100106 ] > > > > Stopped at kdb_enter+0x3d: movq $0,0x68f170(%rip) > > > > db> tr > > > > Tracing pid 1013 tid 100106 td 0xff0007a66ae0 > > > > kdb_enter() at kdb_enter+0x3d > > > > panic() at panic+0x17b > > > > vm_fault_copy_entry() at vm_fault_copy_entry+0x283 > > > > vmspace_fork() at vmspace_fork+0x4d0 > > > > fork1() at fork1+0x35f > > > > fork() at fork+0x1c > > > > syscall() at syscall+0x1e7 > > > > Xfast_syscall() at Xfast_syscall+0xe1 > > > > --- syscall (2, FreeBSD ELF64, fork), rip = 0x8009f41ac, rsp = > > > > 0x7fff=e7d8, > > > > rbp = 0x800c34a80 --- > > > > > > > > any help in tracking this? > > > > Hi Danny, > > > Can you provide some details about your systems, like amd64 vs > > > i386, processor model, amount of RAM, swap, etc? > > sure, straight from the lion's mouth: > > > > Copyright (c) 1992-2010 The FreeBSD Project. > > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > > The Regents of the University of California. All rights reserved. > > FreeBSD is a registered trademark of The FreeBSD Foundation. > > FreeBSD 8.0-STABLE #33 r2073: Wed Apr 14 15:29:07 IDT 2010 > > da...@sunfire:/r+d/obj/sunfire/r+d/stable/8/sys/HUJI amd64 > > Timecounter "i8254" frequency 1193182 Hz quality 0 > > CPU: Dual-Core AMD Opteron(tm) Processor 2218 (2613.41-MHz K8-class CPU) > > Origin = "AuthenticAMD" Id = 0x40f13 Family = f Model = 41 Stepping = > > 3 > > > > Features=0x178bfbff > CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> > > Features2=0x2001 > > AMD Features=0xea500800 > > AMD Features2=0x1f > > real memory = 17179869184 (16384 MB) > > avail memory = 16562614272 (15795 MB) > > > > the hardware is a Sun X2200. > > > > thanks for any help! this machine is supposed to replace our old web server > > and it's not happening :-( > > Could you please provide the following? > > 1) Contents of /var/db/ports/apache-/options sunfire> cat /var/db/ports/apache-xml-security-c/options # This file is auto-generated by 'make config'. # No user-servicable parts inside! # Options for apache-xml-security-c-1.4.0 _OPTIONS_READ=apache-xml-security-c-1.4.0 WITH_XERCES_DEVEL=true > 2) Contents of /etc/make.conf sunfire> cat /etc/make.conf OVERRIDE_LINUX_BASE_PORT=f8 OVERRIDE_LINUX_NONBASE_PORTS=f8 WRKDIRPREFIX=/home/pobj PACKAGES=/r+d/packages FETCH_ENV= HTTP_PROXY=http://wwwproxy.cs.huji.ac.il:8080/ # added by use.perl 2009-11-10 11:51:57 PERL_VERSION=5.10.1 > 3) Your kernel configuration file ("HUJI") > i'll try and send this as an attachment sunfire> config -x /boot/kernel/kernel > Thanks. > > -- > | Jeremy Chadwick j...@parodius.com | > | Parodius Networking http://www.parodius.com/ | > | UNIX Systems Administrator Mountain View, CA, USA | > | Making life hard for others since 1977. PGP: 4BD6C0CB | > options CONFIG_AUTOGENERATED ident HUJI machine amd64 cpu HAMMER makeoptions DEBUG=-g options PRINTF_BUFR_SIZE=256 options ALTQ_HFSC options ALTQ_PRIQ options ALTQ_CBQ options ALTQ options DEVICE_POLLING options CONSPEED=115200 options ALT_BREAK_TO_DEBUGGER options BOOTP_NFSV3 options INCLUDE_CONFIG_FILE options AH_SUPPORT_AR5416 options IEEE80211_SUPPORT_MESH options IEEE80211_AMPDU_AGE options IEEE80211_DEBUG options AHD_REG_PRETTY_PRINT options AHC_REG_PRETTY_PRINT options ATA_REQUEST_TIMEOUT=3 options SMP options GDB options DDB options KDB options FLOWTABLE options MAC options AUDIT options HWPMC_HOOKS options KBD_INSTALL_CDEV options _KPOSIX_PRIORITY_SCHEDULING options P1003_1B_SEMAPHORES options SYSVSEM options SYSVMSG options SYSVSHM options STACK options KTRACE options SCSI_DELAY=500 options COMPAT_FREEBSD7 options COMPAT_FREEBSD6 options COMPAT_FREEBSD5 options COMPAT_FREEBSD4 options COMPAT_FREEBSD32 options COMPAT_43TTY options GEOM_LABEL options GEOM_PART_GPT options PSEUDOFS options PROCFS options CD9660 options MSDOSFS options NFS_ROOT options NFSLOCKD option
Re: panic: vm_fault_copy_wired: page missing
> On Thu, Apr 15, 2010 at 9:22 AM, Daniel Braniss wrote: > > Hi, > > I'm getting this with FreeBSD-8-stable, it usually happens when > > starting apache: > > alc@ made some VM MFCs yesterday, could you try a 13th of April kernel > and see if it works out for you? > asap, btw, I reduced the amount of physical memory and things seem ok. cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: panic: vm_fault_copy_wired: page missing
> On Thu, Apr 15, 2010 at 9:22 AM, Daniel Braniss wrote: > > Hi, > > I'm getting this with FreeBSD-8-stable, it usually happens when > > starting apache: > > alc@ made some VM MFCs yesterday, could you try a 13th of April kernel > and see if it works out for you? the kernel that panics does not include alc's MFC - I did the sync few hours before -, so now I'm copiling with the MFC. BTW, with less memory the server is still running! danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: panic: vm_fault_copy_wired: page missing
> > --xFkczX7rH1pKA3aV > Content-Type: text/plain; charset=us-ascii > Content-Disposition: inline > Content-Transfer-Encoding: quoted-printable > > On Thu, Apr 15, 2010 at 10:22:20AM +0300, Daniel Braniss wrote: > > Hi, > > I'm getting this with FreeBSD-8-stable, it usually happens when > > starting apache: > >=20 > > panic: vm_fault_copy_wired: page missing > > cpuid =3D 3 > > KDB: enter: panic > > [thread pid 1013 tid 100106 ] > > Stopped at kdb_enter+0x3d: movq$0,0x68f170(%rip) > > db> tr > > Tracing pid 1013 tid 100106 td 0xff0007a66ae0 > > kdb_enter() at kdb_enter+0x3d > > panic() at panic+0x17b > > vm_fault_copy_entry() at vm_fault_copy_entry+0x283 > > vmspace_fork() at vmspace_fork+0x4d0 > > fork1() at fork1+0x35f > > fork() at fork+0x1c > > syscall() at syscall+0x1e7 > > Xfast_syscall() at Xfast_syscall+0xe1 > > --- syscall (2, FreeBSD ELF64, fork), rip =3D 0x8009f41ac, rsp =3D 0x7fff= > e7d8,=20 > > rbp =3D 0x800c34a80 --- > >=20 > > any help in tracking this? > >=20 > > thanks, > > danny > > Is it true that the process started, or at least some of loaded dso > are from NFS mount ? everything is nfs :-), the host is dataless but redusing the amount of physical memory has solved the problem, so I don't think NFS is the problem. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: panic: vm_fault_copy_wired: page missing
> On Thu, Apr 15, 2010 at 9:22 AM, Daniel Braniss wrote: > > Hi, > > I'm getting this with FreeBSD-8-stable, it usually happens when > > starting apache: > > alc@ made some VM MFCs yesterday, could you try a 13th of April kernel > and see if it works out for you? with or without the MFC it's still panicking, and the memory size does not affect the outcome :-( danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: panic: vm_fault_copy_wired: page missing
> > --U3Zg06C/E2vtHpAW > Content-Type: text/plain; charset=us-ascii > Content-Disposition: inline > Content-Transfer-Encoding: quoted-printable > > On Thu, Apr 15, 2010 at 12:39:13PM +0300, Daniel Braniss wrote: > > >=20 > > > --xFkczX7rH1pKA3aV > > > Content-Type: text/plain; charset=3Dus-ascii > > > Content-Disposition: inline > > > Content-Transfer-Encoding: quoted-printable > > >=20 > > > On Thu, Apr 15, 2010 at 10:22:20AM +0300, Daniel Braniss wrote: > > > > Hi, > > > > I'm getting this with FreeBSD-8-stable, it usually happens when > > > > starting apache: > > > >=3D20 > > > > panic: vm_fault_copy_wired: page missing > > > > cpuid =3D3D 3 > > > > KDB: enter: panic > > > > [thread pid 1013 tid 100106 ] > > > > Stopped at kdb_enter+0x3d: movq$0,0x68f170(%rip) > > > > db> tr > > > > Tracing pid 1013 tid 100106 td 0xff0007a66ae0 > > > > kdb_enter() at kdb_enter+0x3d > > > > panic() at panic+0x17b > > > > vm_fault_copy_entry() at vm_fault_copy_entry+0x283 > > > > vmspace_fork() at vmspace_fork+0x4d0 > > > > fork1() at fork1+0x35f > > > > fork() at fork+0x1c > > > > syscall() at syscall+0x1e7 > > > > Xfast_syscall() at Xfast_syscall+0xe1 > > > > --- syscall (2, FreeBSD ELF64, fork), rip =3D3D 0x8009f41ac, rsp =3D3= > D 0x7fff=3D > > > e7d8,=3D20 > > > > rbp =3D3D 0x800c34a80 --- > > > >=3D20 > > > > any help in tracking this? > > > >=3D20 > > > > thanks, > > > > danny > > >=20 > > > Is it true that the process started, or at least some of loaded dso > > > are from NFS mount ? > > everything is nfs :-), the host is dataless > > but redusing the amount of physical memory has solved the > > problem, so I don't think NFS is the problem. > > I do think that NFS is problem. Another key point is that your process > is mlock'ed, right ? This is kind of known issue with NFS and mlock. > well, since it's panicking again, there goes the memsize theory. this is getting weirder and weirder, it now panics on reboot: Stopping cron. Stopping sshd. ===> apache22 profile: httpd ===> apache22 profile: httpdyn Stopping inetd. Stopping ntpd. Stopping lockd. Waiting for PIDS: 1201. Stopping statd. Stopping nfsd. Stopping mountd. Stopping devd. . Apr 15 13:27:48 sf-02 syslogd: exiting on signal 15 panic: vm_fault_copy_wired: page missing cpuid = 1 KDB: enter: panic [thread pid 1014 tid 100118 ] Stopped at kdb_enter+0x3d: movq$0,0x68f7a0(%rip) db> tr Tracing pid 1014 tid 100118 td 0xff000533f3a0 kdb_enter() at kdb_enter+0x3d panic() at panic+0x17b vm_fault_copy_entry() at vm_fault_copy_entry+0x283 vmspace_fork() at vmspace_fork+0x4d0 fork1() at fork1+0x35f fork() at fork+0x1c syscall() at syscall+0x1e7 Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (2, FreeBSD ELF64, fork), rip = 0x8009f41ac, rsp = 0x7fffe7d8, rbp = 0x800c34a00 --- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"