from:"Daniel Braniss"

classes and kernel_cookie was Re: Specifying root mount options on diskless boot.

2011-01-10 Thread Daniel Braniss

...
> I note that the response to your message from "danny" offers the ability 
> to pass arguments to the nfs mount command, but also seems to offer a fix 
> for the fact that "classes" are not supported under PXE:
> 
> http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/90368
> 
> I hope "danny" will offer a patch to mainline code - it would be an 
> important improvement (and already promised in the documentation).
...
I'm willing to try and add the missing pieces, but I need some better 
explanantion as to what they are, for example, I have no clue what the
kernel_cookie is used for, nor what the ${class} is all about.
BTW, it would be kind if the line in the pxeboot(8):
As PXE is still in its infancy ...
can be changed :-)

"danny"


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Living on gmirror: need to reincarnate /etc/rc.early

2011-01-25 Thread Daniel Braniss

> On 01/25/2011 12:28, Kostik Belousov wrote:
> > No, my use for rc.early is different. I use it to load modules
> > before filesystems are mounted.
> 
> Ok, I'll bite ... what is deficient about doing this in /boot/loader.conf?
> 
in case if diskless, where the root (/boot/loader.conf) is shared,
it's nice to be able to configure clients via rc.conf.

danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

harmless zfs warnings?

2011-02-01 Thread Daniel Braniss

hi,
I have one disk, labeled r0 (/dev/mfid0), which i gpart'ed so:
=>34  1952448445  mfid0  GPT  (931G)
  34 128  1  freebsd-boot  (64K)
 162 4194304  2  freebsd-ufs  (2.0G)
 4194466   100663296  3  freebsd-swap  (48G)
   104857762  1847590717  4  freebsd-zfs  (881G)

and a second 'disk' labeled r5 (/dev/mfid1).
now, doing a 'spool import':
  pool: z
id: 784424638598804
 state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:

z ONLINE
  gpt/r0/zfs  ONLINE

  pool: h
id: 535400138652241
 state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:

h   ONLINE
  label/r5  ONLINE

what caught my attention was the following message on the console:
ZFS WARNING: Unable to attach to gpt/r0/swap.
ZFS WARNING: Unable to attach to mfid0p3.

should I realy get warried?

thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: statd/lockd startup failure

2011-03-09 Thread Daniel Braniss

> Under 8.2-PRERELEASE (GENERIC kernel), about 15% of the times I boot up
> (with rpc.statd and rpc.lockd enabled in rc.conf), I get:
> 
> Feb  4 07:31:11 wonderland rpc.statd: bindresvport_sa: Address already in use
> Feb  4 07:31:11 wonderland root: /etc/rc: WARNING: failed to start statd
> 
> and slightly later:
> 
> Feb  4 07:31:36 wonderland kernel: NLM: unexpected error contacting NSM, 
> stat=5, errno=35
> 
> I can start rpc.statd and rpc.lockd manually at this point (and I have to
> start them to run firefox and mail with my NFS-mounted home directory and
> mail spool).  But what might cause the above errors?   -- George Mitchell

We have been seeing this too, with the addition of mountd.
So I decided to try and track it down.
rpc.lockd, rpc.statd or mountd, all share the same code for allocating 
address/port. I added some more info to be displayed in case of error,
mainly the ai_family and port, so after many successfull reboots, I got:

Mar  9 09:18:19 chamsa mountd[1070]: bindresvport_sa: (2/617) Address already 
in use

but:

chamsa>  rpcinfo | grep mountd
151udp   0.0.0.0.2.105  mountd superuser
153udp   0.0.0.0.2.105  mountd superuser
151tcp   0.0.0.0.2.105  mountd superuser
153tcp   0.0.0.0.2.105  mountd superuser

BTW, 0.0.0.2.105 is 617, and 2 is AF_INET

the above is wierd, since the rpc stuff happens after the bindresvport_sa(...)

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: statd/lockd startup failure

2011-03-09 Thread Daniel Braniss

> > 
> > Thanks for the analysis. The reason I originally posted is to see why
> > this might have popped up in 8.x, as it never happened in 7.x.
> > -- George Mitchell
> >
> I suspect two things make this occur more frequently with 8.x. One is
> that it does IPv6 first (I suspect IPv6 wasn't enabled by default on 7.x?).
> 
> The other is the port randomization code, which probably results in
> more frequent collisions with port #s used by other things. (Basically,
> the code selects an unused port# for either UDP or TCP over IPv6 (I can't
> remember which comes first:-) and then expects that port to be available
> for the other 3 combinations of UDP/TCP x IPv6/IPv4.

anothere reason for it is probably the multy-cores, most of these daemons
fork very early, and very quickly are compiting for resources in parallel.

danny
PS: rick, can you send me your patch?



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: statd/lockd startup failure

2011-03-10 Thread Daniel Braniss

>> On 02/18/2011 10:08, Rick Macklem wrote:
>> > The attached patches changes the behaviour so that it tries to
>> > get an unused port for each of the 4 cases.
>> 
>> can you send me the patches?
>> thanks,
>> danny

> They're attached. If you get to test them, please let me know
> how it goes.
>
> rick

Hi Rick,
the good side of living on different time zones :-)
I got impatient, so I came up with a different fix.
The rational is that IMHO, there is no need for all listeners
to be on the same port:
rnd> rpcinfo protonew |grep mountd
151udp6  ::.3.141   mountd superuser
153udp6  ::.3.141   mountd superuser
151tcp6  ::.3.141   mountd superuser
153tcp6  ::.3.141   mountd superuser
151udp   0.0.0.0.3.141  mountd superuser
153udp   0.0.0.0.3.141  mountd superuser
151tcp   0.0.0.0.3.92   mountd superuser
<---
153tcp   0.0.0.0.3.92   mountd superuser
<---
rnd> rpcinfo -t protonew mountd
program 15 version 1 ready and waiting
rpcinfo: RPC: Program/version mismatch; low version = 1, high version = 3
program 15 version 2 is not available
program 15 version 3 ready and waiting

the patches are in:

ftp://ftp.cs.huji.ac.il/users/danny/freebsd/patches/address_already_in_use/

cheers,
danny

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: statd/lockd startup failure

2011-03-12 Thread Daniel Braniss

> > >> On 02/18/2011 10:08, Rick Macklem wrote:
> > >> > The attached patches changes the behaviour so that it tries to
> > >> > get an unused port for each of the 4 cases.
> > >>
> > >> can you send me the patches?
> > >> thanks,
> > >> danny
> > 
> > > They're attached. If you get to test them, please let me know
> > > how it goes.
> > >
> > > rick
> > 
> > Hi Rick,
> > the good side of living on different time zones :-)
> > I got impatient, so I came up with a different fix.
> > The rational is that IMHO, there is no need for all listeners
> > to be on the same port:
> > rnd> rpcinfo protonew |grep mountd
> > 15 1 udp6 ::.3.141 mountd superuser
> > 15 3 udp6 ::.3.141 mountd superuser
> > 15 1 tcp6 ::.3.141 mountd superuser
> > 15 3 tcp6 ::.3.141 mountd superuser
> > 15 1 udp 0.0.0.0.3.141 mountd superuser
> > 15 3 udp 0.0.0.0.3.141 mountd superuser
> > 15 1 tcp 0.0.0.0.3.92 mountd superuser <---
> > 15 3 tcp 0.0.0.0.3.92 mountd superuser <---
> > rnd> rpcinfo -t protonew mountd
> > program 15 version 1 ready and waiting
> > rpcinfo: RPC: Program/version mismatch; low version = 1, high version
> > = 3
> > program 15 version 2 is not available
> > program 15 version 3 ready and waiting
> > 
> > the patches are in:
> > ftp://ftp.cs.huji.ac.il/users/danny/freebsd/patches/address_already_in_use/
> > 
> > cheers,
> > danny
> > 
> Yep, a patch that doesn't make them all use the same port# is much
> simpler. However, others, such as Doug Barton feel that it is important
> that they use the same port#. (Something he called "tracking".)

The problem with trying to get the same port for all tcp/udp/inet/inet6
though might succeed most of the time, will fail sometimes, then what?
I saw Doug's commnent, and also the :), it's not as simple as tracking port
80 or 25, needs some efford, but it's deterministic/programable, and worst case
you can still use the -p option (which again will fail sometimes :-).

IMHO, having a system that might fail to reboot is not very pleasant.

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: statd/lockd startup failure

2011-03-13 Thread Daniel Braniss

> On 03/12/2011 02:21, Daniel Braniss wrote:
> > The problem with trying to get the same port for all tcp/udp/inet/inet6
> > though might succeed most of the time, will fail sometimes, then what?
> 
> Can you please describe the scenario when it's completely impossible to 
> find a port that's open on all 4 families?
i did not say impossible, concidering that Rick asked how many times he
should try, unless N is forever, it could fail.

> 
> > I saw Doug's commnent, and also the:), it's not as simple as tracking port
> > 80 or 25, needs some efford, but it's deterministic/programable, and worst 
> > case
> > you can still use the -p option (which again will fail sometimes:-).
> 
> Given that Rick has already written the patch, I don't think it's at all 
> unreasonable to put it in as the first choice, perhaps with a fallback 
> to picking any available port if there isn't one available for all 4 
> families.
> 
as Rick mentioned, the patch is not trivial, and to quote him:
 "My only concern with the "same port# patch" is that it is more complex
  and, therefore, somewhat riskier w.r.t. my having gotten it wrong."


> Meanwhile, I don't think I'm the only person who has ever had trouble 
> trying to track down network traffic from "random" ports that would 
> prefer that doing so not be made harder by having the same service on 
> the same host using 4 different ports.

To track rpc based traffic, which means random-port to start with, you have to
check with rpcinfo anyways. So yes, it's harder than tracking 1 port, but
IMHO, less complex than the patch requiered :-), and BTW, mountd is already
heavely patched, rpc.statd less, and rpc.lockd is, so far, the only one
that is not complaining - guess Rick is a good programer!

and I concider myself lucky that we don't use NIS/yellow-pages.

danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

mountd stuck

2011-03-28 Thread Daniel Braniss

I am runing mountd with -e (experimental :-)
this is happening too often lately, where mountd just stops responding
mountd 11762 [dp->dp_config_rwlock] 8.93r 0.00u 0.00s 0% 1320k
any help/clues?

thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

mountd stuck in ZFS code.

2011-03-30 Thread Daniel Braniss

I have been running the experimental nfs/mount for some time now, and
it mostly works, except with this particular case, where the mountd just
gets stuck:
 
 mountd 11762 [dp->dp_config_rwlock] 8.93r 0.00u 0.00s 0% 1320k
 
and stops respondig. I can't reproduce it at will, but it happens quiet often.

The host in question is an nfs/zfs server, runing 8-stable and zfs
ZFS pool version 15
ZFS filesystem version 4

cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

portmaster goes into a loop

2011-07-14 Thread Daniel Braniss

hi,
this:
portmaster p5-libwww-5.837

goes into a loop:
...
===>>> The dependency for net/p5-Net-HTTP
   seems to be handled by p5-libwww-5.837

===>>> Launching child to update p5-libwww-5.837 to p5-libwww-6.02
p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
p5-libwww-5.837 >> p5-libw w-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
p5-libwww-5.837 >> p5-libwww-5.837 >> p5-l bwww-5.837 >> p5-libwww-5.837 >> 
p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
p5-libwww-5. 37 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p -libwww-5.837 >> 
p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
p5-libwww-5.837 >> p5-libwww 5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >  
p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
p5-libwww-5.837 >> p5-libwww-5.837 >> p5-lib ww-5.837 >> p5-libwww-5.837 >> 
p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
p5-libwww-5.83  >> p5-libwww-5.837 >> p5-libwww-5.837

===>>> Port directory: /usr/ports/www/p5-libwww
...

how can I fix this?
(the loop I kill with ^C :-), it's the going into a loop that I want to fix.

thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: portmaster goes into a loop

2011-07-14 Thread Daniel Braniss

> hi,
> this:
>   portmaster p5-libwww-5.837
> 
> goes into a loop:
> ...
> ===>>> The dependency for net/p5-Net-HTTP
>seems to be handled by p5-libwww-5.837
> 
> ===>>> Launching child to update p5-libwww-5.837 to p5-libwww-6.02
> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
> p5-libwww-5.837 >> p5-libw w-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-l bwww-5.837 >> p5-libwww-5.837 >> 
> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
> p5-libwww-5. 37 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p -libwww-5.837 >> 
> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
> p5-libwww-5.837 >> p5-libwww 5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >  
> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-lib ww-5.837 >> p5-libwww-5.837 >> 
> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> p5-libwww-5.837 >> 
> p5-libwww-5.83  >> p5-libwww-5.837 >> p5-libwww-5.837
> 
> ===>>> Port directory: /usr/ports/www/p5-libwww
> ...
> 
> how can I fix this?
> (the loop I kill with ^C :-), it's the going into a loop that I want to fix.
> 

p5-libwww depends on p5-Net-HTTP, but p5-Net-HTTP says it conflicts
with p5-libwww-5*

maybe portmaster can better catch this conflict?

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: can't ping local address

2012-03-12 Thread Daniel Braniss

> Hi,
> 
> Does any one have this [1] problem? or just know how to fix it?
> 
> [1] http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/159103
> 
the bug has been around for a while, and the fix for a diskless
won't work :-), downing the link will hang the host

danny


> -- 
> Andrey Zonov
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
> 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

8.3-PRERELEASE and ATA_CAM

2012-04-06 Thread Daniel Braniss

with the latest svn, I can't compile kernel with  options ATA_CAM:

...
linking kernel.debug
ata-disk.o(.text+0x93): In function `ad_init':
/r+d/stable/8.3/sys/dev/ata/ata-disk.c:389: undefined reference to 
`ata_setmode'
ata-disk.o(.text+0xaa):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:397: undefined 
reference to `ata_wc'
ata-disk.o(.text+0xc5):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:398: undefined 
reference to `ata_controlcmd'
ata-disk.o(.text+0x113):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:400: undefined 
reference to `ata_controlcmd'
ata-disk.o(.text+0x133):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:393: undefined 
reference to `ata_controlcmd'
ata-disk.o(.text+0x16d):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:407: undefined 
reference to `ata_controlcmd'
ata-disk.o(.text+0x21a): In function `ad_shutdown':
/r+d/stable/8.3/sys/dev/ata/ata-disk.c:196: undefined reference to 
`ata_controlcmd'
ata-disk.o(.text+0x45c): In function `ad_detach':
/r+d/stable/8.3/sys/dev/ata/ata-disk.c:182: undefined reference to 
`ata_fail_requests'
...

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 8.3-PRERELEASE and ATA_CAM

2012-04-07 Thread Daniel Braniss

> On Fri, Apr 06, 2012 at 10:48:13AM +0300, Daniel Braniss wrote:
> > with the latest svn, I can't compile kernel with  options ATA_CAM:
> > 
> > ...
> > linking kernel.debug
> > ata-disk.o(.text+0x93): In function `ad_init':
> > /r+d/stable/8.3/sys/dev/ata/ata-disk.c:389: undefined reference to 
> > `ata_setmode'
> > ata-disk.o(.text+0xaa):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:397: 
> > undefined 
> > reference to `ata_wc'
> > ata-disk.o(.text+0xc5):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:398: 
> > undefined 
> > reference to `ata_controlcmd'
> > ata-disk.o(.text+0x113):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:400: 
> > undefined 
> > reference to `ata_controlcmd'
> > ata-disk.o(.text+0x133):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:393: 
> > undefined 
> > reference to `ata_controlcmd'
> > ata-disk.o(.text+0x16d):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:407: 
> > undefined 
> > reference to `ata_controlcmd'
> > ata-disk.o(.text+0x21a): In function `ad_shutdown':
> > /r+d/stable/8.3/sys/dev/ata/ata-disk.c:196: undefined reference to 
> > `ata_controlcmd'
> > ata-disk.o(.text+0x45c): In function `ad_detach':
> > /r+d/stable/8.3/sys/dev/ata/ata-disk.c:182: undefined reference to 
> > `ata_fail_requests'
> > ...
> > 
> 
> You seem to be using a mutually exclusive set of ata(4) options and
> devices (previously, this erroneously wasn't a bug). When including
> options ATA_CAM you do _not_ want to also include any of the following
> devices:
> deviceatapicam
> deviceatadisk
> deviceataraid
> deviceatapicd
> deviceatapifd
> deviceatapist
> 
> Instead you need the corresponding driver from the following set:
> devicescbus
> devicech
> deviceda
> devicesa
> devicecd
> devicepass
> 
> Marius
> 
they are included by GENERIC, which i include, bummer.
what about ATA_STATIC_ID, I guess that is also a nono? 
thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Time Clock Stops in FreeBSD 9.0 guest running under ESXi 5.0

2012-04-08 Thread Daniel Braniss

> On Thu, 2012-03-22 at 18:13 +0200, Volodymyr Kostyrko wrote:
> > Andriy Gapon wrote:
> > > on 22/03/2012 17:33 Volodymyr Kostyrko said the following:
> > >> Andriy Gapon wrote:
> > >>> on 22/03/2012 15:19 Mike Tkachuk said the following:
> >  kern.eventtimer.periodic: 0
> > >>>
> > >>> It might make sense to try 1 here.
> > >>> Also you could attempt to involve mav@ directly - here is an author of 
> > >>> the code
> > >>> and an expert on it.
> > >>
> > >> Better ask before setting as this doubles hpet0 (with HPET) or 
> > >> cpu0:timer (with
> > >> LAPIC) interrupt rate for me.
> > >
> > > Does it make your system unusable?
> > > Are you comparing with pre-eventtimers version of FreeBSD?
> > 
> > In short term - no. Haven't tested it thoroughly. Results are the same 
> > (double interrupt rate according to `systat 1 -v`) for:
> >   * i386 and amd64 9-STABLE;
> >   * amd64 9.0.
> > 
> > As everything related to timing/freq/acpi can be unpredictive I wouldn't 
> > recommend this to anyone. I own at least two Intel CPU's failing 
> > somewhere near timing/apic when loading cpufreq and enabling powerd.
> > 
> 
> I'm not sure I understand that advice.  We have someone whose system is
> failing (time stops counting) when using the new event timer code.  The
> recommendation is to set kern.eventtimer.periodic=1, which as I
> understand it makes the new code work more like it did before.  That
> seems to be a reasonable attempt to work around the problem.  
> 
> If it works, the system becomes 100% more usable than it is now, even if
> that comes at the cost of timers interrupting twice as fast as they did
> in previous OS releases.  It also generates another datapoint that might
> somehow help track down why the event timer code has trouble on some
> hardware.  Enough such datapoints may eventually lead to an "aha -- it
> happens on all systems that have the xyz chipset."

Just a me too:
but it was running 8.2-stable!
since it's a production machine, I had no choice but to reboot it.
Also the BIOS time got stuck, so I had to fix the time manualy! ntpd doesn't 
like
to advance past a certain delta.

cheers,
danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

lost devices in 8.3

2012-04-22 Thread Daniel Braniss

hi,
I'm trying to upgrade this old opteron box, which is running 8.2, but
when booting 8.3 the disks disappear.

with 8.2:
...
atapci1@pci0:0:7:1: class=0x01018a card=0x74691022 chip=0x74691022 
rev=0x03 hdr=0x00
vendor = 'Advanced Micro Devices (AMD)'
device = 'UltraATA/133 Controller (AMD-8111)'
class  = mass storage
subclass   = ATA
...
atapci0@pci0:3:5:0: class=0x010400 card=0x61141095 chip=0x31141095 
rev=0x02 hdr=0x00
vendor = 'Silicon Image Inc (Was: CMD Technology Inc)'
device = 'SATALink/SATARaid Controller (Sil 3114)'
class  = mass storage
subclass   = RAID

but none on 8.3:
none0@pci0:0:7:1:   class=0x01018a card=0x74691022 chip=0x74691022 
rev=0x03 hdr=0x00
vendor = 'Advanced Micro Devices (AMD)'
device = 'UltraATA/133 Controller (AMD-8111)'
class  = mass storage
subclass   = ATA
...
none3@pci0:3:5:0:   class=0x018000 card=0x31141095 chip=0x31141095 
rev=0x02 hdr=0x00
vendor = 'Silicon Image Inc (Was: CMD Technology Inc)'
device = 'SATALink/SATARaid Controller (Sil 3114)'
class  = mass storage

and the only diff in the configuration is that 8.3 has:
options ATA_CAM
nodeviceata
nodeviceatadisk # ATA disk drives
nodeviceataraid # ATA RAID drives
nodeviceatapicd # ATAPI CDROM drives
nodeviceatapifd # ATAPI floppy drives
nodeviceatapist # ATAPI tape drives

cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 9-STABLE, ZFS, NFS, ggatec - suspected memory leak

2012-04-26 Thread Daniel Braniss

> Security_Multipart(Fri_Apr_27_13_35_56_2012_748)--
> Content-Type: Text/Plain; charset=us-ascii
> Content-Transfer-Encoding: 7bit
> 
> Rick Macklem  wrote
>   in <1527622626.3418715.1335445225510.javamail.r...@erie.cs.uoguelph.ca>:
> 
> rm> Steven Hartland wrote:
> rm> >  Original Message -
> rm> > From: "Rick Macklem" 
> rm> > > At a glance, it looks to me like 8.x is affected. Note that the
> rm> > > bug only affects the new NFS server (the experimental one for 8.x)
> rm> > > when exporting ZFS volumes. (UFS exported volumes don't leak)
> rm> > >
> rm> > > If you are running a server that might be affected, just:
> rm> > > # vmstat -z | fgrep -i namei
> rm> > > on the server and see if the 3rd number shown is increasing.
> rm> >
> rm> > Many thanks Rick wasnt aware we had anything experimental enabled
> rm> > but I think that would be a yes looking at these number:-
> rm> >
> rm> > vmstat -z | fgrep -i namei
> rm> > NAMEI: 1024, 0, 1, 1483, 25285086096, 0
> rm> > vmstat -z | fgrep -i namei
> rm> > NAMEI: 1024, 0, 0, 1484, 25285945725, 0
> rm> >
> rm>   ^
> rm> I don't think so, since the 3rd number (USED) is 0 here.
> rm> If that # is increasing over time, you have the leak. You are
> rm> probably running the old (default in 8.x) NFS server.
> 
>  Just a report, I confirmed it affected 8.x servers running newnfs.
> 
>  Actually I have been suffered from memory starvation symptom on that
>  server (24GB RAM) for a long time and watching vmstat -z
>  periodically.  It stopped working once a week.  I investigated the
>  vmstat log again and found the amount of NAMEI leak was 11,543,956
>  (about 11GB!) just before the locked-up.  After applying the patch,
>  the leak disappeared.  Thank you for fixing it!
> 
> -- Hiroki
this is on 8.2-STABLE/amd64 from around August:
same here, this zfs+newnfs has been hanging every few months, and I can see
now the leak, it's slowly increasing:
NAMEI:   1024,0,   122975,  529, 15417248,0
NAMEI:   1024,0,   122984,  520, 15421772,0
NAMEI:   1024,0,   123002,  502, 15424743,0
NAMEI:   1024,0,   123008,  496, 15425464,0

cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Restricting users from certain privileges

2012-04-28 Thread Daniel Braniss

> Hi:
> 
> I could not figure out how to restrict users or other users from certain
> privileges to execute certain commands in FreeBSD/NanoBSD?
> 
> What I meant is I want to create a NanoBSD image in which there will be an
> additional user, say 'admin'. I need to give this new user (admin) some
> privileges to run some root-can-only-execute commands, but not all (ACL
> similar to the firmwares in adsl modems from ISPs).
> 
> I read Dru Lavingne's 'BSD Hacks' and Joseph Kong's 'Designing BSD
> Rootkits' besides FreeBSD handbook, but I simply could not figure out.
> Could anyone throw some light on this? Appreciate it!
> 
> Thanks!
> 
> /zenny

try sudo from ports, security/sudo

cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 9-STABLE, ZFS, NFS, ggatec - suspected memory leak

2012-04-28 Thread Daniel Braniss

> Daniel Braniss wrote:
> > > Security_Multipart(Fri_Apr_27_13_35_56_2012_748)--
> > > Content-Type: Text/Plain; charset=us-ascii
> > > Content-Transfer-Encoding: 7bit
> > >
> > > Rick Macklem  wrote
> > >   in
> > >   <1527622626.3418715.1335445225510.javamail.r...@erie.cs.uoguelph.ca>:
> > >
> > > rm> Steven Hartland wrote:
> > > rm> >  Original Message -
> > > rm> > From: "Rick Macklem" 
> > > rm> > > At a glance, it looks to me like 8.x is affected. Note that
> > > the
> > > rm> > > bug only affects the new NFS server (the experimental one
> > > for 8.x)
> > > rm> > > when exporting ZFS volumes. (UFS exported volumes don't
> > > leak)
> > > rm> > >
> > > rm> > > If you are running a server that might be affected, just:
> > > rm> > > # vmstat -z | fgrep -i namei
> > > rm> > > on the server and see if the 3rd number shown is increasing.
> > > rm> >
> > > rm> > Many thanks Rick wasnt aware we had anything experimental
> > > enabled
> > > rm> > but I think that would be a yes looking at these number:-
> > > rm> >
> > > rm> > vmstat -z | fgrep -i namei
> > > rm> > NAMEI: 1024, 0, 1, 1483, 25285086096, 0
> > > rm> > vmstat -z | fgrep -i namei
> > > rm> > NAMEI: 1024, 0, 0, 1484, 25285945725, 0
> > > rm> >
> > > rm> ^
> > > rm> I don't think so, since the 3rd number (USED) is 0 here.
> > > rm> If that # is increasing over time, you have the leak. You are
> > > rm> probably running the old (default in 8.x) NFS server.
> > >
> > >  Just a report, I confirmed it affected 8.x servers running newnfs.
> > >
> > >  Actually I have been suffered from memory starvation symptom on
> > >  that
> > >  server (24GB RAM) for a long time and watching vmstat -z
> > >  periodically. It stopped working once a week. I investigated the
> > >  vmstat log again and found the amount of NAMEI leak was 11,543,956
> > >  (about 11GB!) just before the locked-up. After applying the patch,
> > >  the leak disappeared. Thank you for fixing it!
> > >
> > > -- Hiroki
> And thanks Hiroki for testing it on 8.x.
> 
> > this is on 8.2-STABLE/amd64 from around August:
> > same here, this zfs+newnfs has been hanging every few months, and I
> > can see
> > now the leak, it's slowly increasing:
> > NAMEI: 1024, 0, 122975, 529, 15417248, 0
> > NAMEI: 1024, 0, 122984, 520, 15421772, 0
> > NAMEI: 1024, 0, 123002, 502, 15424743, 0
> > NAMEI: 1024, 0, 123008, 496, 15425464, 0
> > 
> > cheers,
> > danny
> Maybe you could try the patch, too.
> 
> It's at:
>http://people.freebsd.org/~rmacklem/namei-leak.patch
> 
> I'll commit it to head soon with a 1 month MFC, so that hopefully
> Oliver will have a chance to try it on his production server before
> the MFC.
> 
> Thanks everyone, for your help with this, rick

I haven't applied the patch yet, but in the meanime I have been running some 
experiments on a zfs/nfs server running 8.3-STABLE, and don't see any leaks
what triggers the leak?

thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 9-STABLE, ZFS, NFS, ggatec - suspected memory leak

2012-04-29 Thread Daniel Braniss

> Daniel Braniss wrote:
> > > Daniel Braniss wrote:
> > > > > Security_Multipart(Fri_Apr_27_13_35_56_2012_748)--
> > > > > Content-Type: Text/Plain; charset=us-ascii
> > > > > Content-Transfer-Encoding: 7bit
> > > > >
> > > > > Rick Macklem  wrote
> > > > >   in
> > > > >   
> > > > > <1527622626.3418715.1335445225510.javamail.r...@erie.cs.uoguelph.ca>:
> > > > >
> > > > > rm> Steven Hartland wrote:
> > > > > rm> >  Original Message -
> > > > > rm> > From: "Rick Macklem" 
> > > > > rm> > > At a glance, it looks to me like 8.x is affected. Note
> > > > > that
> > > > > the
> > > > > rm> > > bug only affects the new NFS server (the experimental
> > > > > one
> > > > > for 8.x)
> > > > > rm> > > when exporting ZFS volumes. (UFS exported volumes don't
> > > > > leak)
> > > > > rm> > >
> > > > > rm> > > If you are running a server that might be affected,
> > > > > just:
> > > > > rm> > > # vmstat -z | fgrep -i namei
> > > > > rm> > > on the server and see if the 3rd number shown is
> > > > > increasing.
> > > > > rm> >
> > > > > rm> > Many thanks Rick wasnt aware we had anything experimental
> > > > > enabled
> > > > > rm> > but I think that would be a yes looking at these number:-
> > > > > rm> >
> > > > > rm> > vmstat -z | fgrep -i namei
> > > > > rm> > NAMEI: 1024, 0, 1, 1483, 25285086096, 0
> > > > > rm> > vmstat -z | fgrep -i namei
> > > > > rm> > NAMEI: 1024, 0, 0, 1484, 25285945725, 0
> > > > > rm> >
> > > > > rm> ^
> > > > > rm> I don't think so, since the 3rd number (USED) is 0 here.
> > > > > rm> If that # is increasing over time, you have the leak. You
> > > > > are
> > > > > rm> probably running the old (default in 8.x) NFS server.
> > > > >
> > > > >  Just a report, I confirmed it affected 8.x servers running
> > > > >  newnfs.
> > > > >
> > > > >  Actually I have been suffered from memory starvation symptom on
> > > > >  that
> > > > >  server (24GB RAM) for a long time and watching vmstat -z
> > > > >  periodically. It stopped working once a week. I investigated
> > > > >  the
> > > > >  vmstat log again and found the amount of NAMEI leak was
> > > > >  11,543,956
> > > > >  (about 11GB!) just before the locked-up. After applying the
> > > > >  patch,
> > > > >  the leak disappeared. Thank you for fixing it!
> > > > >
> > > > > -- Hiroki
> > > And thanks Hiroki for testing it on 8.x.
> > >
> > > > this is on 8.2-STABLE/amd64 from around August:
> > > > same here, this zfs+newnfs has been hanging every few months, and
> > > > I
> > > > can see
> > > > now the leak, it's slowly increasing:
> > > > NAMEI: 1024, 0, 122975, 529, 15417248, 0
> > > > NAMEI: 1024, 0, 122984, 520, 15421772, 0
> > > > NAMEI: 1024, 0, 123002, 502, 15424743, 0
> > > > NAMEI: 1024, 0, 123008, 496, 15425464, 0
> > > >
> > > > cheers,
> > > > danny
> > > Maybe you could try the patch, too.
> > >
> > > It's at:
> > >http://people.freebsd.org/~rmacklem/namei-leak.patch
> > >
> > > I'll commit it to head soon with a 1 month MFC, so that hopefully
> > > Oliver will have a chance to try it on his production server before
> > > the MFC.
> > >
> > > Thanks everyone, for your help with this, rick
> > 
> > I haven't applied the patch yet, but in the meanime I have been
> > running some
> > experiments on a zfs/nfs server running 8.3-STABLE, and don't see any
> > leaks
> > what triggers the leak?
> > 
> Fortunately Oliver isolated this. It should leak when you do a successful
> "rm" or "rmdir" while running the new/experimental server.
>
but that's what I did, I'm running the new/experimental nfs server
(or so I think :-), and did a huge rm -rf and nothing, nada, no leak.
To check the patch, I have to upgrade the production server, the one with the 
leak,
but I wanted to test it on a non production first. Anyways, ill patch the 
kernel
and try it on the leaking production server tomorrow.

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 9-STABLE, ZFS, NFS, ggatec - suspected memory leak

2012-04-30 Thread Daniel Braniss

> Daniel Braniss wrote:
> > > Daniel Braniss wrote:
> > > > > Daniel Braniss wrote:
> > > > > > > Security_Multipart(Fri_Apr_27_13_35_56_2012_748)--
> > > > > > > Content-Type: Text/Plain; charset=us-ascii
> > > > > > > Content-Transfer-Encoding: 7bit
> > > > > > >
> > > > > > > Rick Macklem  wrote
> > > > > > >   in
> > > > > > >   
> > > > > > > <1527622626.3418715.1335445225510.javamail.r...@erie.cs.uoguelph.ca>:
> > > > > > >
> > > > > > > rm> Steven Hartland wrote:
> > > > > > > rm> >  Original Message -
> > > > > > > rm> > From: "Rick Macklem" 
> > > > > > > rm> > > At a glance, it looks to me like 8.x is affected.
> > > > > > > Note
> > > > > > > that
> > > > > > > the
> > > > > > > rm> > > bug only affects the new NFS server (the
> > > > > > > experimental
> > > > > > > one
> > > > > > > for 8.x)
> > > > > > > rm> > > when exporting ZFS volumes. (UFS exported volumes
> > > > > > > don't
> > > > > > > leak)
> > > > > > > rm> > >
> > > > > > > rm> > > If you are running a server that might be affected,
> > > > > > > just:
> > > > > > > rm> > > # vmstat -z | fgrep -i namei
> > > > > > > rm> > > on the server and see if the 3rd number shown is
> > > > > > > increasing.
> > > > > > > rm> >
> > > > > > > rm> > Many thanks Rick wasnt aware we had anything
> > > > > > > experimental
> > > > > > > enabled
> > > > > > > rm> > but I think that would be a yes looking at these
> > > > > > > number:-
> > > > > > > rm> >
> > > > > > > rm> > vmstat -z | fgrep -i namei
> > > > > > > rm> > NAMEI: 1024, 0, 1, 1483, 25285086096, 0
> > > > > > > rm> > vmstat -z | fgrep -i namei
> > > > > > > rm> > NAMEI: 1024, 0, 0, 1484, 25285945725, 0
> > > > > > > rm> >
> > > > > > > rm> ^
> > > > > > > rm> I don't think so, since the 3rd number (USED) is 0 here.
> > > > > > > rm> If that # is increasing over time, you have the leak.
> > > > > > > You
> > > > > > > are
> > > > > > > rm> probably running the old (default in 8.x) NFS server.
> > > > > > >
> > > > > > >  Just a report, I confirmed it affected 8.x servers running
> > > > > > >  newnfs.
> > > > > > >
> > > > > > >  Actually I have been suffered from memory starvation
> > > > > > >  symptom on
> > > > > > >  that
> > > > > > >  server (24GB RAM) for a long time and watching vmstat -z
> > > > > > >  periodically. It stopped working once a week. I
> > > > > > >  investigated
> > > > > > >  the
> > > > > > >  vmstat log again and found the amount of NAMEI leak was
> > > > > > >  11,543,956
> > > > > > >  (about 11GB!) just before the locked-up. After applying the
> > > > > > >  patch,
> > > > > > >  the leak disappeared. Thank you for fixing it!
> > > > > > >
> > > > > > > -- Hiroki
> > > > > And thanks Hiroki for testing it on 8.x.
> > > > >
> > > > > > this is on 8.2-STABLE/amd64 from around August:
> > > > > > same here, this zfs+newnfs has been hanging every few months,
> > > > > > and
> > > > > > I
> > > > > > can see
> > > > > > now the leak, it's slowly increasing:
> > > > > > NAMEI: 1024, 0, 122975, 529, 15417248, 0
> > > > > > NAMEI: 1024, 0, 122984, 520, 15421772, 0
> > > > > > NAMEI: 1024, 0, 123002, 502, 15424743, 0
> > > > > > NAMEI: 1024, 0, 123008, 496, 15425464, 0
> > > > > >
>

Re: su problem

2012-06-10 Thread Daniel Braniss

> Sami Halabi  wrote:
>  > Hi Oliver,
>  > I saw you had similar problem for console on 2010
>  > 
> http://freebsd.1045724.n5.nabble.com/Serial-console-problems-with-stab=le-8-td3950684.html
> 
> No, I don't think that the problem is related.  My problem
> was with the serial console, while you don't have a serial
> console attached at all (at least you didn't mention it).
> 
>  > but the thread wasn't ended by recommendation or conclusions by you.
>  >
>  > did you solve that problem then?
> 
> No, I came to the conclusion that the serial console support
> in FreeBSD 8 was broken somehow.  So I removed the console
> cable; it's running with an old VGA CRT as the console for
> now.  Fortunately I require console access very seldom, so
> I don't have to drive to that machine often.  It's still
> annoying, but I didn't find a better solution; downgrading
> to 7.x isn't an option.
>
just for the record, serial on 8.x works fine! the device naming has changed
from sio to uart, and maybe some features. We use it on all our servers, even
redirecting it where possible via ILO,IMPI,DRAC.  and is great for debuging
or saving long trips :-)

WARNING: control access to these devices, specialy since root can login
on the console!

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: su problem

2012-06-14 Thread Daniel Braniss

> 
> 
> On 6/10/12 1:52 PM, Daniel Braniss wrote:
> >> Sami Halabi  wrote:
> >>  > Hi Oliver,
> >>  > I saw you had similar problem for console on 2010
> >>  > 
> >> http://freebsd.1045724.n5.nabble.com/Serial-console-problems-with-stab=le-8-td3950684.html
> >>
> >> No, I don't think that the problem is related.  My problem
> >> was with the serial console, while you don't have a serial
> >> console attached at all (at least you didn't mention it).
> >>
> >>  > but the thread wasn't ended by recommendation or conclusions by you.
> >>  >
> >>  > did you solve that problem then?
> >>
> >> No, I came to the conclusion that the serial console support
> >> in FreeBSD 8 was broken somehow.  So I removed the console
> >> cable; it's running with an old VGA CRT as the console for
> >> now.  Fortunately I require console access very seldom, so
> >> I don't have to drive to that machine often.  It's still
> >> annoying, but I didn't find a better solution; downgrading
> >> to 7.x isn't an option.
> >>
> > just for the record, serial on 8.x works fine! the device naming has changed
> > from sio to uart, and maybe some features. We use it on all our servers, 
> > even
> > redirecting it where possible via ILO,IMPI,DRAC.  and is great for debuging
> > or saving long trips :-)
> > 
> > WARNING: control access to these devices, specialy since root can login
> > on the console!
> > 
> > danny
> >
> 
> Daniel, would you kindly elaborate on the DRAC console redirection thingy ?
> 
> We're using Dells here and I loathe having to use their web interface
> and the java app to get a console shell.

you need the drac module - sometimes it's optional, but if you can access it 
via the web
you probably have it.

you will have to:
set the bios to allow serial over ethernet, I can't remember off heart 
at the 
moment.
configure /boot/loader.conf:
console="comconsole,vidconsole"
comconsole_speed="38400"-- the speed is what you set it 
in the bios
configure /boot/device.hints:
hint.uart.0.flags="0x10"-- or .1. depending on the bios 
settings

install from ports sysutils/ipmitools
connect the ethernet port
and finaly:
ipmitool -A MD5 -H c  -U root -I lanplus sol activate

danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: FreeBSD and IPMI how-to (was Re: su problem)

2012-06-15 Thread Daniel Braniss

> Hi, all,
> 
> 
> Am 15.06.2012 um 03:27 schrieb Matthew X. Economou:
> > Daniel Braniss writes:
> > 
> >> just for the record, serial on 8.x works fine! the device naming
> >> has changed from sio to uart, and maybe some features. We use it
> >> on all our servers, even redirecting it where possible via
> >> ILO,IMPI,DRAC.  and is great for debuging or saving long trips :-)
> > 
> > Would some kind soul point me to a howto for configuring IPMI on
> > FreeBSD?  I have a Dell PowerEdge 840 that supports IPMI, but I have
> > no idea how to set it up - either in the BIOS or in FreeBSD.  I've
> > messed around with ipmitools a little, but I haven't gotten it to
> > work.
> > > Did you
> > kldload ipmi
> ?
> > What's the output of
> > dmesg>  kldstat
> > after loading the module?
> > With the module loaded, you should be able to get something like this:
> > devel# ipmitool sensor
> Ambient  | 23.500 | degrees C  | ok| na| 1.000   =  | 
> 6.000 | 37.000| 42.000| na
> Systemboard  | 32.000 | degrees C  | ok| na| na  =  | 
> na| 60.000| 65.000| na
> CPU1 | 49.000 | degrees C  | ok| na| na  =  | 
> na| 93.000| 97.000| na
> CPU2 | 48.000 | degrees C  | ok| na| na  =  | 
> na| 93.000| 97.000| na
> ...
[...]

the ipmi kernel module allows interfacing/communicating with the 'local 
system', which is nice,
unless the kernel went bonkers.

You can - after some configuring(*) - connect from another host via something 
like:
 ipmitool -A MD5 -H  -U root -I lanplus sol 
activate
and get the remote host console, or do a power cycle:
 ipmitool -A MD5 -H   -U root power cycle


danny
*: you need configure/enable the bios/drac.



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: PORTS_MODULES

2012-06-18 Thread Daniel Braniss

> Howdy,
> 
> This is an FYI to let people know about a really nice feature for those
> that have ports installed which include kernel modules. You can place a
> list in /etc/src.conf like this:
> 
> PORTS_MODULES=  emulators/virtualbox-ose-kmod sysutils/fusefs-kmod
> x11/nvidia-driver
> 
> which will cause those modules to be built and installed with all the
> proper matching stuff at the same time as buildkernel and installkernel.
> 
> This feature has existed for a while, but has had "issues." Thanks to a
> team effort it's a lot more robust now, and ready for prime time (in
> HEAD, and the -STABLE branches for now, soon to be in 9.1-RELEASE).
> 
> Enjoy,
> 
> Doug

nice!
does it also work when cross-compiling? ie, using an amd64-freebsd-8.3 kernel
to compile for i386-freebsd-8.2

thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

nfs problems

2012-06-29 Thread Daniel Braniss

Hi,
starting about last week, I'm getting:

rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken 
pipe (32)
rsync: write failed on "/net/rnd/dist/tmp/local/amd64.FreeBSD_8.3-wip/compat/li
nux/usr/lib/locale/locale-archive.tmpl": Permission denied (13)
rsync error: error in file IO (code 11) at receiver.c(322) [receiver=3.0.9]
rsync: connection unexpectedly closed (21872 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(605) 
[sender=3.0.9]

the server is running 8.2, but the client is very upto date, 8.3-stable as of 
this morning
(local time).

after runing rsync several times, it finaly gets synced.

another item is that i'm using am-utils, but I don't see it causing the problem

I will try using tcp (instead of udp) soon.

any insights?

cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: nfs problems

2012-06-29 Thread Daniel Braniss

> Hi,
> starting about last week, I'm getting:
> 
> rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken 
> pipe (32)
> rsync: write failed on 
> "/net/rnd/dist/tmp/local/amd64.FreeBSD_8.3-wip/compat/li
> nux/usr/lib/locale/locale-archive.tmpl": Permission denied (13)
> rsync error: error in file IO (code 11) at receiver.c(322) [receiver=3.0.9]
> rsync: connection unexpectedly closed (21872 bytes received so far) [sender]
> rsync error: error in rsync protocol data stream (code 12) at io.c(605) 
> [sender=3.0.9]
> 
> the server is running 8.2, but the client is very upto date, 8.3-stable as of 
> this morning
> (local time).
> 
> after runing rsync several times, it finaly gets synced.
> 
> another item is that i'm using am-utils, but I don't see it causing the 
> problem
> 
> I will try using tcp (instead of udp) soon.
> 
> any insights?
> 
> cheers,
>   danny

the problem is most probably NFS/UDP related.

I took am-utils out of the equation.
mounted using TCP, and no problems
mounted using UDP:
Jun 29 12:38:14 pe-02 kernel: nfs server nrnfdn:sf/s ds isseterr:vve ernr o 
trrnn dd:r:e/s/pddoinisdsitt::n  nngoo
Jun 29 12:38:14 pe-02 kernel: tt
Jun 29 12:38:14 pe-02 kernel: 
Jun 29 12:38:14 pe-02 kernel: <<66>>  rreessppoonnddiinngg
Jun 29 12:38:14 pe-02 kernel: 
Jun 29 12:38:14 pe-02 kernel: nfs server rnd:/dist: not responding
Jun 29 12:38:14 pe-02 last message repeated 11 times
Jun 29 12:38:27 pe-02 kernel: nfs server rnd:/dist: is alive again

the above happens about every 15 seconds
(you have to learn to read in between the bytes :-)

cheers,
danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: nfs problems

2012-06-29 Thread Daniel Braniss

> On 29/06/2012 10:45, Daniel Braniss wrote:
> >> Hi,
> >> starting about last week, I'm getting:
> >>
> >> rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: 
> >> Broken 
> >> pipe (32)
> >> rsync: write failed on 
> >> "/net/rnd/dist/tmp/local/amd64.FreeBSD_8.3-wip/compat/li
> >> nux/usr/lib/locale/locale-archive.tmpl": Permission denied (13)
> >> rsync error: error in file IO (code 11) at receiver.c(322) [receiver=3.0.9]
> >> rsync: connection unexpectedly closed (21872 bytes received so far) 
> >> [sender]
> >> rsync error: error in rsync protocol data stream (code 12) at io.c(605) 
> >> [sender=3.0.9]
> >>
> >> the server is running 8.2, but the client is very upto date, 8.3-stable as 
> >> of 
> >> this morning
> >> (local time).
> >>
> >> after runing rsync several times, it finaly gets synced.
> >>
> >> another item is that i'm using am-utils, but I don't see it causing the 
> >> problem
> >>
> >> I will try using tcp (instead of udp) soon.
> >>
> >> any insights?
> >>
> >> cheers,
> >>danny
> > the problem is most probably NFS/UDP related.
> >
> > I took am-utils out of the equation.
> > mounted using TCP, and no problems
> > mounted using UDP:
> > Jun 29 12:38:14 pe-02 kernel: nfs server nrnfdn:sf/s ds isseterr:vve ernr o 
> > trrnn dd:r:e/s/pddoinisdsitt::n  nngoo
> > Jun 29 12:38:14 pe-02 kernel: tt
> > Jun 29 12:38:14 pe-02 kernel: 
> > Jun 29 12:38:14 pe-02 kernel: <<66>>  rreessppoonnddiinngg
> > Jun 29 12:38:14 pe-02 kernel: 
> > Jun 29 12:38:14 pe-02 kernel: nfs server rnd:/dist: not responding
> > Jun 29 12:38:14 pe-02 last message repeated 11 times
> > Jun 29 12:38:27 pe-02 kernel: nfs server rnd:/dist: is alive again
> >
> > the above happens about every 15 seconds
> > (you have to learn to read in between the bytes :-)
> >
> > cheers,
> > danny
> >
> Its also possible you are hitting a bug I came across recently.
> See http://lists.freebsd.org/pipermail/freebsd-current/2012-June/034860.html
> basicly  mountd may give incorrect permission denied errors when it is
> refreshing the exports list due to non-atomic operations.
> see
> 
> kern/131342
> kern/136865
Hi Vince,
I thought so too, there used to be a bug caused by am-utils umounting, succeding
even if the mount was active, then re-mounting, which caused all kind
of problems, the work around was to increase the timeout.
But I don't think it's the case here, unless mountd has a life of its
own. Furthermore, rsync works without a glitch when mounted nfs/tcp.
thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: [Stable 7] CPIO breakage/

2010-06-15 Thread Daniel Braniss

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
> 
> On 2010/06/15 17:05, Sean Bruno wrote:
> > On Tue, 2010-06-15 at 17:10 -0500, Sean Bruno wrote:
> >> http://svn.freebsd.org/viewvc/base?limit_changes=0&view=revision&revision=208361
> >>
> >> I'm not sure what's up with this update, but it hosed up the default
> >> behavior of cpio.
> >>
> >> It appears now that -o won't do the same things that it used to:
> >>
> >> + cd /
> >> + find -x .
> >> + egrep -v '^\.(/snap|/usr/sup|/boot/kernel/kernel
> >> \.[[:alpha:]_]+\.[[:digit:]]+|/boot/kernel/kernel
> >> \.old|/etc/start_if.*|/etc/ssh/ssh_host_.*key|/etc/hostid|/etc/(master.passwd|passwd|spwd.db|pwd.db))'
> >> + '[' -n '' ']'
> >> + '[' 7 = 4 ']'
> >> + '[' -n '' -a -z '' ']'
> >> + '[' -n /home/backup ']'
> >> + echo 'dumping / ...'
> >> dumping / ...
> >> + cpio -o --quiet --format crc -O /home/backup/root.amd64.cpio
> >> cpio: ./dev not dumped: minor number would be truncated
> >> cpio: Removing leading `/' from member names
> >> cpio: ./proc not dumped: minor number would be truncated
> >> cpio: Removing leading `../' from member names
> >>
> >> We've had to revert this change from our local tree, suggestions?
> >>
> >> Sean
> > 
> > 
> > A little more background.  It looks like symlinks are getting stripped
> > of their '/' which sucks.  Ideas?
> > 
> > Sean
> > 
> > e.g. /home/foo/bar -> /opt/baz/blob
> > 
> > becomes
> > 
> > home/foo/bar -> opt/baz/blob   
> > 
> > Yuck.
> 
> This is a security measurement I think.
> 
> - --absolute-filenames disables this behavior.

A similar 'security feature' was introduced sometime ago, wich 'silently'
broke firefox instalation , it refused to allow symlinks in destination
directory, of course the error was ignored by 'make install' so it took
some time later to find out that nothing was installed - my /usr/local is 
symlinked. The solution was to 'fix' cpio to behave as before, since adding
the ignore-symlinks feature to firefox's makefile was beyond me :-)

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: diskless boot, nfs server behind router

2010-06-28 Thread Daniel Braniss

> 
> 
> On Mon, 28 Jun 2010, al...@ulgsm.ru wrote:
> 
> >
> >
> > kernel built with:
> > options BOOTP  # Use BOOTP to obtain IP address/hostname
> > options BOOTP_NFSROOT  # NFS mount root file system using BOOTP info
> > options BOOTP_NFSV3
> >
> Try building a kernel without the above options, but with
> options NFS_ROOT
> specified. I think that's what most pxeboot users do and it was what
> I had assumed when I looked at the code.
> 
> If that doesn't fix the problem...I haven't got a solution for you, rick

I use:
options BOOTP_NFSV3 # Use NFS v3 to NFS mount root

but the best advice I can give, on the server run tcpdump/wireshark
it is very enlighting.

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: diskless boot, nfs server behind router

2010-06-29 Thread Daniel Braniss

> 
> 
> On Mon, 28 Jun 2010, Daniel Braniss wrote:
> 
> >>
> >>
> >> On Mon, 28 Jun 2010, al...@ulgsm.ru wrote:
> >>
> >>>
> >>>
> >>> kernel built with:
> >>> options BOOTP  # Use BOOTP to obtain IP address/hostname
> >>> options BOOTP_NFSROOT  # NFS mount root file system using BOOTP info
> >>> options BOOTP_NFSV3
> >>>
> >> Try building a kernel without the above options, but with
> >> options NFS_ROOT
> >> specified. I think that's what most pxeboot users do and it was what
> >> I had assumed when I looked at the code.
> >>
> >> If that doesn't fix the problem...I haven't got a solution for you, rick
> >
> > I use:
> > options BOOTP_NFSV3 # Use NFS v3 to NFS mount root
> >
> 
> Here's the critical snippet of code:
> #if defined(BOOTP_NFSROOT) && defined(BOOTP)
>   bootpc_init();  /* use bootp to get nfs_diskless filled in */
> #elif defined(NFS_ROOT)
>   nfs_setup_diskless();
> #endif
> 
> Just fyi, as you can see, unless you have BOOTP_NFSROOT and BOOTP options, 
> it does things the NFS_ROOT way and basically ignores BOOTP_NFSV3.
> (At least thats the way it looks to me. I've been tricked by convoluted
> code before:-)

you are correct, I missed the NFS_ROOT which is defined in GENERIC, and yes,
convoluted is an understatement :-)

danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

problems with 8.1-PRERELEASE

2010-07-08 Thread Daniel Braniss

Hi,
I'm running a resent 8.1-Pre (Friday July 2nd), but I've seen this in previous
ones too, make buildworld -j will sometimes fail, or even panic.
when it failes it's usually some 'internal compiler error' or
panic: page fault. The failures I've seen on different hardware, all runing
amd64 version, so I doubt it's hardware. Another common point, the all are
multicores, both intel and amd.

Any one else seeing this?

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: net-booting the install disks (Re: 8.x grudges)

2010-07-08 Thread Daniel Braniss

> On Thu, Jul 08, 2010 at 11:08:04AM -0400, Mikhail T. wrote:
> > 08.07.2010 09:53, Jeremy Chadwick Ð½Ð°Ð¿Ð¸ÑÐ°Ð=²(Ð»Ð°):
> > >Then don't modify loader.conf.  Instead, once the "Welcome to FreeBSD!="
> > >portion of loader appears, press "6" to shell to the loader prompt
> > >and type:
> > >
> > >set vfs.root.mountfrom="ufs:/dev/md0"
> > >boot
> > Yes, that works... It just should not be necessary.
> 
> Okay, so let me get this straight.  First the complaint was that you had
> to modify loader.conf, which involved extracting the CD image, editing
> the file, yadda yadda.  Now that you've been shown you don't have to
> edit loader.conf, the complaint is "it shouldn't be necessary".  :-)
> 
> There's actually quite a bit about FreeBSD that "shouldn't be> necessary" 
> (from an administrator's point of view), but that's a
> completely separate issue when compared to your "when I do thing X in
> the kernel config, it breaks".  Which of those two approaches do you
> want to focus on?
> > > Red Hat's "kickstart" does not require one to extract CD-images to
> > fiddle with a couple of lines, and FreeBSD comes tantalizingly close
> > to offer the same functionality.  Just not quite :-(
> > I've PXE booted Ubuntu and Debian.  It was easy to accomplish (read:
> easier than FreeBSD) because they offer pxelinux vs. FreeBSD's pxeboot.
> > pxelinux[1] offers the ability to read a configuration file via TFTP,
> which configures pxelinux itself.  The configuration capabilities are
> very impressive[2].  FreeBSD folks interested in PXE should really take
> a look at this thing.  I believe the configuration file is read and
> applied immediately, so things like serial port speed changes happen
> before pxelinux outputs anything (e.g. no need to rebuild pxelinux just
> to get a faster rate).
> 
> That said, given that FreeBSD's pxeboot requires a bunch of extra work
> (rebuilding for faster serial speed, and a bunch of other stuff -- it's
> in my doc), I'm a surprised you're not complaining about that.  :-)
> 
> The bottom line: the PXE booting framework in FreeBSD could be improved.

It has been improved, though not the documentation :-(

you can configure most of the stuff via DHCP, take a look
at src/lib/libstand/bootp.c

example lines from dhcpd.conf:
option FBSD.ind0 "hint.uart.0.flags=0x10"
option FBSD.ind1 "kern.ipc.semmni=256"
option FBSD.ind2 "kern.ipc.semmns=2048"

and with this code in rc.initdiskless:

confpath=`kenv conf-path`
if [ -n "$confpath" ] ; then
if [ "`expr $confpath : '\(.*\):'`" ] ; then
echo Mounting $confpath on /conf
mount_nfs $confpath /conf
chkerr $? "mount_nfs $confpath /conf"
to_umount="${to_umount} $confpath"
fi
fi

eval `kenv | sed -n 's/^rc\.//p'`
rm -f /etc/rc.conf /etc/rc.conf.local
for fc in $conf0 $conf1 $conf2 $conf3 $conf4 $conf5 $conf6 $conf7 $conf8 
$conf9 rc.conf.$hostname
do
ho=`expr $fc : '\(.*\):'`
fl=`expr $fc : '.*/\(.*\)'`
if [ "${ho}" != "" ]; then
mp=`expr $fc : '\(.*\)/.*'`
mount_nfs $mp /mnt > /dev/null 2>&1
if [ -f /mnt/$fl ]; then
echo "# from $fc /mnt/$fl" >> /etc/rc.conf
cat /mnt/$fl >> /etc/rc.conf
fi
umount /mnt > /dev/null 2>&1
elif [ -e /conf/$fc ] ; then
echo "# from /conf/$fc" >> /etc/rc.conf
cat /conf/$fc >> /etc/rc.conf
fi
done

and these lines in dhcpd.conf
option FBSD.conf-path="fr-01:/vol/system/share/conf"
option FBSD.rc-conf3 "rc.ws8"
...

will generate a 'personalized' rc.conf

danny
PS: this is not the first time I have posted this.


[...]


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

latest 8.1 hangs on xpt_config

2010-07-22 Thread Daniel Braniss

It seems that the latest changes (last 7 days) introduced this problem:
...
run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config
run_interrupt_driven_hooks: still waiting after 120 seconds for xpt_config
run_interrupt_driven_hooks: still waiting after 180 seconds for xpt_config
run_interrupt_driven_hooks: still waiting after 240 seconds for xpt_config
...

i'll try to hunt this down, but any help is welcome.

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: latest 8.1 hangs on xpt_config

2010-07-23 Thread Daniel Braniss

> On Fri, Jul 23, 2010 at 09:35:55AM +0300, Daniel Braniss wrote:
> > It seems that the latest changes (last 7 days) introduced this problem:
> > ...
> > run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config
> > run_interrupt_driven_hooks: still waiting after 120 seconds for xpt_config
> > run_interrupt_driven_hooks: still waiting after 180 seconds for xpt_config
> > run_interrupt_driven_hooks: still waiting after 240 seconds for xpt_config
> > ...
> > 
> > i'll try to hunt this down, but any help is welcome.
> 
> Recent to semi-recent commits relevant to xpt that I can find The
> problem might not be even in xpt though.  Which xpt piece pertains to
> you probably depends on your system setup/configuration.  Dates/times
> are in PDT/UTC-0700:
> 
> -rw-r--r--1 root wheel   6037  1 Mar 22:48 
> /usr/src/sys/cam/cam_xpt_internal.h
> -rw-r--r--1 root wheel 124773  9 May 10:19 
> /usr/src/sys/cam/cam_xpt.c
> -rw-r--r--1 root wheel  72556 23 May 10:41 
> /usr/src/sys/cam/scsi/scsi_xpt.c
> -rw-r--r--1 root wheel  56663 19 Jul 05:28 
> /usr/src/sys/cam/ata/ata_xpt.c
> 
> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cam/cam_xpt_internal.h
> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cam/cam_xpt.c
> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cam/scsi/scsi_xpt.c
> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cam/ata/ata_xpt.c

thanks Jeremy, i'll try and make some sence of the changes.

here is some more info:
there are one disk and one dvd connected via SATA

CPU: Intel(R) Core(TM) i5 CPU 660  @ 3.33GHz (3325.02-MHz K8-class 
CPU)^M
  Origin = "GenuineIntel"  Id = 0x20652  Family = 6  Model = 25  Stepping = 2^M

Features=0xbfebfbff^M

Features2=0x298e3ff^M
  AMD Features=0x28100800^M
  AMD Features2=0x1^M
...

atapci0:  port 
0xf0f0-0xf0f7,0xf0e0-0xf0e3,0xf0d0-0xf0d7,0xf0c0-0xf0c3,0xf0b0-0xf0bf irq 18 at 
device 22.2 on pci0^M
atapci0: Reserved 0x10 bytes for rid 0x20 type 4 at 0xf0b0^M
ioapic0: routing intpin 18 (PCI IRQ 18) to lapic 0 vector 49^M
atapci0: [MPSAFE]^M
atapci0: [ITHREAD]^M
ata2:  on atapci0^M
atapci0: Reserved 0x8 bytes for rid 0x10 type 4 at 0xf0f0^M
atapci0: Reserved 0x4 bytes for rid 0x14 type 4 at 0xf0e0^M
ata2: reset tp1 mask=03 ostat0=7f ostat1=7f^M
ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata2: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata2: stat1=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata2: reset tp2 stat0=ff stat1=ff devices=0x0^M
ata2: [MPSAFE]^M
ata2: [ITHREAD]^M
ata3:  on atapci0^M
atapci0: Reserved 0x8 bytes for rid 0x18 type 4 at 0xf0d0^M
atapci0: Reserved 0x4 bytes for rid 0x1c type 4 at 0xf0c0^M
ata3: reset tp1 mask=03 ostat0=7f ostat1=7f^M
ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata3: stat0=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata3: stat1=0x7f err=0x7f lsb=0x7f msb=0x7f^M
ata3: reset tp2 stat0=ff stat1=ff devices=0x0^M
ata3: [MPSAFE]^M
ata3: [ITHREAD]^M
...
ahci0:  port 
0xf090-0xf097,0xf080-0xf083,0xf070-0xf077,0xf060-0xf063,0xf020-0xf03f mem 
0xfe425000-0xfe4257ff irq 19 at device 31.2 on pci0^M
ahci0: Reserved 0x800 bytes for rid 0x24 type 3 at 0xfe425000^M
ahci0: attempting to allocate 1 MSI vectors (1 supported)^M
msi: routing MSI IRQ 257 to local APIC 0 vector 53^M
ahci0: using IRQ 257 for MSI^M
ahci0: [MPSAFE]^M
ahci0: [ITHREAD]^M
ahci0: AHCI v1.30 with 6 3Gbps ports, Port Multiplier not supported^M
ahci0: Caps: 64bit NCQ SNTF MPS ALP AL CLO 3Gbps PMD SSC PSC 32cmd EM eSATA 
6ports^M
ahci0: Caps2: APST^M
ahci0: EM Caps: ALHD XMT SMB LED^M
ahcich0:  at channel 0 on ahci0^M
ahcich0: [MPSAFE]^M
ahcich0: [ITHREAD]^M
ahcich0: Caps:^M
ahcich1:  at channel 1 on ahci0^M
ahcich1: [MPSAFE]^M
ahcich1: [ITHREAD]^M
ahcich1: Caps:^M
ahcich2:  at channel 4 on ahci0^M
ahcich2: [MPSAFE]^M
ahcich2: [ITHREAD]^M
ahcich2: Caps: HPCP ESP^M
...
ata2: Identifying devices: ^M
ata2: New devices: ^M
ata3: Identifying devices: ^M
ata3: New

WITNESS is the culprit was Re: latest 8.1 hangs on xpt_config

2010-07-23 Thread Daniel Braniss

> Daniel Braniss wrote:
> >> On Fri, Jul 23, 2010 at 09:35:55AM +0300, Daniel Braniss wrote:
> >>> It seems that the latest changes (last 7 days) introduced this problem:
> >>> ...
> >>> run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config
> >>> run_interrupt_driven_hooks: still waiting after 120 seconds for xpt_config
> >>> run_interrupt_driven_hooks: still waiting after 180 seconds for xpt_config
> >>> run_interrupt_driven_hooks: still waiting after 240 seconds for xpt_config
> >>> ...
> >>>
> >>> i'll try to hunt this down, but any help is welcome.
> >> Recent to semi-recent commits relevant to xpt that I can find The
> >> problem might not be even in xpt though.  Which xpt piece pertains to
> >> you probably depends on your system setup/configuration.  Dates/times
> >> are in PDT/UTC-0700:
> >>
> >> -rw-r--r--1 root wheel   6037  1 Mar 22:48 
> >> /usr/src/sys/cam/cam_xpt_internal.h
> >> -rw-r--r--1 root wheel 124773  9 May 10:19 
> >> /usr/src/sys/cam/cam_xpt.c
> >> -rw-r--r--1 root wheel  72556 23 May 10:41 
> >> /usr/src/sys/cam/scsi/scsi_xpt.c
> >> -rw-r--r--1 root wheel  56663 19 Jul 05:28 
> >> /usr/src/sys/cam/ata/ata_xpt.c
> >>
> >> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cam/cam_xpt_internal.h
> >> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cam/cam_xpt.c
> >> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cam/scsi/scsi_xpt.c
> >> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cam/ata/ata_xpt.c
> > 
> > thanks Jeremy, i'll try and make some sence of the changes.
> > 
> > here is some more info:
> > there are one disk and one dvd connected via SATA
> 
> I recently had report about alike problem with "PIONEER DVD-RW DVR-215
> 1.19" drive. Don't you have the same device or another Pioneer?
> 
> In that case this patch:
> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cam/ata/ata_xpt.c.diff?r1=1.3.2.29;r2=1.3.2.30;f=h
> allowed system to boot, though problem seemed to be hardware. Try this
> patch, it at least may give additional info about the problem.

That was my first guess, so I detached the DVD, but the problem persisted.
the device is:
ATAPI DVD A  DH16AAS JL34> Removable CD-ROM SCSI-0 device

anyways, I compiled a kernel without WITNESS, and it now works ok!

thanks all,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: WITNESS is the culprit was Re: latest 8.1 hangs on xpt_config

2010-07-23 Thread Daniel Braniss


> That's hardly a solution or reason.
 
of course it's not, just that last successful boot had WITNESS configured,
and with the latest patches it hang, compiling without WITNESS allowed
the boot to proceed.

>  Still please try the patch (or fresh
> 8-STABLE with it), may be it tell more.
> 
according to my logs (i do some hulahups sync'ing via svn/hg which makes it
abit of a problem following versions :-) your patch seems to be all ready
in my kernel:
hg log ata_xpt.c -p
changeset:   2939:440362ab79cb
branch:  8
tag: tip
parent:  2938:fc1c9d5f4b38
parent:  2937:846cb2242d34
user:da...@cs.huji.ac.il
date:Fri Jul 23 08:41:24 2010 +0300
summary: -- merge from head --

diff -r fc1c9d5f4b38 -r 440362ab79cb sys/cam/ata/ata_xpt.c
--- a/sys/cam/ata/ata_xpt.c Fri Jul 23 08:40:46 2010 +0300
+++ b/sys/cam/ata/ata_xpt.c Fri Jul 23 08:41:24 2010 +0300
@@ -134,6 +134,7 @@
uint32_tpm_prv;
int restart;
int spinup;
+   int faults;
u_int   caps;
struct cam_periph *periph;
 } probe_softc;
@@ -738,14 +739,28 @@
ident_buf = &path->device->ident_data;
 
if ((done_ccb->ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) {
-device_fail:   if ((!softc->restart) &&
-   cam_periph_error(done_ccb, 0, 0, NULL) == ERESTART) {
+   if (softc->restart) {
+   if (bootverbose) {
+   cam_error_print(done_ccb,
+   CAM_ESF_ALL, CAM_EPF_ALL);
+   }
+   } else if (cam_periph_error(done_ccb, 0, 0, NULL) == ERESTART)
return;
-   } else if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) {
+   if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) {
/* Don't wedge the queue */
xpt_release_devq(done_ccb->ccb_h.path, /*count*/1,
 /*run_queue*/TRUE);
}
+   if (softc->restart) {
+   softc->faults++;
+   if ((done_ccb->ccb_h.status & CAM_STATUS_MASK) ==
+   CAM_CMD_TIMEOUT)
+   softc->faults += 4;
+   if (softc->faults < 10)
+   goto done;
+   else
+   softc->restart = 0;
+   } else
/* Old PIO2 devices may not support mode setting. */
if (softc->action == PROBE_SETMODE &&
ata_max_pmode(ident_buf) <= ATA_PIO2 &&
@@ -761,7 +776,7 @@
 * already marked unconfigured, notify the peripheral
 * drivers that this device is no more.
 */
-   if ((path->device->flags & CAM_DEV_UNCONFIGURED) == 0)
+device_fail:   if ((path->device->flags & CAM_DEV_UNCONFIGURED) == 0)
xpt_async(AC_LOST_DEVICE, path, NULL);
found = 0;
goto done;
@@ -1209,6 +1224,12 @@
!(work_ccb->cpi.hba_misc & PIM_NOBUSRESET) &&
!timevalisset(&request_ccb->ccb_h.path->bus->last_reset)) {
reset_ccb = xpt_alloc_ccb_nowait();
+   if (reset_ccb == NULL) {
+   request_ccb->ccb_h.status = CAM_RESRC_UNAVAIL;
+   xpt_free_ccb(work_ccb);
+   xpt_done(request_ccb);
+   return;
+   }
xpt_setup_ccb(&reset_ccb->ccb_h, request_ccb->
ccb_h.path,
  CAM_PRIORITY_NONE);
reset_ccb->ccb_h.func_code = XPT_RESET_BUS;
@@ -1228,6 +1249,7 @@
malloc(sizeof(ata_scan_bus_info), M_CAMXPT, M_NOWAIT);
if (scan_info == NULL) {
request_ccb->ccb_h.status = CAM_RESRC_UNAVAIL;
+   xpt_free_ccb(work_ccb);
xpt_done(request_ccb);
return;
}

cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Diskless/readonly root booting issues

2010-09-30 Thread Daniel Braniss

> Hi all,
> 
> I've been working on updating my semi-embedded images to
> 7.3-stable of late (I generally wait for .3+ releases), it's been a
> few years since the last time I did one of these and I'm having some
> issues getting my netboot test environment to behave itself.
> 
> I'm sure it's something simple but I've spent quite a bit of time
> looking for answers and poking the system but no joy yet.
> 
> Basically I use a PXE booted NFS root to test my reduced footprint
> image builds, the boot is working but init is attempting to remount /
> rw (in spite of it being marked ro in fstab) which of course fails
> because the directory is exported ro from the NFS server at which
> point the system dumps me to single user mode;
> 
> === OUTPUT ===
> 
> Starting file system checks:
> udp: Netconfig database not found
> Mounting root filesystem rw failed, startup aborted
> ERROR: ABORTING BOOT (sending SIGTERM to parent)!
> Sep 30 09:60:02 init: /bin/sh on /etc/rc terminated abnormally, going
> to single user mode
> Enter full pathname of shell or RETURN for /bin/sh:
> 
> 
> 
> Relevant configs from the diskless root
> 
> == rc.conf ==
> 
> ifconfig_le0="DHCP"
> 
> diskless_mount=/etc/rc.initdiskless
> 
> varsize=8192
> varmfs="YES"
> 
> tmpsize=8192
> tmpmfs="YES"
> 
> nfs_client_enable="YES"
> 
> dumpdev="NO"
> 
> =
> 
> rc.initdiskless is the version from /usr/share/examples/rc.initdiskless
> 
> == fstab ==
> 
> 192.168.2.2:/usr/fbtest / nfs ro 0 0
> proc /proc procfs rw 0 0
> 
> 
> 
> == loader.conf ==
> 
> verbose_loading="YES"
> 
> autoboot_delay="2"
> 
> 
> 
> Kernel is (obviously) built with NFS_ROOT and NFSCLIENT, relatively
> minimalist otherwise, have also tested with GENERIC, same result.
> 
> I must be forgetting something simple in all of this, I don't recall
> it being terribly difficult to get this stuff working when I was doing
> my original work with 6.3, though I don't recall the use of the
> initdiskless script, IIRC I was using rc.diskless2 which (again IIRC)
> was later replaced by /etc/rc.d/diskless but I've not been able to
> find this script anywhere.
> 
> Any suggestions would be greatly appreciated at this point.
> 
> Thanks,
> 
> Morgan Reed

firstly, you should be using the latest pxeboot, it passes the root file-handle
to the kernel, so no need to remount it, so remove the line from the fstab.
secondly, try using /etc/rc.initdiskless - which is the default.
use the KISS method :-)

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

boot0cfg problems

2010-10-01 Thread Daniel Braniss

In a not so distant past, boot0cfg -sn ... used to work, then it only
partialy worked, it would modify the data in boot but not the mbr, for
which 'gpart -s set active -in ...' modified the mbr. Now
# boot0cfg -s1 -v /dev/mfid0
boot0cfg: write_mbr: /dev/mfid0: Operation not permitted
but:
# boot0cfg -v /dev/mfid0
#   flag start chs   type   end chs   offset size
1   0x80  0:  1: 1   0xa5   1023:212:63   63 41943006
2   0x00   1023:255:63   0xa5   1023:169:63 41943069 41943006
3   0x00   1023:255:63   0xa5   1023:126:63 83886075 41943006
4   0x00   1023:255:63   0xa5   1023:201:63125829081   1046478825

version=2.0  drive=0x80  mask=0x3  ticks=182  bell=# (0x23)
options=packet,update,nosetdrv
volume serial ID 9090-9090
default_selection=F2 (Slice 2)



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: boot0cfg problems

2010-10-01 Thread Daniel Braniss

> On Fri, Oct 01, 2010 at 09:26:41AM +0200, Daniel Braniss wrote:
> > In a not so distant past, boot0cfg -sn ... used to work, then it only
> > partialy worked, it would modify the data in boot but not the mbr, for
> > which 'gpart -s set active -in ...' modified the mbr. Now
> > # boot0cfg -s1 -v /dev/mfid0
> > boot0cfg: write_mbr: /dev/mfid0: Operation not permitted
> > but:
> > # boot0cfg -v /dev/mfid0
> > #   flag start chs   type   end chs   offset size
> > 1   0x80  0:  1: 1   0xa5   1023:212:63   63 41943006
> > 2   0x00   1023:255:63   0xa5   1023:169:63 41943069 41943006
> > 3   0x00   1023:255:63   0xa5   1023:126:63 83886075 41943006
> > 4   0x00   1023:255:63   0xa5   1023:201:63125829081   1046478825
> > 
> > version=2.0  drive=0x80  mask=0x3  ticks=182  bell=# (0x23)
> > options=packet,update,nosetdrv
> > volume serial ID 9090-9090
> > default_selection=F2 (Slice 2)
> 
> Can you try doing "sysctl kern.geom.debugflags=16" first?
>
this is not realy foot-shooting :-), but
- the error msg is gone,
- the slice info is updated,
- but the active bit in the mbr is not! - some bioses rely on it.
looking at changes done to boot0cfg.c there is now an err(...) call which
does an exit, before the boot is updated. I changed it to a warn(...) and the 
old
behaviour is back.
BTW, 
a- gpart command should have been: gpart set -a active -i n ...
b- this works with kern.geom.debugflags=0.

thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: boot0cfg problems

2010-10-01 Thread Daniel Braniss

> On Fri, Oct 01, 2010 at 01:20:42PM +0200, Daniel Braniss wrote:
> > > On Fri, Oct 01, 2010 at 09:26:41AM +0200, Daniel Braniss wrote:
> > > > In a not so distant past, boot0cfg -sn ... used to work, then it only
> > > > partialy worked, it would modify the data in boot but not the mbr, for
> > > > which 'gpart -s set active -in ...' modified the mbr. Now
> > > > # boot0cfg -s1 -v /dev/mfid0
> > > > boot0cfg: write_mbr: /dev/mfid0: Operation not permitted
> > > > but:
> > > > # boot0cfg -v /dev/mfid0
> > > > #   flag start chs   type   end chs   offset size
> > > > 1   0x80  0:  1: 1   0xa5   1023:212:63   63 41943006
> > > > 2   0x00   1023:255:63   0xa5   1023:169:63 41943069 41943006
> > > > 3   0x00   1023:255:63   0xa5   1023:126:63 83886075 41943006
> > > > 4   0x00   1023:255:63   0xa5   1023:201:63125829081   1046478825
> > > > 
> > > > version=2.0  drive=0x80  mask=0x3  ticks=182  bell=# (0x23)
> > > > options=packet,update,nosetdrv
> > > > volume serial ID 9090-9090
> > > > default_selection=F2 (Slice 2)
> > > 
> > > Can you try doing "sysctl kern.geom.debugflags=16" first?
> > >
> > this is not realy foot-shooting :-), but
> > - the error msg is gone,
> > - the slice info is updated,
> > - but the active bit in the mbr is not! - some bioses rely on it.
> > looking at changes done to boot0cfg.c there is now an err(...) call which
> > does an exit, before the boot is updated. I changed it to a warn(...) and 
> > the 
> > old
> > behaviour is back.
> > BTW, 
> > a- gpart command should have been: gpart set -a active -i n ...
> > b- this works with kern.geom.debugflags=0.
> 
> Bit 4 (hence 0x10, or 16 decimal) in kern.geom.debugflags is described
> as:
> 
>  0x10 (allow foot shooting)
>  Allow writing to Rank 1 providers.  This would, for example,
>  allow the super-user to overwrite the MBR on the root disk or
>  write random sectors elsewhere to a mounted disk.  The implicaâ
>  tions are obvious.
> 
> I read this as: "you can't modify the MBR of a root disk unless bit 4 of
> this sysctl is set".  Sector 0 holds the MBR, and boot0cfg modifies the
> MBR.  So can you explain what you mean by "this really isn't
> foot-shooting?"  I mean, even the NOTE section of the boot0cfg(8) man
> page documents what I'm trying to say.
> 
> Anyway, if the MBR did get updated without kern.geom.debugflags having
> bit 4 set, then wouldn't this indicate there's a bug in GEOM's "sector
> 0" protection?

but mbr did NOT get updated by boot0cfg, gpart does however succeed, but gpart 
knows nothing about the other bits boot0cfg knows, like which slice to boot 
from
(not to be confused with the current active slice), what bell to ring, etc,
these are (or used to be) updated before the last change.
anyways, as you correctly pointed out, the problem is in GEOM, being somewhat
over protective :-)


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

latest -stable: still waiting after ...

2010-10-20 Thread Daniel Braniss

hi,
with the latest -stable, the boot process gets stuck with
...
ugen2.2:  at usbus2
uhub6:  on 
usbus2
uhub6: 3 ports with 3 removable, self powered
ugen3.2:  at usbus3
ukbd0:  on usbus3
kbd2 at ukbd0
ums0:  on usbus3
ums0: 3 buttons and [Z] coordinates ID=0 <- stuck here
run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config  
SMP: AP CPU #1 Launched!
SMP: AP CPU #7 Launched!
SMP: AP CPU #3 Launched!


this does not happen with and older -stable (August) kernel
Cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: NFS deadlock (unkillable nfsd and no mounts work)

2010-11-16 Thread Daniel Braniss

> on 05/11/2010 23:27 Kostik Belousov said the following:
> > I agree that the fix a right fix for real issue. It should only
> > affect the filesystems that do support VFS_VGET(). In other words,
> > it is relevant for e.g. UFS exports, but not for ZFS, that is the
> > Andrey case.
> 
> Actually ZFS does implement vfs_vget, but with a special quirk for .zfs/ and
> stuff under it:
> 
> static int
> zfs_vget(vfs_t *vfsp, ino_t ino, int flags, vnode_t **vpp)
> {
> zfsvfs_t*zfsvfs = vfsp->vfs_data;
> znode_t *zp;
> int err;
> 
> /*
>  * zfs_zget() can't operate on virtual entires like .zfs/ or
 entries 

===
==
>  * .zfs/snapshot/ directories, that's why we return EOPNOTSUPP.
>  * This will make NFS to switch to LOOKUP instead of using VGET.
>  */
> if (ino == ZFSCTL_INO_ROOT || ino == ZFSCTL_INO_SNAPDIR)
> return (EOPNOTSUPP);
> ...
> ...
> 
> 
> -- 
> Andriy Gapon
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
> 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

panic on boot

2010-12-22 Thread Daniel Braniss

the hardware is Sun Fire X2200 M2, and it's discless, PXE booted.

this seems to have started sometime before 8.2, and it
'sometimes happens':

FreeBSD 8.2-PRERELEASE #15 r4274: Wed Dec 22 09:11:27 IST 2010c40, rbp = 
0x80ef5c60 ---
da...@rnd:/home/obj/rnd/r+d/stable/8/sys/HUJI amd64
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Dual-Core AMD Opteron(tm) Processor 2218 (2613.40-MHz K8-class CPU)
  Origin = "AuthenticAMD"  Id = 0x40f13  Family = f  Model = 41  Stepping = 3
  Features=0x178bfbff
  Features2=0x2001
  AMD Features=0xea500800
  AMD Features2=0x1f
...
SMP: AP CPU #3 Launched!
(cd0:ata0:0:0:0): SCSI status: Check Condition
cpu3 AP:
(cd0:ata0:0:0:0): SCSI sense: NOT READY asc:3a,0 (Medium not present)
 ID: 0x0300   VER: 0x80050010 LDR: 0x DFR: 0x
(cd0:  lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff
ata0:0:  timer: 0x000200ef therm: 0x0001 err: 0x00f00: pmc: 
0x000104000): 
Error 6, Unretryable error
SMP: AP CPU #2 Launched!
cd0 at ata0 bus 0 scbus0 target 0 lun 0
cpu2 AP:
cd0:  ID: 0x0200   VER: 0x80050010 LDR: 0x DFR: 0x
 Removable CD-ROM SCSI-0 device 
  lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff
cd0: 33.300MB/s transfers  timer: 0x000200ef therm: 0x0001 err: 0x00f0 
( pmc: 0x00010400UDMA2, 
ATAPI 12bytes, ioapic0: routing intpin 3 (PIO 65534bytesISA IRQ 3)) to lapic 1 
vector 48
f
loiwotaapbilce0 :c lreoaunteirn gs tianrttpeidn
 4 (cd0: Attempt to query device size failed: NOT READY, Medium not present
ISA IRQ 4) to lapic 2 vector 48
ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 3 vector 48
ioapic0: routing intpin 15 (ISA IRQ 15) to lapic 1 vector 49
ioapic0: routing intpin 17 (PCI IRQ 17) to lapic 2 vector 49
ioapic0: routing intpin 18 (PCI IRQ 18) to lapic 3 vector 49
ioapic0: routing intpin 22 (PCI IRQ 22) to lapic 1 vector 50
ioapic0: routing intpin 23 (PCI IRQ 23) to lapic 2 vector 50
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x10
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x808b1581
stack pointer   = 0x28:0x80ef5b20
frame pointer   = 0x28:0x80ef5b50
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= resume, IOPL = 0
current process = 0 (swapper)
trap number = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
kdb_backtrace() at kdb_backtrace+0x37
panic() at panic+0x187
trap_fatal() at trap_fatal+0x290
trap_pfault() at trap_pfault+0x28f
trap() at trap+0x3df
calltrap() at calltrap+0x8
--- trap 0xc, rip = 0x808b1581, rsp = 0x80ef5b20, rbp = 
0x80ef5b50 ---
intr_execute_handlers() at intr_execute_handlers+0x21
lapic_handle_intr() at lapic_handle_intr+0x37
Xapic_isr1() at Xapic_isr1+0xa5
--- interrupt, rip = 0x808b6cf3, rsp = 0x80ef5c40, rbp = 
0x80ef5c60 ---
spinlock_exit() at spinlock_exit+0x33
ioapic_assign_cpu() at ioapic_assign_cpu+0x123
intr_shuffle_irqs() at intr_shuffle_irqs+0x9d
mi_startup() at mi_startup+0x77
btext() at btext+0x2c
Uptime: 2s


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: panic on boot

2010-12-22 Thread Daniel Braniss

> On Wednesday, December 22, 2010 5:12:03 am Daniel Braniss wrote:
> > the hardware is Sun Fire X2200 M2, and it's discless, PXE booted.
> > 
> > this seems to have started sometime before 8.2, and it
> > 'sometimes happens':
> > 
> > FreeBSD 8.2-PRERELEASE #15 r4274: Wed Dec 22 09:11:27 IST 2010c40, rbp = 
> > 0x80ef5c60 ---
> > da...@rnd:/home/obj/rnd/r+d/stable/8/sys/HUJI amd64
> > Timecounter "i8254" frequency 1193182 Hz quality 0
> > CPU: Dual-Core AMD Opteron(tm) Processor 2218 (2613.40-MHz K8-class CPU)
> >   Origin = "AuthenticAMD"  Id = 0x40f13  Family = f  Model = 41  Stepping = 
> > 3
> >   
> > Features=0x178bfbff > CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
> >   Features2=0x2001
> >   AMD Features=0xea500800
> >   AMD Features2=0x1f
> > ...
> > SMP: AP CPU #3 Launched!
> > (cd0:ata0:0:0:0): SCSI status: Check Condition
> > cpu3 AP:
> > (cd0:ata0:0:0:0): SCSI sense: NOT READY asc:3a,0 (Medium not present)
> >  ID: 0x0300   VER: 0x80050010 LDR: 0x DFR: 0x
> > (cd0:  lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff
> > ata0:0:  timer: 0x000200ef therm: 0x0001 err: 0x00f00: pmc: 
> > 0x000104000): 
> > Error 6, Unretryable error
> > SMP: AP CPU #2 Launched!
> > cd0 at ata0 bus 0 scbus0 target 0 lun 0
> > cpu2 AP:
> > cd0:  ID: 0x0200   VER: 0x80050010 LDR: 0x DFR: 0x
> >  Removable CD-ROM SCSI-0 device 
> >   lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff
> > cd0: 33.300MB/s transfers  timer: 0x000200ef therm: 0x0001 err: 
> > 0x00f0 ( pmc: 0x00010400UDMA2, 
> > ATAPI 12bytes, ioapic0: routing intpin 3 (PIO 65534bytesISA IRQ 3)) to 
> > lapic 1 vector 48
> > f
> > loiwotaapbilce0 :c lreoaunteirn gs tianrttpeidn
> >  4 (cd0: Attempt to query device size failed: NOT READY, Medium not present
> > ISA IRQ 4) to lapic 2 vector 48
> > ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 3 vector 48
> > ioapic0: routing intpin 15 (ISA IRQ 15) to lapic 1 vector 49
> > ioapic0: routing intpin 17 (PCI IRQ 17) to lapic 2 vector 49
> > ioapic0: routing intpin 18 (PCI IRQ 18) to lapic 3 vector 49
> > ioapic0: routing intpin 22 (PCI IRQ 22) to lapic 1 vector 50
> > ioapic0: routing intpin 23 (PCI IRQ 23) to lapic 2 vector 50
> > kernel trap 12 with interrupts disabled
> > 
> > 
> > Fatal trap 12: page fault while in kernel mode
> > cpuid = 0; apic id = 00
> > fault virtual address   = 0x10
> > fault code  = supervisor read data, page not present
> > instruction pointer = 0x20:0x808b1581
> > stack pointer   = 0x28:0x80ef5b20
> > frame pointer   = 0x28:0x80ef5b50
> > code segment= base 0x0, limit 0xf, type 0x1b
> > = DPL 0, pres 1, long 1, def32 0, gran 1
> > processor eflags= resume, IOPL = 0
> > current process = 0 (swapper)
> > trap number = 12
> > panic: page fault
> > cpuid = 0
> > KDB: stack backtrace:
> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
> > kdb_backtrace() at kdb_backtrace+0x37
> > panic() at panic+0x187
> > trap_fatal() at trap_fatal+0x290
> > trap_pfault() at trap_pfault+0x28f
> > trap() at trap+0x3df
> > calltrap() at calltrap+0x8
> > --- trap 0xc, rip = 0x808b1581, rsp = 0x80ef5b20, rbp = 
> > 0x80ef5b50 ---
> > intr_execute_handlers() at intr_execute_handlers+0x21
> > lapic_handle_intr() at lapic_handle_intr+0x37
> > Xapic_isr1() at Xapic_isr1+0xa5
> > --- interrupt, rip = 0x808b6cf3, rsp = 0x80ef5c40, rbp = 
> > 0x80ef5c60 ---
> > spinlock_exit() at spinlock_exit+0x33
> > ioapic_assign_cpu() at ioapic_assign_cpu+0x123
> > intr_shuffle_irqs() at intr_shuffle_irqs+0x9d
> > mi_startup() at mi_startup+0x77
> > btext() at btext+0x2c
> > Uptime: 2s
> 
> Can you do 'l *intr_execute_handlers+0x21' and 'l *ioapic_assign_cpu+0x123'
> in 'gdb kernel.debug' of your kernel?

sure, as soon as it happens, and it aint happening now :-(
but when it will happen, I think it won't let me into the debugger
- probably will have to recompile
thanks
danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: panic on boot

2010-12-22 Thread Daniel Braniss

ok, it happened
...
Cannot dump. Device not defined or unavailable.
Automatic reboot in 15 seconds - press a key on the console to abort
--> Press a key on the console to reboot,
--> or switch off the system now.


but 
a- the 15 seconds never happen :-)
b- there is some magic to get into the debugger
   but can't find it.
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: panic on boot

2010-12-22 Thread Daniel Braniss

> On Wednesday, December 22, 2010 10:58:56 am Daniel Braniss wrote:
> > > On Wednesday, December 22, 2010 5:12:03 am Daniel Braniss wrote:
> > > > the hardware is Sun Fire X2200 M2, and it's discless, PXE booted.
> > > > 
> > > > this seems to have started sometime before 8.2, and it
> > > > 'sometimes happens':
> > > > 
> > > > FreeBSD 8.2-PRERELEASE #15 r4274: Wed Dec 22 09:11:27 IST 2010c40, rbp 
> > > > = 
> > > > 0x80ef5c60 ---
> > > > da...@rnd:/home/obj/rnd/r+d/stable/8/sys/HUJI amd64
> > > > Timecounter "i8254" frequency 1193182 Hz quality 0
> > > > CPU: Dual-Core AMD Opteron(tm) Processor 2218 (2613.40-MHz K8-class CPU)
> > > >   Origin = "AuthenticAMD"  Id = 0x40f13  Family = f  Model = 41  
> > > > Stepping = 3
> > > >   
> > > > Features=0x178bfbff > > > CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
> > > >   Features2=0x2001
> > > >   AMD 
> > > > Features=0xea500800
> > > >   AMD Features2=0x1f
> > > > ...
> > > > SMP: AP CPU #3 Launched!
> > > > (cd0:ata0:0:0:0): SCSI status: Check Condition
> > > > cpu3 AP:
> > > > (cd0:ata0:0:0:0): SCSI sense: NOT READY asc:3a,0 (Medium not present)
> > > >  ID: 0x0300   VER: 0x80050010 LDR: 0x DFR: 0x
> > > > (cd0:  lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 
> > > > 0x01ff
> > > > ata0:0:  timer: 0x000200ef therm: 0x0001 err: 0x00f00: pmc: 
> > > > 0x000104000): 
> > > > Error 6, Unretryable error
> > > > SMP: AP CPU #2 Launched!
> > > > cd0 at ata0 bus 0 scbus0 target 0 lun 0
> > > > cpu2 AP:
> > > > cd0:  ID: 0x0200   VER: 0x80050010 LDR: 0x DFR: 
> > > > 0x
> > > >  Removable CD-ROM SCSI-0 device 
> > > >   lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff
> > > > cd0: 33.300MB/s transfers  timer: 0x000200ef therm: 0x0001 err: 
> > > > 0x00f0 ( pmc: 0x00010400UDMA2, 
> > > > ATAPI 12bytes, ioapic0: routing intpin 3 (PIO 65534bytesISA IRQ 3)) to 
> > > > lapic 1 vector 48
> > > > f
> > > > loiwotaapbilce0 :c lreoaunteirn gs tianrttpeidn
> > > >  4 (cd0: Attempt to query device size failed: NOT READY, Medium not 
> > > > present
> > > > ISA IRQ 4) to lapic 2 vector 48
> > > > ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 3 vector 48
> > > > ioapic0: routing intpin 15 (ISA IRQ 15) to lapic 1 vector 49
> > > > ioapic0: routing intpin 17 (PCI IRQ 17) to lapic 2 vector 49
> > > > ioapic0: routing intpin 18 (PCI IRQ 18) to lapic 3 vector 49
> > > > ioapic0: routing intpin 22 (PCI IRQ 22) to lapic 1 vector 50
> > > > ioapic0: routing intpin 23 (PCI IRQ 23) to lapic 2 vector 50
> > > > kernel trap 12 with interrupts disabled
> > > > 
> > > > 
> > > > Fatal trap 12: page fault while in kernel mode
> > > > cpuid = 0; apic id = 00
> > > > fault virtual address   = 0x10
> > > > fault code  = supervisor read data, page not present
> > > > instruction pointer = 0x20:0x808b1581
> > > > stack pointer   = 0x28:0x80ef5b20
> > > > frame pointer   = 0x28:0x80ef5b50
> > > > code segment= base 0x0, limit 0xf, type 0x1b
> > > > = DPL 0, pres 1, long 1, def32 0, gran 1
> > > > processor eflags= resume, IOPL = 0
> > > > current process = 0 (swapper)
> > > > trap number = 12
> > > > panic: page fault
> > > > cpuid = 0
> > > > KDB: stack backtrace:
> > > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
> > > > kdb_backtrace() at kdb_backtrace+0x37
> > > > panic() at panic+0x187
> > > > trap_fatal() at trap_fatal+0x290
> > > > trap_pfault() at trap_pfault+0x28f
> > > > trap() at trap+0x3df
> > > > calltrap() at calltrap+0x8
> > > > --- trap 0xc, rip = 0x808b1581, rsp = 0x80ef5b20, rbp = 
> > > > 0x80ef5b50 ---
> > > > intr_execute_handlers() at intr_execute_handlers+0x21
> > > > lapic_handle_intr() at lapic_handle_intr+0x37
> > > > Xapic_isr1() at Xapic_isr1+0xa5
> > > > --- inter

Re: panic on boot

2010-12-23 Thread Daniel Braniss

> On Thursday, December 23, 2010 1:47:39 am Daniel Braniss wrote:
> > > On Wednesday, December 22, 2010 10:58:56 am Daniel Braniss wrote:
> > > > > On Wednesday, December 22, 2010 5:12:03 am Daniel Braniss wrote:
> > > > > > the hardware is Sun Fire X2200 M2, and it's discless, PXE booted.
> > > > > > 
> > > > > > this seems to have started sometime before 8.2, and it
> > > > > > 'sometimes happens':
> > > > > > 
> > > > > > FreeBSD 8.2-PRERELEASE #15 r4274: Wed Dec 22 09:11:27 IST 2010c40, 
> > > > > > rbp = 
> > > > > > 0x80ef5c60 ---
> > > > > > da...@rnd:/home/obj/rnd/r+d/stable/8/sys/HUJI amd64
> > > > > > Timecounter "i8254" frequency 1193182 Hz quality 0
> > > > > > CPU: Dual-Core AMD Opteron(tm) Processor 2218 (2613.40-MHz K8-class 
> > > > > > CPU)
> > > > > >   Origin = "AuthenticAMD"  Id = 0x40f13  Family = f  Model = 41  
> > > > > > Stepping = 3
> > > > > >   
> > > > > > Features=0x178bfbff > > > > > CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
> > > > > >   Features2=0x2001
> > > > > >   AMD 
> > > > > > Features=0xea500800
> > > > > >   AMD Features2=0x1f
> > > > > > ...
> > > > > > SMP: AP CPU #3 Launched!
> > > > > > (cd0:ata0:0:0:0): SCSI status: Check Condition
> > > > > > cpu3 AP:
> > > > > > (cd0:ata0:0:0:0): SCSI sense: NOT READY asc:3a,0 (Medium not 
> > > > > > present)
> > > > > >  ID: 0x0300   VER: 0x80050010 LDR: 0x DFR: 
> > > > > > 0x
> > > > > > (cd0:  lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 
> > > > > > 0x01ff
> > > > > > ata0:0:  timer: 0x000200ef therm: 0x0001 err: 0x00f00: pmc: 
> > > > > > 0x000104000): 
> > > > > > Error 6, Unretryable error
> > > > > > SMP: AP CPU #2 Launched!
> > > > > > cd0 at ata0 bus 0 scbus0 target 0 lun 0
> > > > > > cpu2 AP:
> > > > > > cd0:  ID: 0x0200   VER: 0x80050010 LDR: 0x DFR: 
> > > > > > 0x
> > > > > >  Removable CD-ROM SCSI-0 device 
> > > > > >   lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 
> > > > > > 0x01ff
> > > > > > cd0: 33.300MB/s transfers  timer: 0x000200ef therm: 0x0001 err: 
> > > > > > 0x00f0 ( pmc: 0x00010400UDMA2, 
> > > > > > ATAPI 12bytes, ioapic0: routing intpin 3 (PIO 65534bytesISA IRQ 3)) 
> > > > > > to lapic 1 vector 48
> > > > > > f
> > > > > > loiwotaapbilce0 :c lreoaunteirn gs tianrttpeidn
> > > > > >  4 (cd0: Attempt to query device size failed: NOT READY, Medium not 
> > > > > > present
> > > > > > ISA IRQ 4) to lapic 2 vector 48
> > > > > > ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 3 vector 48
> > > > > > ioapic0: routing intpin 15 (ISA IRQ 15) to lapic 1 vector 49
> > > > > > ioapic0: routing intpin 17 (PCI IRQ 17) to lapic 2 vector 49
> > > > > > ioapic0: routing intpin 18 (PCI IRQ 18) to lapic 3 vector 49
> > > > > > ioapic0: routing intpin 22 (PCI IRQ 22) to lapic 1 vector 50
> > > > > > ioapic0: routing intpin 23 (PCI IRQ 23) to lapic 2 vector 50
> > > > > > kernel trap 12 with interrupts disabled
> > > > > > 
> > > > > > 
> > > > > > Fatal trap 12: page fault while in kernel mode
> > > > > > cpuid = 0; apic id = 00
> > > > > > fault virtual address   = 0x10
> > > > > > fault code  = supervisor read data, page not present
> > > > > > instruction pointer = 0x20:0x808b1581
> > > > > > stack pointer   = 0x28:0x80ef5b20
> > > > > > frame pointer   = 0x28:0x80ef5b50
> > > > > > code segment= base 0x0, limit 0xf, type 0x1b
> > > > > > = DPL 0, pres 1, long 1, def32 0, gran 1
> > > > > > processor eflags= resume, IOPL = 0
> > > > > > current process = 0 (swapper)
&g

Re: panic on boot

2010-12-23 Thread Daniel Braniss

> On Thursday, December 23, 2010 1:47:39 am Daniel Braniss wrote:
> > > On Wednesday, December 22, 2010 10:58:56 am Daniel Braniss wrote:
> > > > > On Wednesday, December 22, 2010 5:12:03 am Daniel Braniss wrote:
> > > > > > the hardware is Sun Fire X2200 M2, and it's discless, PXE booted.
> > > > > > 
> > > > > > this seems to have started sometime before 8.2, and it
> > > > > > 'sometimes happens':
> > > > > > 
> > > > > > FreeBSD 8.2-PRERELEASE #15 r4274: Wed Dec 22 09:11:27 IST 2010c40, 
> > > > > > rbp = 
> > > > > > 0x80ef5c60 ---
> > > > > > da...@rnd:/home/obj/rnd/r+d/stable/8/sys/HUJI amd64
> > > > > > Timecounter "i8254" frequency 1193182 Hz quality 0
> > > > > > CPU: Dual-Core AMD Opteron(tm) Processor 2218 (2613.40-MHz K8-class 
> > > > > > CPU)
> > > > > >   Origin = "AuthenticAMD"  Id = 0x40f13  Family = f  Model = 41  
> > > > > > Stepping = 3
> > > > > >   
> > > > > > Features=0x178bfbff > > > > > CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
> > > > > >   Features2=0x2001
> > > > > >   AMD 
> > > > > > Features=0xea500800
> > > > > >   AMD Features2=0x1f
> > > > > > ...
> > > > > > SMP: AP CPU #3 Launched!
> > > > > > (cd0:ata0:0:0:0): SCSI status: Check Condition
> > > > > > cpu3 AP:
> > > > > > (cd0:ata0:0:0:0): SCSI sense: NOT READY asc:3a,0 (Medium not 
> > > > > > present)
> > > > > >  ID: 0x0300   VER: 0x80050010 LDR: 0x DFR: 
> > > > > > 0x
> > > > > > (cd0:  lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 
> > > > > > 0x01ff
> > > > > > ata0:0:  timer: 0x000200ef therm: 0x0001 err: 0x00f00: pmc: 
> > > > > > 0x000104000): 
> > > > > > Error 6, Unretryable error
> > > > > > SMP: AP CPU #2 Launched!
> > > > > > cd0 at ata0 bus 0 scbus0 target 0 lun 0
> > > > > > cpu2 AP:
> > > > > > cd0:  ID: 0x0200   VER: 0x80050010 LDR: 0x DFR: 
> > > > > > 0x
> > > > > >  Removable CD-ROM SCSI-0 device 
> > > > > >   lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 
> > > > > > 0x01ff
> > > > > > cd0: 33.300MB/s transfers  timer: 0x000200ef therm: 0x0001 err: 
> > > > > > 0x00f0 ( pmc: 0x00010400UDMA2, 
> > > > > > ATAPI 12bytes, ioapic0: routing intpin 3 (PIO 65534bytesISA IRQ 3)) 
> > > > > > to lapic 1 vector 48
> > > > > > f
> > > > > > loiwotaapbilce0 :c lreoaunteirn gs tianrttpeidn
> > > > > >  4 (cd0: Attempt to query device size failed: NOT READY, Medium not 
> > > > > > present
> > > > > > ISA IRQ 4) to lapic 2 vector 48
> > > > > > ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 3 vector 48
> > > > > > ioapic0: routing intpin 15 (ISA IRQ 15) to lapic 1 vector 49
> > > > > > ioapic0: routing intpin 17 (PCI IRQ 17) to lapic 2 vector 49
> > > > > > ioapic0: routing intpin 18 (PCI IRQ 18) to lapic 3 vector 49
> > > > > > ioapic0: routing intpin 22 (PCI IRQ 22) to lapic 1 vector 50
> > > > > > ioapic0: routing intpin 23 (PCI IRQ 23) to lapic 2 vector 50
> > > > > > kernel trap 12 with interrupts disabled
> > > > > > 
> > > > > > 
> > > > > > Fatal trap 12: page fault while in kernel mode
> > > > > > cpuid = 0; apic id = 00
> > > > > > fault virtual address   = 0x10
> > > > > > fault code  = supervisor read data, page not present
> > > > > > instruction pointer = 0x20:0x808b1581
> > > > > > stack pointer   = 0x28:0x80ef5b20
> > > > > > frame pointer   = 0x28:0x80ef5b50
> > > > > > code segment= base 0x0, limit 0xf, type 0x1b
> > > > > > = DPL 0, pres 1, long 1, def32 0, gran 1
> > > > > > processor eflags= resume, IOPL = 0
> > > > > > current process = 0 (swapper)
&g

unable to pwd in ZFS snapshot

2010-12-25 Thread Daniel Braniss

hi,
this is still broken in 8.2-PRERELEASE, there seems to be a patch, but
it's almost a year old.
http://people.freebsd.org/~jh/patches/zfs-ctldir-vptocnp.diff

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: unable to pwd in ZFS snapshot

2010-12-26 Thread Daniel Braniss

> On Sun, Dec 26, 2010 at 09:26:13AM +0200, Daniel Braniss wrote:
> > this is still broken in 8.2-PRERELEASE, there seems to be a patch, but
> > it's almost a year old.
> > http://people.freebsd.org/~jh/patches/zfs-ctldir-vptocnp.diff
> 
> Setting snapdir to visible should fix this right away:
> # zfs set snapdir=visible tank/foo
> 
it did indeed!
any reason why this should not be the default behaviour?

thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: unable to pwd in ZFS snapshot

2010-12-26 Thread Daniel Braniss

> > On 26 Dec 2010, at 10:05, Daniel Braniss wrote:
> > >> On Sun, Dec 26, 2010 at 09:26:13AM +0200, Daniel Braniss wrote:
> >>> this is still broken in 8.2-PRERELEASE, there seems to be a patch, =but
> >>> it's almost a year old.
> >>>   http://people.freebsd.org/~jh/patches/zfs-ctldir-vptocnp.diff
> >> 
> >> Setting snapdir to visible should fix this right away:
> >> # zfs set snapdir=visible tank/foo
> >> 
> > it did indeed!
> > any reason why this should not be the default behaviour?
> 
> Personally, I want to have the snapshot, but not see the directory otherwise 
> so that
> it doesn't get scooped up by rsync et al inadvertently

I agree, so the point is that as usual, the solution fixes one problem by 
creating
another one :-)

so basically, the bug is still there, or is it a feature?
ie:
ls  /h/.zfs/snapshot/20101225/  works
cd /h/.zfs/snapshot/20101225/   works
pwd
pwd: .: No such file or directory

btw, why use rsync if 'zfs send| zfs recv' work realy nice?

cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Specifying root mount options on diskless boot.

2011-01-02 Thread Daniel Braniss

> 
> --2iBwrppp/7QCDedR
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> Content-Transfer-Encoding: quoted-printable
> 
> [I'm not sure if -stable is the best list for this but anyway...]
> 
> I'm trying to convert an old laptop running FreeBSD 8.0 into a diskless
> client (since its internal HDD is growing bad spots faster than I can
> repair them).  I have it pxebooting nicely and running with an NFS root
> but it then reports locking problems: devd, syslogd, moused (and maybe
> others) lock their PID file to protect against multiple instances.
> Unfortunately, these daemons all start before statd/lockd and so the
> locking fails and reports "operation not supported".
> 
> It's not practical to reorder the startup sequence to make lockd start
> early enough (I've tried).
> 
> Since the filesystem is reserved for this client, there's no real need
> to forward lock requests across the wire and so specifying "nolockd"
> would be another solution.  Looking through sys/nfsclient/bootp_subr.c,
> DHCP option 130 should allow NFS mount options to be specified (though
> it's not clear that the relevant code path is actually followed because
> I don't see the associated printf()s anywhere on the console.  After
> getting isc-dhcpd to forward this option (made more difficult because
> its documentation is incorrect), it still doesn't work.
> 
> Understanding all this isn't helped by kenv(8) reporting three different
> sets of root filesystem options:
> boot.nfsroot.path=3D"/tank/m3"
> boot.nfsroot.server=3D"192.168.123.200"
> dhcp.option-130=3D"nolockd"
> dhcp.root-path=3D"192.168.123.200:/tank/m3"
> vfs.root.mountfrom=3D"nfs:server:/tank/m3"
> vfs.root.mountfrom.options=3D"rw,tcp,nolockd"
> 
> And the console also reports conflicting root definitions:
> Trying to mount root from nfs:server:/tank/m3
> NFS ROOT: 192.168.123.200:/tank/m3
> 
> Working through all these:
> boot.nfsroot.* appears to be initialised by sys/boot/i386/libi386/pxe.c
> but, whilst nfsclient/nfs_diskless.c can parse boot.nfsroot.options,
> there's no code to initialise that kenv name in pxe.c
> 
> dhcp.* appears to be initialised by lib/libstand/bootp.c - which does
> include code to populate boot.nfsroot.options (using vendor specific
> DHCP option 20) but this code is not compiled in.  Further studying
> of bootp.c shows that it's possible to initialise arbitrary kenv's
> using DHCP options 246-254 - but the DHCPDISCOVER packets do not
> request these options so they don't work without special DHCP server
> configuration (to forward options that aren't requested).
> 
> vfs.root.* is parsed out of /etc/fstab but, other than being
> reported in the console message above, it doesn't appear to be
> used in this environment (it looks like the root entry can be
> commented out of /etc/fstab without problem).
> 
> My final solution was to specify 'boot.nfsroot.options=3D"nolockd"' in
> loader.conf - and this seems to actually work.
> 
> It seems rather unfortunate that FreeBSD has code to allow NFS root
> mount options to be specified via DHCP (admittedly in several
> incompatible ways) but none actually work.  A quick look at -current
> suggests that the situation there remains equally broken.
> 
> Has anyone else tried to use any of this?  And would anyone be interested
> in trying to make it actually work?

Hi Peter,
i have beed doing diskless booting for a long time, and am very pleased
(though 8.2-prerelease is causing some problems :-).
In my case /var is mfs, or ufs/zfs, and have no lockd problems.

here is what you need to do:
either change in libstand/bootp.c:
#define DHCP_ENV DHCP_ENV_NO_VENDOR
to
#define DHCP_ENVDHCP_ENV_FREEBSD

or pick my version from:
ftp://ftp.cs.huji.ac.il/users/danny/freebsd/diskless-boot/
and compile a new pxeboot.
this new pxeboot will allow you to pass via dhcp some key options.

next, take a look at
  ftp://ftp.cs.huji.ac.il/users/danny/freebsd/diskless-boot/rc.initdiskless
make sure that your exported root has /.etc

If you'r /var is also nfs mounted, maybe unionfs might help too.

just writing quickly so you won't feel discouraged, and that diskless
actually works.

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

gstripe/gpart problems.

2011-01-04 Thread Daniel Braniss

Hi,
I have 2 ada disks striped:

# gstripe list
Geom name: s1
State: UP
Status: Total=2, Online=2
Type: AUTOMATIC
Stripesize: 65536
ID: 2442772675
Providers:
1. Name: stripe/s1
   Mediasize: 1000215674880 (932G)
   Sectorsize: 512
   Stripesize: 65536
   Stripeoffset: 0
   Mode: r0w0e0
Consumers:
1. Name: ada0
   Mediasize: 500107862016 (466G)
   Sectorsize: 512
   Mode: r0w0e0
   Number: 0
2. Name: ada1
   Mediasize: 500107862016 (466G)
   Sectorsize: 512
   Mode: r0w0e0
   Number: 1

boot complains:

GEOM_STRIPE: Device s1 created (id=2442772675).
GEOM_STRIPE: Disk ada0 attached to s1.
GEOM: ada0: corrupt or invalid GPT detected.
GEOM: ada0: GPT rejected -- may not be recoverable.
GEOM_STRIPE: Disk ada1 attached to s1.
GEOM_STRIPE: Device s1 activated.

# gpart show
=>34  1953546173  stripe/s1  GPT  (932G)
  34 128  1  freebsd-boot  (64K)
 162  1953546045 - free -  (932G)
# gpart show
=>34  1953546173  stripe/s1  GPT  (932G)
  34 128  1  freebsd-boot  (64K)
 162  1953546045 - free -  (932G)

# gpart add -t freebsd-ufs -s 20g stripe/s1
GEOM: ada0: corrupt or invalid GPT detected.
GEOM: ada0: GPT rejected -- may not be recoverable.
stripe/s1p2 added
# gpart show
=>34  1953546173  stripe/s1  GPT  (932G)
  34 128  1  freebsd-boot  (64K)
 16241943040  2  freebsd-ufs  (20G)
41943202  1911603005 - free -  (912G)

if I go the MBR road, all seems ok, but as soon as I try to write
the boot block (boot0cfg -B /dev/stripe/s1) again the kernel
starts to complain about corrupted GEOM too.

any ideas?
thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: gstripe/gpart problems.

2011-01-05 Thread Daniel Braniss

> On Tue, Jan 04, 2011 at 04:21:31PM +0200, Daniel Braniss wrote:
> > Hi,
> > I have 2 ada disks striped:
> > 
> > # gstripe list
> > Geom name: s1
> > State: UP
> > Status: Total=2, Online=2
> > Type: AUTOMATIC
> > Stripesize: 65536
> > ID: 2442772675
> > Providers:
> > 1. Name: stripe/s1
> >Mediasize: 1000215674880 (932G)
> >Sectorsize: 512
> >Stripesize: 65536
> >Stripeoffset: 0
> >Mode: r0w0e0
> > Consumers:
> > 1. Name: ada0
> >Mediasize: 500107862016 (466G)
> >Sectorsize: 512
> >Mode: r0w0e0
> >Number: 0
> > 2. Name: ada1
> >Mediasize: 500107862016 (466G)
> >Sectorsize: 512
> >Mode: r0w0e0
> >Number: 1
> > 
> > boot complains:
> > 
> > GEOM_STRIPE: Device s1 created (id=2442772675).
> > GEOM_STRIPE: Disk ada0 attached to s1.
> > GEOM: ada0: corrupt or invalid GPT detected.
> > GEOM: ada0: GPT rejected -- may not be recoverable.
> > GEOM_STRIPE: Disk ada1 attached to s1.
> > GEOM_STRIPE: Device s1 activated.
> > 
> > # gpart show
> > =>34  1953546173  stripe/s1  GPT  (932G)
> >   34 128  1  freebsd-boot  (64K)
> >  162  1953546045 - free -  (932G)
> > # gpart show
> > =>34  1953546173  stripe/s1  GPT  (932G)
> >   34 128  1  freebsd-boot  (64K)
> >  162  1953546045 - free -  (932G)
> > 
> > # gpart add -t freebsd-ufs -s 20g stripe/s1
> > GEOM: ada0: corrupt or invalid GPT detected.
> > GEOM: ada0: GPT rejected -- may not be recoverable.
> > stripe/s1p2 added
> > # gpart show
> > =>34  1953546173  stripe/s1  GPT  (932G)
> >   34 128  1  freebsd-boot  (64K)
> >  16241943040  2  freebsd-ufs  (20G)
> > 41943202  1911603005 - free -  (912G)
> > 
> > if I go the MBR road, all seems ok, but as soon as I try to write
> > the boot block (boot0cfg -B /dev/stripe/s1) again the kernel
> > starts to complain about corrupted GEOM too.
> 
> So are you trying to partition the drives and then stripe the
> partitions within the drives, or are you trying to partition the
> stripe?
> 
> It seems here as though you might be trying to first partition the
> drives (not clear on that) then stripe the whole drives - which will
> mean the partition info is wrong for the resulting striped drive set -
> and then repartition the striped drive set, and neither is ending up
> valid.
> 
> If what you are intending is to partition after striping the raw
> drives, then you are doing the right steps, but when the geom layer
> tries to look at the info on the individual drives as at boot, it will
> find it invalid.  If it the gpart layer is actually refusing to write
> partition info to the drives which is wrong for the drives taken
> individually, that would account for your problems.
> 
> One valid order to do things in would be partition the drives with
> gpart, creating identical sets of partitions on both drives, then
> stripe the partitions created within them (syntax not exact):
>  
> gpart add -t freebsd-ufs0 -s 10g ada0
> gpart add -t freebsd-ufs1 -s 10g ada1
> gstripe label freebsd-ufs freebsd-ufs0 freebsd-ufs1
> 
> That would give you a 20GB stripe, with valid partition info on each
> drive.
> 
> If this will be your boot drive, depending on how much needs to be read
> from the drive before the geom_stripe kernel module gets loaded, I
> would think there could also be a problem booting from the drive.  This
> is not like gmirroring two drives or partitions, where the info read
> from either disk early in boot will be identical, and identical (except
> for the last block of the partition) to what the OS sees later after
> the mirror is formed.
> 
> I assume you're bearing in mind that if you lose either drive to a
> hardware fault you lose the whole thing, and consider the risk worth
> the potential speed/size gain.
>   -- Clifton 

Hi Clifton,
I was getting very frustrated yesterday, hence the cripted message, your
response requieres some background :-)
the box is a Sun Fire X2200, which has bays for 2 disks, (we have several of 
these)
before the latest upgrade, the 2 disks were 'raided' via 'nVidia MediaShield' 
and
appeared as ar0, when I upgraded to 8.2, it disappeared, since I had in the 
kernel config file
ATA_CAM. So I starded fiddling with gstripe, which 'recoverd' the data.
Next, since the kernel boot kept complaining abouf GEOM errors, (and not 
w

Re: Serial console not working in 7.2-p4 and 7.2-STABLE

2009-11-09 Thread Daniel Braniss

> All,
> 
> I'm pulling my hair out on this one! Can't get the serial console to
> work with nanoBSD, either 7.2-p4 or 7-STABLE. A 8.0 nanoBSD image
> works fine (which I have not created myself). The symptom is that all
> kernel output goes to VGA. Whatever I do. This happens in VMware
> Player (where I actually see the VGA output) and on my ALIX
> (Soekris-like) board (which does not have a VGA card).
> 
> boot0 is boot0sio, boot.config contains -h and the loader works fine
> over the serial port. console=comconsole there so that should work,
> right? No, because still my kernel outputs everything to VGA...
> 
> I'm using the sio device. Even tried putting flags on 0x30 -> no
> difference at all. Tried the uart device and removing sio from my
> kernel but that resulted in having NO serial ports at all...
> 
> Any help is much appreciated!
> 
> Sven
hi,
put hint.uart.0.flags="0x10" in /boot/device.hints, or better, make sure
you have an updated one from /sys/i386/conf/GENERIC.hints
another thing, make sure the speed/bauds is correct, else you probably wont
see any output either
in /boot/loader.conf you need

console="comconsole,vidconsole"
and
comconsole_speed="115200"
to set the speed.
to get a login you will need, in /etc/ttys:
ttyu0   "/usr/libexec/getty 3wire.115200"   dialup  on secure

hope this helps
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

can't boot table-8 on HP Proliant DL580 G5

2009-11-12 Thread Daniel Braniss

Hi,
the boot stops somewhare after probing ata0, so far
playing with the BIOS (disabling stuff) does not help.
BTW, linux boots ok (except it has problems with IPMI)
So, any success stories there?

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

PCengines ALIX boot0sio serial input failes

2009-12-09 Thread Daniel Braniss

hi,
FreeBSD-8 works great on these boards, but there are some
gotchas, the boot and the serial: output works fine, but input
is 'problematic'. the pxeboot serial handling is ok, the boot menu
is ok, but booting off the CF (using boot0sio), the input 'screwy'
at the selection of partition it is ignored, at the OK: prompt
from the boot (i had no kernel in the slice), the input is usually
doubled:
sshooww instead of show
which is probably similar to what is happening with boot0sio but it
only echoes # (the current bell).

Once the kernel is up, the serial works fine.

any ideas?

thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: PCengines ALIX boot0sio serial input failes

2009-12-09 Thread Daniel Braniss

> On 12/9/2009 11:13 AM, Daniel Braniss wrote:
> > hi,
> > FreeBSD-8 works great on these boards, but there are some
> > gotchas, the boot and the serial: output works fine, but input
> > is 'problematic'. the pxeboot serial handling is ok, the boot menu
> > is ok, but booting off the CF (using boot0sio), the input 'screwy'
> > at the selection of partition it is ignored, at the OK: prompt
> > from the boot (i had no kernel in the slice), the input is usually
> > doubled:
> > sshooww instead of show
> > which is probably similar to what is happening with boot0sio but it
> > only echoes # (the current bell).
> > 
> > Once the kernel is up, the serial works fine.
> 
> The development version of pfSense (2.0) is running on FreeBSD 8.0 using
> NanoBSD and its serial input/output works pretty well on ALIX, the
> 2d3.2d13 version at least (and others, but those are the only two I have
> used personally).
> 
> My test ALIX is at home unplugged at the moment, but based on what I see
> in the image file there are a few things that were done:
> 
> /boot/device.hints contains:
> hint.uart.0.at="isa"
> hint.uart.0.port="0x3F8"
> hint.uart.0.flags="0x10"
> hint.uart.0.irq="4"
> 
> /boot.config contains:
>  -h
> 
> The initial boot0cfg on an image is done with:
> boot0cfg -B -b /path/to/boot/boot0sio -o packet -s 1 -m 3 
> 
> Here is what shows up when I mount an md device from a CF image:
> # boot0cfg -v /dev/md0
> #   flag start chs   type   end chs   offset size
> 1   0x80  0:  1: 1   0xa5444: 15:63   63   448497
> 2   0x00445:  1: 1   0xa5889: 15:63   448623   448497
> 3   0x00890:  0: 1   0xa5991: 15:63   897120   102816
> 
> version=2.0  drive=0x80  mask=0x3  ticks=182  bell=# (0x23)
> options=packet,update,nosetdrv
> volume serial ID 9090-9090
> default_selection=F1 (Slice 1)
> 
> Seems to work pretty well there. If you want the details, you can check
> out the pfSense tools git repository which contains the build scripts
> that generate the images.

I have the same /boot/device.hints.
can you confirm that
1) when booting from CF, the boot0sio accepts input
2) the /boot/boot accepts input from the serial?
thanks,
danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: PCengines ALIX boot0sio serial input failes

2009-12-09 Thread Daniel Braniss

> On Wednesday 09 December 2009 17:13:57 Daniel Braniss wrote:
> > hi,
> > FreeBSD-8 works great on these boards, but there are some
> > gotchas, the boot and the serial: output works fine, but input
> > is 'problematic'. the pxeboot serial handling is ok, the boot menu
> > is ok, but booting off the CF (using boot0sio), the input 'screwy'
> > at the selection of partition it is ignored, at the OK: prompt
> > from the boot (i had no kernel in the slice), the input is usually
> > doubled:
> > sshooww instead of show
> > which is probably similar to what is happening with boot0sio but it
> > only echoes # (the current bell).
> >
> > Once the kernel is up, the serial works fine.
> >
> > any ideas?
> >
> 
> Which ALIX board exactly? There are some differences (even various BIOSes).
> Any chance you have vga driver in kernel? TinyBIOS emulates VGA a bit, 
> redirects output to serial port. If at the beginning you are trying both VGA 
> and serial port, output is doubled. Similar behavior is observed on older 
> WRAP boards, too.

I have tried ALIX-1 and 2
here is an example:

PC Engines ALIX.3 v0.99h
640 KB Base Memory
261120 KB Extended Memory
Waiting for HDD ...

01F0 Master 848A SanDisk SDCFH2-002G 
Phys C/H/S 3970/16/63 Log C/H/S 992/64/63

1  FreeBSD
2  FreeBSD
3  FreeBSD

6 PXE
Boot:  1 

any key I hit, it echoes as # and is ignored.
at this point the kernel is not yet involved, so having vga+kb support
is not the reason, though I will try out the alix-3, which has vga support, and
a different BIOS soon.

thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: PCengines ALIX boot0sio serial input failes

2009-12-10 Thread Daniel Braniss

> On 12/10/2009 2:32 AM, Daniel Braniss wrote:
> >> Which ALIX board exactly? There are some differences (even various BIOSes).
> >> Any chance you have vga driver in kernel? TinyBIOS emulates VGA a bit, 
> >> redirects output to serial port. If at the beginning you are trying both 
> >> VGA 
> >> and serial port, output is doubled. Similar behavior is observed on older 
> >> WRAP boards, too.
> > 
> > I have tried ALIX-1 and 2
> > here is an example:
> > 
> > PC Engines ALIX.3 v0.99h
> > 640 KB Base Memory
> > 261120 KB Extended Memory
> > Waiting for HDD ...
> > 
> > 01F0 Master 848A SanDisk SDCFH2-002G 
> > Phys C/H/S 3970/16/63 Log C/H/S 992/64/63
> > 
> > 1  FreeBSD
> > 2  FreeBSD
> > 3  FreeBSD
> > 
> > 6 PXE
> > Boot:  1 
> > 
> > any key I hit, it echoes as # and is ignored.
> > at this point the kernel is not yet involved, so having vga+kb support
> > is not the reason, though I will try out the alix-3, which has vga support, 
> > and
> > a different BIOS soon.
i have now, and the results are:
 - serial works
 - bios boot skips boot0, and goes straight to boot slice 1.

The good side is that PXE boot works, but switching to boot
from disk is a pain, on other systems, hitting ^C at the dhcp
will stop it, and the boot will continue from disk, which if fails
(forgot some critical setup :-), reboot, fix, boot ^C ...

> 
> A lot of users have seen that happen, but typically it has been cleared
> up by using ALIX BIOS v0.99h, which that box already appears to have,
> and setting the BIOS for CHS mode.
> 
> I haven't tried any of the ALIX models with VGA, but I have heard they
> are working as long as you set the BIOS for APM power management.

it's actualy setting Power Management to anything but ACPI.

>  (See
> the previous -STABLE thread titled "8.0-rc2 dropped hardsupport".
> 
> Jim

cheers,
danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: iSCSI initiator and Dell PowerVault MD3000i

2009-12-15 Thread Daniel Braniss

> Hi all,
> I am playing with iscsi_initiator on FreeBSD 7-STABLE and Dell 
> PowerVault MD3000i. This is the first time I am testing iSCSI...
> 
> Does anyone have FreeBSD's iSCSI initiator in production / heavy load? 
> Or does somebody have experiences with Dell MD3000i?
> 
> One thing is "poor performance" ~ 60 - 70MB/s depending on RAID level 
> used. (poor performance compared to plain SATA disk which have 110MB/s - 
> both tested for reading as it is our planned load - multimedia streaming 
> and downloads)
> 
> 
> The other thing is some problem with compatibility of initiator and Dell 
> MD3000i.
> 
> If I setup RAID 5 'Disk Group' consisted of 4x 1TB SATA drives (in 
> MD3000i) and then created for example 2 'Virtual Disks', both are 
> detected by iscontrol and added to /dev/ as da0 and da1, but da1 spams 
> log with messages like this:
> 
> Dec 15 04:00:38 dust kernel: da0 at iscsi0 bus 0 target 0 lun 0
> Dec 15 04:00:38 dust kernel: da0:  Fixed Direct 
> Access SCSI-5 device
> Dec 15 04:00:38 dust kernel: da1 at iscsi0 bus 0 target 0 lun 1
> Dec 15 04:00:38 dust kernel: da1:  Fixed Direct 
> Access SCSI-5 device
> Dec 15 04:00:38 dust iscontrol[48576]: cam_open_btl: no passthrough 
> device found at 0:0:2
> Dec 15 04:00:38 dust iscontrol[48576]: cam_open_btl: no passthrough 
> device found at 0:0:3
> Dec 15 04:00:39 dust kernel: (da1:iscsi0:0:0:1): READ(6)/WRITE(6) not 
> supported, increasing minimum_cmd_size to 10.
> Dec 15 04:00:39 dust kernel: (da1:iscsi0:0:0:1): READ(10). CDB: 28 0 0 0 
> 0 0 0 0 1 0
> Dec 15 04:00:39 dust kernel: (da1:iscsi0:0:0:1): CAM Status: SCSI Status 
> Error
> Dec 15 04:00:39 dust kernel: (da1:iscsi0:0:0:1): SCSI Status: Check 
> Condition
> Dec 15 04:00:39 dust kernel: (da1:iscsi0:0:0:1): ILLEGAL REQUEST asc:94,1
> Dec 15 04:00:39 dust kernel: (da1:iscsi0:0:0:1): Vendor Specific ASC
> Dec 15 04:00:39 dust kernel: (da1:iscsi0:0:0:1): Unretryable error
> Dec 15 04:00:40 dust kernel: (da1:iscsi0:0:0:1): READ(10). CDB: 28 0 c 
> 7f df ff 0 0 1 0
> Dec 15 04:00:40 dust kernel: (da1:iscsi0:0:0:1): CAM Status: SCSI Status 
> Error
> Dec 15 04:00:40 dust kernel: (da1:iscsi0:0:0:1): SCSI Status: Check 
> Condition
> Dec 15 04:00:40 dust kernel: (da1:iscsi0:0:0:1): ILLEGAL REQUEST asc:94,1
> Dec 15 04:00:40 dust kernel: (da1:iscsi0:0:0:1): Vendor Specific ASC
> Dec 15 04:00:40 dust kernel: (da1:iscsi0:0:0:1): Unretryable error
> Dec 15 04:00:41 dust kernel: (da1:iscsi0:0:0:1): READ(10). CDB: 28 0 0 0 
> 0 0 0 0 1 0
> 
> The message repeated many times.
> 
> If I created more 'Virtual Disks' (7 for example), 3 of them are 
> producing same errors (da1, da3, da5)
> 
> If there is only one 'Virtual Disk', it seems fine... until I configured 
> second path to the virtual disk as I want to try gmultipath or geom_fox 
> (MD3000i is dual controller with 4 NICs), then second session produces 
> same errors.
> 
> First path - OK:
> 
> Dec 15 22:47:57 dust kernel: da0 at iscsi0 bus 0 target 0 lun 0
> Dec 15 22:47:57 dust kernel: da0:  Fixed Direct 
> Access SCSI-5 device
> Dec 15 22:47:57 dust iscontrol[52226]: cam_open_btl: no passthrough 
> device found at 0:0:1
> Dec 15 22:47:57 dust iscontrol[52226]: cam_open_btl: no passthrough 
> device found at 0:0:2
> Dec 15 22:47:57 dust iscontrol[52226]: cam_open_btl: no passthrough 
> device found at 0:0:3
> 
> 
> Second path - error:
> 
> Dec 15 22:48:04 dust kernel: da1 at iscsi0 bus 0 target 1 lun 0
> Dec 15 22:48:04 dust kernel: da1:  Fixed Direct 
> Access SCSI-5 device
> Dec 15 22:48:05 dust kernel: (da1:iscsi0:0:1:0): READ(6)/WRITE(6) not 
> supported, increasing minimum_cmd_size to 10.
> Dec 15 22:48:05 dust kernel: (da1:iscsi0:0:1:0): READ(10). CDB: 28 0 0 0 
> 0 0 0 0 1 0
> Dec 15 22:48:05 dust kernel: (da1:iscsi0:0:1:0): CAM Status: SCSI Status 
> Error
> Dec 15 22:48:05 dust kernel: (da1:iscsi0:0:1:0): SCSI Status: Check 
> Condition
> Dec 15 22:48:05 dust kernel: (da1:iscsi0:0:1:0): ILLEGAL REQUEST asc:94,1
> Dec 15 22:48:05 dust kernel: (da1:iscsi0:0:1:0): Vendor Specific ASC
> Dec 15 22:48:05 dust kernel: (da1:iscsi0:0:1:0): Unretryable error
> Dec 15 22:48:05 dust iscontrol[52230]: cam_open_btl: no passthrough 
> device found at 0:1:1
> Dec 15 22:48:05 dust iscontrol[52230]: cam_open_btl: no passthrough 
> device found at 0:1:2
> Dec 15 22:48:05 dust iscontrol[52230]: cam_open_btl: no passthrough 
> device found at 0:1:3
> Dec 15 22:48:06 dust kernel: (da1:iscsi0:0:1:0): READ(16). CDB: 88 0 0 0 
> 0 1 5d 21 1f ff 0 0 0 1 0 0
> Dec 15 22:48:06 dust kernel: (da1:iscsi0:0:1:0): CAM Status: SCSI Status 
> Error
> Dec 15 22:48:06 dust kernel: (da1:iscsi0:0:1:0): SCSI Status: Check 
> Condition
> Dec 15 22:48:06 dust kernel: (da1:iscsi0:0:1:0): ILLEGAL REQUEST asc:94,1
> Dec 15 22:48:06 dust kernel: (da1:iscsi0:0:1:0): Vendor Specific ASC
> Dec 15 22:48:06 dust kernel: (da1:iscsi0:0:1:0): Unretryable error
> Dec 15 22:48:07 dust kernel: (da1:iscsi0:0:1:0): READ(10). CDB: 28 0 0 0 
> 0 0 0 0 1 0
> Dec 15 22:48:07 dust kernel: (da

Re: iSCSI initiator and Dell PowerVault MD3000i

2009-12-17 Thread Daniel Braniss

> please Cc: me, I am not subscribed to freebsd-scsi
> 
> Sossi Andrej wrote:
>  >> On 16. 12. 2009 15:57, Miroslav Lachman wrote:
>  >> [...]
>  >> I use MD300i with FreeBSD 7.0 and 7.1 with iscsi-2.2.2. It work fine.
>  >> But be careful to configure MD3000i. MD3000i assign by default first
>  >> disk to preferred controller 0, second disk to preferred controller 1,
>  >> third disk to preferred controller 0, and so on. First, third, fifth...
>  >> disks is usable from FreeBSD, but second, fourth,... disks result 
> unusable.
>  >> Work around: manually assign all disks to controller 0.
>  >
>  > When you say "unusable" do you mean you can't access it at all / it
>  > errors even if it's the only path (drive) used? It would be normal if
>  > you have for example two paths to each drive and can't mount the other
>  > path if one path to the drive is mounted - this is not a usable
>  > combination. You can use geom_multipath to get multipath failover.
> 
> I got errors even in unmounted state.
> I tried iscsi-2.2.3 and got same errors. I tried second path first 
> (device da0) and it produces same errors, then I run iscontrol for the 
> first path (device da1) and everything is fine.
> 
>    path throught second controller: ERROR 
> # diskinfo -t /dev/da0
> /dev/da0
>  512 # sectorsize
>  2998998663168   # mediasize in bytes (2.7T)
>  5857419264  # mediasize in sectors
>  364607  # Cylinders according to firmware.
>  255 # Heads according to firmware.
>  63  # Sectors according to firmware.
> 
> Seek times:
>  Full stroke:diskinfo: read error or disk too small for 
> test.: Invalid argument
> 
> 
>    path throught first controller: OK 
> # diskinfo -t /dev/da1
> /dev/da1
>  512 # sectorsize
>  2998998663168   # mediasize in bytes (2.7T)
>  5857419264  # mediasize in sectors
>  364607  # Cylinders according to firmware.
>  255 # Heads according to firmware.
>  63  # Sectors according to firmware.
> 
> Seek times:
>  Full stroke:  250 iter in   2.483517 sec =9.934 msec
>  Half stroke:  250 iter in   2.575778 sec =   10.303 msec
>  Quarter stroke:   500 iter in   2.926170 sec =5.852 msec
>  Short forward:400 iter in   0.916901 sec =2.292 msec
>  Short backward:   400 iter in   2.181790 sec =5.454 msec
>  Seq outer:   2048 iter in   0.520920 sec =0.254 msec
>  Seq inner:   2048 iter in   0.545300 sec =0.266 msec
> Transfer rates:
>  outside:   102400 kbytes in   1.414997 sec =72368 
> kbytes/sec
>  middle:102400 kbytes in   1.45 sec =70405 
> kbytes/sec
>  inside:102400 kbytes in   1.422527 sec =71985 
> kbytes/sec
> 
the numbers seem ok to me, concidering that the net is 1Gb.
can you configure the target virtual disk to have luns?
in any case the errors seem to be in the md3000i, can you see/check its
error log?

> 
> Do you have experiences with iSCSI multipath? I read about geom_fox and 
> gmultipath...
i have no experience with it, and personaly see no benefit in it (but then
others might disagree :-)

danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

NFS/UDP and vfs.nfs.nfs_ip_paranoia=0 does not help

2010-02-13 Thread Daniel Braniss

Hi,
While trying to find out why our NSF/ZFS servers now hangs about once a
week, I got hold of a similiar box, and got a bit more ambitious, I connected
it via 2 NICs, to complicate things a bit, the server boots via pxeboot (ie, is
datatless). After fiddling with the default gateway, adding -h to rpcbind and
mountd, things seem ok, but UDP is 'problematic', I could do with TCP
except that am-utils does a fsinfo via UDP when doing a /net/ and will 
hang
the client.
even with vfs.nfs.nfs_ip_paranoia=0, when the response from the server
arrives with the 'wrong' ip, an ICMP destination unreachable (port unreachable)
is replied.
in short, on the client:
this works: mount_nfs -o mntudp server-ip-vlanA:/mnt /mnt
this fails: mount_nfs -o mntudp server-ip-vlanB:/mnt /mnt
since the response is coming from server-ip-vlanA.

Q: why does this work for TCP and fails for UDP
Q: is there a workaround?

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: RELENG_8 -- NFSv3 credentials/permissions issue

2010-02-20 Thread Daniel Braniss

> I'm willing to bet this is something simple I've overlooked, but I'm out
> of ideas.  Client is 8.0-RELEASE i386, server is 8.0-STABLE amd64
> (kernel/world 2010/01/16).  NFS version used is v3.  Server filesystem
> is UFS2.
at boot time, the NFS is V2!, if the server is FreeBSD it can be upgraded
later in the boot progress to V3
> 
> Client configuration is off-kilter: it's a PXE booted machine.  Initial
> PXE booting uses TFTP, then switches to NFS to load the kernel and
> kernel modules.  The TFTP part works, with a caveat[1], but the NFS
> portion fails.
TFTP is as old as the Internet, so it mostly works, and security was in dipers,
so the T for trivial also means un-secure :-)
> 
> With NFS, I'm forced to change permissions on all the exported
> files/directories to be 0644/0755 (specifically, setting other/global
> read/write access) otherwise the client gets back "Permission denied".
> The nfsd(8) man page implies that this shouldn't be necessary; adding
> -mapall=nobody:nobody or -maproot=nobody doesn't fix things either.
> 
why not use -maproot=root?
by adding -ro, the client will be able to read but not modify.
That's what we do here, the /etc is mounted via unionfs to a md, but
that is yet another solution.

>   In the absence of -maproot and -mapall options, remote accesses by root
>   will result in using a credential of -2:-2.  All other users will be
>   mapped to their remote credential.  If a -maproot option is given, remote
>   access by root will be mapped to that credential instead of -2:-2.  If a
>   -mapall option is given, all users (including root) will be mapped to
>   that credential in place of their own.  
> 
> Configuration data, tcpdump validation (client=192.168.1.140,
> server=192.168.1.51), and syslog data is below.
> 
> Ideas?
> 
> [1]: TFTP works as long as the file its trying to request (in this case
> /usr/local/freebsd8/boot/pxeboot) has its other/global read bit set,
> otherwise EACCESS is returned; I had to look in the tftpd source to
> figure this out.  I'm not sure what the justification is there, given
> that use of -s and/or -u switches credentials to user/group nobody...
> 
only root can read a file with mode 0, so you need to set the read bit for
any non root user.

> -- 
> | Jeremy Chadwick   j...@parodius.com |
> | Parodius Networking   http://www.parodius.com/ |
> | UNIX Systems Administrator  Mountain View, CA, USA |
> | Making life hard for others since 1977.  PGP: 4BD6C0CB |
> 
> 
> Relevant server configuration bits:
> 
> /etc/rc.conf
> ==
> rpcbind_enable="yes"
> rpcbind_flags="-l"
> mountd_enable="yes"
> mountd_flags="-r -l"
> nfs_server_enable="yes"
> 
> /etc/exports
> ==
> /usr/local/freebsd8   -network 192.168.1 -mask 255.255.255.0
> 
> Permissions
> =
> drwxr-xr-x  22 rootwheel512 Feb  6 12:25 /
> drwxr-xr-x  17 rootwheel512 Feb 12 03:38 /usr
> drwxr-xr-x  15 rootwheel512 Feb 19 10:41 /usr/local
> drwx--   5 nobody  nobody   512 Feb 19 10:42 /usr/local/freebsd8
> drwx--   7 nobody  nobody  1024 Nov 21 08:11 /usr/local/freebsd8/boot
> drwx--   2 nobody  nobody 12800 Nov 21 08:11 
> /usr/local/freebsd8/boot/kernel
> -r   1 nobody  nobody  11492703 Nov 21 07:48 
> /usr/local/freebsd8/boot/kernel/kernel
> 
> tcpdump
> =
> {...snipping TFTP portion...}
> 10:57:20.601313 IP 192.168.1.140.68 > 255.255.255.255.67: BOOTP/DHCP, Request 
> from 00:30:48:71:60:6b, length 548
> 10:57:20.601442 IP 192.168.1.51.67 > 192.168.1.140.68: BOOTP/DHCP, Reply, 
> length 323
> 10:57:20.601688 IP 192.168.1.140.68 > 255.255.255.255.67: BOOTP/DHCP, Request 
> from 00:30:48:71:60:6b, length 548
> 10:57:20.601782 IP 192.168.1.51.67 > 192.168.1.140.68: BOOTP/DHCP, Reply, 
> length 323
> 10:57:20.613056 IP 192.168.1.140.1023 > 192.168.1.51.111: UDP, length 76
> 10:57:20.613369 IP 192.168.1.51.111 > 192.168.1.140.1023: UDP, length 28
> 10:57:20.613556 IP 192.168.1.140.1023 > 192.168.1.51.947: UDP, length 84
> 10:57:20.613921 IP 192.168.1.51.947 > 192.168.1.140.1023: UDP, length 60
> 10:57:20.614055 IP 192.168.1.140.1023 > 192.168.1.51.111: UDP, length 76
> 10:57:20.614291 IP 192.168.1.51.111 > 192.168.1.140.1023: UDP, length 28
> 10:57:20.614432 IP 192.168.1.140.4 > 192.168.1.51.2049: 100 lookup fh 
> 1197,150310/6618112 "boot"
> 10:57:20.614458 IP 192.168.1.51.2049 > 192.168.1.140.4: reply ok 28 lookup 
> ERROR: Permission denied
> 10:57:20.615436 IP 192.168.1.140.1022 > 192.168.1.51.947: UDP, length 84
> 10:57:20.615677 IP 192.168.1.51.947 > 192.168.1.140.1022: UDP, length 60
> 10:57:20.615806 IP 192.168.1.140.6 > 192.168.1.51.2049: 100 lookup fh 
> 1197,150310/6618112 "boot"
> 10:57:20.615824 IP 192.168.1.51.2049 > 192.168.1.140.6: reply ok 28 lookup 
> ERROR: Permission denied
> 10:57:20.615929 IP 192.168.1.140.1021 > 192.168.1.51.947: UDP, length 84
> 10:57:20.616164 IP 192.168.1.51.9

Re: RELENG_8 -- NFSv3 credentials/permissions issue

2010-02-21 Thread Daniel Braniss

> On Sun, Feb 21, 2010 at 09:25:45AM +0200, Daniel Braniss wrote:
> > > I'm willing to bet this is something simple I've overlooked, but I'm out
> > > of ideas.  Client is 8.0-RELEASE i386, server is 8.0-STABLE amd64
> > > (kernel/world 2010/01/16).  NFS version used is v3.  Server filesystem
> > > is UFS2.
> > at boot time, the NFS is V2!, if the server is FreeBSD it can be upgraded
> > later in the boot progress to V3
> > > 
> > > Client configuration is off-kilter: it's a PXE booted machine.  Initial
> > > PXE booting uses TFTP, then switches to NFS to load the kernel and
> > > kernel modules.  The TFTP part works, with a caveat[1], but the NFS
> > > portion fails.
> > TFTP is as old as the Internet, so it mostly works, and security was in 
> > dipers,
> > so the T for trivial also means un-secure :-)
> > > 
> > > With NFS, I'm forced to change permissions on all the exported
> > > files/directories to be 0644/0755 (specifically, setting other/global
> > > read/write access) otherwise the client gets back "Permission denied".
> > > The nfsd(8) man page implies that this shouldn't be necessary; adding
> > > -mapall=nobody:nobody or -maproot=nobody doesn't fix things either.
> > > 
> > why not use -maproot=root?
> > by adding -ro, the client will be able to read but not modify.
> > That's what we do here, the /etc is mounted via unionfs to a md, but
> > that is yet another solution.
> 
> I'll have to try that (shouldn't take me long), but I remember messing
> with -maproot and -mapall both and wasn't able to get anywhere.  I'll
> try again and report back.
> 
> > > Configuration data, tcpdump validation (client=192.168.1.140,
> > > server=192.168.1.51), and syslog data is below.
> > > 
> > > Ideas?
> > > 
> > > [1]: TFTP works as long as the file its trying to request (in this case
> > > /usr/local/freebsd8/boot/pxeboot) has its other/global read bit set,
> > > otherwise EACCESS is returned; I had to look in the tftpd source to
> > > figure this out.  I'm not sure what the justification is there, given
> > > that use of -s and/or -u switches credentials to user/group nobody...
> > > 
> > only root can read a file with mode 0, so you need to set the read bit for
> > any non root user.
> 
> I'm not sure if you're referring to NFS here, or my TFTP comment.  My
> TFTP comment should be discussed elsewhere -- it's broken/odd behaviour,
> but the workaround for TFTP (to set the file permissions to 0644 for
> read) I'm fine with -- it's TFTP!  :-)
> 
if the owner does not have read permition, it wont be able to read the file,
no matter that the other read bits are enabled.

% date > 0
% chmod 04 0
% cat 0
cat: 0: Permission denied
% chmod 040 0
% cat 0
cat: 0: Permission denied
% chmod 0400 0
% cat 0
Sun Feb 21 11:47:32 IST 2010
%
this answers the TFTP problem.

> With regards to NFS: none of the files below are mode .  The request
> made via NFS should have gotten "translated" to being done by
> nobody:nobody on the NFS server, since there's no -mapall or -maproot
> line in the exports; user nobody has read access to everything shown
> below, so "Permission denied" makes no sense.
> 
as I mentioned before/above, maybe not so clearly, the initial NFS transactions
are done via NFS/V2 - which is problematic/broken[1], and so probably
the access permitions are not exactly what we expect.

[1]: rm /any-file in a read-only exported fs will hang the client

> > > Permissions
> > > =
> > > drwxr-xr-x  22 rootwheel512 Feb  6 12:25 /
> > > drwxr-xr-x  17 rootwheel512 Feb 12 03:38 /usr
> > > drwxr-xr-x  15 rootwheel512 Feb 19 10:41 /usr/local
> > > drwx--   5 nobody  nobody   512 Feb 19 10:42 /usr/local/freebsd8
> > > drwx--   7 nobody  nobody  1024 Nov 21 08:11 
> > > /usr/local/freebsd8/boot
> > > drwx--   2 nobody  nobody 12800 Nov 21 08:11 
> > > /usr/local/freebsd8/boot/kernel
> > > -r   1 nobody  nobody  11492703 Nov 21 07:48 
> > > /usr/local/freebsd8/boot/kernel/kernel
> > > 
> > > tcpdump
> > > =
> > > {...snipping TFTP portion...}
> > > 10:57:20.601313 IP 192.168.1.140.68 > 255.255.255.255.67: BOOTP/DHCP, 
> > > Request from 00:30:48:71:60:6b, length 548
> > > 10:57:20.601442 IP 192.168.1.51.67 > 192.168.1.140.68: BOOTP/DHCP, Reply, 
> > > length 323
>

Re: RELENG_8 -- NFSv3 credentials/permissions issue

2010-02-21 Thread Daniel Braniss

> On Sun, Feb 21, 2010 at 12:02:28PM +0200, Daniel Braniss wrote:
> > > I'm not sure if you're referring to NFS here, or my TFTP comment.  My
> > > TFTP comment should be discussed elsewhere -- it's broken/odd behaviour,
> > > but the workaround for TFTP (to set the file permissions to 0644 for
> > > read) I'm fine with -- it's TFTP!  :-)
> > > 
> > if the owner does not have read permition, it wont be able to read the file,
> > no matter that the other read bits are enabled.
> > 
> > % date > 0
> > % chmod 04 0
> > % cat 0
> > cat: 0: Permission denied
> > % chmod 040 0
> > % cat 0
> > cat: 0: Permission denied
> > % chmod 0400 0
> > % cat 0
> > Sun Feb 21 11:47:32 IST 2010
> > %
> > this answers the TFTP problem.
> 
> Actually it doesn't.  Are you familiar with C?  If so, have a look at
> this piece of the source code (src/libexec/tftpd/tftpd.c):
> 
> 586 int
> 587 validate_access(char **filep, int mode)
> 588 {
> ...
> 618 if (stat(filename, &stbuf) < 0)
> 619 return (errno == ENOENT ? ENOTFOUND : EACCESS);
> 620 if ((stbuf.st_mode & S_IFMT) != S_IFREG)
> 621 return (ENOTFOUND);
> 622 if (mode == RRQ) {
> 623 if ((stbuf.st_mode & S_IROTH) == 0)
> 624 return (EACCESS);
> 625 } else {
> 626 if ((stbuf.st_mode & S_IWOTH) == 0)
> 627 return (EACCESS);
> 628 }
> ...
> 694 return (0);
> 695 }
> 
> This function is called whenever there's a request of any sort via TFTP
> (such as file retrieval (read) or file storage (write)).  In this
> context, RRQ = "read request".
> 
> The above code explicitly requires the global/other read (or write, if
> the request is to write data) bit be set on the files being
> requested/written to, otherwise EACCESS ("Access Denied") is returned to
> the client.  This is *regardless* of who owns the file.  See the stat(2)
> man page for verification of S_IROTH and S_IWOTH bits.
> 
> This is justified *unless* UID switching is present -- meaning, if the
> -s option (and/or -u) is used.  If -s is used but no -u is specified,
> the daemon switches to user "nobody" by default.  But regardless of what
> user the daemon switches to, its code still explicitly requires the
> global read or global write bits be set on the files.
> 
> IMHO, the above permissions checks should be removed if -s is in effect.
> 
the code is only usefull if running as root (and questionable too).
I agree, the code is useless, it should use access(2), but tftpd predates it 
:-(

> > > With regards to NFS: none of the files below are mode .  The request
> > > made via NFS should have gotten "translated" to being done by
> > > nobody:nobody on the NFS server, since there's no -mapall or -maproot
> > > line in the exports; user nobody has read access to everything shown
> > > below, so "Permission denied" makes no sense.
> > > 
> > as I mentioned before/above, maybe not so clearly, the initial NFS 
> > transactions
> > are done via NFS/V2 - which is problematic/broken[1], and so probably
> > the access permitions are not exactly what we expect.
> > 
> > [1]: rm /any-file in a read-only exported fs will hang the client
> > 
> > > > > Permissions
> > > > > =
> > > > > drwxr-xr-x  22 rootwheel512 Feb  6 12:25 /
> > > > > drwxr-xr-x  17 rootwheel512 Feb 12 03:38 /usr
> > > > > drwxr-xr-x  15 rootwheel512 Feb 19 10:41 /usr/local
> > > > > drwx--   5 nobody  nobody   512 Feb 19 10:42 
> > > > > /usr/local/freebsd8
> > > > > drwx--   7 nobody  nobody  1024 Nov 21 08:11 
> > > > > /usr/local/freebsd8/boot
> > > > > drwx--   2 nobody  nobody 12800 Nov 21 08:11 
> > > > > /usr/local/freebsd8/boot/kernel
> > > > > -r   1 nobody  nobody  11492703 Nov 21 07:48 
> > > > > /usr/local/freebsd8/boot/kernel/kernel
> 
> Okay, so then you're saying it's a bug of some sort in NFSv2, not NFSv3.
>
yes

> But the above (and below, see tcpdump) files are not attempting to be
> removed nor written to -- they're attempting to be read.

I mentioned the rm bug to show that there is at least a well known problem, and
your problem seems to point to yet another one.

> Should I file a PR for this problem?  IMHO, it's a pretty serious
> oversight (it effectively means user/group ownership means jack squat
> with NFSv2).

well, V2 is quiet dead, and I doubt anyone is willing to look into it,
what would be nice if pxeboot is upgraded to use NFS/V3 - before it becomes
obsolete too :-)

danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

ahcich3: Timeout on slot 0 ...

2010-02-23 Thread Daniel Braniss

hi,
with latest 8-stable, I can't boot since it's stuck with:
...
ahci0:  port 0xb880-0xb887,0xb800-0xb803,0xb48
0-0xb487,0xb400-0xb403,0xb080-0xb09f mem 0xfe7fa800-0xfe7fafff irq 22 at 
device 31.2 on pci0
ahci0: [ITHREAD]
ahci0: AHCI v1.20 with 4 3Gbps ports, Port Multiplier supported
ahcich0:  at channel 0 on ahci0
ahcich0: [ITHREAD]
ahcich1:  at channel 1 on ahci0
ahcich1: [ITHREAD]
ahcich2:  at channel 4 on ahci0
ahcich2: [ITHREAD]
ahcich3:  at channel 5 on ahci0
ahcich3: [ITHREAD]
...

umass0:4:0:-1: Attached to scbus4
(probe0:umass-sim0:0:0:0): TEST UNIT READY. CDB: 0 0 0 0 0 0 
(probe0:umass-sim0:0:0:0): CAM status: SCSI Status Error
(probe0:umass-sim0:0:0:0): SCSI status: Check Condition
(probe0:umass-sim0:0:0:0): SCSI sense: NOT READY asc:3a,0 (Medium not present)
(probe0:umass-sim0:0:0:1): TEST UNIT READY. CDB: 0 20 0 0 0 0 
(probe0:umass-sim0:0:0:1): CAM status: SCSI Status Error
(probe0:umass-sim0:0:0:1): SCSI status: Check Condition
(probe0:umass-sim0:0:0:1): SCSI sense: NOT READY asc:3a,0 (Medium not present)
(probe0:umass-sim0:0:0:2): TEST UNIT READY. CDB: 0 40 0 0 0 0 
(probe0:umass-sim0:0:0:2): CAM status: SCSI Status Error
(probe0:umass-sim0:0:0:2): SCSI status: Check Condition
(probe0:umass-sim0:0:0:2): SCSI sense: NOT READY asc:3a,0 (Medium not present)
(probe0:umass-sim0:0:0:3): TEST UNIT READY. CDB: 0 60 0 0 0 0 
(probe0:umass-sim0:0:0:3): CAM status: SCSI Status Error
(probe0:umass-sim0:0:0:3): SCSI status: Check Condition
(probe0:umass-sim0:0:0:3): SCSI sense: NOT READY asc:3a,0 (Medium not present)
ahcich3: Timeout on slot 0
ahcich3: is  cs  ss  rs 0001 tfd 50 serr 
run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config
ahcich3: Timeout on slot 0
ahcich3: is  cs  ss  rs 0001 tfd 50 serr 
ahcich3: Timeout on slot 0
ahcich3: is  cs  ss  rs 0001 tfd 50 serr 
run_interrupt_driven_hooks: still waiting after 120 seconds for xpt_config
...

with a slightly older kernel all is ok.
...
Trying to mount root from nfs:
(probe0:umass-sim0:0:0:0): TEST UNIT READY. CDB: 0 0 0 0 0 0 
(probe0:umass-sim0:0:0:0): CAM Status: SCSI Status Error
(probe0:umass-sim0:0:0:0): SCSI Status: Check Condition
(probe0:umass-sim0:0:0:0): NOT READY asc:3a,0
(probe0:umass-sim0:0:0:0): Medium not present
(probe0:umass-sim0:0:0:0): Unretryable error
da0 at umass-sim0 bus 0 scbus4 target 0 lun 0
da0:  Removable Direct Access SCSI-0 device 
da0: 40.000MB/s transfers
da0: Attempt to query device size failed: NOT READY, Medium not present

the only wierd thing I see, is the
Trying to mount root from nfs:
which does not happen in the failing kernel.

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ahcich3: Timeout on slot 0 ...

2010-02-23 Thread Daniel Braniss

> Daniel Braniss wrote:
> > with latest 8-stable, I can't boot since it's stuck with:
> 
> > the only wierd thing I see, is the
> > Trying to mount root from nfs:
> > which does not happen in the failing kernel.
> 
> Could you show full verbose boot messages?

here it comes ...

GDB: no debug ports present
KDB: debugger backends: ddb
KDB: current backend: ddb
SMAP type=01 base= len=0009ec00
SMAP type=02 base=0009ec00 len=1400
SMAP type=02 base=000e4000 len=0001c000
SMAP type=01 base=0010 len=7f58
SMAP type=03 base=7f68 len=e000
SMAP type=04 base=7f68e000 len=00052000
SMAP type=02 base=7f6e len=0002
SMAP type=02 base=fee0 len=1000
SMAP type=02 base=fff0 len=0010
Copyright (c) 1992-2010 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.0-STABLE #0 r1589: Tue Feb 23 09:10:52 IST 2010
da...@sunfire:/r+d/obj/sunfire/r+d/stable/8/sys/HUJI amd64
Preloaded elf kernel "/boot/kernel/kernel" at 0x80e8f000.
Preloaded elf obj module "/boot/kernel/ahci.ko" at 0x80e8f1c0.
Timecounter "i8254" frequency 1193182 Hz quality 0
Calibrating TSC clock ... TSC clock: 258570 Hz
CPU: Intel(R) Core(TM)2 Duo CPU E6850  @ 3.00GHz (2999.96-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x6fb  Stepping = 11
  
Features=0xbfebfbff
  Features2=0xe3fd
  AMD Features=0x20100800
  AMD Features2=0x1
  TSC: P-state invariant
real memory  = 2147483648 (2048 MB)
Physical memory chunk(s):
0x1000 - 0x0009afff, 630784 bytes (154 pages)
0x00ebd000 - 0x7ba66fff, 2059051008 bytes (502698 pages)
avail memory = 2046427136 (1951 MB)
ACPI APIC Table: 
INTR: Adding local APIC 1 as a target
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
APIC: CPU 0 has ACPI ID 1
APIC: CPU 1 has ACPI ID 2
ULE: setup cpu 0
ULE: setup cpu 1
ACPI: RSDP 0xfb790 00014 (v0 ACPIAM)
ACPI: RSDT 0x7f68 00040 (v1 _ASUS_ Notebook 1829 MSFT 0097)
ACPI: FACP 0x7f680200 00084 (v2 A_M_I_ OEMFACP  1829 MSFT 0097)
ACPI: DSDT 0x7f6805c0 087ED (v1  A0827 A0827000  INTL 20060113)
ACPI: FACS 0x7f68e000 00040
ACPI: APIC 0x7f680390 0006C (v1 A_M_I_ OEMAPIC  1829 MSFT 0097)
ACPI: MCFG 0x7f680400 0003C (v1 A_M_I_ OEMMCFG  1829 MSFT 0097)
ACPI: SLIC 0x7f680440 00176 (v1 _ASUS_ Notebook 1829 MSFT 0097)
ACPI: OEMB 0x7f68e040 00081 (v1 A_M_I_ AMI_OEM  1829 MSFT 0097)
ACPI: HPET 0x7f688db0 00038 (v1 A_M_I_ OEMHPET  1829 MSFT 0097)
ACPI: GSCI 0x7f68e0d0 02024 (v1 A_M_I_ GMCHSCI  1829 MSFT 0097)
MADT: Found IO APIC ID 2, Interrupt 0 at 0xfec0
ioapic0: Routing external 8259A's -> intpin 0
MADT: Interrupt override: source 0, irq 2
ioapic0: Routing IRQ 0 -> intpin 2
MADT: Interrupt override: source 9, irq 9
ioapic0: intpin 9 trigger: level
ioapic0  irqs 0-23 on motherboard
cpu0 BSP:
 ID: 0x   VER: 0x00050014 LDR: 0x DFR: 0x
  lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff
  timer: 0x000100ef therm: 0x0001 err: 0x0001000f pcm: 0x00010400
wlan: <802.11 Link Layer>
kbd: new array size 4
kbd1 at kbdmux0
nfslock: pseudo-device
mem: 
null: 
random: 
io: 
hptrr: RocketRAID 17xx/2xxx SATA controller driver v1.2
acpi0: <_ASUS_ Notebook> on motherboard
PCIe: Memory Mapped configuration base @ 0xe000
ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 0 vector 48
acpi0: [MPSAFE]
acpi0: [ITHREAD]
ACPI: Executed 1 blocks of module-level executable AML code
acpi0: Power Button (fixed)
acpi0: wakeup code va 0xff806000 pa 0x4000
AcpiOsDerivePciId: \_SB_.PCI0.SBRG.IELK.RXA0 -> bus 0 dev 0 func 0
AcpiOsDerivePciId: \_SB_.PCI0.SBRG.PIX0 -> bus 0 dev 31 func 0
acpi0: reservation of 0, a (3) failed
acpi0: reservation of 10, 7f60 (3) failed
ACPI timer: 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 -> 10
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
pci_link0:Index  IRQ  Rtd  Ref  IRQs
  Initial Probe   0   10   N 0  3 4 5 6 7 10 11 12 14 15
  Validation  0   10   N 0  3 4 5 6 7 10 11 12 14 15
  After Disable   0  255   N 0  3 4 5 6 7 10 11 12 14 15
pci_link1:Index  IRQ  Rtd  Ref  IRQs
  Initial Probe   0   11   N 0  3 4 5 6 7 10 11 12 14 15
  Validation  0   11   N 0  3 4 5 6 7 10 11 12 14 15
  After Disable   0  255   N 0  3 4 5 6 7 10 11 12 14 15
pci_link2:

Re: em0 freezes on ZFS server

2010-02-26 Thread Daniel Braniss

> On Fri, 26 Feb 2010 13:31:38 +0100 Gerrit Kühn
>  wrote about Re: em0 freezes on ZFS server:
> 
> GK> JC> Note how close the "current" value is to that of "total".  I'm not
> GK> JC> too surprised you're seeing what you are as a result of this.
> GK> JC> What on earth is this machine doing at all times?
> 
> GK> Is there any way I could find out what is actually using these buffers?
> 
> Sorry for replying to my own email:
> At least in my case I found out what is eating the buffers: nfsd does!
> The buffers stop increasing as soon as I stop nfsd. However, they start
> increasing as soon as I start nfsd again.
> Are there any ideas how to fix this? Downgrading back to 7-stable is not
> really an easy task as far as I know, and I need the server to run without
> having to reboot it once for twice a day...

I want to add some spices to this stew: :-)
I have this big server (> 10 TB) which was running pretty much without major
problems, till one morning it started panicking because some 'ZFS * credential 
*',
Since this server is used by many and uptime being a priority,
I upgraded it to 8-stable, the panic went away, one problem solved.

Some few day later it hung, and it's now hanging every few days.
Most of the hangs are because there is no network, but the NIC is bce not em!
I doubled kern.ipc.nmbclusters and lets see what happens ...

netstat -m:
23066/6634/29700 mbufs in use (current/cache/total)
22072/5942/28014/51200 mbuf clusters in use (current/cache/total/max)
22021/2939 mbuf+clusters out of packet secondary zone in use (current/cache)

hope this helps in finding a cure,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: em0 freezes on ZFS server

2010-02-26 Thread Daniel Braniss

> On Fri, 26 Feb 2010 15:04:37 +0200 Daniel Braniss 
> wrote about Re: em0 freezes on ZFS server :
> 
> DB> > At least in my case I found out what is eating the buffers: nfsd
> DB> > does! The buffers stop increasing as soon as I stop nfsd. However,
> DB> > they start increasing as soon as I start nfsd again.
> DB> > Are there any ideas how to fix this? Downgrading back to 7-stable is
> DB> > not really an easy task as far as I know, and I need the server to
> DB> > run without having to reboot it once for twice a day...
> 
> DB> I want to add some spices to this stew: :-)
> 
> You're welcome. :-)
> 
> DB> Some few day later it hung, and it's now hanging every few days.
> DB> Most of the hangs are because there is no network, but the NIC is bce
> DB> not em! I doubled kern.ipc.nmbclusters and lets see what happens ...
> 
> Do you have nfsd running and serving clients? If so, we should maybe
> change the topic to something like "possible nfs mbuf leakage"...
> 
it's only purpose in life is a nfs server.
but I wouldn't exclude zfs from the equation yet.
I have othere nfs servers, not doing zfs and I don't see this.

> DB> 23066/6634/29700 mbufs in use (current/cache/total)
> 
> My server is at 22k now, and the buffer number is still increasing every
> few seconds...
> Can you monitor your mbuf usage and report if it grows?
> 
I am, and in the last 2hs. it grew by about 300, it does oscilate, i.e. it 
grows some, then
it goes down, but it seems that the low always increases.

when I have enough data i'll plot it.

Cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: em0 freezes on ZFS server

2010-02-26 Thread Daniel Braniss


> when I have enough data i'll plot it.
> 
check:
ftp://ftp.cs.huji.ac.il/users/danny/freebsd/plot.ps
x is seconds, y is mbus current.

> Cheers,
>   danny
> 
> 
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
> 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: mbuf leakage with nfs/zfs? (was: em0 freezes on ZFS server)

2010-02-26 Thread Daniel Braniss

> On Fri, 26 Feb 2010 17:41:02 +0200 Daniel Braniss 
> wrote about Re: em0 freezes on ZFS server :
> 
> DB> check:
> DB>   ftp://ftp.cs.huji.ac.il/users/danny/freebsd/plot.ps
> DB> x is seconds, y is mbus current.
> 
> Looks not as bad as mine. I had 37k when I rebooted the machine some
> minutes ago (and it's basically idle, just serving a few nfs clients that
> don't do much).
> But from the values Jeremy has posted and from my own comparsisons here I
> would think that something like 5k of mbuf clusters would be normal for my
> machine (and probably also for yours).
> 
> Some more info from my side:
> In the meantime I also tried a different network interface. The
> nfe-interface that is onboard causes the same problems, so it is probably
> not an em-specific issue.
> Furthermore I found this via Google:
> <http://lists.freebsd.org/pipermail/freebsd-current/2009-December/014062.html>.
I'll have to do some packet snooping to check if it's TCP or UDP nfs traffic,
since some of the clients are Linux ...

> I patched and recompiled my kernel with this, just to try it out. Right
> now I have
> 
> 2264/1321/3585 mbufs in use (current/cache/total)
> 1239/1017/2256/65000 mbuf clusters in use (current/cache/total/max)
> 1239/809 mbuf+clusters out of packet secondary zone in use (current/cache)
> 
> but the uptime is only 12min so far. In some hours I'll know for certain
> if this patch has anything to do with the problem.

at the moment there is not much activity, but if you check the latest plot.ps 
you will
see that the bottom is slowly increasing, so my bet is that there must be some
leakage!

cheers
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: mbuf leakage with nfs/zfs?

2010-02-26 Thread Daniel Braniss

> On Fri, 26 Feb 2010 23:12:39 +0100 Willem Jan Withagen 
> wrote about Re: mbuf leakage with nfs/zfs?:
> 
> WJW> > DB>  I'll have to do some packet snooping to check if it's TCP or
> WJW> > DB> UDP nfs traffic, since some of the clients are Linux ...
> 
> WJW> > I have Linux clients, too. Some use tcp, some udp.
> 
> WJW> I have Linux and FreeBSD clients running. The build system runs on 
> WJW> Linux. All Linux's are UDP
> 
> Another shot in the dark:
> After upgrading the server, all my Linux clients hang with "stale nfs
> dir/file handle/whatever". I was not able to umount them (not even
> forcefully). I had to use either lazy forceful umount (-fl) or reboot. Some
> of these clients are still hanging around, because they are physically
> hard to access (clean room installs etc.). Maybe these clients still try to
> establish connections that eat up the buffers and never come back?

I doubt it, but here is another shot:
are we all running samba? I'm asking because the lock manager keeps dying and 
...

cheers,
danny
PS: I dropped Jack from the CC, I think em is innocent :-)

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: mbuf leakage with nfs/zfs?

2010-02-27 Thread Daniel Braniss

> On Sat, 27 Feb 2010 09:24:10 +0200 Daniel Braniss 
> wrote about Re: mbuf leakage with nfs/zfs? :
> 
> DB> I doubt it, but here is another shot:
> DB> are we all running samba? I'm asking because the lock manager keeps
> DB> dying and ...
> 
> Nope, no samba on my side. I am running lockd and statd on the server, but
> stoppeing them does not change anything. All clients are using option
> nolock anyway.
> 
it was a shot in the dark.
anyways, I am running tests on an 'unused' server, only me using it to 'make 
world'
and it's leaking.

cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: mbuf leakage with nfs/zfs?

2010-02-27 Thread Daniel Braniss

> On Sat, 27 Feb 2010 11:14:56 +0200 Daniel Braniss 
> wrote about Re: mbuf leakage with nfs/zfs? :
> 
> DB> anyways, I am running tests on an 'unused' server, only me using it to
> DB> 'make world'
> DB> and it's leaking.
> 
> Hm, I've got a server with 8-PRE from somewhen in Nov09 that is serving
> nfs from zfs fine and shows no leakage...
> 
> 
> cu
>   Gerrit

the binary search has started!
sorry, have to go know :-) [realy], but should be back in a couple of hours,
let me know if you managed to pin it down, else I can continue.

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: mbuf leakage with nfs/zfs?

2010-02-27 Thread Daniel Braniss

> On Sat, 27 Feb 2010 12:26:02 +0200 Daniel Braniss 
> wrote about Re: mbuf leakage with nfs/zfs? :
> 
> 
> DB> > Hm, I've got a server with 8-PRE from somewhen in Nov09 that is
> DB> > serving nfs from zfs fine and shows no leakage...
> 
> DB> the binary search has started!
> 
> After considering the last email from Willem: My 8-PRE server does not
> have udp Linux clients, only Linux with tcp. If indeed Linux with udp is
> causing the problem, it may very well even be in 8-PRE, and I just did not
> see it so far.

I have been running for the last few hours, 8-rel, and the only client is 
another
8-stable, furthermore, no ZFS, just plain UFS, and the leak is there!
I am now trying 8-rc2 but will check in the morning, it is after all saturday 
night :-)

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: mbuf leakage with nfs/zfs?

2010-02-28 Thread Daniel Braniss

> On Sat, Feb 27, 2010 at 10:53:00PM +0100, Willem Jan Withagen wrote:
> > On 27-2-2010 21:32, Eirik Øverby wrote:
> > >I've had a discussion with some folks on this for a while. I can easil=y
> > >reproduce this situation by mounting a FreeBSD ZFS filesystem via
> > >NFS-UDP from an OpenBSD machine. Telling the OpenBSD machine to use TC=P
> > >instead of UDP makes the problem go away.
> > >
> > >Other FreeBSD systems mounting the same share, either using UDP or TCP=,
> > >does not cause the problem to show up.
> > >
> > >A patch was suggested by Rick Macklem, but that did not solve the issu=e:
> > >http://lists.freebsd.org/pipermail/freebsd-current/2009-December/01418=1.html>
> > > > 
> > I concur.
> > Everything in my network is now on TCP, and there is no mbuf leakage.
> > I just don't get over the 5500 mark, no matter what I throw at it.
> > 
> > I do feel that TCP is not as well performing on a local net with Linux,
> > hence the choice for UDP. But TCP is workable as next best.
> 
> I'm pulling in Robert Watson, who has some familiarity with the UDP
> stack/code in FreeBSD.  I'm not sure he'll be a sufficient source of
> knowledge for this specific issue since it appears (?) to be specific to
> NFS; Rick Macklem would be a better choice, but as reported, he's MIA.
> 
> Robert, are you aware of any changes or implementation issues which
> might cause excessive (read: leaking) mbuf use under UDP-based NFS?  Do
> you know of a way folks could determine the source of the leak, either
> via DDB or while the system is live?

I have been runing some tests in a controlled environment.

server and client are both 64bit Xeon/X5550 @  2.67GHz with 16Gb of memory
FreeBSD/SMP: 2 package(s) x 4 core(s) x 2 SMT threads

the client is runing latest 8.0 stable
the load is created by runing 'make -j32 buildworld' and sleeping 150 sec.
in between runs, this is the straight line you will see in the graphs.
Both the src and obj directories are NFS mounted from the server, regular UFS.

when server is running 7.2-stable no leakage is seen.
 see ftp://ftp.cs.huji.ac.il/users/danny/freebsd/mbufs/{tcp,udp}-7.2.ps
when server is runing 8.0-stable
 see ftp://ftp.cs.huji.ac.il/users/danny/freebsd/mbufs/{tcp,udp}-8.0.ps
you can see that udp is leaking!

cheers,
danny
ps: I think the subject should be changed again, removing zfs ...


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: mbuf leakage with nfs/udp (was: mbuf leakage with nfs/zfs)

2010-02-28 Thread Daniel Braniss

>
> On Feb 28, 2010, at 12:11 PM, Daniel Braniss wrote:
> 
> >> I'm pulling in Robert Watson, who has some familiarity with the UDP
> >> stack/code in FreeBSD.  I'm not sure he'll be a sufficient source of
> >> knowledge for this specific issue since it appears (?) to be specific =to
> >> NFS; Rick Macklem would be a better choice, but as reported, he's =MIA.
> >> 
> >> Robert, are you aware of any changes or implementation issues which
> >> might cause excessive (read: leaking) mbuf use under UDP-based NFS?  =Do
> >> you know of a way folks could determine the source of the leak, =either
> >> via DDB or while the system is live?
> > 
> > I have been runing some tests in a controlled environment.
> > > > server and client are both 64bit Xeon/X5550 @  2.67GHz with 16Gb of 
> > > > =memory
> > FreeBSD/SMP: 2 package(s) x 4 core(s) x 2 SMT threads
> > > > the client is runing latest 8.0 stable
> > the load is created by runing 'make -j32 buildworld' and sleeping 150 =sec.
> > in between runs, this is the straight line you will see in the graphs.
> > Both the src and obj directories are NFS mounted from the server, =regular 
> > UFS.
> > > > when server is running 7.2-stable no leakage is seen.
> > see ftp://ftp.cs.huji.ac.il/users/danny/freebsd/mbufs/{tcp,udp}-7.2.ps
> > when server is runing 8.0-stable
> > see ftp://ftp.cs.huji.ac.il/users/danny/freebsd/mbufs/{tcp,udp}-8.0.ps
> > you can see that udp is leaking!
> > > > cheers,
> > danny
> > ps: I think the subject should be changed again, removing zfs ...
> > This type of problem (occurs with one client but not another) is almost 
> > =always the result of the access pattern of a particular client =triggering 
> > a specific (and perhaps single) bug in error-handling. For =example, we 
> > might not be properly freeing the received request when =generating an 
> > EPERM in an edge case. The hard bit is identifying which =it is. If it's 
> > reproducible with UDP, then usually the process is:
> > - Build a minimal test case to trigger the problem -- ideally with as 
> > =little complexity as possible.
> - Run netstat -m at the beginning of the test and the end of the test on =the 
> server to count the number of leaked mbufs
> - Run wireshark throughout the test
> - Walk the wireshark trace looking for some error that occurs at about =the 
> same or slightly lower number of times then the number of mbufs =leaked
> - Iterate, narrowing the test case until it's either obvious exactly =what's 
> going on, or you've identified a relatively constrained code path =and can 
> just spot the bug by reading the code
> > It's almost certainly one or a small number of very specific RPCs that =are 
> > triggering it -- maybe OpenBSD does an extra lookup, or stat, or 
> > =something, on a name that may not exist anymore, or does it sooner than 
> > =the other clients. Hard to say, other than to wave hands at the 
> > =possibilities.
> > And it may well be we're looking at two bugs: Danny may see one bug, 
> > =perhaps triggered by a race condition, but it may be different from the 
> > =OpenBSD client-triggered bug (to be clear: it's definitely a FreeBSD =bug, 
> > although we might only see it when an OpenBSD client is used =because 
> > perhaps OpenBSD also has a bug or feature).
> > Robert=

well, I have further reduced the problem, it happens with NFS/UDP writes.
i'll try the wireshark road, but i'm very rusty with RPC, the other road is to
check the changes, my oldest is from late october (RC2) where it's happening, 
while
Gerrit tried 8-pre from November and worked, so it will be fun
trying to nail it down :-)

cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: mbuf leakage with nfs/zfs?

2010-03-02 Thread Daniel Braniss

> 
> 
> On Sat, 27 Feb 2010, Jeremy Chadwick wrote:
> 
> >> I concur.
> >> Everything in my network is now on TCP, and there is no mbuf leakage.
> >> I just don't get over the 5500 mark, no matter what I throw at it.
> >>
> >> I do feel that TCP is not as well performing on a local net with Linux,
> >> hence the choice for UDP. But TCP is workable as next best.
> >
> > NFS; Rick Macklem would be a better choice, but as reported, he's MIA.
> >
> 
> Not exactly MIA, but only able to read email from time to time at this
> point. I don't know when I'll be able to do more than that.
> 
> So, it does sound like it is UDP specific. Robert mentioned one scenario,
> which was an infrequently executed code path that is being tickled and it
> has a missing m_freem().
> 
> One thing someone could try is switching to the experimental nfs server
> ("-e" on both mountd and nfsd) and see if the leak goes away. If it does
> go away, it is almost certainly the above in the regular nfs server code.
> 
runing with the experimental nfs server all is ok!
(at least I can't see any mbuf leakage :-)

so now that we can  assume that the problem is in NFS/UDP writes via
classic nfsserver, where to look?

> If it doesn't go away, the problem is more likely in the krpc or the
> generic udp code. (When I looked at svc_dg.c, I could only spot one
> possible leak and you've already determined that patch doesn't help.
> The other big difference when using udp on the FreeBSD8 krpc is the
> reply cache code. I seem to recall it's an lru cache with a fixed upper
> bound, but it might be broken and leaking.
> 
> If you change the server to set sp_rcache = NULL in the initialization
> function in sys/nfsserver/nfs_srvkrpc.c, I think that disables the replay
> cache. You wouldn't want to run this way in production, but it would 
> determine if the leak is in it.
> 
> Change the 3 lines in nfsrv_init() to:
> nfsrv_pool->sp_rcache = NULL;
> nfsrv_pool->sp_assign = NULL;
> nfsrv_pool->sp_done = NULL;
> 
> and I think the krpc replay cache will be disabled.
> 
> Good luck with it and please report back if you get to try the above.
> 
> I'll get back to committing etc one of these days, rick

just keep sending insights/pointers and enjoy life

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: mbuf leakage with nfs/udp (was mbuf leakage with nfs/zfs?)

2010-03-03 Thread Daniel Braniss

> 
> 
> On Tue, 2 Mar 2010, Daniel Braniss wrote:
> 
> > runing with the experimental nfs server all is ok!
> > (at least I can't see any mbuf leakage :-)
> >
> > so now that we can  assume that the problem is in NFS/UDP writes via
> > classic nfsserver, where to look?
> >
> 
> It might also be the krpc reply cache, since the experimental server
> isn't using it (nfsv4 requires a rather twisted reply cache and it was
> easier to just use that one for nfsv2,3 for the experimental server,
> as well).
> 
> >> If it doesn't go away, the problem is more likely in the krpc or the
> >> generic udp code. (When I looked at svc_dg.c, I could only spot one
> >> possible leak and you've already determined that patch doesn't help.
> >> The other big difference when using udp on the FreeBSD8 krpc is the
> >> reply cache code. I seem to recall it's an lru cache with a fixed upper
> >> bound, but it might be broken and leaking.
> >>
> >> If you change the server to set sp_rcache = NULL in the initialization
> >> function in sys/nfsserver/nfs_srvkrpc.c, I think that disables the replay
> >> cache. You wouldn't want to run this way in production, but it would
> >> determine if the leak is in it.
> >>
> >> Change the 3 lines in nfsrv_init() to:
> >> nfsrv_pool->sp_rcache = NULL;
> >> nfsrv_pool->sp_assign = NULL;
> >> nfsrv_pool->sp_done = NULL;
> >>
> >> and I think the krpc replay cache will be disabled.
> >>
> 
> If someone gets a chance to try the above (not in production mode:-),
> it will determine if the problem is in the reply cache or the nfs server's
> write code.
> >> Good luck with it and please report back if you get to try the above.
> >>
> 
> Thanks for trying the experimental server. It is getting narrowed down,
> due to everyone's work on it.
> 
disabling the krpc reply cache does it, no visible damage. Somehow
this reminds me of my old 1970 beetle, parts would fall off but it would
continue working :-)
where to go from here?

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: mbuf leakage with nfs/udp (was mbuf leakage with nfs/zfs?)

2010-03-03 Thread Daniel Braniss

> 
> 
> On Wed, 3 Mar 2010, Daniel Braniss wrote:
> 
> > disabling the krpc reply cache does it, no visible damage. Somehow
> > this reminds me of my old 1970 beetle, parts would fall off but it would
> > continue working :-)
> > where to go from here?
> >
> Ok, so it sounds like the leak is in the krpc reply cache code, if I
> understand this? (ie. you are running the regular server with the reply
> cache disabled and the UDP client mounts aren't causing the leak.)

correct. The interesting side effect, is that I can't see any negative
issues when disabling the cash.
> 
> Good work on tracking this down!
> 
it was a coordinated efford :-)

> I guess the next step is to look through the code for the leak. I'll
> do that someday, but if anyone else is inspired to do so, they are
> more than welcome.:-)
> 
> Thanks for working through this, rick

thank you! I have a vested interest in having this fixed, on the other hand 
nfsd
seems ok, I have been running it now on a semi production server and
it's holding up quiet nicely, the cache seems not up to expectations:

store-mg-03# nfsstat -se
Server Info:
  Getattr   SetattrLookup  Readlink  Read WriteCreateRemove
 48176764262687  12582599 19732   4225907   9186574780793818837
   Rename  Link   Symlink Mkdir Rmdir   Readdir  RdirPlusAccess
 7623   160 27753 59551 59552118216 0   1992779
MknodFsstatFsinfo  PathConfCommit   LookupP   SetClId SetClIdCf
097900519 0   1644267 0 0 0
 Open  OpenAttr OpenDwnGr  OpenCfrm DelePurge   DeleRet GetFH  Lock
0 0 0 0 0 0 0 0
LockT LockU CloseVerify   NVerify PutFH  PutPubFH PutRootFH
0 0 0 0 0 0 0 0
Renew RestoreFHSaveFH   Secinfo RelLckOwn  V4Create
0 0 0 0 0 0
Server:
RetfailedFaults   Clients
0 0 0
OpenOwner Opens LockOwner LocksDelegs 
0 0 0 0 0 
Server Cache Stats:
   Inprog  Idem  Non-idemMisses CacheSize   TCPPeak
  307 0   297  80943198 0 0

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: mbuf leakage with nfs/zfs?

2010-03-04 Thread Daniel Braniss

> 
> 
> On Tue, 2 Mar 2010, Daniel Braniss wrote:
> 
> >
> > just keep sending insights/pointers and enjoy life
> >
> 
> 
> You could try this patch for sys/rpc/replay.c. Completely untested and
> just typed into email (so don't give it to "patch", just edit the file).
> 
> - try adding these 2 lines just before the end of replay_setreply() in
>sys/rpc/replay.c:
> 
> - }
> + } else if (m)
> + m_freem(m);
>   mtx_unlock(&rc->rc_lock);
> }
> 
> It's the only place I can see in replay.c that might leak, rick
> 
this is what I did:
--- a/sys/rpc/replay.c  Mon Mar 01 18:29:54 2010 +0200
+++ b/sys/rpc/replay.c  Fri Mar 05 09:24:17 2010 +0200
@@ -243,6 +243,9 @@
rce->rce_repbody = m;
if (m)
rc->rc_size += m_length(m, NULL);
+   } else if (m) {
+printf("free m=%p ...\n", m);
+m_freem(m);
}
mtx_unlock(&rc->rc_lock);
 }

but it didn't help, it's not triggered

Thanks for the explanation on the cache, things are begining to make sense.
If I understand, the reason for this cache is to prevent re-applying an
already performed rpc, which could lead to data corruption

btw, the list of CCs is rather big, so if anyone feels he rather be removed,
please let me know.

cheers,
danny




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: mbuf leakage with nfs/zfs?

2010-03-05 Thread Daniel Braniss

> 
> 
> On Tue, 2 Mar 2010, Daniel Braniss wrote:
> 
> >
> > just keep sending insights/pointers and enjoy life
> >
> 
> 
> You could try this patch for sys/rpc/replay.c. Completely untested and
> just typed into email (so don't give it to "patch", just edit the file).
> 
> - try adding these 2 lines just before the end of replay_setreply() in
>sys/rpc/replay.c:
> 
> - }
> + } else if (m)
> + m_freem(m);
>   mtx_unlock(&rc->rc_lock);
> }
> 
> It's the only place I can see in replay.c that might leak, rick
> 
this is what I did:
--- a/sys/rpc/replay.c  Mon Mar 01 18:29:54 2010 +0200
+++ b/sys/rpc/replay.c  Fri Mar 05 09:24:17 2010 +0200
@@ -243,6 +243,9 @@
rce->rce_repbody = m;
if (m)
rc->rc_size += m_length(m, NULL);
+   } else if (m) {
+printf("free m=%p ...\n", m);
+m_freem(m);
}
mtx_unlock(&rc->rc_lock);
 }

but it didn't help, it's not triggered

Thanks for the explanation on the cache, things are begining to make sense.
If I understand, the reason for this cache is to prevent re-applying an
already performed rpc, which could lead to data corruption

btw, the list of CCs is rather big, so if anyone feels he rather be removed,
please let me know.

cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: mbuf leakage with nfs/zfs?

2010-03-05 Thread Daniel Braniss

[...]
> > but it didn't help, it's not triggered
> >
> 
> Hmm, well that's the only place I could see in replay.c that could leak
> (and it's a pretty straightforward piece of code). This is getting
> interesting. Just to confirm where we currently are...
> 
> - replay cache disabled --> no leak
> - replay cache enabled (with or without the above patch) --> leak
> 
yes and yes.

> I'll take another look, but I doubt the leak is in replay.c so... maybe
> a reply from the cache is somehow handled incorrectly and that causes the
> leak elsewhere? (Just a random hunch at this point.)
> 
it works ok in 7.2, so it would be interesting to compare changes ...

> > Thanks for the explanation on the cache, things are begining to make sense.
> > If I understand, the reason for this cache is to prevent re-applying an
> > already performed rpc, which could lead to data corruption
> >
> 
> Yep, you've got it. It is basically a bandaid for the poor transport
> semantics provided by UDP.
> 
> Having fun with this one. Thanks for the help, rick
> 
I'm glad :-)

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

is dtrace usable?

2010-03-06 Thread Daniel Braniss

hi,
I get 
link_elf_obj: symbol lapic_cyclic_clock_func undefined

when trying
kldload dtraceall
this is with a fearly resent 8-stable

I'm trying to help Rick Maclem debug the NSF/UDP problem, and I
thought it would be a good chance to learn dtrace, but :-(

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: is dtrace usable?

2010-03-06 Thread Daniel Braniss

> 
> On Sat, 6 Mar 2010, Daniel Braniss wrote:
> 
> > link_elf_obj: symbol lapic_cyclic_clock_func undefined
> >
> > when trying
> > kldload dtraceall this is with a fearly resent 8-stable
> >
> > I'm trying to help Rick Maclem debug the NSF/UDP problem, and I thought it 
> > would be a good chance to learn dtrace, but :-(
> 
> Take a look at the DTrace configuration information here:
> 
>http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/dtrace.html
> 
> And here:
> 
>http://wiki.freebsd.org/DTrace
> 
> It looks like options KDTRACE_HOOKS may not be defined in your kernel 
> configuration, but there are some other details, such as WITH_CTF=1, that 
> you'll also need to make sure are appropriately set.
> 
> Robert

I did all that, but booted the wrong kernel,
sorry for the noise

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Fwd: Re: NFS Client error

2010-03-09 Thread Daniel Braniss

> Thanks for your kind reply, I'm forwarding it there...
> 
> 
>  Original Message 
> Subject:  Re: NFS Client error
> Date: Mon, 08 Mar 2010 23:59:29 +0100
> From: vol...@vwsoft.com
> To:   Giulio Ferro 
> CC:   freebsd-hack...@freebsd.org, freebsd-...@freebsd.org
> 
> 
> 
> On 03/08/10 12:16, Giulio Ferro wrote:
> >  Freebsd 8 stable amd64
> >
> >  It mounts different file systems by NFS (with locking) on a
> >  data server directly connected (gigabit) to the server
> >
> >  Apache running in a several jails on those nfs folders.
> >
> >  Now and then I get huge slow-down. When I look in the logs
> >  I get thousand of lines like these:
> >  Mar  5 11:50:52 virt2 kernel: vm_fault: pager read error, pid 46487 (httpd)
> >  Mar  5 11:50:52 virt2 kernel: pid 46487 (httpd), uid 80: exited on
> >  signal 11
> >
> >
> >  What should I do?

If the binary (httpd) is on a nfs server, then if the binary got
modified this is what usualy happens

my 2c
danny

> 
> Giulio,
> 
> it seems this is anyhow not related to network (nfs) operations. It's
> looking like a problem in the VM. I think it makes sense to have a look
> at the httpd.core file if the binary has been linked with debugging
> symbols turned on. Also I think at first, it may not hurt to look at
> vmstat -m output.
> 
> You may want to change ${subject} and post to stable@ to drive more
> attention to your problem.
> 
> Volker
> 
> 
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
> 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

boot and boot0cfg problem

2010-03-30 Thread Daniel Braniss

hi,
I have a this SBC that boots off a CF card,
when it boots, I can select the boot partition via F1 or F2
and all is OK.
when I do it via boot0cfg the 'default_selection' changes
correctly, but the 'active' partition is not changed, so boot
ignores it.
I went ahead and changed boot0cfg.c to set the active
partition and now I'm baffled:

alix-3# ./boot0cfg -v ad0
#   flag start chs   type   end chs   offset size
1   0x00  0:  1: 1   0xa5519: 15:63   63   524097
2   0x80520:  0: 1   0xa5   1023: 15:63   524160   524160 --+
3   0x00   1023:255:63   0xa5   1023: 15:63  1048320  2951424   |
|
version=2.0  drive=0x80  mask=0xf  ticks=182  bell=# (0x23) |
options=packet,update,nosetdrv  |
volume serial ID -800f  |
default_selection=F2 (Slice 2) <+

so far so good.

alix-3# ./boot0cfg -v -s1 ad0
...
1   0x80  0:  1: 1   0xa5519: 15:63   63   524097
...
default_selection=F1 (Slice 1)

ok right? but no!
./boot0cfg -v ad0
...
2   0x80520:  0: 1   0xa5   1023: 15:63   524160   524160
...
default_selection=F1 (Slice 1)

so it seems that someone is preventing changes to the partition table!
btw, this problem was not present in older boot0 (1.0) where the active
partition flag is ignored.

help needed here!

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: boot and boot0cfg problem

2010-03-30 Thread Daniel Braniss

> On 30.03.2010 12:05, Daniel Braniss wrote:
> > so it seems that someone is preventing changes to the partition table!
> > btw, this problem was not present in older boot0 (1.0) where the active
> > partition flag is ignored.
> 
> You can change active partition via gpart(8).
> 
Hi Andrey,
I'm sorry, I've reread the manual, and can't find the write magic.
btw, boot0cfg does call geom but something seems to be broken.

cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: boot and boot0cfg problem

2010-03-30 Thread Daniel Braniss

> 30.03.10, 14:03, "Daniel Braniss" :
> 
> > > On 30.03.2010 12:05, Daniel Braniss wrote:
> >  > > so it seems that someone is preventing changes to the partition table!
> >  > > btw, this problem was not present in older boot0 (1.0) where the active
> >  > > partition flag is ignored.
> >  > 
> >  > You can change active partition via gpart(8).
> >  > 
> >  Hi Andrey,
> >  I'm sorry, I've reread the manual, and can't find the write magic.
> 
> Yes, i also doesn't remember where it can be read. Only in g_part_mbr.c :)
> Try this:
> # gpart set -a active -i 1 ada2
> This will set active first partition on ada2:
> # gpart show ada2
> =>63  1250263665  ada2  MBR  (596G)
>   6340965687 1  !7  [active]  (20G)
> 40965750  1209292875 2  !7  (577G)
>   12502586255103- free -  (2.5M)
> 
> >  btw, boot0cfg does call geom but something seems to be broken.
> I'll look boot0cfg code today and probably made a patch.
ok, that worked!
now if you can get boot0cfg to work that would realy be nice.
thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

panic: vm_fault_copy_wired: page missing

2010-04-15 Thread Daniel Braniss

Hi,
I'm getting this with FreeBSD-8-stable, it usually happens when
starting apache:

panic: vm_fault_copy_wired: page missing
cpuid = 3
KDB: enter: panic
[thread pid 1013 tid 100106 ]
Stopped at  kdb_enter+0x3d: movq$0,0x68f170(%rip)
db> tr
Tracing pid 1013 tid 100106 td 0xff0007a66ae0
kdb_enter() at kdb_enter+0x3d
panic() at panic+0x17b
vm_fault_copy_entry() at vm_fault_copy_entry+0x283
vmspace_fork() at vmspace_fork+0x4d0
fork1() at fork1+0x35f
fork() at fork+0x1c
syscall() at syscall+0x1e7
Xfast_syscall() at Xfast_syscall+0xe1
--- syscall (2, FreeBSD ELF64, fork), rip = 0x8009f41ac, rsp = 0x7fffe7d8, 
rbp = 0x800c34a80 ---

any help in tracking this?

thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: panic: vm_fault_copy_wired: page missing

2010-04-15 Thread Daniel Braniss

> On Thu, Apr 15, 2010 at 12:22 AM, Daniel Braniss  wrot=e:
> > Hi,
> > I'm getting this with FreeBSD-8-stable, it usually happens when
> > starting apache:
> >
> > panic: vm_fault_copy_wired: page missing
> > cpuid = 3
> > KDB: enter: panic
> > [thread pid 1013 tid 100106 ]
> > Stopped at      kdb_enter+0x3d: movq    $0,0x68f170(%rip)
> > db> tr
> > Tracing pid 1013 tid 100106 td 0xff0007a66ae0
> > kdb_enter() at kdb_enter+0x3d
> > panic() at panic+0x17b
> > vm_fault_copy_entry() at vm_fault_copy_entry+0x283
> > vmspace_fork() at vmspace_fork+0x4d0
> > fork1() at fork1+0x35f
> > fork() at fork+0x1c
> > syscall() at syscall+0x1e7
> > Xfast_syscall() at Xfast_syscall+0xe1
> > --- syscall (2, FreeBSD ELF64, fork), rip = 0x8009f41ac, rsp = 
> > 0x7fff=e7d8,
> > rbp = 0x800c34a80 ---
> >
> > any help in tracking this?
> > Hi Danny,
> Can you provide some details about your systems, like amd64 vs
> i386, processor model, amount of RAM, swap, etc?
sure, straight from the lion's mouth:

Copyright (c) 1992-2010 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.0-STABLE #33 r2073: Wed Apr 14 15:29:07 IDT 2010
da...@sunfire:/r+d/obj/sunfire/r+d/stable/8/sys/HUJI amd64
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Dual-Core AMD Opteron(tm) Processor 2218 (2613.41-MHz K8-class CPU)
  Origin = "AuthenticAMD"  Id = 0x40f13  Family = f  Model = 41  Stepping = 3
  Features=0x178bfbff
  Features2=0x2001
  AMD Features=0xea500800
  AMD Features2=0x1f
real memory  = 17179869184 (16384 MB)
avail memory = 16562614272 (15795 MB)

the hardware is a Sun X2200.

thanks for any help! this machine is supposed to replace our old web server
and it's not happening :-(

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: panic: vm_fault_copy_wired: page missing

2010-04-15 Thread Daniel Braniss

> On Thu, Apr 15, 2010 at 11:50:41AM +0300, Daniel Braniss wrote:
> > > On Thu, Apr 15, 2010 at 12:22 AM, Daniel Braniss  
> > > wrot=e:
> > > > Hi,
> > > > I'm getting this with FreeBSD-8-stable, it usually happens when
> > > > starting apache:
> > > >
> > > > panic: vm_fault_copy_wired: page missing
> > > > cpuid = 3
> > > > KDB: enter: panic
> > > > [thread pid 1013 tid 100106 ]
> > > > Stopped at      kdb_enter+0x3d: movq    $0,0x68f170(%rip)
> > > > db> tr
> > > > Tracing pid 1013 tid 100106 td 0xff0007a66ae0
> > > > kdb_enter() at kdb_enter+0x3d
> > > > panic() at panic+0x17b
> > > > vm_fault_copy_entry() at vm_fault_copy_entry+0x283
> > > > vmspace_fork() at vmspace_fork+0x4d0
> > > > fork1() at fork1+0x35f
> > > > fork() at fork+0x1c
> > > > syscall() at syscall+0x1e7
> > > > Xfast_syscall() at Xfast_syscall+0xe1
> > > > --- syscall (2, FreeBSD ELF64, fork), rip = 0x8009f41ac, rsp = 
> > > > 0x7fff=e7d8,
> > > > rbp = 0x800c34a80 ---
> > > >
> > > > any help in tracking this?
> > > > Hi Danny,
> > > Can you provide some details about your systems, like amd64 vs
> > > i386, processor model, amount of RAM, swap, etc?
> > sure, straight from the lion's mouth:
> > 
> > Copyright (c) 1992-2010 The FreeBSD Project.
> > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
> > The Regents of the University of California. All rights reserved.
> > FreeBSD is a registered trademark of The FreeBSD Foundation.
> > FreeBSD 8.0-STABLE #33 r2073: Wed Apr 14 15:29:07 IDT 2010
> > da...@sunfire:/r+d/obj/sunfire/r+d/stable/8/sys/HUJI amd64
> > Timecounter "i8254" frequency 1193182 Hz quality 0
> > CPU: Dual-Core AMD Opteron(tm) Processor 2218 (2613.41-MHz K8-class CPU)
> >   Origin = "AuthenticAMD"  Id = 0x40f13  Family = f  Model = 41  Stepping = 
> > 3
> >   
> > Features=0x178bfbff > CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
> >   Features2=0x2001
> >   AMD Features=0xea500800
> >   AMD Features2=0x1f
> > real memory  = 17179869184 (16384 MB)
> > avail memory = 16562614272 (15795 MB)
> > 
> > the hardware is a Sun X2200.
> > 
> > thanks for any help! this machine is supposed to replace our old web server
> > and it's not happening :-(
> 
> Could you please provide the following?
> 
> 1) Contents of /var/db/ports/apache-/options
sunfire> cat  /var/db/ports/apache-xml-security-c/options
# This file is auto-generated by 'make config'.
# No user-servicable parts inside!
# Options for apache-xml-security-c-1.4.0
_OPTIONS_READ=apache-xml-security-c-1.4.0
WITH_XERCES_DEVEL=true

> 2) Contents of /etc/make.conf
sunfire> cat /etc/make.conf
OVERRIDE_LINUX_BASE_PORT=f8
OVERRIDE_LINUX_NONBASE_PORTS=f8
WRKDIRPREFIX=/home/pobj
PACKAGES=/r+d/packages
FETCH_ENV=  HTTP_PROXY=http://wwwproxy.cs.huji.ac.il:8080/
# added by use.perl 2009-11-10 11:51:57
PERL_VERSION=5.10.1

> 3) Your kernel configuration file ("HUJI")
> i'll try and send this as an attachment
sunfire> config -x /boot/kernel/kernel

> Thanks.
> 
> -- 
> | Jeremy Chadwick   j...@parodius.com |
> | Parodius Networking   http://www.parodius.com/ |
> | UNIX Systems Administrator  Mountain View, CA, USA |
> | Making life hard for others since 1977.  PGP: 4BD6C0CB |
> 

options CONFIG_AUTOGENERATED
ident   HUJI
machine amd64
cpu HAMMER
makeoptions DEBUG=-g
options PRINTF_BUFR_SIZE=256
options ALTQ_HFSC
options ALTQ_PRIQ
options ALTQ_CBQ
options ALTQ
options DEVICE_POLLING
options CONSPEED=115200
options ALT_BREAK_TO_DEBUGGER
options BOOTP_NFSV3
options INCLUDE_CONFIG_FILE
options AH_SUPPORT_AR5416
options IEEE80211_SUPPORT_MESH
options IEEE80211_AMPDU_AGE
options IEEE80211_DEBUG
options AHD_REG_PRETTY_PRINT
options AHC_REG_PRETTY_PRINT
options ATA_REQUEST_TIMEOUT=3
options SMP
options GDB
options DDB
options KDB
options FLOWTABLE
options MAC
options AUDIT
options HWPMC_HOOKS
options KBD_INSTALL_CDEV
options _KPOSIX_PRIORITY_SCHEDULING
options P1003_1B_SEMAPHORES
options SYSVSEM
options SYSVMSG
options SYSVSHM
options STACK
options KTRACE
options SCSI_DELAY=500
options COMPAT_FREEBSD7
options COMPAT_FREEBSD6
options COMPAT_FREEBSD5
options COMPAT_FREEBSD4
options COMPAT_FREEBSD32
options COMPAT_43TTY
options GEOM_LABEL
options GEOM_PART_GPT
options PSEUDOFS
options PROCFS
options CD9660
options MSDOSFS
options NFS_ROOT
options NFSLOCKD
option

Re: panic: vm_fault_copy_wired: page missing

2010-04-15 Thread Daniel Braniss

> On Thu, Apr 15, 2010 at 9:22 AM, Daniel Braniss  wrote:
> > Hi,
> > I'm getting this with FreeBSD-8-stable, it usually happens when
> > starting apache:
> 
> alc@ made some VM MFCs yesterday, could you try a 13th of April kernel
> and see if it works out for you?
> 
asap, btw, I reduced the amount of physical memory and things seem ok.

cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: panic: vm_fault_copy_wired: page missing

2010-04-15 Thread Daniel Braniss

> On Thu, Apr 15, 2010 at 9:22 AM, Daniel Braniss  wrote:
> > Hi,
> > I'm getting this with FreeBSD-8-stable, it usually happens when
> > starting apache:
> 
> alc@ made some VM MFCs yesterday, could you try a 13th of April kernel
> and see if it works out for you?
the kernel that panics does not include alc's MFC - I did the sync few
hours before -, so now I'm copiling with the MFC.
BTW, with less memory the server is still running!

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: panic: vm_fault_copy_wired: page missing

2010-04-15 Thread Daniel Braniss

> 
> --xFkczX7rH1pKA3aV
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> Content-Transfer-Encoding: quoted-printable
> 
> On Thu, Apr 15, 2010 at 10:22:20AM +0300, Daniel Braniss wrote:
> > Hi,
> > I'm getting this with FreeBSD-8-stable, it usually happens when
> > starting apache:
> >=20
> > panic: vm_fault_copy_wired: page missing
> > cpuid =3D 3
> > KDB: enter: panic
> > [thread pid 1013 tid 100106 ]
> > Stopped at  kdb_enter+0x3d: movq$0,0x68f170(%rip)
> > db> tr
> > Tracing pid 1013 tid 100106 td 0xff0007a66ae0
> > kdb_enter() at kdb_enter+0x3d
> > panic() at panic+0x17b
> > vm_fault_copy_entry() at vm_fault_copy_entry+0x283
> > vmspace_fork() at vmspace_fork+0x4d0
> > fork1() at fork1+0x35f
> > fork() at fork+0x1c
> > syscall() at syscall+0x1e7
> > Xfast_syscall() at Xfast_syscall+0xe1
> > --- syscall (2, FreeBSD ELF64, fork), rip =3D 0x8009f41ac, rsp =3D 0x7fff=
> e7d8,=20
> > rbp =3D 0x800c34a80 ---
> >=20
> > any help in tracking this?
> >=20
> > thanks,
> > danny
> 
> Is it true that the process started, or at least some of loaded dso
> are from NFS mount ?
everything is nfs :-), the host is dataless
but redusing the amount of physical memory has solved the
problem, so I don't think NFS is the problem.

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: panic: vm_fault_copy_wired: page missing

2010-04-15 Thread Daniel Braniss

> On Thu, Apr 15, 2010 at 9:22 AM, Daniel Braniss  wrote:
> > Hi,
> > I'm getting this with FreeBSD-8-stable, it usually happens when
> > starting apache:
> 
> alc@ made some VM MFCs yesterday, could you try a 13th of April kernel
> and see if it works out for you?

with or without the MFC it's still panicking, and the memory size does not
affect the outcome :-(

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: panic: vm_fault_copy_wired: page missing

2010-04-15 Thread Daniel Braniss

> 
> --U3Zg06C/E2vtHpAW
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> Content-Transfer-Encoding: quoted-printable
> 
> On Thu, Apr 15, 2010 at 12:39:13PM +0300, Daniel Braniss wrote:
> > >=20
> > > --xFkczX7rH1pKA3aV
> > > Content-Type: text/plain; charset=3Dus-ascii
> > > Content-Disposition: inline
> > > Content-Transfer-Encoding: quoted-printable
> > >=20
> > > On Thu, Apr 15, 2010 at 10:22:20AM +0300, Daniel Braniss wrote:
> > > > Hi,
> > > > I'm getting this with FreeBSD-8-stable, it usually happens when
> > > > starting apache:
> > > >=3D20
> > > > panic: vm_fault_copy_wired: page missing
> > > > cpuid =3D3D 3
> > > > KDB: enter: panic
> > > > [thread pid 1013 tid 100106 ]
> > > > Stopped at  kdb_enter+0x3d: movq$0,0x68f170(%rip)
> > > > db> tr
> > > > Tracing pid 1013 tid 100106 td 0xff0007a66ae0
> > > > kdb_enter() at kdb_enter+0x3d
> > > > panic() at panic+0x17b
> > > > vm_fault_copy_entry() at vm_fault_copy_entry+0x283
> > > > vmspace_fork() at vmspace_fork+0x4d0
> > > > fork1() at fork1+0x35f
> > > > fork() at fork+0x1c
> > > > syscall() at syscall+0x1e7
> > > > Xfast_syscall() at Xfast_syscall+0xe1
> > > > --- syscall (2, FreeBSD ELF64, fork), rip =3D3D 0x8009f41ac, rsp =3D3=
> D 0x7fff=3D
> > > e7d8,=3D20
> > > > rbp =3D3D 0x800c34a80 ---
> > > >=3D20
> > > > any help in tracking this?
> > > >=3D20
> > > > thanks,
> > > > danny
> > >=20
> > > Is it true that the process started, or at least some of loaded dso
> > > are from NFS mount ?
> > everything is nfs :-), the host is dataless
> > but redusing the amount of physical memory has solved the
> > problem, so I don't think NFS is the problem.
> 
> I do think that NFS is problem. Another key point is that your process
> is mlock'ed, right ? This is kind of known issue with NFS and mlock.
> 
well, since it's panicking again, there goes the memsize theory.
this is getting weirder and weirder, it now panics on reboot:

Stopping cron.
Stopping sshd.
===> apache22 profile: httpd
===> apache22 profile: httpdyn
Stopping inetd.
Stopping ntpd.
Stopping lockd.
Waiting for PIDS: 1201.
Stopping statd.
Stopping nfsd.
Stopping mountd.
Stopping devd.
.
Apr 15 13:27:48 sf-02 syslogd: exiting on signal 15
panic: vm_fault_copy_wired: page missing
cpuid = 1
KDB: enter: panic
[thread pid 1014 tid 100118 ]
Stopped at  kdb_enter+0x3d: movq$0,0x68f7a0(%rip)
db>  tr
Tracing pid 1014 tid 100118 td 0xff000533f3a0
kdb_enter() at kdb_enter+0x3d
panic() at panic+0x17b
vm_fault_copy_entry() at vm_fault_copy_entry+0x283
vmspace_fork() at vmspace_fork+0x4d0
fork1() at fork1+0x35f
fork() at fork+0x1c
syscall() at syscall+0x1e7
Xfast_syscall() at Xfast_syscall+0xe1
--- syscall (2, FreeBSD ELF64, fork), rip = 0x8009f41ac, rsp = 0x7fffe7d8, 
rbp = 0x800c34a00 ---


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

1 2 3 >

1 - 100 of 260 matches

Mail list logo