from:"Stefan Bethke"

Re: bridge, ipv6 and rtadvd

2011-02-06 Thread Stefan Bethke

Am 06.02.2011 um 13:23 schrieb Spil Oss:

> Hi All,
> 
> Don't know if this is expected behaviour.
> 
> My LAN (bge0) and WLAN (wlan0) are bridged in bridge0. I tried to run
> rtadvd on bridge0 but that didn't result in ipv6 addresses on my
> network. Tried running rtadvd directly /usr/sbin/rtadvd -c
> /etc/rtadvd.conf -f -D and saw the requests coming in from the client
> but that didn't result in a working ipv6 network. "Wild guessing" I
> tried loading it with /usr/sbin/rtadvd -f -D bge0 and I had a
> functional ipv6 network.
> 
> Is this intended behaviour? Am I doing something wrong?

It appears to be intentional; there was some discussion a couple years back, 
and the current behavior is for virtual interfaces to not receive link-local 
addresses.

Since I prefer to have bridge0 as the "main" interface, I simply manually 
configured a link local address:

ipv6_enable="YES"
ipv6_gateway_enable="YES"
ipv6_network_interfaces="bridge0 gif0"
ipv6_ifconfig_bridge0="fe80::21c:c0ff:fe7d:8c50%bridge0"
ipv6_ifconfig_bridge0_alias0="2001:470:1f0b:::1 prefixlen 64"
ipv6_ifconfig_gif0="2001:470:1f0a:::2 2001:470:1f0a:::1 prefixlen 128"

$ cat /etc/rtadvd.conf 
bridge0:\
:addrs#1:addr="2001:470:1f0b:::":raflags#64:

The IPv4 side of gif0 is brought up through a linkup script triggered by mpd 
when my DSL connection comes up; that also updates the endpoint address for the 
HE tunnel.

Oh, this is on -stable from Dec 4.

HTH,
Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: How to bind a static ether address to bridge?

2011-02-26 Thread Stefan Bethke

Am 25.02.2011 um 07:56 schrieb Zhihao Yuan:

> My server is behind a DHCP-enabled router, and it has two network
> interfaces, wlan0 and bge0. I want to use them together, so I bind
> them, plus tap0 to bridge0. But bridge has a random MAC address for
> each time it was created, which makes me hard to reserve an IP for it
> (since I need to forward some ports to this server). So I set
> net.link.bridge.inherit_mac=1, which makes bridge0 to use bge0's MAC
> address, always. But this causes another problem: the packets sent to
> bridge0 is also sent to bge0, -- the packets are duplicated! The
> kernel have to drop half of them. So how can I bind a distinct MAC
> address to a bridge?

This is in my router's rc.conf:
ifconfig_bridge0="ether 02:00:00:00:00:01 addm tap0 addm vlan1"
ifconfig_bridge0_alias0="inet 192.168.0.1/24"

vlan1 is on em0; neither as an address assigned.

And if you want to put IPv6 on there, you also have to add a link-local address 
to make rtadvd happy, something like:
ipv6_network_interfaces="bridge0 gif0"
ipv6_ifconfig_bridge0="fe80::21c:c0ff:fe7d:8c50%bridge0"
ipv6_ifconfig_bridge0_alias0="2001:470:1f0b:::1 prefixlen 64"


-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: gptzfsboot serial support

2011-06-02 Thread Stefan Bethke

Am 01.06.2011 um 10:16 schrieb Arnaud Houdelette:

> Hi.
> 
> I know that there is 2 versions of boot0 : With and Without serial support.

boot0 and boot0sio are FreeBSD's version of the MBR; it's what shows the F1..F4 
prompt.

> Is it the same with gptzfsboot ?
> 
> How to build gptzfsboot with serial support, setting serial speed at 19200 
> baud ?

Serial support is built by default for boot1/2, loader and it's variations.  To 
change the default from video console to serial, or change the speed, see 
boot(8), and add the appropriate flags to /boot.config.

If you only require loader(8) to interact with the serial console, you can set 
loader.conf(5) variables to pick the console and the speed.


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Crashes with Promise controller

2011-06-18 Thread Stefan Bethke

Am 13.06.2011 um 16:22 schrieb Christian Baer:

> I have to slightly explain the word "crash" here: I don't actually have
> to hard reset the system myself. My box just does a reboot by itself. No
> filesystem is unmounted cleanly and because the machine isn't really new
> and powerful fsck takes pretty long.

I can't help you with your controllers, but anyone in a position to help will 
likely want to know if the box simply resets, or if the kernel panics.  And if 
there are going to be any patches, you most certainly will want to get familiar 
with the debugger to help try stuff out.  The handbook has information on how 
to enable crash dumps and getting the kernel debugger going.  If you haven't 
done so already, try and get a serial console going, it helps tremendously to 
be able to cut&paste debugger info instead of trying to hand transcribe it.


HTH,
Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: OS X Lion time machine => (afpd|iSCSI) => ZFS question

2011-07-24 Thread Stefan Bethke

Am 21.07.2011 um 23:56 schrieb Bakul Shah:

> I am in no hurry to upgrade my MBP to OS X Lion but given Lion
> time machine and netatalk issues, I got wondering if iSCSI on
> FreeBSD is stable enough for time machine use. How much duct
> tape and baling wire are needed to make it work?!

After having had odd behavior from TM on a netatalk volume, I've switched over 
to istgt and the globalSAN iSCSI initiator, using a ZVOL.  I found the istgt 
configuration non-obvious, but I also have little background in iSCSI.  Took me 
about an hour to get it up and running without authentication; haven't bothered 
since trying to get authentication to work.


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: /usr/bin/script eating 100% cpu with portupgrade and xargs

2011-10-14 Thread Stefan Bethke


Am 14.10.2011 um 14:03 schrieb Jilles Tjoelker:

> On Wed, Oct 12, 2011 at 11:25:35PM +0100, Adrian Wontroba wrote:
>> On Sat, Oct 08, 2011 at 01:27:07AM +0100, Adrian Wontroba wrote:
>>> I won't be in a position to create a simpler test case, raise a PR or
>>> try patches till Tuesday evening (UK) at the earliest.
> 
>> So far I have been unable to reproduce the problem with portupgrade (and
>> will probably move to portmaster).
> 
>> I have however found a different but possibly related problem with the
>> new version of script in RELENG_8, for which I have raised this PR:
> 
>> misc/161526: script outputs corrupt if input is not from a terminal
> 
>> Blast, should of course been bin/
> 
> The extra ^D\b\b are the EOF character being echoed. These EOF
> characters are being generated by the new script(1) to pass through the
> EOF condition on stdin.
> 
> One fix would be to change the termios settings temporarily to disable
> the echoing but this may cause problems if the application is changing
> termios settings concurrently and generally feels bad.
> 
> It may be best to remove writing EOF characters, perhaps adding an
> option to enable it again if there is a concrete use case for it.

I finally figured out why my ports aren't updating anymore: when running 
portupgrade -a --batch from cron, stdin is /dev/null, and that produces the 
gobs of ^D in the output, as well as the script file that portupgrade creates.  
What's worse is that the upgrade never completes.

You can easily see this for yourself:
# portupgrade -a --batchFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: /usr/bin/script eating 100% cpu with portupgrade and xargs

2011-10-15 Thread Stefan Bethke

Am 15.10.2011 um 09:36 schrieb Mikolaj Golub:

> 
> On Fri, 14 Oct 2011 22:50:32 +0200 Stefan Bethke wrote:
> 
> SB> I finally figured out why my ports aren't updating anymore: when running 
> portupgrade -a --batch from cron, stdin is /dev/null, and that produces the 
> gobs of ^D in the output, as well as the script file that portupgrade 
> creates.  What's worse is that the upgrade never completes.
> 
> SB> You can easily see this for yourself:
> SB> # portupgrade -a --batch  
> SB> This is on 8-stable from October 5th.
> 
> Could you please try the patch I attached to another my mail in this thread to
> see if it helps?


Seems to do the trick, thanks!


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: unix browsers problem

2011-10-15 Thread Stefan Bethke

Am 15.10.2011 um 12:34 schrieb kapral:

> When I connect with any freebsd 8.2 browsers  like epithany konqueror
> firefox7.0.1 and under open bsd 4.9 with firefox 3.6.13 en i have  strange
> connections too le100.net i checkt the ip and this is seedo in germany i
> also have other strange connections any idea what to do with this bug ?

I think you need to be much more specific about what you're seeing, and why you 
think that it is a problem.

I got curious, so I fired up a fresh Firefox, and indeed saw that I have open 
connections to a couple of IPs:
$ netstat -anfinet
Active Internet connections (including servers)
Proto Recv-Q Send-Q Local Address  Foreign Address(state)
tcp4   0  0 92.231.160.45.1494474.125.43.120.80   ESTABLISHED
tcp4   0  0 92.231.160.45.6244174.125.43.104.80   ESTABLISHED
tcp4   0  0 92.231.160.45.4171074.125.43.104.80   ESTABLISHED
tcp4   0  0 92.231.160.45.56705195.95.193.85.80   ESTABLISHED
tcp4   0  0 92.231.160.45.14474195.95.193.78.80   TIME_WAIT
tcp4   0  0 92.231.160.45.64333195.95.193.78.80   ESTABLISHED
tcp4   0  0 92.231.160.45.3619168.232.35.119.80   TIME_WAIT
tcp4   0  0 92.231.160.45.4033963.245.217.43.443  TIME_WAIT
tcp4   0  0 92.231.160.45.5178563.245.217.43.443  ESTABLISHED
tcp4   0  0 92.231.160.45.5132174.125.43.190.443  ESTABLISHED

The 74.125.43.* addresses do resolve to 1e100.net, as you're seeing:
$ host 74.125.43.190
190.43.125.74.in-addr.arpa domain name pointer bw-in-f190.1e100.net.
$ host 74.125.43.104
104.43.125.74.in-addr.arpa domain name pointer bw-in-f104.1e100.net.

whois shows who's using them:
$ whois 74.125.43.104
...
NetRange:   74.125.0.0 - 74.125.255.255
CIDR:   74.125.0.0/16
OriginAS:
NetName:GOOGLE
NetHandle:  NET-74-125-0-0-1
Parent: NET-74-0-0-0-0
NetType:Direct Allocation
RegDate:2007-03-13
Updated:2007-05-22
Ref:http://whois.arin.net/rest/net/NET-74-125-0-0-1

Considering that Firefox by default will open up the Firefox Google page, I 
don't find this surprising at all.


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Accessing tun devices from inside a Jail

2011-10-21 Thread Stefan Bethke


Am 21.10.2011 um 04:02 schrieb Morgan Reed:

> Hi all,
> 
>  I'm currently attempting to setup, I suppose you'd call it a
> multi-VPN-tunnel gateway. Basically I have several OpenVPN Servers in
> different locations, I want to have various tunnels up to them and be
> able to choose an exit by way of pointing my browser at a particular
> instance of Squid running in a particular jail which routes via a
> particular tunnel (HTTP/S traffic is the primary concern at this
> point, though I might want to extend the concept to all traffic in
> future).

I have a similar setup, but the OpenVPN endpoints are on OpenWrt, with 
tinyproxy running there.  I have a central squid that knows which tiny proxy to 
use for which URL pattern, and that works quite well.

> First issue I ran into was routing tables, that was resolved by
> recompiling my kernel with option ROUTETABLES=10 and pointing each of
> my jails to their own FIB, however as it's not possible to configure
> route tables from inside the jail (as far as I'm aware anyway) I need
> to bring the OpenVPN tunnel up from the host and utilise a route-up
> script to configure the routing table for the jail (utilising setfib),
> I run into problems though, as even though the tun device is visible
> in the jail it does not appear to be configured (no IP addersses, etc)
> so the jail is unable to route traffic.
> 
> All the stuff I've been able to find online has been geared to static
> addresses on each end of the tunnel, this is not the case with my VPN
> provider, tunnel addresses are dynamically assigned.
> 
> I think that worst case I can probably use pf on the host to route
> traffic from a given jail via a particular interface or possibly
> cobble something up around VIMAGE, but I think I'd rather not have to
> go down those paths.
> 
> I'm not sure if what I'm looking for is actually possible, any
> suggestions would be much appreciated.


I was trying to enable a set of processes to use a separate DSL interface, with 
the FreeBSD box terminating the PPPoE connection.  I've tried a couple of 
things:
- I couldn't come up with pf rules that would allow certain processes (i. e. 
those in a specific jail, or running under a specific user id) to have seperate 
forwarding applied to them.  I believe IPFW might be better suited, but I 
haven't tried.
- VIMAGE and mpd don't like each other, so VIMAGE was out as well
- VBox with the interface bridged to the DSL interface works fine, but has a 
lot of overhead.

My OpenVPN hub server is running inside a jail, but the tun interface is 
preconfigured from outside; the config substitutes /bin/true for ifconfig and 
route.

HTH, and please report back on any success, I'm definitely interested!


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Accessing tun devices from inside a Jail

2011-10-22 Thread Stefan Bethke

Am 22.10.2011 um 01:19 schrieb Nikos Vassiliadis:

> On 10/21/2011 5:08 PM, Stefan Bethke wrote:
>> - VIMAGE and mpd don't like each other, so VIMAGE was out as well
> 
> Could you explain please? In my limited testing they seem to get along fine:)

Sorry, I misremembered.  The issue is actually pf and VIMAGE.  A couple of 
years back, there were issues with VIMAGE and netgraph, but those seem to have 
been resolved.

Stefan

-- 
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Accessing tun devices from inside a Jail

2011-10-22 Thread Stefan Bethke


Am 22.10.2011 um 14:04 schrieb Matthew Seaman:

> On 22/10/2011 12:49, Stefan Bethke wrote:
>> Am 22.10.2011 um 01:19 schrieb Nikos Vassiliadis:
>> 
>>>> On 10/21/2011 5:08 PM, Stefan Bethke wrote:
>>>>>> - VIMAGE and mpd don't like each other, so VIMAGE was out as well
>>>> 
>>>> Could you explain please? In my limited testing they seem to get along 
>>>> fine:)
>> Sorry, I misremembered.  The issue is actually pf and VIMAGE.  A couple of 
>> years back, there were issues with VIMAGE and netgraph, but those seem to 
>> have been resolved.
> 
> pf and VIMAGE seems to have been fixed in 9.0

Oh cool, I'll give it another shot then!


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Running portupgrade from cron (was: /usr/bin/script eating 100% cpu with portupgrade and xargs)

2011-10-23 Thread Stefan Bethke

Am 23.10.2011 um 08:47 schrieb Chris Rees:

> Worst of all, you're running portupgrade from cron without reading UPDATING,
> which is just asking for trouble.

What specifically is your concern here?

I've been running portupgrade from cron for six years on a multitude of systems 
with great success.  Yes, occasionally, things break, and reading UPDATING 
becomes a necessity, but I much prefer the nightly upgrade to running 
portupgrade by hand less frequently, especially when you have to do it on 20 or 
more machines.

Note that most of these boxes have very limited SLAs, and dealing with the 
occasional breakage is much less work than regular manual maintainance.

I decided to do this after I got bitten one too many times trying to upgrade 
ports after three to four months, and getting stuck in all kinds of bad 
dependencies.  Daily upgrades usually means that I catch every single update by 
itself, so complex interdependencies are exactly what the committer tested.


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: SIOCGIFADDR broken on 9.0-RC1?

2011-11-15 Thread Stefan Bethke

Am 15.11.2011 um 23:35 schrieb GR:

> So, I switched to static assignement and it changes the behaviour (and 
> "fixes" the "bug").
> My guess is that during the time waiting for the DHCP offer, all aliases are 
> already configured on the network interface, and the IP address given by DHCP 
> is added at the end of the tail.
> 
> Is that a wanted behaviour? I find it dangerous (i.e. not exactly what a user 
> is expecting).

A bit of background, as best I understand it and remember from Stevens:

Interfaces in BSD do not have a notion of "primary" and "additional" addresses; 
interfaces just have any number of addresses associated with them.  There's no 
inherent ordering in this list (except for how the current implementation seems 
to keep them in the order they were configured).

To be able to associate proper routes with interface addresses, the 
recommendation for multiple IPv4 addresses on an Ethernet interface is to have 
one of them have the proper netmask for the network, and configure the 
remainder with a netmask of 255.255.255.255.  But that's solely for the benefit 
of the routing table; the interface itself doesn't really care.

Reading the rc.conf man page could give you the impression that there are 
primary and alias addresses, but the networking code doesn't really work like 
that.  The new ipv4_addrs_ syntax exposes the actual behavior in a 
more direct way.

Jeremy gave you a hint on how to fix your immediate problem, but the real 
answer is that the program needs to be fixed that makes assumptions about 
meaning attached to the first configured IPv4 address.


HTH,
Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: TCP Reassembly Issues

2011-11-24 Thread Stefan Bethke

Am 24.11.2011 um 21:30 schrieb Kris Bauer:

> On Thu, Nov 24, 2011 at 1:20 PM, Raul  wrote:
> 
> I am seeing the same sorts of things in netstat & vmstat:
> 
> # netstat -s -p tcp |grep mem
> 742935 discarded due to memory problems
> 
> # vmstat -z |grep tcpreass
> tcpreass: 40, 16464, 16340, 124, 131485,955443, 0

Same here:
root@diesel:~# netstat -s -p tcp |grep mem
529211 discarded due to memory problems
root@diesel:~# vmstat -z |grep tcpreass
tcpreass:40,   1680,1679,   1,  118846,831450,   0
root@diesel:~# uname -a
FreeBSD diesel.lassitu.de 9.0-PRERELEASE FreeBSD 9.0-PRERELEASE #20: Fri Nov 18 
21:57:59 CET 2011 r...@diesel.lassitu.de:/usr/obj/usr/src/sys/DIESEL  amd64
root@diesel:~# uptime
11:01PM  up 5 days, 23:15, 1 user, load averages: 0.14, 0.04, 0.01
root@diesel:~# svn info /usr/src
Path: /usr/src
Working Copy Root Path: /usr/src
URL: http://mirror.hanse.de/svn/freebsd/base/stable/9
Repository Root: http://mirror.hanse.de/svn/freebsd/base
Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
Revision: 227665
Node Kind: directory
Schedule: normal
Last Changed Author: fabient
Last Changed Rev: 227664
Last Changed Date: 2011-11-18 15:41:48 +0100 (Fri, 18 Nov 2011)

I regularly copy large files off my Tivo trans-atlantic (125ms RTT), and TCP 
connections currently stall after about 500 megs, never recovering.  I suspect 
this is connected, as it started immediately after upgrading the machine to 
9-stable.

As far as I can tell, the problem does not exist with 8-stable.


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: TCP Reassembly Issues

2011-11-25 Thread Stefan Bethke

Am 25.11.2011 um 00:35 schrieb Adrian Chadd:

> Have you tried disabling the tcp offload features of your NIC?


I'm using my em0 as a VLAN trunk, and I'm under the impression that that 
disables all the hardware assists in the controller. Also, the LAN vlan is 
bridged via OpenVPN and tap, making the whole bunch promiscous, which I believe 
also forces off the acceleration.

em0: flags=8943 metric 0 mtu 
1500

options=219b
ether 00:1c:c0:7d:8c:50
inet6 fe80::21c:c0ff:fe7d:8c50%em0 prefixlen 64 scopeid 0x1 
nd6 options=29
media: Ethernet autoselect (1000baseT )
status: active
bridge0: flags=8843 metric 0 mtu 1500
ether 02:00:00:00:00:01
inet6 2001:470:1f0b:1064::1 prefixlen 64 
inet 44.128.65.1 netmask 0xffc0 broadcast 44.128.65.63
inet6 fe80::21c:c0ff:fe7d:8c50%bridge0 prefixlen 64 scopeid 0xd 
nd6 options=21
id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
maxage 20 holdcnt 6 proto rstp maxaddr 100 timeout 1200
root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
member: vlan1 flags=143
ifmaxaddr 0 port 15 priority 128 path cost 55
member: tap0 flags=143
ifmaxaddr 0 port 14 priority 128 path cost 200
vlan1: flags=8943 metric 0 mtu 
1500
options=3
ether 00:1c:c0:7d:8c:50
nd6 options=29
media: Ethernet autoselect (1000baseT )
status: active
vlan: 1 parent interface: em0

em0@pci0:0:25:0:class=0x02 card=0x50038086 chip=0x10cd8086 rev=0x00 
hdr=0x00
vendor = 'Intel Corporation'
device = '82567LF-2 Gigabit Network Connection'
class  = network
subclass   = ethernet
cap 01[c8] = powerspec 2  supports D0 D3  current D0
cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
cap 09[e0] = vendor (length 6) Intel cap 2 version 0


-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: TCP Reassembly Issues

2011-11-26 Thread Stefan Bethke

> I think I've got it - a stupid 1 line logic bug. My apologies for missing it 
> when I reviewed the patch which introduced the bug (patch was committed to 
> head as r226113, MFCed to stable/9 as r226228).
> 
> Due to some miscommunication, the initial patch was committed to and MFCed 
> from head much later than it should have been in the 9.0 release cycle and 
> instead of being included in the BETAs, didn't make it in until 9.0-RC1 I 
> believe i.e. only RC1 and RC2 should be experiencing the issue.
> 
> Could those who have reported the bug and are able to recompile their kernel 
> to test a patch please try the following and report back to the list:
> 
> http://people.freebsd.org/~lstewart/patches/misctcp/tcp_reass_plugzoneleak_10.x.r227986.patch
> 
> The patch is against head r227986 but will apply and work correctly for 9.0 
> as well.

I'm a happy camper!


Thanks,
Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: UFS corruption panic

2012-01-15 Thread Stefan Bethke


Am 15.01.2012 um 05:20 schrieb Joe Holden:

> Guys
> 
> Is a panic **really** appropriate for a filesystem that isn't even in
> fstab?
> 
> ie;
> panic: ufs_dirbad: /mnt: bad dir ino 3229 at offset 0: mangled entry
> 
> Which happened to be an file-backed md volume that got changed as I forgot
> to unmount it beforehand, however as a result there is now inconsistencies
> and probably data corruption or even missing data on other important
> filesystems (ie; /, /var etc) because there wasn't even a sync or any kind
> of other sensible behaviour.

Yes, a panic is the correct action here.  While I agree that it's super 
annoying, the filesystem notices that something is *really* wrong.  Instead of 
letting the problem fester and continue to corrupt data, it stops the system.

Most filesystems work under the assumption that they're the sole owner of the 
disk.  This means that any changes to the on-disk data must come from 
filesystem code itself; if that data is inconstistent, it must be a bug in the 
filesystem code.  At this point, panic is the only course of action to avoid 
even greater damage to the data.

In other words: don't do that then :-)


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Custom kernel poll summary (was: Re: Reducing the need to compile a custom kernel)

2012-02-17 Thread Stefan Bethke

Am 14.02.2012 um 12:37 schrieb Alexander Leidinger:

> 1 FLOWTABLE

The last time I included this in a kernel it seemed to have odd effects on TCP 
connections.  Admittedly, that was probably two years or so ago, and I never 
bothered to find out what was happening in detail.  Is it safe now?


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: random problem with 8.3 from yesterday

2012-02-23 Thread Stefan Bethke

Am 22.02.2012 um 07:34 schrieb Erich Dollansky:

> 
> tunefs -L NewDeviceName /dev/da0a
> 
> Either this call or the mount command does not work randomly.
> 
> When I then try to mount the device on /dev/da0a it does not work always.
> 
> I do not know what this causes, I am only randomly able to reproduce it.
> 
> It might be affected by removing the device or keeping it plugged in.

You need to be more specific: what "does not work" mean? Output, results?


Stefan

-- 
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

panic: GPF in kernel

2012-02-26 Thread Stefan Bethke

Setting up new hardware (i5 CPU, 16 GB RAM) and doing a burn-in test running 
make buildworld in a loop.  After a couple of hours, I got this panic.  
8-stable is from January, ZFS root.

Does this correlate with any recently fixed bugs, or is this likely a hardware 
issue?


# uname -a
FreeBSD dhcp62.lassitu.de 8.2-STABLE FreeBSD 8.2-STABLE #0: Fri Feb 24 23:22:57 
UTC 2012 r...@dhcp62.lassitu.de:/usr/obj/freebsd/checkout/src/sys/EISENBOOT 
 amd64


Fatal trap 9: general protection fault while in kernel mode
cpuid = 3; apic id = 06
instruction pointer = 0x20:0x805460c5
stack pointer   = 0x28:0xff84830119d0
frame pointer   = 0x28:0xff8483011a60
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 38198 (cc1)
trap number = 9
panic: general protection fault
cpuid = 3
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
kdb_backtrace() at kdb_backtrace+0x37
panic() at panic+0x187
trap_fatal() at trap_fatal+0x290
trap() at trap+0x180
calltrap() at calltrap+0x8
--- trap 0x9, rip = 0x805460c5, rsp = 0xff84830119d0, rbp = 
0xff8483011a60 ---
pmap_remove_pages() at pmap_remove_pages+0x275
vmspace_exit() at vmspace_exit+0x9a
exit1() at exit1+0x3b3
sys_exit() at sys_exit+0xe
amd64_syscall() at amd64_syscall+0x24f
Xfast_syscall() at Xfast_syscall+0xfc
--- syscall (1, FreeBSD ELF64, sys_exit), rip = 0x84ba3c, rsp = 0x7fffdf68, 
rbp = 0x7fffdfa0 ---



(kgdb) bt
#0  doadump () at /freebsd/checkout/src/sys/kern/kern_shutdown.c:263
#1  0x802eab00 in boot (howto=260)
at /freebsd/checkout/src/sys/kern/kern_shutdown.c:441
#2  0x802eafa1 in panic (fmt=Variable "fmt" is not available.
)
at /freebsd/checkout/src/sys/kern/kern_shutdown.c:614
#3  0x8054dc80 in trap_fatal (frame=0x9, eva=Variable "eva" is not 
available.
)
at /freebsd/checkout/src/sys/amd64/amd64/trap.c:825
#4  0x8054e2a0 in trap (frame=0xff8483011920)
at /freebsd/checkout/src/sys/amd64/amd64/trap.c:621
#5  0x80534cb8 in calltrap ()
at /freebsd/checkout/src/sys/amd64/amd64/exception.S:228
#6  0x805460c5 in pmap_remove_pages (pmap=0xff01982838d8)
at /freebsd/checkout/src/sys/amd64/amd64/pmap.c:4087
#7  0x8051a61a in vmspace_exit (td=0xff017efca8a0)
at /freebsd/checkout/src/sys/vm/vm_map.c:405
#8  0x802b93d3 in exit1 (td=0xff017efca8a0, rv=Variable "rv" is not 
available.
)
at /freebsd/checkout/src/sys/kern/kern_exit.c:298
#9  0x802ba61e in sys_exit (td=Variable "td" is not available.
)
at /freebsd/checkout/src/sys/kern/kern_exit.c:106
#10 0x8054d1ff in amd64_syscall (td=0xff017efca8a0, traced=0)
at subr_syscall.c:114
#11 0x80534fac in Xfast_syscall ()
at /freebsd/checkout/src/sys/amd64/amd64/exception.S:387
#12 0x0084ba3c in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) 

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

make world fails in usr.sbin/config?

2010-05-24 Thread Stefan Bethke

I have a feeling I screwed something up, but I can't find anything wrong 
locally.

# uname -a
FreeBSD diesel.lassitu.de 8.0-STABLE FreeBSD 8.0-STABLE #9 r204100: Sat Feb 20 
09:53:14 CET 2010 r...@diesel.lassitu.de:/usr/obj/usr/src/sys/DIESEL  amd64
# svn info
Path: .
URL: svn://svn.freebsd.org/base/stable/8
Repository Root: svn://svn.freebsd.org/base
Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
Revision: 208493
Node Kind: directory
Schedule: normal
Last Changed Author: mav
Last Changed Rev: 208492
Last Changed Date: 2010-05-24 13:01:56 +0200 (Mon, 24 May 2010)

[...]
===> usr.sbin/config (obj,depend,all,install)
/usr/obj/usr/src/tmp/usr/src/usr.sbin/config created for 
/usr/src/usr.sbin/config
yacc -d /usr/src/usr.sbin/config/config.y
cp y.tab.c config.c
lex -t  /usr/src/usr.sbin/config/lang.l > lang.c
file2c 'char kernconfstr[] = {' ',0};' < /usr/src/usr.sbin/config/kernconf.tmpl 
> kernconf.c
rm -f .depend
mkdep -f .depend -a-I. -I/usr/src/usr.sbin/config 
-I/usr/obj/usr/src/tmp/legacy/usr/include config.c 
/usr/src/usr.sbin/config/main.c lang.c /usr/src/usr.sbin/config/mkmakefile.c 
/usr/src/usr.sbin/config/mkheaders.c /usr/src/usr.sbin/config/mkoptions.c 
kernconf.c
echo config: /usr/lib/libc.a /usr/lib/libl.a /usr/lib/libsbuf.a 
/usr/obj/usr/src/tmp/legacy/usr/lib/libegacy.a >> .depend
cc -O2 -pipe -I. -I/usr/src/usr.sbin/config   
-I/usr/obj/usr/src/tmp/legacy/usr/include -c config.c
config.c:214: error: expected '=', ',', ';', 'asm' or '__attribute__' before 
'*' token
config.c:215: error: expected '=', ',', ';', 'asm' or '__attribute__' before 
'yyval'
config.c:216: error: expected '=', ',', ';', 'asm' or '__attribute__' before 
'yylval'
config.c:219: error: expected '=', ',', ';', 'asm' or '__attribute__' before 
'*' token
/usr/src/usr.sbin/config/config.y: In function 'yyerror':
/usr/src/usr.sbin/config/config.y:312: error: 'yyfile' undeclared (first use in 
this function)
/usr/src/usr.sbin/config/config.y:312: error: (Each undeclared identifier is 
reported only once
/usr/src/usr.sbin/config/config.y:312: error: for each function it appears in.)
/usr/src/usr.sbin/config/config.y:312: error: 'yyline' undeclared (first use in 
this function)
/usr/src/usr.sbin/config/config.y: In function 'yywrap':
/usr/src/usr.sbin/config/config.y:318: error: 'found_defaults' undeclared 
(first use in this function)
/usr/src/usr.sbin/config/config.y:319: error: 'PREFIX' undeclared (first use in 
this function)
/usr/src/usr.sbin/config/config.y:319: error: 'stdin' undeclared (first use in 
this function)
/usr/src/usr.sbin/config/config.y:319: warning: comparison between pointer and 
integer
/usr/src/usr.sbin/config/config.y:321: error: 'yyfile' undeclared (first use in 
this function)
/usr/src/usr.sbin/config/config.y:322: error: 'yyline' undeclared (first use in 
this function)
/usr/src/usr.sbin/config/config.y: In function 'newfile':
/usr/src/usr.sbin/config/config.y:337: error: dereferencing pointer to 
incomplete type
/usr/src/usr.sbin/config/config.y:340: error: dereferencing pointer to 
incomplete type
/usr/src/usr.sbin/config/config.y:341: error: 'fntab' undeclared (first use in 
this function)
/usr/src/usr.sbin/config/config.y:341: error: 'f_next' undeclared (first use in 
this function)
/usr/src/usr.sbin/config/config.y: At top level:
/usr/src/usr.sbin/config/config.y:348: warning: 'struct device_head' declared 
inside parameter list
e type
y.tab.c: In function 'yygrowstack':
y.tab.c:382: error: 'YYSTYPE' undeclared (first use in this function)
y.tab.c:382: error: 'newvs' undeclared (first use in this function)
y.tab.c:397: error: 'yyvs' undeclared (first use in this function)
y.tab.c:397: error: expected expression before ')' token
y.tab.c:402: error: 'yyvsp' undeclared (first use in this function)
y.tab.c: In function 'yyparse':
y.tab.c:456: error: 'yyvsp' undeclared (first use in this function)
y.tab.c:456: error: 'yyvs' undeclared (first use in this function)
y.tab.c:488: error: 'yylval' undeclared (first use in this function)
y.tab.c:569: error: 'yyval' undeclared (first use in this function)
*** Error code 1

Stop in /usr/src/usr.sbin/config.
*** Error code 1

Stop in /usr/src.
*** Error code 1

Stop in /usr/src.
*** Error code 1

Stop in /usr/src.

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: make world fails in usr.sbin/config?

2010-05-24 Thread Stefan Bethke

Am 24.05.2010 um 14:09 schrieb Jeremy Chadwick:

> On Mon, May 24, 2010 at 01:59:14PM +0200, Stefan Bethke wrote:
>> I have a feeling I screwed something up, but I can't find anything wrong 
>> locally.
>> ...
>> ===> usr.sbin/config (obj,depend,all,install)
>> /usr/obj/usr/src/tmp/usr/src/usr.sbin/config created for 
>> /usr/src/usr.sbin/config
>> yacc -d /usr/src/usr.sbin/config/config.y
>> cp y.tab.c config.c
>> lex -t  /usr/src/usr.sbin/config/lang.l > lang.c
>> file2c 'char kernconfstr[] = {' ',0};' < 
>> /usr/src/usr.sbin/config/kernconf.tmpl > kernconf.c
>> rm -f .depend
>> mkdep -f .depend -a-I. -I/usr/src/usr.sbin/config 
>> -I/usr/obj/usr/src/tmp/legacy/usr/include config.c 
>> /usr/src/usr.sbin/config/main.c lang.c /usr/src/usr.sbin/config/mkmakefile.c 
>> /usr/src/usr.sbin/config/mkheaders.c /usr/src/usr.sbin/config/mkoptions.c 
>> kernconf.c
>> echo config: /usr/lib/libc.a /usr/lib/libl.a /usr/lib/libsbuf.a 
>> /usr/obj/usr/src/tmp/legacy/usr/lib/libegacy.a >> .depend
>> cc -O2 -pipe -I. -I/usr/src/usr.sbin/config   
>> -I/usr/obj/usr/src/tmp/legacy/usr/include -c config.c
>> config.c:214: error: expected '=', ',', ';', 'asm' or '__attribute__' before 
>> '*' token
>> config.c:215: error: expected '=', ',', ';', 'asm' or '__attribute__' before 
>> 'yyval'
>> config.c:216: error: expected '=', ',', ';', 'asm' or '__attribute__' before 
>> 'yylval'
>> config.c:219: error: expected '=', ',', ';', 'asm' or '__attribute__' before 
>> '*' token
>> /usr/src/usr.sbin/config/config.y: In function 'yyerror':
>> /usr/src/usr.sbin/config/config.y:312: error: 'yyfile' undeclared (first use 
>> in this function)
>> /usr/src/usr.sbin/config/config.y:312: error: (Each undeclared identifier is 
>> reported only once
>> /usr/src/usr.sbin/config/config.y:312: error: for each function it appears 
>> in.)
>> /usr/src/usr.sbin/config/config.y:312: error: 'yyline' undeclared (first use 
>> in this function)
> 
> 1) Have you tried rm -fr /usr/obj/* prior to building world?

/usr/obj is a fresh filesystem (zfs).

> 2) If you already tried that, can you provide your /etc/make.conf and
> /etc/src.conf contents?

I have no src.conf, and this is make.conf, unchanged from previous make worlds.

#
# make world etc.
#
KERNCONF?=  DIESEL
#MODULES_WITH_WORLD=true

BOOT_PXELDR_ALWAYS_SERIAL?= true
BOOT_PXELDR_PROBE_KEYBOARD?=true

# added by use.perl 2009-07-26 23:56:06
PERL_VERSION=5.8.9


-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: make world fails in usr.sbin/config?

2010-05-24 Thread Stefan Bethke

Am 24.05.2010 um 14:18 schrieb Jeremy Chadwick:

> 1) Were you using any "-j" flags during your make?  If so, try without it.
> Sometimes these are known to cause oddities, even if occasionally.

Nope.

> 2) Make sure your system clock is correct and isn't drifting badly.
> Highly recommend you use ntpdate to set the clock initially, then run
> ntpd at all times.

# ntpq -p
 remote   refid  st t when poll reach   delay   offset  jitter
==
+lokschuppen.zs6 131.188.3.2222 u   69  512  377   34.1155.313   0.153
*jachthafen.hans 131.188.3.2222 u   52  512  377   33.9664.757   0.554
-ps.bucuo.de 192.53.103.108   2 u  185  512  377   39.5677.895   0.268
-svr02.teleport- 73.120.242.922 u  187  512  377   44.5726.949   0.542
-netzwerkteufel. 192.53.103.104   2 u  202  512  377   35.3387.662   0.422
+qraftwerk.de192.53.103.108   2 u  141  512  377   52.5055.228   0.256

I'll try a new checkout next.


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: make world fails in usr.sbin/config?

2010-05-24 Thread Stefan Bethke

Am 24.05.2010 um 14:40 schrieb Paul Mather:

> On May 24, 2010, at 8:29 AM, Jeremy Chadwick wrote:
> 
>> For added posterity, it looks like usr.sbin/config has been mostly
>> untouched for quite some time, sans mkoptions.c and mkmakefile.c:
>> 
>> http://www.freebsd.org/cgi/cvsweb.cgi/src/usr.sbin/config/
> 
> Having said that, there is this entry in /usr/src/UPDATING dating from early 
> May:
> 
> 20100502:
>The config(8) command has been updated to maintain compatibility
>with config files from 8.0-RELEASE.  You will need a new version
>of config to build kernels (this version can be used from 8.0-RELEASE
>forward).  The buildworld target will generate it, so following
>the instructions in this file for updating will work glitch-free.
>Merely doing a make buildkernel without first doing a make buildworld
>(or kernel-toolchain), or attempting to build a kernel using
>traidtional methods will generate a config version warning, indicating
>you should update.
> 
> 
> Stefan's kernel looks to have last been built on 20th February 2010.  It 
> isn't explicit in the first posting of this thread how Stefan is doing his 
> build, so there is a possibility that he's being affected by the above 
> UPDATING entry.

# make buildworld buildkernel


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: make world fails in usr.sbin/config?

2010-05-24 Thread Stefan Bethke

Am 24.05.2010 um 14:29 schrieb Jeremy Chadwick:

> All that said: I *have* seen the compiler error you've mentioned, but
> usually a 2nd rebuild (after nuking /usr/obj/*) usually works.  Probably
> some weird race condition.

It sure looks like it. Now that I've checked out again, the error has moved to:
cc -O2 -pipe -DHAS_ISBLANK -I. 
-I/usr/src/usr.bin/awk/../../contrib/one-true-awk -DFOPEN_MAX=64   
-I/usr/obj/usr/src/tmp/legacy/usr/include  
-L/usr/obj/usr/src/tmp/legacy/usr/lib 
/usr/src/usr.bin/awk/../../contrib/one-true-awk/maketab.c  -o maketab
In file included from 
/usr/src/usr.bin/awk/../../contrib/one-true-awk/maketab.c:35:
./ytab.h:98: warning: data definition has no type or storage class
./ytab.h:99: error: expected '=', ',', ';', 'asm' or '__attribute__' before 
'yylval'
*** Error code 1

Stop in /usr/src/usr.bin/awk.
*** Error code 1

Stop in /usr/src.
*** Error code 1

Stop in /usr/src.
*** Error code 1

Stop in /usr/src.

Again, it appears as if YYSTYPE is defined but empty.  ytab.h looks a bit odd 
to me:

#define DECR 346
#define INCR 347
#define INDIRECT 348
#define LASTTOKEN 349
 YYSTYPE;
extern YYSTYPE yylval;

Line 98 is " YYSTYPE;"


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: make world fails in usr.sbin/config?

2010-05-24 Thread Stefan Bethke

Am 24.05.2010 um 15:13 schrieb Jeremy Chadwick:

> So now the problem has moved from usr.sbin/config to usr.sbin/awk?
> Weird.  Usually this sort of thing indicates excessive clock skew (as in
> rapidly skewing multiple seconds in bursts), or very strange filesystem
> problems.
> 
> Is it possible for your /usr/obj to be made a UFS2 filesystem and for
> you to re-try your build?

I've now checked out via csup, and I've put /usr/obj on UFS.  The error has 
shifted yet again:
cc -O2 -pipe -I. -I/usr/src/usr.bin/lex -std=gnu99   
-I/usr/obj/usr/src/tmp/legacy/usr/include -c parse.c
/usr/src/usr.bin/lex/parse.y: In function 'build_eof_action':
/usr/src/usr.bin/lex/parse.y:786: error: 'MAXLINE' undeclared (first use in 
this function)

I would agree that this looks like time problems or similar, but I don't see 
how that could be the case.

I'll put the source on UFS as well, just to make sure.

> By the way, the buildworld + buildkernel I was running on the FreeBSD VM
> box I have just finished -- no issues.  And that's with make -j2.
> That's an 8.0-RELEASE machine which is being built to upgrade to
> RELENG_8.

A separate make buildworld on another machine is chugging along just fine, so 
there's definitly something odd about this box.

I've just moved from a root on UFS plus data on ZFS setup, to root on ZFS; 
that's the only real difference I can think of.  Although I don't see how that 
would affect building world, especially since I've had src and obj on ZFS 
before.


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: make world fails in usr.sbin/config?

2010-05-24 Thread Stefan Bethke

Am 24.05.2010 um 15:27 schrieb Stefan Bethke:

> I've now checked out via csup, and I've put /usr/obj on UFS.  The error has 
> shifted yet again:
> cc -O2 -pipe -I. -I/usr/src/usr.bin/lex -std=gnu99   
> -I/usr/obj/usr/src/tmp/legacy/usr/include -c parse.c
> /usr/src/usr.bin/lex/parse.y: In function 'build_eof_action':
> /usr/src/usr.bin/lex/parse.y:786: error: 'MAXLINE' undeclared (first use in 
> this function)
> 
> I would agree that this looks like time problems or similar, but I don't see 
> how that could be the case.
> 
> I'll put the source on UFS as well, just to make sure.

Putting the sources on a separate UFS file system "fixed" the build issue.  
Previously, I did have root on UFS, but /usr/src and /usr/obj on ZFS, so I 
don't quite understand what the difference is.


Thanks for all your help!


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: make world fails in usr.sbin/config?

2010-05-25 Thread Stefan Bethke

Am 24.05.2010 um 19:49 schrieb Jeremy Chadwick:

> On Mon, May 24, 2010 at 09:24:00AM -0700, Jeremy Chadwick wrote:
>> Builds are underway now (following /usr/src/Makefile method), I'll
>> report back when those are done.  I'm also adding "time" in front of the
>> "make buildXXX" portions just to see now long things take.
> 
> The build portions finished.  Here are the numbers (quite high due to a
> combination of limited memory constraints (intentional) and the fact
> that VMware Workstation isn't the fastest thing on the planet.  :-) )

For the record: I'm now running -stable as of last night, compiled without 
issue on ZFS filesystems throughout.  No idea what caused the issue in the 
first place, and what made it disappear though, but updating to the correctly 
built -stable made the build on ZFS work again.  (It also involved an 
accidential upgrade and downgrade via -current, since I checked out the wrong 
tag with csup.  Yikes.)

Thanks for all the support to all of you!

Stefan

-- 
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Inconsistent IO performance

2010-08-13 Thread Stefan Bethke

Am 13.08.2010 um 18:01 schrieb Kevin Oberman:

> Note the dramatic differences even on the same kernel. For the December
> 6 kernel, for example, I see a maximum of 23,676,086 and a minimum of
> just 18,304,565. 

Are the disks still OK?  If any sectors have been remapped between runs, 
additional seeks would be needed.  I think it's unlikely, but checking with 
smartmontools should only take a few minutes.


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

ichwd causes freeze instead of reset

2010-08-21 Thread Stefan Bethke

Hi,

somewhat foolishly, I activated watchdogd and ichwd on a remote box, and while 
testing it (by suspending watchdogd), apparently the watchdog triggered.  But 
instead of resetting, the machine does not react anymore on the serial console. 
 I will have to wait until Monday to get physical access, so it might be 
hanging or just switched itself off; I have no way of telling right now.

ichwd probes as:
ichwd0:  on isa0
ichwd0: Intel ICH7 watchdog timer (ICH7 or equivalent)
ppc0: cannot reserve I/O port range

(not sure why ppc0 is getting involved at that point.)

FreeBSD lokschuppen.zs64.net 8.1-PRERELEASE FreeBSD 8.1-PRERELEASE #30: Thu Jul 
15 12:58:20 UTC 2010 
r...@lokschuppen.zs64.net:/usr/obj/usr/src/sys/EISENBOOT  amd64

Once the box is up again, is it worthwile trying ichwd again, should I try and 
use SW_WATCHDOG, or forget about it?


Thanks,
Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ichwd causes freeze instead of reset

2010-08-21 Thread Stefan Bethke

Am 21.08.2010 um 23:02 schrieb Andriy Gapon:

> on 21/08/2010 23:33 Stefan Bethke said the following:
>> Hi,
>> 
>> somewhat foolishly, I activated watchdogd and ichwd on a remote box, and
>> while testing it (by suspending watchdogd), apparently the watchdog
>> triggered.  But instead of resetting, the machine does not react anymore on
>> the serial console.  I will have to wait until Monday to get physical access,
>> so it might be hanging or just switched itself off; I have no way of telling
>> right now.
>> 
>> ichwd probes as: ichwd0:  on isa0 ichwd0: Intel
>> ICH7 watchdog timer (ICH7 or equivalent) ppc0: cannot reserve I/O port range
>> 
>> (not sure why ppc0 is getting involved at that point.)
>> 
>> FreeBSD lokschuppen.zs64.net 8.1-PRERELEASE FreeBSD 8.1-PRERELEASE #30: Thu
>> Jul 15 12:58:20 UTC 2010
>> r...@lokschuppen.zs64.net:/usr/obj/usr/src/sys/EISENBOOT  amd64
>> 
>> Once the box is up again, is it worthwile trying ichwd again, should I try
>> and use SW_WATCHDOG, or forget about it?
> 
> Just test it more while having physical access before making any conclusions.
> There could be a number of radically different possibilities ranging from
> hardware peculiarities to configuration problems to pilot errors to etc.

I guess what I'm looking for is some confirmation that ichwd is working 
properly on this particular hardware: Asus Pundit P4 P5G41 with a G41 chipset.

Below are pciconv -lvb and dmesg:

hos...@pci0:0:0:0:  class=0x06 card=0x836d1043 chip=0x2e308086 rev=0x03 
hdr=0x00
vendor = 'Intel Corporation'
class  = bridge
subclass   = HOST-PCI
vgap...@pci0:0:2:0: class=0x03 card=0x836d1043 chip=0x2e328086 rev=0x03 
hdr=0x00
vendor = 'Intel Corporation'
device = 'Intel G41 express graphics 
(PCIVEN_8086&DEV_2E32&SUBSYS_31031565&REV_033&115)'
class  = display
subclass   = VGA
bar   [10] = type Memory, range 64, base 0xfe40, size 4194304, enabled
bar   [18] = type Prefetchable Memory, range 64, base 0xe000, size 
268435456, enabled
bar   [20] = type I/O Port, range 32, base 0xbc00, size  8, enabled
vgap...@pci0:0:2:1: class=0x038000 card=0x836d1043 chip=0x2e338086 rev=0x03 
hdr=0x00
vendor = 'Intel Corporation'
class  = display
bar   [10] = type Memory, range 64, base 0xfe80, size 1048576, enabled
no...@pci0:0:27:0:  class=0x040300 card=0x82fe1043 chip=0x27d88086 rev=0x01 
hdr=0x00
vendor = 'Intel Corporation'
device = 'IDT High Definition Audio Driver  (BA101897)'
class  = multimedia
subclass   = HDA
bar   [10] = type Memory, range 64, base 0xfe3f8000, size 16384, enabled
pc...@pci0:0:28:0:  class=0x060400 card=0x81791043 chip=0x27d08086 rev=0x01 
hdr=0x01
vendor = 'Intel Corporation'
device = '82801G (ICH7 Family) PCIe Root Port'
class  = bridge
subclass   = PCI-PCI
pc...@pci0:0:28:2:  class=0x060400 card=0x81791043 chip=0x27d48086 rev=0x01 
hdr=0x01
vendor = 'Intel Corporation'
device = '82801G (ICH7 Family) PCIe Root Port'
class  = bridge
subclass   = PCI-PCI
pc...@pci0:0:28:3:  class=0x060400 card=0x81791043 chip=0x27d68086 rev=0x01 
hdr=0x01
vendor = 'Intel Corporation'
device = '82801G (ICH7 Family) PCIe Root Port'
class  = bridge
subclass   = PCI-PCI
uh...@pci0:0:29:0:  class=0x0c0300 card=0x81791043 chip=0x27c88086 rev=0x01 
hdr=0x00
vendor = 'Intel Corporation'
device = '82801G (ICH7 Family) USB Universal Host Controller'
class  = serial bus
subclass   = USB
bar   [20] = type I/O Port, range 32, base 0xb400, size 32, enabled
uh...@pci0:0:29:1:  class=0x0c0300 card=0x81791043 chip=0x27c98086 rev=0x01 
hdr=0x00
vendor = 'Intel Corporation'
device = '82801G (ICH7 Family) USB Universal Host Controller'
class  = serial bus
subclass   = USB
bar   [20] = type I/O Port, range 32, base 0xb480, size 32, enabled
uh...@pci0:0:29:2:  class=0x0c0300 card=0x81791043 chip=0x27ca8086 rev=0x01 
hdr=0x00
vendor = 'Intel Corporation'
device = '82801G (ICH7 Family) USB Universal Host Controller'
class  = serial bus
subclass   = USB
bar   [20] = type I/O Port, range 32, base 0xb800, size 32, enabled
uh...@pci0:0:29:3:  class=0x0c0300 card=0x81791043 chip=0x27cb8086 rev=0x01 
hdr=0x00
vendor = 'Intel Corporation'
device = '82801G (ICH7 Family) USB Universal Host Controller'
class  = serial bus
subclass   = USB
bar   [20] = type I/O Port, range 32, base 0xb880, siz

Re: ichwd causes freeze instead of reset

2010-08-21 Thread Stefan Bethke

Am 21.08.2010 um 23:24 schrieb Mike Tancsa:

> At 05:09 PM 8/21/2010, Stefan Bethke wrote:
> 
>> I guess what I'm looking for is some confirmation that ichwd is working 
>> properly on this particular hardware: Asus Pundit P4 P5G41 with a G41 
>> chipset.
>> 
> 
> Dont know about that particular MB implementation, but I have a number of 
> various ICH7 based boards where ichwd works as expected.  The freeze could 
> some something as simple as the box is waiting for keyboard input at the BIOS 
> prompt, or the BIOS option after a watchdog reset is to power off 
> However, I have only seen that option in later boards.

Thanks, I'll check that out Monday morning.


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Apparent dnsbl bug in Sendmail or m4

2010-08-22 Thread Stefan Bethke

Am 22.08.2010 um 09:10 schrieb John Nielsen:

> FEATURE(dnsbl, `bl.spamcop.net', `"550 Mail from " $&{client_addr} " 
> rejected, see http://spamcop.net/bl.shtml?"; $&{client_addr}')
> 
> On the FreeBSD 4.x server, this is the corresponding section in the .cf file:
> 
> # DNS based IP address spam list bl.spamcop.net
> R$* $: $&{client_addr}
> R$-.$-.$-.$-$:  $(dnsbl $4.$3.$2.$1.bl.spamcop.net. $: OK $)
> ROK  $: OKSOFAR
> R$+ $: TMPOK
> R$+  $#error $@ 5.7.1 $: "550 Mail from " $&{client_addr} 
> " rejected, s
> ee http://spamcop.net/bl.shtml?"; $&{client_addr}
> 
> On the FreeBSD 8.x server, this is the corresponding section:
> 
> # DNS based IP address spam list bl.spamcop.net
> R$* $: $&{client_addr}
> R$-.$-.$-.$-$:  $(dnsbl $4.$3.$2.$1.bl.spamcop.net. $: OK $)
> ROK  $: OKSOFAR
> R$+ $: TMPOK

I've got:

FEATURE(`dnsbl', `ix.dnsbl.manitu.net',`"550 Rejected - see 
http://www.heise.de/ix/nixspam/nixspam.blackmatches";')dnl

in my .mc, and get this in my .cf:

# DNS based IP address spam list ix.dnsbl.manitu.net
R$* $: $&{client_addr}
R$-.$-.$-.$-$:  $(dnsbl $4.$3.$2.$1.ix.dnsbl.manitu.net. $: OK $)
ROK  $: OKSOFAR
R$+ $: TMPOK
R$+  $#error $@ 5.7.1 $: "550 Rejected - see 
http://www.heise.de/ix/nixspam/nixspam.blackmatches";

This is on 8.1 from July 15th.  I just ran make all again, and it stays the 
same.

Fired up my VMware image with a three-day old -stable, and put both mine and 
yours in, and yours is missing the error line.

I experimented a bit, and it appears that the macro does not like having the 
$&{client_addr} at the very end of the parameter.  If I add "", it starts 
working.  No idea how or why, but there you go :-)

FEATURE(`dnsbl', `bl.spamcop.net', `"550 " $&{client_addr} "foo" 
$&{client_addr} ""')dnl


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Apparent dnsbl bug in Sendmail or m4

2010-08-22 Thread Stefan Bethke

Am 22.08.2010 um 10:00 schrieb Stefan Bethke:

> FEATURE(`dnsbl', `bl.spamcop.net', `"550 " $&{client_addr} "foo" 
> $&{client_addr} ""')dnl

The real culprit is the comma.  I believe the problem stems from unquoted use 
of the arguments in some of the ifelses, where the comma turns the single 
argument into two.  Tracing the ifelses with -d aceq I see this for the last 
ifelse in cf/feature/dnsbl.m4:

m4trace: -1- ifelse(`X"550 Mail from " $&{client_addr} " rejected', `see 
http://spamcop.net/bl.shtml?"; $&{client_addr}', `Xquarantine', `R$+ 
$#error $@ quarantine $: _DNSBL_SRV_', `X"550 Mail from " $&{client_addr} " 
rejected', `
see http://spamcop.net/bl.shtml?"; $&{client_addr}', `Xdiscard', `R$+ 
$#discard $: _DNSBL_SRV_', `R$+  $#error $@ 5.7.1 $: _DNSBL_MSG_'
) -> ???
m4trace: -1- ifelse(...) -> `'
m4trace: -1- ifelse ...


I've never managed to really wrap my head around m4 quoting, but the easy fix 
is to use some other character that has no meaning to m4.


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Serial console problems with stable/8

2010-09-13 Thread Stefan Bethke


Am 12.09.2010 um 17:26 schrieb Oliver Fromme:

> I cannot even su(1) to root because it tries to print
> a message to the console, so it hangs, too.  For the same
> reason I can't use shutdown(8) either.  :-(
> 
> This is what a hanging su(1) command looks like in ps -alxww:
>  UID   PID  PPID CPU PRI NI   VSZ   RSS MWCHAN STAT  TT   TIME COMMAND
>0  1533  1532   0  76  0  3392  3180 ttydcd I+ 00:00.05 su (zsh)
> 
> Interestingly, the KDB sequences CR ~ ^B/^P/^R/ do work,
> which use the "low-level" console.  So only the "high-level"
> console is frozen.

Looking at the WCHAN, I'd speculate that it's waiting for DCD to become active. 
Are you using a proper cable with handshaking, or a three-wire cable?

See what stty thinks the port is set to.  It probably has clocal set, but 
shouldn't. See if you can unwedge it by setting -clocal with stty, then pick a 
proper cable or gettytab entry.


Stefan

-- 
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Serial console problems with stable/8

2010-09-13 Thread Stefan Bethke

Am 13.09.2010 um 13:04 schrieb David Evans:

> I can confirm there is much weirdness with the uart on 8-STABLE.

OTOH, I have real hardware where things are working just fine:

$ grep uart /var/run/dmesg.boot 
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
uart0: [FILTER]
uart0: console (115200,n,8,1)
$ grep ttyu0 /etc/ttys 
ttyu0   "/usr/libexec/getty std.115200" vt100   on  secure

This is -stable from July 15th.  The other end of the serial line is an uftdi 
USB adapter:
uftdi0:  on usbus0


Stefan

-- 
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: How to predict drive number change for 7.3->8.1 upgrade?

2010-09-16 Thread Stefan Bethke

Am 16.09.2010 um 11:05 schrieb Michael Sperber:

> I just upgraded my desktop system from 7.3 to 8.1, and the main hard
> drive, which was /dev/ad6 before is now /dev/ad10.  Consequently, the
> initial boot failed when trying to mount the root file system from ad6.
> 
> The desktop system is now fixed, but I also have a rented server with
> only a serial console, and I worry that the upgrade is going to leave me
> with a dead machine.  Is there any way to predict how the drive number
> changes?  (Why does it change at all?)  If so, what's the proper way to
> tell the system the initial root device *before* rebooting?

If you have a serial console, you can always enter the root device at the 
prompt, so you can recover there.

If you can figure out the new device name, you can simply change the fstab 
entry for /; that's where loader picks up the root device that it hands to the 
kernel.

Long-term, the best option is to label your filesystems or partitions, and use 
the label entries in fstab instead of the device names.  I don't remember what 
7.3 offers in terms of labels, but glabel should be available.  Check tunefs if 
it offers the -L volname option, that's even better.


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Label question...why does ufs label vanish on mount?

2010-10-12 Thread Stefan Bethke

Am 12.10.2010 um 20:51 schrieb Kevin Oberman:

> For some reason the /dev/ufs/label entry that geom creates for every UFS
> formatted partition is deleted when the device is mounted. This is not
> the case for other file systems, though I have not tried them all. It
> makes the drive much harder to deal with when you have to keep track of
> which physical drive contains the labeled media. It is a particular
> issue for hald and the tools which depend on it.

In 8, only the ones that are not mounted are removed; they are restored on 
unmount.  So when you mount the filesystem via it's label, that entry continues 
to exist.  And it's not limited to any particular label type, I've noticed the 
same with partitions that I used gmirror on.

IIRC, in 7 only the device entry remained, and all label entries were removed 
on mount.

> Is there a good reason for this odd behavior of UFS? If there is not a
> good reason, could it be changed?

I don't know, but I'm curious myself. When I asked this very question some time 
ago, I didn't get a response.


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: need help with BElkin KVM and USB mouse problems.

2010-10-12 Thread Stefan Bethke

Am 12.10.2010 um 20:17 schrieb Gary Kline:

> The USB keyboard works fine everywhere. But my USB mouse fails on
> the FreeBSD platforms when I try to run X11.  I *have* managed to get
> the mouse working without X [i.e., in console mode]; and yes, the
> cursor and the buttons work fine.  But once I launch X--even simple
> apps like twm or ctwm, the mouse pointer is dead. 

It doesn't sound like this issue is connected to your using a KVM at all, but 
rather your X configuration.  If you have a second USB mouse, try plugging that 
in in addition to the KVM and see if there's any difference; I'm guessing not.

Check you X config, and make sure dbus and hald are enabled in rc.conf and 
started.


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Label question...why does ufs label vanish on mount?

2010-10-12 Thread Stefan Bethke

Am 12.10.2010 um 22:19 schrieb Kevin Oberman:

>> From: Stefan Bethke 
>> Date: Tue, 12 Oct 2010 22:01:24 +0200
>> 
>> Am 12.10.2010 um 20:51 schrieb Kevin Oberman:
>> 
>>> For some reason the /dev/ufs/label entry that geom creates for every UFS
>>> formatted partition is deleted when the device is mounted. This is not
>>> the case for other file systems, though I have not tried them all. It
>>> makes the drive much harder to deal with when you have to keep track of
>>> which physical drive contains the labeled media. It is a particular
>>> issue for hald and the tools which depend on it.
>> 
>> In 8, only the ones that are not mounted are removed; they are
>> restored on unmount.  So when you mount the filesystem via it's label,
>> that entry continues to exist.  And it's not limited to any particular
>> label type, I've noticed the same with partitions that I used gmirror
>> on.
> 
> Sorry, but my experience in contrary to that. I mount "/dev/ufs/aux" and
> that device name is returned by df(1), but 'ls /dev/ufs' no longer
> contains 'aux'. This broke gnome-mount and required patching hald to
> ignore device created in /dev/ufs. Otherwise, when the device was
> removed, the /dev/ufs device was re-created, a devd creation event
> occurred and the partition was immediately re-mounted. It made it
> impossible to unplug the USB drive.
> 
> Joe Marcus added a test of the created device to hald so the creation of
> /dev/ufs/aux would be ignored and the device always mounted by the
> hardware device name.
> 
> This all works fine for msdosfs systems. The /dev/msdosfs entry does
> stay around when the device is mounted as /dev/msdosfs and all is
> well. I'd like to see consistent behavior before we get to making
> devicekit work with FreeBSD. (devicekit will replace hald some day.)

This got me curious, so I fired up my -stable VM.  I only tried UFS, but label 
entries do not reappear constently, at least in the quick test I did.  And I 
still don't understand why they get removed in the first place.

r...@freebsd8:~# uname -a
FreeBSD freebsd8.lassitu.de 8.1-STABLE FreeBSD 8.1-STABLE #2 r212724: Thu Sep 
16 15:22:34 UTC 2010 r...@freebsd8.lassitu.de:/usr/obj/usr/src/sys/MINIMAL  
amd64

Here's what I tried with a 1 gig stick that probes as:
umass0:  on usbus1
(probe0:umass-sim0:0:0:0): TEST UNIT READY. CDB: 0 0 0 0 0 0 
(probe0:umass-sim0:0:0:0): CAM status: SCSI Status Error
(probe0:umass-sim0:0:0:0): SCSI status: Check Condition
(probe0:umass-sim0:0:0:0): SCSI sense: UNIT ATTENTION asc:28,0 (Not ready to 
ready change, medium may have changed)
da2 at umass-sim0 bus 0 scbus1 target 0 lun 0
da2:  Removable Direct Access SCSI-0 device 
da2: 40.000MB/s transfers
da2: 984MB (2015232 512 byte sectors: 64H 32S/T 984C)

r...@freebsd8:~# gpart create -s gpt /dev/da2
da2 created
r...@freebsd8:~# gpart add -s 256m -l ufs -t freebsd-ufs da2
da2p1 added
r...@freebsd8:~# gpart list da2
Geom name: da2
fwheads: 64
fwsectors: 32
last: 2015198
first: 34
entries: 128
scheme: GPT
Providers:
1. Name: da2p1
   Mediasize: 268435456 (256M)
   Sectorsize: 512
   Mode: r0w0e0
   rawtype: 516e7cb6-6ecf-11d6-8ff8-00022d09712b
   label: ufs
   length: 268435456
   offset: 17408
   type: freebsd-ufs
   index: 1
   end: 524321
   start: 34
Consumers:
1. Name: da2
   Mediasize: 1031798784 (984M)
   Sectorsize: 512
   Mode: r0w0e0
r...@freebsd8:~# ls /dev/gpt
ufs
r...@freebsd8:~# ls /dev/gptid
bba94c8e-d63f-11df-888c-000c295e330a
r...@freebsd8:~# newfs -L ufslabel /dev/da2p1
/dev/da2p1: 256.0MB (524288 sectors) block size 16384, fragment size 2048
using 4 cylinder groups of 64.02MB, 4097 blks, 8256 inodes.
super-block backups (for fsck -b #) at:
 160, 131264, 262368, 393472
r...@freebsd8:~# mount /dev/da2p1 /mnt
r...@freebsd8:~# ls /dev/gpt
r...@freebsd8:~# ls /dev/ufs
r...@freebsd8:~# ls /dev/gptid
r...@freebsd8:~# umount /mnt
r...@freebsd8:~# mount /dev/ufs/ufslabel /mnt
r...@freebsd8:~# ls /dev/da2p1
/dev/da2p1
r...@freebsd8:~# ls /dev/gpt
r...@freebsd8:~# ls /dev/ufs
ufslabel
r...@freebsd8:~# ls /dev/gptid
r...@freebsd8:~# umount /mnt
r...@freebsd8:~# ls -l /dev/da2p1 /dev/gpt /dev/gptid /dev/ufs
crw-r-  1 root  operator0, 103 Oct 12 20:34 /dev/da2p1

/dev/gpt:
total 0
crw-r-  1 root  operator0, 111 Oct 12 20:34 ufs

/dev/gptid:
total 0
crw-r-  1 root  operator0, 112 Oct 12 20:34 
bba94c8e-d63f-11df-888c-000c295e330a

/dev/ufs:
total 0
crw-r-  1 root  operator0, 108 Oct 12 20:34 ufslabel

I then unplugged and replugged the stick:
r...@freebsd8:~# ls -l /dev/da2p1 /dev/gpt /dev/gptid /dev/ufs
crw-r-  1 root  operator0, 103 Oct 12 20:34 /dev/da2p1

/dev/gpt:
total 0
crw-r-  1 root  operator    0, 111 Oct 12 20:34 ufs

/d

Re: Label question...why does ufs label vanish on mount?

2010-10-12 Thread Stefan Bethke

Am 13.10.2010 um 06:56 schrieb Andrey V. Elsukov:

> On 12.10.2010 22:51, Kevin Oberman wrote:
>> For some reason the /dev/ufs/label entry that geom creates for every UFS
>> formatted partition is deleted when the device is mounted. This is not
>> the case for other file systems, though I have not tried them all. It
>> makes the drive much harder to deal with when you have to keep track of
>> which physical drive contains the labeled media. It is a particular
>> issue for hald and the tools which depend on it.
>> 
>> Is there a good reason for this odd behavior of UFS? If there is not a
>> good reason, could it be changed?
> 
> When you are opening provider for writing (i.e. mount FS) GEOM(4)
> initiates SPOILING and all consumers that are attached to this provider
> except one will self-destroyed. When you are closing provider GEOM(4)
> initiates TASTING and consumers can return back. Look at man 4 geom
> for details.

That explains the mechanism, but not the rationale.  Or is it just an 
unintended consequence?  And how is da2p1 different from ufs/mylabel?  (Mount 
da2p1 and ufs/mylabel is removed, but not the other way around.)


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Label question...why does ufs label vanish on mount?

2010-10-13 Thread Stefan Bethke

Am 13.10.2010 um 10:20 schrieb Pawel Jakub Dawidek:

> On Tue, Oct 12, 2010 at 11:33:11PM -0700, Jeremy Chadwick wrote:
>> On Wed, Oct 13, 2010 at 08:29:06AM +0200, Stefan Bethke wrote:
>>> That explains the mechanism, but not the rationale.  Or is it just an 
>>> unintended consequence?  And how is da2p1 different from ufs/mylabel?  
>>> (Mount da2p1 and ufs/mylabel is removed, but not the other way around.)
>> 
>> Pulling in pjd@ who can probably shed some light on this.
> 
> The ufs/mylabel provider is based on da2p1, that's why opening da2p1
> makes ufs/mylabel to be removed and not the other way around.
> 
> The ufs/mylabel provider was created, because when da2p1 provider was
> created and LABEL class tasted it, it discovered that this provider
> contains UFS file system with 'mylabel' volume label, so the LABEL class
> created ufs/mylabel provider. Now when you open da2p1 for writing, the
> LABEL class destroys ufs/mylabel, because you may decide to change
> metadata on da2p1, for example you may choose to destroy UFS in there or
> change the volume label. When write open count on da2p1 goes down to
> zero, the LABEL class will be given da2p1 provider for tasting once
> again, so it can rediscover (possibly modified) volume label.
> 
> The class may choose to ignore the spoil event from GEOM (it is send on
> first open for write), but if it isn't based on autodiscovering
> metadata. For example the NOP class ignores this event, because it
> doesn't care about metadata of provider it is based on.
> 
> If we choose to ignore the spoil event in the LABEL class we will end up
> with stale info, eg. open da2p1 for writing, change its volume label and
> mount it and you will still have old label in /dev/ufs/.

Thanks a lot (and also to Andrey), that really makes it clear to me!

I just wish there was an easy way to keep the labels around even while someone 
has the provider open for writing, but I now understand that this requires some 
significant changes.


Stefan

-- 
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ZFS write speed

2010-10-27 Thread Stefan Bethke

Am 27.10.2010 um 22:51 schrieb S.N.Grigoriev:

> Hi list,
> 
> I've got very low write speed using ZFS on a SATA disk.
> My HDD configuration is:
> ad4: 70911MB  at ata2-master UDMA100 SATA 3Gb/s
> ad6: 78532MB  at ata3-master UDMA100 SATA 
> 1.5Gb/s
> ad8: 1430799MB  at ata4-master UDMA100 SATA 
> 3Gb/s

The EARS has 4k sectors, if I'm not mistaken.  I don't recall the eventual 
outcome, but there was a long thread on stable or hackers on how to ensure 
proper alignment and (minimun) 4k-sized writes to make sure the disk doesn't 
have to do a read-modify-write cycle, so try and search the archives.

> ad4 and ad6 are single-slice disks (UFS2 with soft updates)
> 
> ZFS configuration is following:
> zpool create Z ad8
> zfs create Z/music
> zfs create Z/video
> All ZFS parameters are default.
> kern.maxvnodes = 100
> 
> To test my configuration I recursively copied from ad6 to ad8 two directories.
> The first one contains MP3 files (average size = 10MB).
> The second one contains AVI files (average size = 1GB). 
> 
> To compare performance I repeated above tests with ad8 using UFS2 with soft 
> updates.
> 
> 18GB of MP3 files required 10m35s to copy to UFS2 and 21m40s to copy to ZFS.
> 30GB of AVI files required 16m6s to copy to UFS2 and 1h2m39s to copy to ZFS.
> 
> I used for tests FreeBSD 8.1R amd64. Amount of RAM on my machine is 6GB.
> 
> Any tips?
> 
> -- 
> Regards,
> Serguey.
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Degraded zpool cannot detach old/bad drive

2010-10-28 Thread Stefan Bethke


Am 29.10.2010 um 07:51 schrieb Rumen Telbizov:

> Thanks for your quick response. Unfortunately I already did try this
> approach. Applying -d /dev/gpt only limits the pool to the bare three 
> remaining disks
> which turns pool completely unusable (no mfid devices). Maybe those labels 
> are removed
> shortly they are being tried to be imported/accessed?
> 
> What I don't understand is what exactly makes those gpt labels disappear
> when the pool is imported and otherwise are just fine?!
> Something to do with OpenSolaris ? On top of it all gpart show -l keeps
> showing all the labels right even while the pool is imported.
> 
> Any other clues would be appreciated.

The labels are removed by glabel as soon as something opens the underlying 
provider, i. e. the disk device, for writing.  Since that process could change 
the part of the disk that the label information is extracted from, the label is 
removed.  glabel will re-taste the provider once the process closes it again.

Since you're using gpt labels, I would expect them to continue to be available, 
unless zpool import somehow opens the disk devices (instead of the partition 
devices).


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Abysmal re(4) performance under 8.1-STABLE (mid-August)

2010-11-06 Thread Stefan Bethke

Am 06.11.2010 um 10:37 schrieb Ulrich Spörlein:

> On this new server, I cannot get more than ~280kByte/s up/downstream out of
> re(4) without any tweaking.
> 
> re0: flags=8843 metric 0 mtu 1500
>
> options=389b
>ether 00:21:85:63:74:34
>inet6 fe80::221:85ff:fe63:7434%re0 prefixlen 64 scopeid 0x1 
>inet 46.4.12.147 netmask 0xffc0 broadcast 46.4.12.191
>nd6 options=3
>media: Ethernet autoselect (100baseTX )
>status: active

AOL:

r...@pci0:1:0:0:class=0x02 card=0x82c61043 chip=0x816810ec rev=0x02 
hdr=0x00
vendor = 'Realtek Semiconductor'
device = 'Gigabit Ethernet NIC(NDIS 6.0) (RTL8168/8111/8111c)'
class  = network
subclass   = ethernet
re0:  port 0xd800-0xd8ff 
mem 0xfdfff000-0xfdff,0xfdfe-0xfdfe irq 18 at device 0.0 on pci1
re0: Using 1 MSI messages
re0: Chip rev. 0x3c00
re0: MAC rev. 0x0040
miibus0:  on re0
rgephy0:  PHY 1 on miibus0
rgephy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto
re0: Ethernet address: 00:26:18:d5:2c:23
re0: [FILTER]

I believe that it was working properly some months ago, but reading Rick's 
thread over on -current I checked and transfer over NFS seems to be limited to 
a couple hundred KB as well.


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: whats best pracfive for ZFS on a whole disc these days ?

2009-11-14 Thread Stefan Bethke

Am 28.10.2009 um 01:41 schrieb Daniel O'Connor:

> On Wed, 28 Oct 2009, jfar...@goldsword.com wrote:
>> Check the archives for stable@ and f...@.  I believe that there was a  
>> thread not that long ago detailing exactly how to do that.  IIRC,  
>> while it took a bit of work, it wasn't difficult.
> 
> Hmm do you have any idea what the subject was? I'm having trouble 
> finding it :(

If you still need it, it was "ZFS pool corrupted on upgrade of -current 
(probably sata  renaming)" on -current back in July.  You probably need to read 
the full thread, and there are some caveats, but it's sometimes possible to 
glabel each device/partion, and zpool replace the original device/partition 
with the labelled one online.

HTH,
STefan

-- 
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: whats best pracfive for ZFS on a whole disc these days ?

2009-11-14 Thread Stefan Bethke

Am 15.11.2009 um 00:58 schrieb Larry Rosenman:

>>> On Wed Jul 15 at 16:22, Freddie Cash  wrote:
>>> Yep.  It's as simple as:
>>> 
>>>  * label all the drives using glabel, while they're still attached to
>>> the pool
>>>  * use "zpool replace pool ad4 label/disk01" to replace 1 drive
>>>  * wait for it to resilver
>>>  * use "zpool replace pool ad6 label/disk02" to replace the next
>>> drive
>>>  * repeat the resilver and replace until all the devices are replaced
>>> 
>>> This is what I did to one of our servers.  Works quite nicely.
>>> 
>>> There's no need to detach anything.
>> 
>> I'll try it when I get home and see how it goes.
> 
> When I try that, I get:

> # glabel label disk01 /dev/ada1
> glabel: Can't store metadata on /dev/ada1: Operation not permitted.

There's some caveats that you need to consider before attempting this: most 
importantly, glabel will re-use the last block of the disk/partition to store 
the label.  Apparently, in many cases, the filesystem (UFS, ZFS) allocates 
blocks in larger chunks (8K or larger), and the last few blocks are unused and 
can be repurposed.  But there's no guarantee, so you might damage the 
filesystem by labeling the device.  I don't understand enough to definitivly 
say how to deterime whether the last block is available or not, so make sure 
you have a backup before trying.

Secondly, my limited experience shows that both GEOM and ZFS can get confused 
about devices/partitions/geoms that start on the same block as others.  How 
these are picked up by GEOM and/or ZFS in their probing depends on the order, 
and it wasn't always obvious to me how that worked.  In one case, I couldn't 
get GEOM to pick up the /dev/label entry, since it removed the label entry as 
soon as the physical device node was probed.  I've since come to the conclusion 
that labelled GPT partitions are the way forward, and now that booting off 
ZRAID pools on GPT partitions works, there's little speaking against it, IMO.

Finally, if you want to label the existing disks, you probably need to take the 
pool offline for the labelling step, using zpool export, so the devices are not 
mounted anymore.

Stefan

-- 
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: whats best pracfive for ZFS on a whole disc these days ?

2009-11-14 Thread Stefan Bethke

Am 15.11.2009 um 00:49 schrieb Daniel O'Connor:

> It would be nice if the man page mentioned this case though, currently the 
> "zpool replace" entry covers the case where the new disk has the same device 
> node.

Huh?

>zpool replace [‐f] pool old_device [new_device]
> 
>Replaces  old_device with new_device. This is equivalent to attach‐
>ing new_device, waiting for it  to  resilver,  and  then  detaching
>    old_device.


Stefan

-- 
Stefan BethkeFon +49 151 14070811




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Fatal trap 9 triggered by zfs?

2009-12-04 Thread Stefan Bethke

I'm getting panics like this every so often (couple weeks, sometimes just a few 
days.) A second machine that has identical hardware and is running the same 
source has no such problems.

FreeBSD XXX.hanse.de 8.0-STABLE FreeBSD 8.0-STABLE #16: Tue Dec  1 14:30:54 UTC 
2009 r...@xxx.hanse.de:/usr/obj/usr/src/sys/EISENBOOT  amd64

# zpool status
  pool: tank
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
  ad4s1dONLINE   0 0 0
# cat /boot/loader.conf
vfs.zfs.arc_max="512M"
vfs.zfs.prefetch_disable="1"
vfs.zfs.zil_disable="1"

Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer = 0x20:0x80a39900
stack pointer   = 0x28:0xff80622ddae0
frame pointer   = 0x28:0xff80622ddb10
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 0 (spa_zio)
trap number = 9
panic: general protection fault
cpuid = 0
Uptime: 17h44m5s
Physical memory: 3313 MB
Dumping 1843 MB: 1828 1812 1796 1780 1764 1748 1732 1716 1700 1684 1668 1652 
1636 1620 1604 1588 1572 1556 1540 1524 1508 1492 1476 1460 1444 1428 1412 1396 
1380 1364 1348 1332 1316 1300 1284 1268 1252 1236 1220 1204 1188 1172 1156 1140 
1124 1108 1092 1076 1060 1044 1028 1012 996 980 964 948 932 916 900 884 868 852 
836 820 804 788 772 756 740 724 708 692 676 660 644 628 612 596 580 564 548 532 
516 500 484 468 452 436 420 404 388 372 356 340 324 308 292 276 260 244 228 212 
196 180 164 148 132 116 100 84 68 52 36 20 4

#0  doadump () at pcpu.h:223
223 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0  doadump () at pcpu.h:223
#1  0x803374b9 in boot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:416
#2  0x8033790c in panic (fmt=Variable "fmt" is not available.
)
at /usr/src/sys/kern/kern_shutdown.c:579
#3  0x805cbb8d in trap_fatal (frame=0x9, eva=Variable "eva" is not 
available.
)
at /usr/src/sys/amd64/amd64/trap.c:857
#4  0x805cc6f2 in trap (frame=0xff80622dda30)
at /usr/src/sys/amd64/amd64/trap.c:644
#5  0x805b2223 in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:224
#6  0x80a39900 in vdev_queue_agg_io_done (aio=0xff00374562d0)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c:174
#7  0x80a4be6f in zio_done (zio=0xff00374562d0)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:2243
#8  0x80a49e87 in zio_execute (zio=0xff00374562d0)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:996
#9  0x809ed603 in taskq_run (arg=0xff008d8d0420, pending=Variable 
"pending" is not available.
)
at 
/usr/src/sys/modules/zfs/../../cddl/compat/opensolaris/kern/opensolaris_taskq.c:108
#10 0x80373533 in taskqueue_run (queue=0xff00017e1400)
at /usr/src/sys/kern/subr_taskqueue.c:239
#11 0x803737b6 in taskqueue_thread_loop (arg=Variable "arg" is not 
available.
)
at /usr/src/sys/kern/subr_taskqueue.c:360
#12 0x8030e0b8 in fork_exit (
callout=0x80373770 , 
arg=0xff00016434e0, frame=0xff80622ddc80)
at /usr/src/sys/kern/kern_fork.c:843
#13 0x805b26fe in fork_trampoline ()
at /usr/src/sys/amd64/amd64/exception.S:561
#14 0x in ?? ()
#15 0x in ?? ()
#16 0x in ?? ()
#17 0x in ?? ()
#18 0x in ?? ()
#19 0x in ?? ()
#20 0x in ?? ()
#21 0x in ?? ()
#22 0x in ?? ()
#23 0x in ?? ()
#24 0x in ?? ()
#25 0x in ?? ()
#26 0x in ?? ()
#27 0x in ?? ()
#28 0x in ?? ()
#29 0x in ?? ()
#30 0x in ?? ()
#31 0x in ?? ()
#32 0x in ?? ()
#33 0x in ?? ()
#34 0x in ?? ()
#35 0x in ?? ()
#36 0x in ?? ()
#37 0x in ?? ()
#38 0x00c6c000 in ?? ()
#39 0x in ?? ()
#40 0x000b in ?? ()
#41 0x80832500 in affinity ()
#42 0xff000173c390 in ?? ()
#43 0xff80622dd240 in ?? ()
#44 0xff80622dd1f8 in ?? ()
#45 0xff00015ecab0 in ?? ()
#46 0x8035aa48 in sched_switch (td=0x80373770, 
newtd=0xff00016434e0, flags=Variable "flags" is not available.
) at /usr/src/sys/kern/sched_ule.c:1858
Previous frame inner to this frame (corrupt stack?)
(kgdb) 

-- 
Stefan BethkeFon +49 151 14070811




___

Re: Fatal trap 9 triggered by zfs?

2009-12-04 Thread Stefan Bethke

Am 04.12.2009 um 17:52 schrieb Stefan Bethke:

> I'm getting panics like this every so often (couple weeks, sometimes just a 
> few days.) A second machine that has identical hardware and is running the 
> same source has no such problems.
> 
> FreeBSD XXX.hanse.de 8.0-STABLE FreeBSD 8.0-STABLE #16: Tue Dec  1 14:30:54 
> UTC 2009 r...@xxx.hanse.de:/usr/obj/usr/src/sys/EISENBOOT  amd64
> 
> # zpool status
>  pool: tank
> state: ONLINE
> scrub: none requested
> config:
> 
>   NAMESTATE READ WRITE CKSUM
>   tankONLINE   0 0 0
> ad4s1dONLINE   0 0 0
> # cat /boot/loader.conf
> vfs.zfs.arc_max="512M"
> vfs.zfs.prefetch_disable="1"
> vfs.zfs.zil_disable="1"

Got another, different one.  Any tuning suggestions or similar?

#0  doadump () at pcpu.h:223
223 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0  doadump () at pcpu.h:223
#1  0x80337bd9 in boot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:416
#2  0x8033802c in panic (fmt=Variable "fmt" is not available.
)
at /usr/src/sys/kern/kern_shutdown.c:579
#3  0x805cc2ad in trap_fatal (frame=0x9, eva=Variable "eva" is not 
available.
)
at /usr/src/sys/amd64/amd64/trap.c:857
#4  0x805cce12 in trap (frame=0xff80625db030)
at /usr/src/sys/amd64/amd64/trap.c:644
#5  0x805b2943 in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:224
#6  0x80586c7a in vm_map_entry_splay (addr=Variable "addr" is not 
available.
)
at /usr/src/sys/vm/vm_map.c:771
#7  0x80587f37 in vm_map_lookup_entry (map=0xff0001e8, 
address=18446743523979624448, entry=0xff80625db170)
at /usr/src/sys/vm/vm_map.c:1021
#8  0x80588aa3 in vm_map_delete (map=0xff0001e8, 
start=18446743523979624448, end=18446743523979689984)
at /usr/src/sys/vm/vm_map.c:2685
#9  0x80588e61 in vm_map_remove (map=0xff0001e8, 
start=18446743523979624448, end=18446743523979689984)
at /usr/src/sys/vm/vm_map.c:2774
#10 0x8057db85 in uma_large_free (slab=0xff005fcc7000)
at /usr/src/sys/vm/uma_core.c:3021
#11 0x80325987 in free (addr=0xff80018b, 
mtp=0x80ac61e0) at /usr/src/sys/kern/kern_malloc.c:471
#12 0x80a36d03 in vdev_cache_evict (vc=0xff0001723ce0, 
ve=0xff003dd52200)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c:151
#13 0x80a372ad in vdev_cache_read (zio=0xff005f5ca2d0)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c:182
#14 0x80a4a954 in zio_vdev_io_start (zio=0xff005f5ca2d0)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1814
#15 0x80a4ae87 in zio_execute (zio=0xff005f5ca2d0)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:996
#16 0x80a3a080 in vdev_mirror_io_start (zio=0xff005f811b40)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c:303
#17 0x80a4ae87 in zio_execute (zio=0xff005f811b40)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:996
#18 0x809ff45a in arc_read_nolock (pio=0xff005f66d5a0, 
spa=0xff000150a000, bp=0xff800a91c440, 
done=0x80a02630 , private=Variable "private" is not 
available.
)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2763
#19 0x809ff8ec in arc_read (pio=0xff005f66d5a0, 
spa=0xff000150a000, bp=0xff800a91c440, pbuf=0xff0042a3ca20, 
done=0x80a02630 , private=0xff005fbfc620, 
priority=0, zio_flags=1, arc_flags=0xff80625db5ec, 
zb=0xff80625db5c0)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2508
#20 0x80a02aba in dbuf_read (db=0xff005fbfc620, 
zio=0xff005f66d5a0, flags=2)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:521
#21 0x80a0602c in dmu_buf_hold (os=Variable "os" is not available.
)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:106
#22 0x80a40db5 in zap_lockdir (os=0xff005f937610, obj=247890, 
tx=0x0, lti=RW_READER, fatreader=1, adding=0, zapp=0xff80625db888)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:388
#23 0x80a41724 in zap_cursor_retrieve (zc=0xff80625db880, 
za=0xff80625db8c0)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:1004
#24 0x80a61b66 in zfs_freebsd_readdir (ap=Variable "ap" is not 
a

Re: Fatal trap 9 triggered by zfs?

2009-12-04 Thread Stefan Bethke

Am 04.12.2009 um 21:20 schrieb Thomas Backman:

> Bad RAM/motherboard? My first thought when I read your first mail (re: 
> identical hardware) was bad hardware, and this seems to point towards that 
> too, no?
> Have you tried memtest86+?

No, I haven't yet, since I don't have physical access right now, and the box is 
in production service.  I've shifted a couple of services to the other, 
identical box to see if that changes anything in the behavior.

Right now it seems that heavy CPU load triggers panics, so bad RAM, CPU, 
chipset, mainboard, or marginal power supply are all possibilities.


Stefan

-- 
Stefan BethkeFon +49 151 14070811




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Fatal trap 9 triggered by zfs?

2009-12-04 Thread Stefan Bethke


Am 04.12.2009 um 21:33 schrieb Jeremy Chadwick:

> You only have one disk in your pool.  I'm not sure how long your system
> stays up before it panics, but could you try doing "zpool scrub tank"
> and let that run for a while?  The first ~5 minutes may show the time to
> completion (from "zpool status") getting worse and worse, but it should
> decrease/catch up.
> 
> If the scrub is able to finish, look for any errors in the resulting
> R/W/CK fields.

Doh, should have though of that myself. Will get started right away.


Stefan

-- 
Stefan BethkeFon +49 151 14070811




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Fatal trap 9 triggered by zfs?

2009-12-04 Thread Stefan Bethke

Am 04.12.2009 um 20:56 schrieb Stefan Bethke:

> Am 04.12.2009 um 17:52 schrieb Stefan Bethke:
> 
>> I'm getting panics like this every so often (couple weeks, sometimes just a 
>> few days.) A second machine that has identical hardware and is running the 
>> same source has no such problems.
>> 
>> FreeBSD XXX.hanse.de 8.0-STABLE FreeBSD 8.0-STABLE #16: Tue Dec  1 14:30:54 
>> UTC 2009 r...@xxx.hanse.de:/usr/obj/usr/src/sys/EISENBOOT  amd64
>> 
>> # zpool status
>> pool: tank
>> state: ONLINE
>> scrub: none requested
>> config:
>> 
>>  NAMESTATE READ WRITE CKSUM
>>  tankONLINE   0 0 0
>>ad4s1dONLINE   0 0 0
>> # cat /boot/loader.conf
>> vfs.zfs.arc_max="512M"
>> vfs.zfs.prefetch_disable="1"
>> vfs.zfs.zil_disable="1"

Third one.  Since there's no mention of ZFS in this one, I'll start looking 
into pontential hardware issues.

(kgdb) #0  doadump () at pcpu.h:223
#1  0x80337bd9 in boot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:416
#2  0x8033802c in panic (fmt=Variable "fmt" is not available.
)
at /usr/src/sys/kern/kern_shutdown.c:579
#3  0x805cc2ad in trap_fatal (frame=0x9, eva=Variable "eva" is not 
available.
)
at /usr/src/sys/amd64/amd64/trap.c:857
#4  0x805cce12 in trap (frame=0xff800011ab00)
at /usr/src/sys/amd64/amd64/trap.c:644
#5  0x805b2943 in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:224
#6  0x803405fc in msleep_spin (ident=0xff00015ce780, 
mtx=0xff00015ce7b0, wmesg=0x80638ffd "-", timo=0)
at /usr/src/sys/kern/kern_synch.c:312
#7  0x80373ef7 in taskqueue_thread_loop (arg=Variable "arg" is not 
available.
)
at /usr/src/sys/kern/subr_taskqueue.c:89
#8  0x8030e7d8 in fork_exit (
callout=0x80373e90 , 
arg=0xff80002e4768, frame=0xff800011ac80)
at /usr/src/sys/kern/kern_fork.c:843
#9  0x805b2e1e in fork_trampoline ()
at /usr/src/sys/amd64/amd64/exception.S:561


Stefan

-- 
Stefan BethkeFon +49 151 14070811




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

vmstat and iostat us/sy/id numbers wrong?

2009-12-05 Thread Stefan Bethke

I'm confused about the numbers shown in the last three columns in both vmstat 
and iostat. They should reflect percent of CPU time spent on user processes, 
system threads, and the idle thread (or something like that).

On multiple machines running 8-stable from the last couple of days, the numbers 
do not agree with actual system usage and with numbers shown by top, at all.  
I'm seeing 7 7 87 on one box, 0 0 100 on another, and 10 3 87 on a third.  The 
numbers stay the same even under different loads.

Am I misunderstanding what those numbers should represent?


Stefan

-- 
Stefan BethkeFon +49 151 14070811




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: vmstat and iostat us/sy/id numbers wrong?

2009-12-07 Thread Stefan Bethke


Am 07.12.2009 um 16:10 schrieb John Baldwin:


On Saturday 05 December 2009 9:19:06 am Stefan Bethke wrote:
I'm confused about the numbers shown in the last three columns in  
both
vmstat and iostat. They should reflect percent of CPU time spent on  
user
processes, system threads, and the idle thread (or something like  
that).


On multiple machines running 8-stable from the last couple of days,  
the
numbers do not agree with actual system usage and with numbers shown  
by top,
at all.  I'm seeing 7 7 87 on one box, 0 0 100 on another, and 10 3  
87 on a

third.  The numbers stay the same even under different loads.


Am I misunderstanding what those numbers should represent?


Are you just running vmstat once or using 'vmstat 1' to have it  
poll?  If you
are running it once, note that the numbers vmstat report are the  
percentage of
system/user/idle time since boot rather than during the previous  
second which
is what top reports (and what 'vmstat 1' reports after the first  
line).


Thanks, I figured that out eventually (also by the nice help received  
from David Wolfskill).


I'm now using "vmstat 30 2" and using the last line to get "current"  
numbers, instead of "vmstat 1".


There seems to be some problem with the numbers though, as the since- 
boot output of vmstat does not seems to add up to 100%, at least in  
some cases.  I'll see if I can find out more details later in the month.



Stefan

--
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Fatal trap 9 triggered by zfs?

2009-12-09 Thread Stefan Bethke

Am 04.12.2009 um 17:52 schrieb Stefan Bethke:

> I'm getting panics like this every so often (couple weeks, sometimes just a 
> few days.) A second machine that has identical hardware and is running the 
> same source has no such problems.

Thanks to all who suggested bad hardware: it turned out to be a case of 
capacitor plague.  With a new mainboard, everything appears to be working 
solidly again.


Stefan

-- 
Stefan BethkeFon +49 151 14070811




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: PCengines ALIX boot0sio serial input failes

2009-12-10 Thread Stefan Bethke


Am 09.12.2009 um 17:13 schrieb Daniel Braniss:


[B]ooting off the CF (using boot0sio), the input 'screwy'
at the selection of partition it is ignored, at the OK: prompt
from the boot (i had no kernel in the slice), the input is usually
doubled:
sshooww instead of show
which is probably similar to what is happening with boot0sio but it
only echoes # (the current bell).



I've seen this happening when the BIOS also tries to talk to the  
serial port at the same time. Try changing the BIOS to stop doing  
console redirection once it starts booting, or replace boot0sio with  
boot0, and change boot(8) and loader(8) to only use the serial or BIOS  
console, but not both.


Since I don't dual-boot anymore, I've replaced boot0 with a standard  
MBR, and start off with boot(8) on all machines.



HTH,
Stefan

--
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

panic on zfs unmount

2009-12-11 Thread Stefan Bethke

I still sometimes get the "lost" .zfs/snapshot directory, with resulting panic, 
and it just happened again.  I have the full crash dump, if anyone wants to 
look at details.

# cd /jail/foo/.zfs
# ls
ls: snapshot: Bad file descriptor
# cd
# zfs umount tank/jail/foo
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0xa8
fault code  = supervisor write data, page not present
instruction pointer = 0x20:0x8033fac5
stack pointer   = 0x28:0xff80626cf9d0
frame pointer   = 0x28:0xff80626cf9e0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 38362 (zfs)
trap number = 12
panic: page fault
cpuid = 0
Uptime: 7d3h33m46s
Physical memory: 3313 MB

#0  doadump () at pcpu.h:223
223 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0  doadump () at pcpu.h:223
#1  0x80337bd9 in boot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:416
#2  0x8033802c in panic (fmt=Variable "fmt" is not available.
)
at /usr/src/sys/kern/kern_shutdown.c:579
#3  0x805cc2ad in trap_fatal (frame=0xc, eva=Variable "eva" is not 
available.
)
at /usr/src/sys/amd64/amd64/trap.c:857
#4  0x805cc694 in trap_pfault (frame=0xff80626cf920, usermode=0)
at /usr/src/sys/amd64/amd64/trap.c:773
#5  0x805cd06a in trap (frame=0xff80626cf920)
at /usr/src/sys/amd64/amd64/trap.c:499
#6  0x805b2943 in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:224
#7  0x8033fac5 in _sx_xlock (sx=0x90, opts=0, 
file=0x80ac1d30 
"/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c",
 line=1349) at atomic.h:158
#8  0x80a53b85 in zfsctl_umount_snapshots (vfsp=Variable "vfsp" is not 
available.
)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c:1349
#9  0x80a604f9 in zfs_umount (vfsp=0xff00017518d0, fflag=0)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1020
#10 0x803c080a in dounmount (mp=0xff00017518d0, flags=0, 
td=Variable "td" is not available.
)
at /usr/src/sys/kern/vfs_mount.c:1294
#11 0x803c1038 in unmount (td=0xff002ed50720, 
uap=0xff80626cfbf0) at /usr/src/sys/kern/vfs_mount.c:1179
#12 0x805cc906 in syscall (frame=0xff80626cfc80)
at /usr/src/sys/amd64/amd64/trap.c:989
#13 0x805b2c21 in Xfast_syscall ()
at /usr/src/sys/amd64/amd64/exception.S:373
#14 0x000800f4ba4c in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) 

-- 
Stefan BethkeFon +49 151 14070811




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

ahci and ICH7?

2009-12-12 Thread Stefan Bethke

After being very satisfied with ahci(4) on an Intel DG45FC (G45 chipset) with 
8-stable, I tried on a box with a Asus P-P5G41 (G41 chipset), but ahci is not 
attaching.  atapci continues to attach, and I'm wondering whether I'm doing 
something wrong, the BIOS is misconfigured, or ahci can't talk to ICH7-based 
AHCIs?  I'm loading ahci from loader.conf.


atap...@pci0:0:31:1:class=0x01018a card=0x81791043 chip=0x27df8086 rev=0x01 
hdr=0x00
vendor = 'Intel Corporation'
device = '82801G (ICH7 Family) Ultra ATA Storage Controller'
class  = mass storage
subclass   = ATA
atap...@pci0:0:31:2:class=0x01018f card=0x81791043 chip=0x27c08086 rev=0x01 
hdr=0x00
vendor = 'Intel Corporation'
device = '82801GB/GR/GH (ICH7 Family) Serial ATA Storage Controller'
class  = mass storage
subclass   = ATA

atapci0:  port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x37
6,0xffa0-0xffaf at device 31.1 on pci0
ata0:  on atapci0
ata0: [ITHREAD]
atapci1:  port 
0xc080-0xc087,0xc000-0xc003,0xbc00-0xbc07,0xb880-0xb883,0xb800-0xb80f irq 19 at 
device 31.2 on pci0
atapci1: [ITHREAD]
ata2:  on atapci1
ata2: [ITHREAD]
ata3:  on atapci1
ata3: [ITHREAD]
ad4: 953869MB  at ata2-master SATA150


Thanks,
Stefan

-- 
Stefan BethkeFon +49 151 14070811




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ahci and ICH7?

2009-12-12 Thread Stefan Bethke

Am 12.12.2009 um 12:20 schrieb Xin LI:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Stefan Bethke wrote:
>> After being very satisfied with ahci(4) on an Intel DG45FC (G45 chipset) 
>> with 8-stable, I tried on a box with a Asus P-P5G41 (G41 chipset), but ahci 
>> is not attaching.  atapci continues to attach, and I'm wondering whether I'm 
>> doing something wrong, the BIOS is misconfigured, or ahci can't talk to 
>> ICH7-based AHCIs?  I'm loading ahci from loader.conf.
>> 
>> 
>> atap...@pci0:0:31:1: class=0x01018a card=0x81791043 chip=0x27df8086 rev=0x01 
>> hdr=0x00
>>vendor = 'Intel Corporation'
>>device = '82801G (ICH7 Family) Ultra ATA Storage Controller'
>>class  = mass storage
>>subclass   = ATA
>> atap...@pci0:0:31:2: class=0x01018f card=0x81791043 chip=0x27c08086 rev=0x01 
>> hdr=0x00
>>vendor = 'Intel Corporation'
>>device = '82801GB/GR/GH (ICH7 Family) Serial ATA Storage Controller'
>>class  = mass storage
>>subclass   = ATA
> 
> Have you enabled AHCI in BIOS?  It looks like that your ICH7 is in
> legacy SATA mode.

I'll check the next time I have physical access to the machine.


Stefan

-- 
Stefan BethkeFon +49 151 14070811




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ahci and ICH7?

2009-12-12 Thread Stefan Bethke

Am 12.12.2009 um 12:39 schrieb Alexander Motin:

> Stefan Bethke wrote:
>> After being very satisfied with ahci(4) on an Intel DG45FC (G45 chipset) 
>> with 8-stable, I tried on a box with a Asus P-P5G41 (G41 chipset), but ahci 
>> is not attaching.  atapci continues to attach, and I'm wondering whether I'm 
>> doing something wrong, the BIOS is misconfigured, or ahci can't talk to 
>> ICH7-based AHCIs?  I'm loading ahci from loader.conf.
> 
> There are several types of ICH7. Most of them doesn't support AHCI mode.
> Rare exceptions are ICH7R and ICH7M AFAIR. G41 is low-end desktop
> chipset, which uses low-end ICH7 SB, so usually there is no luck.

Looks like you're right, the BIOS does not offer any configuration option to 
enable AHCI or RAID functions for the two SATA ports.


Stefan

-- 
Stefan BethkeFon +49 151 14070811




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Problems with Atheros card and hostpd

2009-12-16 Thread Stefan Bethke


Am 16.12.2009 um 12:19 schrieb Derek Kulinski:


Hello,

I just upgraded my access point (from 7.1 to 8.0) and can't make
hostapd work (looks like wide-dhcp relay also has a problem with  
ath0):


Things got a bit more complicated (and more powerful) with 8.0: you  
now have to configure a virtual wireless interface, attached to the  
physical one.  Unfortunatly, the handbook has not quite caught up with  
this change.



Stefan

--
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Basic SMART info "out of the box"

2009-12-16 Thread Stefan Bethke


Am 15.12.2009 um 21:24 schrieb Jeremy Chadwick:


[1]: It's hardly done and needs a *lot* of work, but I'll eventually
get it into a state where it could be committed and people could  
hack on
it/improve it.  It's no where near as defined as smartmontools (re:  
disk

vendor/model one-offs for attribute parsing and so on), but I figured
FreeBSD users might want something out-of-the-box which might give  
them

stats which are most commonly focused on (sector reallocation, drive
temperature, high spin-up times, CRC errors, etc.).  I guess you could
say I'm a bit proud of myself given that I was able to figure out  
how to

accomplish it by looking at some smartmontools source (messy, let me
tell you...) and ata(4) bits (since the ioctls aren't documented).

[2]: Yes, I'm still working on writing that doc that explains how to
read SMART data.  Going to have to end up doing it for work as well...
oh the joys.  :-)


Yes please, I'd like to see basic SMART diagnostics out of the box in  
the base system!  I've looked at doing something similar on and off  
for a long time, but never really got beyond the basic ioctl proof of  
concept stage.


Since it appears ata and atacontrol might be replaced by CAM, and SCSI  
devices can also support SMART, would it be possible to add this to  
camcontrol or a similar utility?



Thanks,
Stefan

--
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Many processes stuck in zfs

2010-03-09 Thread Stefan Bethke

Over the past couple of months, I've more or less regularly observed machines 
having more and more processes stuck in the zfs wchan.  The processes never 
recover from that, and trying to reboot only gets the entire system stuck, 
without any console messages.  I can enter the debugger, and I have saved a 
couple of dumps.

The situation seems to be triggered by zfs receive'ing snapshots from the 
sister machine (both synchronize their active ZFS filesystems to each other, 
using zfs send and zfs receive).  It appears it's the receiving causing trouble.

Both machines run 8-stable from mid-February, with a single-disk ZFS pool, with 
ARC limited to 512M, prefetch and ZIL disabled via loader.conf.

What should I be looking at to further diagnose?


Thanks,
Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Many processes stuck in zfs

2010-03-09 Thread Stefan Bethke

Am 09.03.2010 um 11:53 schrieb Peter Jeremy:

> On 2010-Mar-09 10:15:53 +0100, Stefan Bethke  wrote:
>> Over the past couple of months, I've more or less regularly observed 
>> machines having more and more processes stuck in the zfs wchan.  The 
>> processes never recover from that,
> 
> How long have you waited?

Many hours, sometimes up to 48 hours (when I didn't notice the stuck processes 
at first).

> There seems to be a problem with low free memory handling that causes ZFS
> to turn into cold molasses.  The work-around is to run a program that
> allocates a decent size chunk of memory and then exits.  The original
> suggestion was something like:
>   perl -e '@x = (0) x 100;'
> I've written a short program that allocates and dirties ~100MB and then
> exits and run it from cron.

I'll try that the next time I encounter the stuck processes.

I'm recording ZFS ARC stats with munin, would I be able to identify such a low 
memory situation from there?  Would it make sense to monitor other stats?


Thanks,
Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Many processes stuck in zfs

2010-03-09 Thread Stefan Bethke

Am 09.03.2010 um 13:29 schrieb Pawel Jakub Dawidek:

> On Tue, Mar 09, 2010 at 10:15:53AM +0100, Stefan Bethke wrote:
>> Over the past couple of months, I've more or less regularly observed 
>> machines having more and more processes stuck in the zfs wchan.  The 
>> processes never recover from that, and trying to reboot only gets the entire 
>> system stuck, without any console messages.  I can enter the debugger, and I 
>> have saved a couple of dumps.
>> 
>> The situation seems to be triggered by zfs receive'ing snapshots from the 
>> sister machine (both synchronize their active ZFS filesystems to each other, 
>> using zfs send and zfs receive).  It appears it's the receiving causing 
>> trouble.
>> 
>> Both machines run 8-stable from mid-February, with a single-disk ZFS pool, 
>> with ARC limited to 512M, prefetch and ZIL disabled via loader.conf.
>> 
>> What should I be looking at to further diagnose?
> 
> What kind of hardware do you have there? There is 3-way deadlock I've a
> fix for which would be hard to trigger on single or dual core machines.

FreeBSD lokschuppen.zs64.net 8.0-STABLE FreeBSD 8.0-STABLE #24: Sat Feb 13 
11:20:03 UTC 2010 r...@lokschuppen.zs64.net:/usr/obj/usr/src/sys/EISENBOOT  
amd64
Copyrig
ht (c) 1992-2010 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.0-STABLE #24: Sat Feb 13 11:20:03 UTC 2010
r...@lokschuppen.zs64.net:/usr/obj/usr/src/sys/EISENBOOT amd64
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Core(TM)2 Duo CPU E7300  @ 2.66GHz (2666.65-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x10676  Stepping = 6
  Features=0xbfebfbff
  Features2=0x8e39d
  AMD Features=0x20100800
  AMD Features2=0x1
  TSC: P-state invariant
real memory  = 4294967296 (4096 MB)
avail memory = 4081422336 (3892 MB)


> Feel free to try the fix:
> 
>   http://people.freebsd.org/~pjd/patches/zfs_3way_deadlock.patch

I'll give it a shot on one of the two boxes.


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Many processes stuck in zfs

2010-03-10 Thread Stefan Bethke

Am 10.03.2010 um 12:35 schrieb Ollivier Robert:

> According to Stefan Bethke:
>> The situation seems to be triggered by zfs receive'ing snapshots from the 
>> sister machine (both synchronize their active ZFS filesystems to each other, 
>> using zfs send and zfs receive).  It appears it's the receiving causing 
>> trouble.
> 
> Have you tuned kern.maxvnodes in /etc/sysctl.conf?
> 
> When I move to this new machine, I forgot to get it much higher than the 
> default (now I use 20) and it was locking up pretty soon.  Had not a 
> single lockup now.

I haven't, it's at the default of 10.  How would I be able to tell if that 
limit is being reached?

Right now:
$ sysctl kern.maxvnodes vfs.numvnodes vfs.freevnodes
kern.maxvnodes: 10
vfs.numvnodes: 87287
vfs.freevnodes: 24993
and on the sister host:
$ sysctl kern.maxvnodes vfs.numvnodes vfs.freevnodes
kern.maxvnodes: 10
vfs.numvnodes: 87681
vfs.freevnodes: 7600

Is there a rule of thumb what maxvnodes should be tuned to?


Stefan

-- 
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ZFS question

2010-03-10 Thread Stefan Bethke

Am 10.03.2010 um 19:51 schrieb Wiktor Niesiobedzki:

> I've also did some plugins for munin, where based on that, and looking
> on the code, I tried to provide some interpretation of L2ARC/ARC
> statistics (see attached scripts). I'm still not sure, if I have all
> the important stuff visible on the graphs.

The FreeBSD lists strip attachments. Would you mind posting a download link, or 
are they listed in Munin Exchange?


TIA,
Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Many processes stuck in zfs

2010-03-27 Thread Stefan Bethke

Am 10.03.2010 um 12:02 schrieb Pawel Jakub Dawidek:

> Once the deadlock occur, enter DDB and send me the output of:
> 
>   ps
>   show alllocks
>   show lockedvnods
>   show allchains
>   alltrace

panic: deadlkres: possible deadlock detected for 0xff000c66e000, blocked 
for 1801490 ticks

I've saved a core, and can try to look at more things.  The text dump is at 
http://www.lassitu.de/freebsd/core.txt.13

show alllocks and show lockedvnods only gave me a "no such command" error. 
Otherwise, the full output is at http://www.lassitu.de/freebsd/zfs_panic.txt.


Thanks,
Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: .zfs directory broken on an FS

2010-05-12 Thread Stefan Bethke

Am 12.05.2010 um 02:09 schrieb Daniel O'Connor:

> I recently switched over to ZFS at work and it seems pretty good (so far at 
> least!) however I was writing a script to do snapshots and one of the file 
> systems now gives..
> 
> [cain 9:37] ~ >ls -la /usr/local/Genesis/archive/.zfs
> ls: snapshot: Bad file descriptor
> total 0
> 
> Other file systems in the same pool work fine though..
> [cain 9:36] ~ >ll /usr/local/Genesis/MeteorData/.zfs/snapshot
> total 4
> drwxrwxr-x  32 metdata  radar  38 Mar 21 03:11 20100512-0900
> 
> (All other snapshot directories are OK)

This appears to be a long standing issue with no solution.  I used to get this 
a lot during daily rsync backups; since switching to a zfs send/recv based 
script I don't get these problems anymore.

Be aware that trying to unmount that snapshot or it's filesystem might panic or 
hang the system, including when you try to reboot.


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

www/apache22: purpose of WITHOUT_APACHE_OPTIONS?

2010-05-15 Thread Stefan Bethke

Hi,

I was quite surprised that I need to set WITHOUT_APACHE_OPTIONS to have any 
command line options honored by the makefile.  All other ports seem to override 
the config options (that may or not may be set) with the WITH and WITHOUT 
variables specifed on the make commandline or through pkgtools.conf.  What's 
the reason for this difference?


Thanks,
Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

APU3 ethernet can't transmit

2021-01-22 Thread Stefan Bethke

I have a weird situation with an PCEngines APU3, where I can't seem to be able 
to transmit packets through either of the igb interfaces. With tcpdump, I can 
see packets arriving, and the interface flags appear to be just fine:

options=e527bb
ether 00:0d:b9:58:xx:xx
inet6 fe80::20d:b9ff::%igb0 prefixlen 64 scopeid 0x1
inet6 2a02:8108:4840::::: prefixlen 64 autoconf
inet 0.0.0.0 netmask 0xff00 broadcast 255.255.255.255
media: Ethernet autoselect (1000baseT )
status: active
nd6 options=23

dhclient is running on that interfaces, and I got an IPv6 address through RA.

Assigning an address manually doesn't change anything.

igb0@pci0:1:0:0:class=0x02 card=0x8086 chip=0x157b8086 rev=0x03 
hdr=0x00
vendor = 'Intel Corporation'
device = 'I210 Gigabit Network Connection'
class  = network
subclass   = ethernet

# freebsd-version
12.2-RELEASE

I installed 12.2-REL a couple of weeks ago, and haven't done anything since.

Tried the three different port with different cables on different switch ports, 
which are working fine with other machines.

I'm installing updates now via a USB adapter.

Any suggestions?


Stefan

--
Stefan BethkeFon +49 151 14070811



signature.asc
Description: Message signed with OpenPGP

Re: APU3 ethernet can't transmit

2021-01-22 Thread Stefan Bethke

Am 22.01.2021 um 22:10 schrieb Stefan Bethke :
> 
> I have a weird situation with an PCEngines APU3, where I can't seem to be 
> able to transmit packets through either of the igb interfaces. With tcpdump, 
> I can see packets arriving, and the interface flags appear to be just fine:
>
> options=e527bb
>ether 00:0d:b9:58:xx:xx
>inet6 fe80::20d:b9ff::%igb0 prefixlen 64 scopeid 0x1
>inet6 2a02:8108:4840::::: prefixlen 64 autoconf
>inet 0.0.0.0 netmask 0xff00 broadcast 255.255.255.255
>media: Ethernet autoselect (1000baseT )
>status: active
>nd6 options=23
> 
> dhclient is running on that interfaces, and I got an IPv6 address through RA.
> 
> Assigning an address manually doesn't change anything.
> 
> igb0@pci0:1:0:0:class=0x02 card=0x8086 chip=0x157b8086 
> rev=0x03 hdr=0x00
>vendor = 'Intel Corporation'
>device = 'I210 Gigabit Network Connection'
>class  = network
>subclass   = ethernet
> 
> # freebsd-version
> 12.2-RELEASE
> 
> I installed 12.2-REL a couple of weeks ago, and haven't done anything since.
> 
> Tried the three different port with different cables on different switch 
> ports, which are working fine with other machines.
> 
> I'm installing updates now via a USB adapter.

Updating the firmware to apu2_v4.11.0.6.rom didn't change a thing. Somebody 
suggested turning off LRO, but that didn't help either.

I have another APU2 with 12.1 that is doing just fine. I guess I can downgrade 
and see if that changes anything.


Stefan

--
Stefan BethkeFon +49 151 14070811



signature.asc
Description: Message signed with OpenPGP

Re: APU3 ethernet can't transmit

2021-01-23 Thread Stefan Bethke

Am 22.01.2021 um 22:38 schrieb Stefan Bethke :
> 
> Am 22.01.2021 um 22:10 schrieb Stefan Bethke :
>> 
>> I have a weird situation with an PCEngines APU3, where I can't seem to be 
>> able to transmit packets through either of the igb interfaces. With tcpdump, 
>> I can see packets arriving, and the interface flags appear to be just fine:
>>   
>> options=e527bb
>>   ether 00:0d:b9:58:xx:xx
>>   inet6 fe80::20d:b9ff::%igb0 prefixlen 64 scopeid 0x1
>>   inet6 2a02:8108:4840::::: prefixlen 64 autoconf
>>   inet 0.0.0.0 netmask 0xff00 broadcast 255.255.255.255
>>   media: Ethernet autoselect (1000baseT )
>>   status: active
>>   nd6 options=23
>> 
>> dhclient is running on that interfaces, and I got an IPv6 address through RA.
>> 
>> Assigning an address manually doesn't change anything.
>> 
>> igb0@pci0:1:0:0:class=0x02 card=0x8086 chip=0x157b8086 
>> rev=0x03 hdr=0x00
>>   vendor = 'Intel Corporation'
>>   device = 'I210 Gigabit Network Connection'
>>   class  = network
>>   subclass   = ethernet
>> 
>> # freebsd-version
>> 12.2-RELEASE
>> 
>> I installed 12.2-REL a couple of weeks ago, and haven't done anything since.
>> 
>> Tried the three different port with different cables on different switch 
>> ports, which are working fine with other machines.
>> 
>> I'm installing updates now via a USB adapter.
> 
> Updating the firmware to apu2_v4.11.0.6.rom didn't change a thing. Somebody 
> suggested turning off LRO, but that didn't help either.
> 
> I have another APU2 with 12.1 that is doing just fine. I guess I can 
> downgrade and see if that changes anything.

Debian is not happy about the interfaces either, so I'm guessing its a hardware 
problem.


Stefan

--
Stefan BethkeFon +49 151 14070811



signature.asc
Description: Message signed with OpenPGP

Re: APU3 ethernet can't transmit

2021-01-23 Thread Stefan Bethke


I've tried one more time with this in rc.conf:

network_interfaces="igb0"
ifconfig_igb0="-lro -vlanhwtso -tso4 -vlanhwfilter -rxcsum -txcsum -vlanhwtag 
-vlanmtu -vlanhwcsum -tso6 -txcsum6"

Whick produces this: options=802020

No joy. What's interesting is that the left LED on the plug in blinking with 
about 2Hz, so clearly the PHY or controller is unhappy about something. The 
i210 datasheet has a table with all modes the LEDs support, but it doesn't 
mention this fast blinking. The right LED is on constantly, which should 
indicate a link. This is the same on all three ports, irrespective of 
configuration.

I also just tried netbooting, and that doesn't seem to be working either. So I 
guess it's time to RMA it.

Stefan

> Am 23.01.2021 um 11:28 schrieb Bob Bishop :
> 
> Hi,
> 
> FWIW I found that applying -lro didn’t work retrospectively, it had to be 
> done when the interface was first configured. Might apply to other options 
> too.
> 
> --
> Bob Bishop
> r...@gid.co.uk
> 
>> On 22 Jan 2021, at 23:42, Graham Menhennitt  wrote:
>> 
>> Try "ifconfig $ifname -rxcsum -txcsum" and possibly " -vlanhwtso -tso4" as 
>> well.
>> 
>> Graham
>> 
>> On 23/01/2021 8:10 am, Stefan Bethke wrote:
>>> I have a weird situation with an PCEngines APU3, where I can't seem to be 
>>> able to transmit packets through either of the igb interfaces. With 
>>> tcpdump, I can see packets arriving, and the interface flags appear to be 
>>> just fine:
>>>
>>> options=e527bb
>>>ether 00:0d:b9:58:xx:xx
>>>inet6 fe80::20d:b9ff::%igb0 prefixlen 64 scopeid 0x1
>>>inet6 2a02:8108:4840::::: prefixlen 64 autoconf
>>>inet 0.0.0.0 netmask 0xff00 broadcast 255.255.255.255
>>>media: Ethernet autoselect (1000baseT )
>>>status: active
>>>nd6 options=23
>>> 
>>> dhclient is running on that interfaces, and I got an IPv6 address through 
>>> RA.
>>> 
>>> Assigning an address manually doesn't change anything.
>>> 
>>> igb0@pci0:1:0:0:class=0x02 card=0x8086 chip=0x157b8086 
>>> rev=0x03 hdr=0x00
>>>vendor = 'Intel Corporation'
>>>device = 'I210 Gigabit Network Connection'
>>>class  = network
>>>subclass   = ethernet
>>> 
>>> # freebsd-version
>>> 12.2-RELEASE
>>> 
>>> I installed 12.2-REL a couple of weeks ago, and haven't done anything since.
>>> 
>>> Tried the three different port with different cables on different switch 
>>> ports, which are working fine with other machines.
>>> 
>>> I'm installing updates now via a USB adapter.
>>> 
>>> Any suggestions?
>>> 
>>> 
>>> Stefan
>>> 
>>> --
>>> Stefan BethkeFon +49 151 14070811
>>> 
>> ___
>> freebsd-stable@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>> 
> 
> ___
> freebsd-stable@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

--
Stefan BethkeFon +49 151 14070811



signature.asc
Description: Message signed with OpenPGP

Re: APU3 ethernet can't transmit

2021-01-23 Thread Stefan Bethke

Argh! It was my el cheapo desktop switch. I tried two ports and two cables, 
they were working with my laptop, but not with the APU. Only when I hooked up a 
different device this evening and couldn't talk to it did I think of the 
switch. Powercycling "fixed" it. Time for an upgrade, I think.


Stefan

> Am 23.01.2021 um 12:50 schrieb Stefan Bethke :
> 
> 
> I've tried one more time with this in rc.conf:
> 
> network_interfaces="igb0"
> ifconfig_igb0="-lro -vlanhwtso -tso4 -vlanhwfilter -rxcsum -txcsum -vlanhwtag 
> -vlanmtu -vlanhwcsum -tso6 -txcsum6"
> 
> Whick produces this: options=802020
> 
> No joy. What's interesting is that the left LED on the plug in blinking with 
> about 2Hz, so clearly the PHY or controller is unhappy about something. The 
> i210 datasheet has a table with all modes the LEDs support, but it doesn't 
> mention this fast blinking. The right LED is on constantly, which should 
> indicate a link. This is the same on all three ports, irrespective of 
> configuration.
> 
> I also just tried netbooting, and that doesn't seem to be working either. So 
> I guess it's time to RMA it.
> 
> Stefan
> 
>> Am 23.01.2021 um 11:28 schrieb Bob Bishop :
>> 
>> Hi,
>> 
>> FWIW I found that applying -lro didn’t work retrospectively, it had to be 
>> done when the interface was first configured. Might apply to other options 
>> too.
>> 
>> --
>> Bob Bishop
>> r...@gid.co.uk
>> 
>>> On 22 Jan 2021, at 23:42, Graham Menhennitt  
>>> wrote:
>>> 
>>> Try "ifconfig $ifname -rxcsum -txcsum" and possibly " -vlanhwtso -tso4" as 
>>> well.
>>> 
>>> Graham
>>> 
>>> On 23/01/2021 8:10 am, Stefan Bethke wrote:
>>>> I have a weird situation with an PCEngines APU3, where I can't seem to be 
>>>> able to transmit packets through either of the igb interfaces. With 
>>>> tcpdump, I can see packets arriving, and the interface flags appear to be 
>>>> just fine:
>>>>   
>>>> options=e527bb
>>>>   ether 00:0d:b9:58:xx:xx
>>>>   inet6 fe80::20d:b9ff::%igb0 prefixlen 64 scopeid 0x1
>>>>   inet6 2a02:8108:4840::::: prefixlen 64 autoconf
>>>>   inet 0.0.0.0 netmask 0xff00 broadcast 255.255.255.255
>>>>   media: Ethernet autoselect (1000baseT )
>>>>   status: active
>>>>   nd6 options=23
>>>> 
>>>> dhclient is running on that interfaces, and I got an IPv6 address through 
>>>> RA.
>>>> 
>>>> Assigning an address manually doesn't change anything.
>>>> 
>>>> igb0@pci0:1:0:0:class=0x02 card=0x8086 chip=0x157b8086 
>>>> rev=0x03 hdr=0x00
>>>>   vendor = 'Intel Corporation'
>>>>   device = 'I210 Gigabit Network Connection'
>>>>   class  = network
>>>>   subclass   = ethernet
>>>> 
>>>> # freebsd-version
>>>> 12.2-RELEASE
>>>> 
>>>> I installed 12.2-REL a couple of weeks ago, and haven't done anything 
>>>> since.
>>>> 
>>>> Tried the three different port with different cables on different switch 
>>>> ports, which are working fine with other machines.
>>>> 
>>>> I'm installing updates now via a USB adapter.
>>>> 
>>>> Any suggestions?
>>>> 
>>>> 
>>>> Stefan
>>>> 
>>>> --
>>>> Stefan BethkeFon +49 151 14070811
>>>> 
>>> ___
>>> freebsd-stable@freebsd.org mailing list
>>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>>> 
>> 
>> ___
>> freebsd-stable@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
> 
> --
> Stefan BethkeFon +49 151 14070811
> 

--
Stefan BethkeFon +49 151 14070811



signature.asc
Description: Message signed with OpenPGP

Updating to 13-stable and existing ZFS pools: any gotchas?

2021-03-14 Thread Stefan Bethke

I'm planning to upgrade three production machines with existing ZFS pools to 
13-stable. Is there anything I need to pay attention to wrt OpenZFS? Or should 
it be fully transparent, apart from updating loader?

My (limited) testing with VMs went without a hitch, but I want to make sure I 
don't paint myself into a corner.


Stefan

--
Stefan BethkeFon +49 151 14070811



signature.asc
Description: Message signed with OpenPGP

Re: Deprecating base system ftpd?

2021-04-06 Thread Stefan Bethke

Am 05.04.2021 um 21:01 schrieb Patrick M. Hausen :
> 
> But still even on "the Internet", FTP is the most used method for customers
> of static website hosting. You cannot teach these people what an SSH key is.
> Just my experience, but backed by a load of customer interactions over more
> than 20 years ...

Strato did disable FTP access over a year ago, and instructed customers on how 
to use SSH-based access instead, so it's definitely possible, and people are 
moving towards more secure protocols, even when (non-technical) end users are 
affected.


Srefan

--
Stefan BethkeFon +49 151 14070811



signature.asc
Description: Message signed with OpenPGP

Re: Deprecating base system ftpd?

2021-04-06 Thread Stefan Bethke

Am 06.04.2021 um 12:08 schrieb Helge Oldach :
> 
> Stefan Bethke wrote on Tue, 06 Apr 2021 11:29:34 +0200 (CEST):
>> Strato did disable FTP access over a year ago,
> 
> Actually it was effective October 20, 2020.

You are correct; I was remembering the announcement, not the switch off.

> and instructed customers on how to use SSH-based access instead,
> 
> They have a completely different incentive (avoiding cleartext passwords
> over the Internet, and reportedly they had a number of cases where
> customers where affected by password snooping) than a local admin person
> on a local network not exposed to the public.
> 
>> so it's definitely possible, and people are moving towards more secure
>> protocols, even when (non-technical) end users are affected.
> 
> No doubt about that. Any information about the ticket volume triggered
> by this deprecation?

I have no insight into Strato's operations, but from having to support a bunch 
of non-technical people who are customers, I'd say it was relatively painless, 
because Strato provided good instructions, and the (non-techincal) customers 
were using GUI clients already anyway where they only needed to switch from FTP 
to SFTP.

Stefan

--
Stefan BethkeFon +49 151 14070811



signature.asc
Description: Message signed with OpenPGP

Re: 9.1 RELENG_9 Unable to cleanly dismount root partition on shutdown

2012-08-27 Thread Stefan Bethke

Am 27.08.2012 um 11:06 schrieb Matt Smith:

> I posted on this mailing list two weeks ago and never received any replies so 
> I decided to raise a PR via the web form. But I think I submitted it under 
> the wrong category and it's marked as low priority as well. But I think this 
> is something that is a potential serious problem if I end up getting a 
> corrupted filesystem so I'm posting here again in the hope somebody can help 
> this time. The PR is amd64/170646.
> 
> I'm now running the latest RELENG_9 code as of 25th August as I've done a new 
> buildworld/kernel. I still get the same problem. When I reboot it I get 
> WARNING: / was not properly dismounted and it rebuilds from journal. On 
> shutdown I get the messages pasted below. I'm running amd64 with GPT 
> partitioning, UFS2 with softupdates and softupdates journalling enabled. I 
> have a custom kernel but I don't think I took anything important out of it.
> 
> Syncing disks, vnodes remaining...7 7 2 0 0 done
> All buffers synced.
> fsync: giving up on dirty
> 0xfe0007102780: tag devfs, type VCHR
> usecount 1, writecount 0, refcount 2292 mountedhere 0xfe00729ca00
> flags (VI(0x200))
> v_object 0xfe0005101910 ref 0 pages 23509
> lock type devfs: EXCL by thread 0xfe00018fe08e0 (pid 1)
> dev label/root
> umount of / failed (35)
> 
> Then when the box comes back up again it detects that / was not unmounted
> cleanly and recovers from journal before marking it clean once more.

> My fstab:
> /dev/label/root / ufs rw 1 1
> /dev/label/swap none swap sw 0 0

Is there a particular reason you've decided to glabel your partitions instead 
of using GPT labels? Which device did you do the newfs on, the GPT partition or 
the glabel device?  My hunch is that the label metadata sector at the end of 
the GPT partition is interfering with the filesystem.

I'd try labelling my partitions (gpart modify -i 2 -l root ada0; gpart modify 
-i 3 -l swap), then change fstab to reference the gpt labels (dev(gpt/root) 
instead of the glabel ones.


Stefan

-- 
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: CLANG 3.2 breaks security/pam_ssh_agent_auth on stable/9

2013-02-02 Thread Stefan Bethke

Am 30.01.2013 um 07:21 schrieb Kimmo Paasiala :

> On Wed, Jan 30, 2013 at 7:27 AM, James  wrote:
>> I was able to correct the problem as well by prefixing strnvis, avoiding the
>> symbol collision. I also found PR: ports/172941 which also has a fix.
>> 
>> Using my patch or the patch in ports/172941 fixes the segfault for me in
>> stable/9. However, I quickly ran into another problem. I can't remember the
>> error message exactly, it was something like "Unable to initialize PAM:
>> Unknown file descriptor". A ktrace didn't reveal anything obvious. I'll try
>> to test it out tomorrow.
>> 
>> --
>> James.
> 
> Try the attached patch. Just drop it into
> /usr/ports/security/pam_ssh_agent_auth/files directory and recompile.
> 
> This will make the port use the system strnvis() with correctly
> ordered arguments if one is available (HAVE_STRNVIS defined) and an
> _openbsd suffixed version if not.
> 
> 
> -Kimmo
> 

Working great for me!

Is this on any committers radar?  I don't see a PR for it.



Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: CLANG 3.2 breaks security/pam_ssh_agent_auth on stable/9

2013-02-03 Thread Stefan Bethke


Am 03.02.2013 um 10:57 schrieb Chris Rees :

> On 3 February 2013 03:55, Kimmo Paasiala  wrote:
>> 
>> There is no PR yet with my fix and therefor no commit to ports tree
>> that would fix the problem. I'll file a PR soon (TM).
> 
> The problem was in base, and is fixed there.

Huh? With -current r246283, I still get a segfault from sudo unless I have 
Kimmo's patch.

Is there some confusion about which problem is addressed by Kimmo's patch?


Stefan

-- 
Stefan BethkeFon +49 151 14070811



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

ZFS high write IO in single user mode

2018-10-31 Thread Stefan Bethke

I have two hosts that are configured identically (in a kind of manual 
hot-standby configuration), running a set of jails each.  ZFS datasets for the 
jails and bhyve VMs are synced across regularly. When one of the machines 
exhibits a problem, I can shut down the problematic jails or the whole machine, 
and start the jails/VMs on the other host. This has been working really well 
for the past ~10 years.

A couple of years ago, one of the ZFS pools on one of the machines developed 
some logical inconsistencies that were not detected by zpool scrub. The only 
indication that something was amiss was high disk IO, in particular, writes, 
even when no processes were running. I eventually resolved that situation by 
recreating the zpool and restoring the datasets from the working machine.

About a year ago, I upgraded the hardware and in the process created fresh 
pools. This has been running well. Since about two days ago, I now have the 
situation again where I have a steady write rate even in single user mode, with 
the root dataset mounted read only, and the second pool that contains the jail 
datasets not mounted at all.

I only have a video console (via IPMI KVM) so I won’t transcribe the complete 
output, but here’s what I think are significant observations:
gstat reports ~30 writes/sec on each of the two disks that make up the zmirror 
pool.

mount shows the root dataset to be mounted read-only.

zpool status takes a really long time, and then reports that everything is fine 
for both pools (boot/os and jails).

smartctl doesn’t show any problems for either of the disks.

I’m happy to just wipe the pools and start fresh, but I’d like to use this 
opportunity to hopefully figure out why ZFS appears to act weirdly, and 
hopefully find a permanent fix. This is 11-stable from September 13th.


Stefan

-- 
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: FreeBSD 12 and Nocona

2019-01-03 Thread Stefan Bethke

> I have under supervision a few old servers running 11.2-STABLE. The
> hardware is almost for retirement, but still in working condition. It's
> all old Nocona NetBurst microarchitecture. I have recently tried do
> upgrade OS two of them to 12.0-STABLE, but failed. When I use old
> bootloader the boot freezes on blue highlighted "Booting" stage, when I
> tried to use 12 loader, it freezes earlier, on loading kernel modules.
> The kernel was compiled from fresh sources for CPUTYPE?=nocona.
> 11.2-STABLE is fine with this optimization and the same kernel boots
> fine on newer hardware.
> 
> It is fair, that 11 EOL is expected September 30, 2021 and these servers
> will likely be retired before this date, but some questions arise:
> 
> Is such old hardware still supported? Is it possible (how to) debug the
> booting process?

The first step is to try with known-good bits: can you boot these machines off 
the 12.0 ISO or memstick images? Can you load your kernel and modules with the 
loader from the ISO/memstick? Does GENERIC built without any flags work?

If any of these don’t work, try to be as specific as possible when reporting 
problems. For example, the exact make of mainboard (kenv output) and the BIOS 
version, and any relevant BIOS settings are likely important for problems 
regarding the loader. If the kernel and modules load, you can try a verbose 
boot to see better how far the kernel gets.

I’d be really surprised if the CPUs themselves would cause trouble.


Stefan

-- 
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: FreeBSD 12 and Nocona

2019-01-08 Thread Stefan Bethke

Am 08.01.2019 um 10:34 schrieb Marek Zarychta :
> W dniu 03.01.2019 o 14:13, Stefan Bethke pisze:
>>> I have under supervision a few old servers running 11.2-STABLE. The
>>> hardware is almost for retirement, but still in working condition. It's
>>> all old Nocona NetBurst microarchitecture. I have recently tried do
>>> upgrade OS two of them to 12.0-STABLE, but failed. When I use old
>>> bootloader the boot freezes on blue highlighted "Booting" stage, when I
>>> tried to use 12 loader, it freezes earlier, on loading kernel modules.
>>> The kernel was compiled from fresh sources for CPUTYPE?=nocona.
>>> 11.2-STABLE is fine with this optimization and the same kernel boots
>>> fine on newer hardware.
>>> 
>>> It is fair, that 11 EOL is expected September 30, 2021 and these servers
>>> will likely be retired before this date, but some questions arise:
>>> 
>>> Is such old hardware still supported? Is it possible (how to) debug the
>>> booting process?
>> 
>> The first step is to try with known-good bits: can you boot these machines 
>> off the 12.0 ISO or memstick images? Can you load your kernel and modules 
>> with the loader from the ISO/memstick? Does GENERIC built without any flags 
>> work?
>> 
>> If any of these don’t work, try to be as specific as possible when reporting 
>> problems. For example, the exact make of mainboard (kenv output) and the 
>> BIOS version, and any relevant BIOS settings are likely important for 
>> problems regarding the loader. If the kernel and modules load, you can try a 
>> verbose boot to see better how far the kernel gets.
>> 
>> I’d be really surprised if the CPUs themselves would cause trouble.
> 
> 
> The first step is done. The affected hardware doesn't boot from official
> 12.0-RELEASE CD either. Loader also freezes at the stage of loading
> kernel modules. These servers are old Maxdata Platinum 500 and 3200.
> Some time ago I have submitted dmesgs to NYC BUG dmesg repository[1][2].
> 
> Both configurations are fine with 11-STABLE, so I am not going to
> upgrade them and I am replying only FYI.
> 
> 
> [1] https://dmesgd.nycbug.org/index.cgi?do=view&id=3790 
> <https://dmesgd.nycbug.org/index.cgi?do=view&id=3790>
> [2] https://dmesgd.nycbug.org/index.cgi?do=view&id=4111 
> <https://dmesgd.nycbug.org/index.cgi?do=view&id=4111>
I think it would be great to get some input from someone familiar with the new 
loader. I’ve cc’ed Warner, Kyle and Toomas, as they were listed in the 
quarterly status report.


Stefan

-- 
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Trouble booting from EFI with 12-stable

2019-01-11 Thread Stefan Bethke

The loader stumbles over this error and then drops to the prompt:
efi-autoresizecons not found

module_path is then not set, and loader can’t load the kernel. Typing in 
everything by hand will boot the system OK.

I just did a regular make installworld installkernel (previous install was from 
mid-december). Do I need to update the boot blocks or the EFI partition?


Stefan

-- 
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Trouble booting from EFI with 12-stable

2019-01-11 Thread Stefan Bethke



> Am 11.01.2019 um 15:04 schrieb Kyle Evans :
> 
> On Fri, Jan 11, 2019 at 5:05 AM Stefan Bethke  wrote:
>> 
>> The loader stumbles over this error and then drops to the prompt:
>> efi-autoresizecons not found
>> 
>> module_path is then not set, and loader can’t load the kernel. Typing in 
>> everything by hand will boot the system OK.
>> 
>> I just did a regular make installworld installkernel (previous install was 
>> from mid-december). Do I need to update the boot blocks or the EFI partition?
>> 
> 
> Hi,
> 
> Interesting; this is generally an indicator that your loader
> (/boot/loader.efi in 12.0 EFI-land) is out-of-date with respect to
> scripts. For that I'd go ahead and double-check that /boot/loader.efi
> was actually updated *and* update the contents of the ESP -- that
> particular change was paired with another one that stopped doing any
> resizing in boot1.

I thought as much. Is there a succinct step-by-step to install/update 
everything involved in the UEFI boot process? The Handbook appears to have very 
little on UEFI booting…


Stefan

-- 
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Trouble booting from EFI with 12-stable

2019-01-11 Thread Stefan Bethke



> Am 11.01.2019 um 19:35 schrieb Stefan Bethke :
> 
> 
> 
>> Am 11.01.2019 um 15:04 schrieb Kyle Evans :
>> 
>> On Fri, Jan 11, 2019 at 5:05 AM Stefan Bethke  wrote:
>>> 
>>> The loader stumbles over this error and then drops to the prompt:
>>> efi-autoresizecons not found
>>> 
>>> module_path is then not set, and loader can’t load the kernel. Typing in 
>>> everything by hand will boot the system OK.
>>> 
>>> I just did a regular make installworld installkernel (previous install was 
>>> from mid-december). Do I need to update the boot blocks or the EFI 
>>> partition?
>>> 
>> 
>> Hi,
>> 
>> Interesting; this is generally an indicator that your loader
>> (/boot/loader.efi in 12.0 EFI-land) is out-of-date with respect to
>> scripts. For that I'd go ahead and double-check that /boot/loader.efi
>> was actually updated *and* update the contents of the ESP -- that
>> particular change was paired with another one that stopped doing any
>> resizing in boot1.
> 
> I thought as much. Is there a succinct step-by-step to install/update 
> everything involved in the UEFI boot process? The Handbook appears to have 
> very little on UEFI booting…

The UEFI man page has a good explanation of which files are involved in booting:
https://www.freebsd.org/cgi/man.cgi?query=uefi&sektion=8&manpath=freebsd-release-ports

I mounted the ESP and copied /boot/boot1.efi to /boot/efi/EFI/BOOT/BOOTX64.EFI. 
Surprisingly, the new boot1.efi is much smaller than what I had before 
(according to the timestamp from November), but using that, booting seems to be 
restored.

# grep efi /etc/fstab
/dev/ada0p2 /boot/efi   msdos   rw,noauto   0   0
# mount /boot/efi
# ls -l /boot/efi/EFI/BOOT/BOOTX64.EFI /boot/efi/EFI/BOOT/bak/BOOTX64.EFI 
-rwxr-xr-x  1 root  wheel   81920 Jan 11 18:43 /boot/efi/EFI/BOOT/BOOTX64.EFI*
-rwxr-xr-x  1 root  wheel  410112 Nov 25 16:27 
/boot/efi/EFI/BOOT/bak/BOOTX64.EFI*


Thanks,
Stefan

-- 
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Boot from one drive and load FreeBSD from another

2019-01-12 Thread Stefan Bethke

Am 11.01.2019 um 23:08 schrieb Walter Parker :
> If I create a FreeBSD-boot partition on the SAS drive and a FreeBSD-zfs
> partition on the ZFS mirror, will the boot partition loader automatically
> find the ZFS pool? If not, is there anything special I can do to force a
> boot?

Set up a UFS filesystem on one of the disks that the BIOS can access and put 
everything under /boot into it. Install boot or gptboot (not zfsboot or 
gptzfsboot) with gpart, since loader will only work on that UFS filesystem.

Since loader can’t find your root file system (as the BIOS has no access to 
those disks), you need to set the path to the root filesystem in loader.conf 
(see loader.conf(3), vfs.root.mountfrom). For ZFS, that something like 
zfs:poolname/path/to/rootfs. This will instruct the kernel to mount root from 
that spec. Normally, loader figures this out automatically, by probing the 
disks for metadata (ZFS) or by analyzing fstab (UFS), but in your case, it 
can’t.

You’ll probably want to add an entry for /boot to your fstab, so updates will 
update the boot partition instead of the /boot directory on your ZFS root.


HTH,
Stefan

-- 
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: poudriere(-devel) ports updating question

2019-03-06 Thread Stefan Bethke



> Am 05.03.2019 um 15:09 schrieb tech-lists :
> 
> Hi,
> 
> There are several categories of ports I'd like to avoid for some
> architectures. For example, I don't want x11 for mips.mips64. Or astronomy. 
> But let's say, for this architecture, I want to build everything else.
> 
> I can't see a way of excluding categories with poudriere ports when
> updating the ports tree - the only workaround I can see is to download 
> another tree, call it something and then manually edit that tree, and then 
> set the build off with -p port-treename. Every time I want to
> make a bulk run.
> 
> Basically I'm looking for exclude mask functionality when updating a
> ports tree with poudriere ports.
> 
> Do I need to do this manually or have I missed something?

I don’t think it’s easy to do that. How would you handle dependencies? (For 
example, some ports require X11 libs and stuff, even though they’re in a 
different category.)

Do you want to save time on builds by excluding pkgs that you know you’ll never 
need? Or what is your goal with this?

In my setup, I rely on the regular packages from the official repo, but for 
those pkg that I need built with different options, I run a custom list.

You could try to produce a filtered list of all ports, removing those that 
you’d never select manually, and let poudriere figure out what needs to be 
built. Something along the lines of:
- update ports
- list all ports | grep -v '^x11/'
- run poudriere with resulting list


Stefan

-- 
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

carp can't delete address

2019-05-23 Thread Stefan Bethke

I’ve just set up carp (for the first time) and it seems the virtual address is 
not being removed on the backup host:

May 23 20:55:09 xxx kernel: carp: 1@igb0: INIT -> BACKUP (initialization 
complete)
May 23 20:55:12 xxx kernel: carp: 1@igb0: BACKUP -> MASTER (master timed out)
May 23 20:56:33 xxx kernel: carp: 1@igb0: MASTER -> BACKUP (more frequent 
advertisement received)
May 23 20:56:33 xxx kernel: igb0: deletion failed: 3

ifconfig shows the address as active:
# ifconfig igb0
igb0: flags=8943 metric 0 mtu 
1500

options=e527bb
ether ac:1f:6b:12:34:56
inet 212.12.xxx.xxx/24 broadcast 212.12.xxx.xxx 
inet 212.12.xxx.yyy/32 broadcast 212.12.xxx.yyy vhid 1 
inet6 fe80::ae1f:6bff:...%igb0/64 scopeid 0x1 
carp: BACKUP vhid 1 advbase 1 advskew 200
media: Ethernet autoselect (1000baseT )
status: active
nd6 options=21

Is there a configuration I can/need to adjust?

uname -a
FreeBSD foo.example.com 12.0-STABLE FreeBSD 12.0-STABLE r344052 EISENBOOT  amd64



Stefan

-- 
Stefan BethkeFon +49 151 14070811

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

make kernel ignore broken SATA disk

2020-04-12 Thread Stefan Bethke

I have a server I don't have physical access to right now, which has a broken 
SATA disk that produces mostly errors (but not entirely).

The disk has two partitions that are part of a zpool each. I can't bring the 
system up with this disk being online, because ZFS is trying its darndest to 
use it.

I already renamed the GPT partitions in the hope that ZFS would not find them 
anymore, but it does.

I can't gpart destroy -f ada1 because "device busy".

Is there a way, ideally in the loader, to tell the kernel to ignore ada1 and/or 
ahcich5? Or can I force ZFS some other way to ignore the disk? I do have a 
spare disk I can use to replace the failed one, but I can't get the machine 
into a state where I could even issue the zpool replace command.


Stefan

--
Stefan BethkeFon +49 151 14070811



signature.asc
Description: Message signed with OpenPGP

Re: make kernel ignore broken SATA disk

2020-04-12 Thread Stefan Bethke



> Am 12.04.2020 um 16:45 schrieb Eugene Grosbein :
> 
> 12.04.2020 21:37, Stefan Bethke wrote:
> 
>> I have a server I don't have physical access to right now, which has a 
>> broken SATA disk that produces mostly errors (but not entirely).
>> 
>> The disk has two partitions that are part of a zpool each. I can't bring the 
>> system up with this disk being online, because ZFS is trying its darndest to 
>> use it.
>> 
>> I already renamed the GPT partitions in the hope that ZFS would not find 
>> them anymore, but it does.
>> 
>> I can't gpart destroy -f ada1 because "device busy".
>> 
>> Is there a way, ideally in the loader, to tell the kernel to ignore ada1 
>> and/or ahcich5? Or can I force ZFS some other way to ignore the disk? I do 
>> have a spare disk I can use to replace the failed one, but I can't get the 
>> machine into a state where I could even issue the zpool replace command.
> 
> It depends on the HDD controller the disk is attached to. What controller and 
> driver does it have?

This is from an identlical machine without disk issues:

# camcontrol devlist
  at scbus4 target 0 lun 0 (ada0,pass0)
  at scbus5 target 0 lun 0 (ada1,pass1)
  at scbus6 target 0 lun 0 (ada2,pass2)
   at scbus8 target 0 lun 0 (pass3)
# pciconf -lv
...
ahci0@pci0:0:23:0:  class=0x010601 card=0x088415d9 chip=0xa1028086 rev=0x31 
hdr=0x00
vendor = 'Intel Corporation'
device = 'Q170/Q150/B150/H170/H110/Z170/CM236 Chipset SATA Controller 
[AHCI Mode]'
class  = mass storage
subclass   = SATA
...

dmesg:
ahci0:  port 
0xf050-0xf057,0xf040-0xf043,0xf020-0xf03f mem 
0xdf41-0xdf411fff,0xdf41e000-0xdf4
1e0ff,0xdf41d000-0xdf41d7ff irq 16 at device 23.0 on pci0
ahci0: AHCI v1.31 with 8 6Gbps ports, Port Multiplier not supported
ahcich0:  at channel 0 on ahci0
ahcich1:  at channel 1 on ahci0
ahcich2:  at channel 2 on ahci0
ahcich3:  at channel 3 on ahci0
ahcich4:  at channel 4 on ahci0
ahcich5:  at channel 5 on ahci0
ahcich6:  at channel 6 on ahci0
ahcich7:  at channel 7 on ahci0
ahciem0:  on ahci0

ada0 at ahcich4 bus 0 scbus4 target 0 lun 0
ada0:  ACS-2 ATA SATA 3.x device
ada0: Serial Number Z1F4GVC3
ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 2861588MB (5860533168 512 byte sectors)
ada0: quirks=0x1<4K>
ada1 at ahcich5 bus 0 scbus5 target 0 lun 0
ada1:  ACS-2 ATA SATA 3.x device
ada1: Serial Number W1F5180B
ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 2861588MB (5860533168 512 byte sectors)
ada1: quirks=0x1<4K>
ada2 at ahcich6 bus 0 scbus6 target 0 lun 0
ada2:  ACS-2 ATA SATA 3.x device
ada2: Serial Number Z1F4EJEQ
ada2: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada2: Command Queueing enabled
ada2: 2861588MB (5860533168 512 byte sectors)
ada2: quirks=0x1<4K>


Stefan

--
Stefan BethkeFon +49 151 14070811



signature.asc
Description: Message signed with OpenPGP

Re: make kernel ignore broken SATA disk

2020-04-12 Thread Stefan Bethke

Am 12.04.2020 um 17:43 schrieb Slawa Olhovchenkov :
> 
> On Sun, Apr 12, 2020 at 04:37:06PM +0200, Stefan Bethke wrote:
> 
>> I have a server I don't have physical access to right now, which has a 
>> broken SATA disk that produces mostly errors (but not entirely).
>> 
>> The disk has two partitions that are part of a zpool each. I can't bring the 
>> system up with this disk being online, because ZFS is trying its darndest to 
>> use it.
>> 
>> I already renamed the GPT partitions in the hope that ZFS would not find 
>> them anymore, but it does.
>> 
>> I can't gpart destroy -f ada1 because "device busy".
>> 
>> Is there a way, ideally in the loader, to tell the kernel to ignore ada1 
>> and/or ahcich5? Or can I force ZFS some other way to ignore the disk? I do 
>> have a spare disk I can use to replace the failed one, but I can't get the 
>> machine into a state where I could even issue the zpool replace command.
> 
> `zpool offline pool device` if you have enoght redundancy?

I do, but the command doesn't return. Instead, I'm getting loads of sata error 
message.

--
Stefan BethkeFon +49 151 14070811



signature.asc
Description: Message signed with OpenPGP

Re: make kernel ignore broken SATA disk

2020-04-12 Thread Stefan Bethke

Am 12.04.2020 um 18:29 schrieb Eugene Grosbein :
> 
> 12.04.2020 21:57, Stefan Bethke wrote:
> 
>>>> Is there a way, ideally in the loader, to tell the kernel to ignore ada1 
>>>> and/or ahcich5? Or can I force ZFS some other way to ignore the disk? I do 
>>>> have a spare disk I can use to replace the failed one, but I can't get the 
>>>> machine into a state where I could even issue the zpool replace command.
>>> 
>>> It depends on the HDD controller the disk is attached to. What controller 
>>> and driver does it have?
>> 
>> This is from an identlical machine without disk issues:
>> 
>> # camcontrol devlist
>>   at scbus4 target 0 lun 0 (ada0,pass0)
>>   at scbus5 target 0 lun 0 (ada1,pass1)
>>   at scbus6 target 0 lun 0 (ada2,pass2)
>>at scbus8 target 0 lun 0 (pass3)
>> # pciconf -lv
>> ...
>> ahci0@pci0:0:23:0:class=0x010601 card=0x088415d9 chip=0xa1028086 rev=0x31 
>> hdr=0x00
>>vendor = 'Intel Corporation'
>>device = 'Q170/Q150/B150/H170/H110/Z170/CM236 Chipset SATA Controller 
>> [AHCI Mode]'
>>class  = mass storage
>>subclass   = SATA
> 
> And your FreeBSD version?

FreeBSD 12.1-STABLE r358833 amd64

Stefan

--
Stefan BethkeFon +49 151 14070811



signature.asc
Description: Message signed with OpenPGP

Re: make kernel ignore broken SATA disk

2020-04-12 Thread Stefan Bethke

> Am 12.04.2020 um 18:31 schrieb Slawa Olhovchenkov :
> 
> On Sun, Apr 12, 2020 at 06:24:09PM +0200, Stefan Bethke wrote:
> 
>> Am 12.04.2020 um 17:43 schrieb Slawa Olhovchenkov :
>>> 
>>> On Sun, Apr 12, 2020 at 04:37:06PM +0200, Stefan Bethke wrote:
>>> 
>>>> I have a server I don't have physical access to right now, which has a 
>>>> broken SATA disk that produces mostly errors (but not entirely).
>>>> 
>>>> The disk has two partitions that are part of a zpool each. I can't bring 
>>>> the system up with this disk being online, because ZFS is trying its 
>>>> darndest to use it.
>>>> 
>>>> I already renamed the GPT partitions in the hope that ZFS would not find 
>>>> them anymore, but it does.
>>>> 
>>>> I can't gpart destroy -f ada1 because "device busy".
>>>> 
>>>> Is there a way, ideally in the loader, to tell the kernel to ignore ada1 
>>>> and/or ahcich5? Or can I force ZFS some other way to ignore the disk? I do 
>>>> have a spare disk I can use to replace the failed one, but I can't get the 
>>>> machine into a state where I could even issue the zpool replace command.
>>> 
>>> `zpool offline pool device` if you have enoght redundancy?
>> 
>> I do, but the command doesn't return. Instead, I'm getting loads of sata 
>> error message.
> 
> What you zpool configuration?

This is from the working system. The identifiers are slightly different, but 
the structure is identical.

# zpool status
  pool: data
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
the pool may no longer be accessible by software that does not support
the features. See zpool-features(7) for details.
  scan: resilvered 176K in 0 days 00:01:28 with 0 errors on Sun May 26 21:24:54 
2019
config:

NAME  STATE READ WRITE CKSUM
data  ONLINE   0 0 0
  mirror-0ONLINE   0 0 0
gpt/ls0data   ONLINE   0 0 0
gpt/ls1data   ONLINE   0 0 0
logs
  gpt/data0logONLINE   0 0 0
cache
  gpt/data0cache  ONLINE   0 0 0

errors: No known data errors

  pool: ls-host
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
the pool may no longer be accessible by software that does not support
the features. See zpool-features(7) for details.
  scan: scrub repaired 0 in 0 days 00:06:33 with 0 errors on Sun Apr 12 
11:46:25 2020
config:

NAME  STATE READ WRITE CKSUM
ls-host   ONLINE   0 0     0
  mirror-0ONLINE   0 0 0
gpt/ls0host   ONLINE   0 0 0
gpt/ls1host   ONLINE   0 0 0
logs
  gpt/host0logONLINE   0 0 0
cache
  gpt/host0cache  ONLINE   0 0 0

errors: No known data errors

--
Stefan BethkeFon +49 151 14070811

signature.asc
Description: Message signed with OpenPGP

Re: make kernel ignore broken SATA disk

2020-04-12 Thread Stefan Bethke

Am 12.04.2020 um 19:03 schrieb Slawa Olhovchenkov :
> 
> On Sun, Apr 12, 2020 at 06:38:10PM +0200, Stefan Bethke wrote:
> 
>> 
>> 
>>> Am 12.04.2020 um 18:31 schrieb Slawa Olhovchenkov :
>>> 
>>> On Sun, Apr 12, 2020 at 06:24:09PM +0200, Stefan Bethke wrote:
>>> 
>>>> Am 12.04.2020 um 17:43 schrieb Slawa Olhovchenkov :
>>>>> 
>>>>> On Sun, Apr 12, 2020 at 04:37:06PM +0200, Stefan Bethke wrote:
>>>>> 
>>>>>> I have a server I don't have physical access to right now, which has a 
>>>>>> broken SATA disk that produces mostly errors (but not entirely).
>>>>>> 
>>>>>> The disk has two partitions that are part of a zpool each. I can't bring 
>>>>>> the system up with this disk being online, because ZFS is trying its 
>>>>>> darndest to use it.
>>>>>> 
>>>>>> I already renamed the GPT partitions in the hope that ZFS would not find 
>>>>>> them anymore, but it does.
>>>>>> 
>>>>>> I can't gpart destroy -f ada1 because "device busy".
>>>>>> 
>>>>>> Is there a way, ideally in the loader, to tell the kernel to ignore ada1 
>>>>>> and/or ahcich5? Or can I force ZFS some other way to ignore the disk? I 
>>>>>> do have a spare disk I can use to replace the failed one, but I can't 
>>>>>> get the machine into a state where I could even issue the zpool replace 
>>>>>> command.
>>>>> 
>>>>> `zpool offline pool device` if you have enoght redundancy?
>>>> 
>>>> I do, but the command doesn't return. Instead, I'm getting loads of sata 
>>>> error message.
>>> 
>>> What you zpool configuration?
>> 
>> This is from the working system. The identifiers are slightly different, but 
>> the structure is identical.
> 
> what about `zpool detach  ` ?

Now I can't boot into single user mode anymore, ZFS just waits forever, and the 
kernel is printing an endless chain of SATA error messages.

I really need a way to remove the broken disk before ZFS tries to access it, or 
a way to stop ZFS from try to access the disk.


Stefan

--
Stefan BethkeFon +49 151 14070811



signature.asc
Description: Message signed with OpenPGP

Re: make kernel ignore broken SATA disk

2020-04-12 Thread Stefan Bethke


Am 12.04.2020 um 18:53 schrieb Ian Lepore :
> 
> On Sun, 2020-04-12 at 16:37 +0200, Stefan Bethke wrote:
>> I have a server I don't have physical access to right now, which has
>> a broken SATA disk that produces mostly errors (but not entirely).
>> 
>> The disk has two partitions that are part of a zpool each. I can't
>> bring the system up with this disk being online, because ZFS is
>> trying its darndest to use it.
>> 
>> I already renamed the GPT partitions in the hope that ZFS would not
>> find them anymore, but it does.
>> 
>> I can't gpart destroy -f ada1 because "device busy".
>> 
>> Is there a way, ideally in the loader, to tell the kernel to ignore
>> ada1 and/or ahcich5? Or can I force ZFS some other way to ignore the
>> disk? I do have a spare disk I can use to replace the failed one, but
>> I can't get the machine into a state where I could even issue the
>> zpool replace command.
> 
> The the loader prompt (or in loader.conf without 'set'):
> 
> set hint.ada.1.disabled=1

Doesn't seem to have any effect. ada1 still probed, and still prints error 
messages to the console.


Stefan

--
Stefan BethkeFon +49 151 14070811



signature.asc
Description: Message signed with OpenPGP

Re: make kernel ignore broken SATA disk

2020-04-12 Thread Stefan Bethke

Am 12.04.2020 um 19:24 schrieb Eugene Grosbein :
> 
> Try something like this at loader prompt:
> 
> set hint.ahcich.5.disabled=1

Thank you, that did the trick!


Stefan

--
Stefan BethkeFon +49 151 14070811



signature.asc
Description: Message signed with OpenPGP

Re: make kernel ignore broken SATA disk

2020-04-12 Thread Stefan Bethke

Am 12.04.2020 um 19:59 schrieb Warner Losh :
> 
> Boot single user. Zfs won't import and you can do what you need.

Not if you have root on ZFS, and it's on the affected pool.


Stefan

--
Stefan BethkeFon +49 151 14070811



signature.asc
Description: Message signed with OpenPGP

1 2 >

1 - 100 of 188 matches

Mail list logo