from:"Dan"

On Aug 19, 2011, at 11:24 PM, Jeremy Chadwick wrote:

> On Fri, Aug 19, 2011 at 09:39:17PM -0400, Dan Langille wrote:
>> 
>> On Aug 19, 2011, at 7:21 PM, Jeremy Chadwick wrote:
>> 
>>> On Fri, Aug 19, 2011 at 04:50:01PM -0400, Dan Langille wrote:
>>>> System in question: FreeBSD 8.2-STABLE #3: Thu Mar  3 04:52:04 GMT 2011
>>>> 
>>>> After a recent power failure, I'm seeing this in my logs:
>>>> 
>>>> Aug 19 20:36:34 bast smartd[1575]: Device: /dev/ad2, 2 Currently 
>>>> unreadable (pending) sectors
>>> 
>>> I doubt this is related to a power failure.
>>> 
>>>> Searching on that error message, I was led to believe that identifying the 
>>>> bad sector and
>>>> running dd to read it would cause the HDD to reallocate that bad block.
>>>> 
>>>> http://smartmontools.sourceforge.net/badblockhowto.html
>>> 
>>> This is incorrect (meaning you've misunderstood what's written there).
>>> 
>>> Unreadable LBAs can be a result of the LBA being actually bad (as in
>>> uncorrectable), or the LBA being marked "suspect".  In either case the
>>> LBA will return an I/O error when read.
>>> 
>>> If the LBAs are marked "suspect", the drive will perform re-analysis of
>>> the LBA (to determine if the LBA can be read and the data re-mapped, or
>>> if it cannot then the LBA is marked uncorrectable) when you **write** to
>>> the LBA.
>>> 
>>> The above smartd output doesn't tell me much.  Providing actual SMART
>>> attribute data (smartctl -a) for the drive would help.  The brand of the
>>> drive, the firmware version, and the model all matter -- every drive
>>> behaves a little differently.
>> 
>> Information such as this?  
>> http://beta.freebsddiary.org/smart-fixing-bad-sector.php
> 
> Yes, perfect.  Thank you.  First thing first: upgrade smartmontools to
> 5.41.  Your attributes will be the same after you do this (the drive is
> already in smartmontools' internal drive DB), but I often have to remind
> people that they really need to keep smartmontools updated as often as
> possible.  The changes between versions are vast; this is especially
> important for people with SSDs (I'm responsible for submitting some
> recent improvements for Intel 320 and 510 SSDs).

Done.

> Anyway, the drive (albeit an old PATA Maxtor) appears to have three
> anomalies:
> 
> 1) One confirmed reallocated LBA (SMART attribute 5)
> 
> 2) One "suspect" LBA (SMART attribute 197)
> 
> 3) A very high temperature of 51C (SMART attribute 194).  If this drive
> is in an enclosure or in a system with no fans this would be
> understandable, otherwise this is a bit high.  My home workstation which
> has only one case fan has a drive with more platters than your Maxtor,
> and it idles at ~38C.  Possibly this drive has been undergoing constant
> I/O recently (which does greatly increase drive temperature)?  Not sure.
> I'm not going to focus too much on this one.

This is an older system.  I suspect insufficient ventilation.  I'll look at 
getting
a new case fan, if not some HDD fans.

> The SMART error log also indicates an LBA failure at the 26000 hour mark
> (which is 16 hours prior to when you did smartctl -a /dev/ad2).  Whether
> that LBA is the remapped one or the suspect one is unknown.  The LBA was
> 5566440.
> 
> The SMART tests you did didn't really amount to anything; no surprise.
> short and long tests usually do not test the surface of the disk.  There
> are some drives which do it on a long test, but as I said before,
> everything varies from drive to drive.
> 
> Furthermore, on this model of drive, you cannot do a surface scans via
> SMART.  Bummer.  That's indicated in the "Offline data collection
> capabilities" section at the top, where it reads:
> 
>   No Selective Self-test supported.
> 
> So you'll have to use the dd method.  This takes longer than if surface
> scanning was supported by the drive, but is acceptable.  I'll get to how
> to go about that in a moment.

FWIW, I've done a dd read of the entire suspect disk already.  Just two errors.
From the URL mentioned above:

[root@bast:~] # dd of=/dev/null if=/dev/ad2 bs=1m conv=noerror
dd: /dev/ad2: Input/output error
2717+0 records in
2717+0 records out
2848980992 bytes transferred in 127.128503 secs (22410246 bytes/sec)
dd: /dev/ad2: Input/output error
38170+1 records in
38170+1 records out
40025063424 bytes transferred in 1544.671423 secs (25911701 bytes/sec)
[root@bast:~] # 

That seems to indicate two prob

Re: bad sector in gmirror HDD


On Aug 20, 2011, at 1:54 PM, Alex Samorukov wrote:

>> [root@bast:~] # dd of=/dev/null if=/dev/ad2 bs=1m conv=noerror
>> dd: /dev/ad2: Input/output error
>> 2717+0 records in
>> 2717+0 records out
>> 2848980992 bytes transferred in 127.128503 secs (22410246 bytes/sec)
>> dd: /dev/ad2: Input/output error
>> 38170+1 records in
>> 38170+1 records out
>> 40025063424 bytes transferred in 1544.671423 secs (25911701 bytes/sec)
>> [root@bast:~] #
>> 
>> That seems to indicate two problems.  Are those the values I should be using
>> with dd?
>> 
> 


> You can run long self-test in smartmontools (-t long). Then you can get 
> failed sector number from the smartmontools (-l selftest) and then you can 
> use DD to write zero to the specific sector.

Already done: http://beta.freebsddiary.org/smart-fixing-bad-sector.php

Search for 786767

Or did you mean something else?

That doesn't seem to map to a particular sector though... I ran it for a 
while...

# time dd of=/dev/null if=/dev/ad2 bs=512 iseek=786767 
^C4301949+0 records in
4301949+0 records out
2202597888 bytes transferred in 780.245828 secs (2822954 bytes/sec)

real13m0.256s
user0m22.087s
sys 3m24.215s



> Also i am highly recommending to setup smartd as daemon and to monitor number 
> of relocated sectors. If they will grow again - then it is a good time to 
> utilize this disk.

It is running, but with nothing custom in the .conf file.

-- 
Dan Langille - http://langille.org

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: bad sector in gmirror HDD


On Aug 20, 2011, at 2:04 PM, Diane Bruce wrote:

> On Sat, Aug 20, 2011 at 01:34:41PM -0400, Dan Langille wrote:
>> On Aug 19, 2011, at 11:24 PM, Jeremy Chadwick wrote:
>> 
>>> On Fri, Aug 19, 2011 at 09:39:17PM -0400, Dan Langille wrote:
> ...
>>>> Information such as this?  
>>>> http://beta.freebsddiary.org/smart-fixing-bad-sector.php
> ...
>>> 3) A very high temperature of 51C (SMART attribute 194).  If this drive
>>> is in an enclosure or in a system with no fans this would be
> 
> ...
> 
> eh? What's the temperature of the second drive?

Roughly the same:


[root@bast:/home/dan/tmp] # smartctl -a /dev/ad2 | grep -i temp
194 Temperature_Celsius 0x0022   080   076   042Old_age   Always   
-   51

[root@bast:/home/dan/tmp] # smartctl -a /dev/ad0 | grep -i temp
194 Temperature_Celsius 0x0022   081   074   042Old_age   Always   
-   49
[root@bast:/home/dan/tmp] # 


FYI, when I first set up smartd, I questioned those values.  The HDD in 
question, at the time,
did not feel hot to the touch.

> 
> ...
> 
>> This is an older system.  I suspect insufficient ventilation.  I'll look at 
>> getting
>> a new case fan, if not some HDD fans.
> 
> ...
> 
>>> I still suggest you replace the drive, although given its age I doubt
> 
> Older drive and errors starting to happen, replace ASAP.
> 
>>> you'll be able to find a suitable replacement.  I tend to keep disks
>>> like this around for testing/experimental purposes and not for actual
>>> use.
>> 
>> I have several unused 80GB HDD I can place into this system.  I think that's
>> what I'll wind up doing.  But I'd like to follow this process through and 
>> get it documented
>> for future reference.
> 
> If the data is valuable, the sooner the better. 
> It's actually somewhat saner if the two drives are not from the same lot.

Noted.

-- 
Dan Langille - http://langille.org

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: bad sector in gmirror HDD

On Aug 20, 2011, at 2:36 PM, Jeremy Chadwick wrote:

> Dan, I will respond to your reply sometime tomorrow.  I do not have time
> to review the Email today (~7.7KBytes), but will have time tomorrow.


No worries.  Thank you.

-- 
Dan Langille - http://langille.org

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: bad sector in gmirror HDD

On Aug 20, 2011, at 3:57 PM, Jeremy Chadwick wrote:

>>> I still suggest you replace the drive, although given its age I doubt
>>> you'll be able to find a suitable replacement.  I tend to keep disks
>>> like this around for testing/experimental purposes and not for actual
>>> use.
>> 
>> I have several unused 80GB HDD I can place into this system.  I think that's
>> what I'll wind up doing.  But I'd like to follow this process through and 
>> get it documented
>> for future reference.
> 
> Yes, given the behaviour of the drive I would recommend you simply
> replace it at this point in time.  What concerns me the most is
> Current_Pending_Sector incrementing, but it's impossible for me to
> determine if that incrementing means there are other LBAs which are bad,
> or if the drive is behaving how its firmware is designed.
> 
> Keep the drive around for further experiments/tinkering if you're
> interested.  Stuff like this is always interesting/fun as long as your
> data isn't at risk, so doing the replacement first would be best
> (especially if both drives in your mirror were bought at the same time
> from the same place and have similar manufacturing plants/dates on
> them).


I'm happy to send you this drive for your experimentation pleasure.

If so, please email me an address offline.  You don't have a disk with 
errors, and it seems you should have one.

After I wipe it.  I'm sure I have a destroyer CD here somewhere

-- 
Dan Langille - http://langille.org

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Something missing in truss

2011-12-03 Thread Dan Nelson

In the last episode (Dec 02), Eivind Evensen said:
> Does anybody else see this or know why?
> 
> The machine here is running :
> 
> > uname -a
> FreeBSD elg.hjerdalen.lokalnett 8.2-STABLE FreeBSD 8.2-STABLE #36: Wed Nov 30 
> 22:03:07 CET 2011 
> rumrunner@elg.hjerdalen.lokalnett:/usr/obj/usr/src/sys/RUM  amd64
> 
> While trying to weed out some firefox problems, I've noticed
> that truss doesn't recognise certain syscalls :
> 
> getpid()   = 1519 (0x5ef)
> clock_gettime(4,{48496.335142903 })= 0 (0x0)
> kevent(20,{0x23,EVFILT_READ,EV_ADD,0,0x0,0x809ec9d80},1,{0x15,EVFILT_READ,0x0,0,0x1,0x809ec9e80},64,0x0)
>  = 1 (0x1)
> clock_gettime(4,{48496.335293202 })= 0 (0x0)
> read(21,"\0",1)= 1 (0x1)
> clock_gettime(4,{48496.335382599 })= 0 (0x0)
> umask(0x80a52ee20,0x8,0x0,0x80a52ee00,0x7f1f9eb0,0x80a52ee00) = 116 (0x74)
> -- UNKNOWN SYSCALL -14704864 --
> syscall(0x7f1f9ec0,0x0,0x18745,0x7f1f9eb0,0x1,0x7f1f9e90) = 454 
> (0x1c6)
> umask(0x80a52ee20,0x8,0x0,0x80a52ee00,0x7f1f9eb0,0x80a52ee00) = 116 (0x74)
> -- UNKNOWN SYSCALL -14704864 --
> syscall(0x7f1f9ec0,0x0,0x18745,0x7f1f9eb0,0x1,0x7f1f9e90) = 454 
> (0x1c6)
> umask(0x80a52ee20,0x8,0x0,0x80a52ee00,0x7f1f9eb0,0x80a52ee00) = 116 (0x74)
> -- UNKNOWN SYSCALL -14704864 --
> syscall(0x7f1f9ec0,0x0,0x18745,0x7f1f9eb0,0x1,0x7f1f9e90) = 454 
> (0x1c6)
> umask(0x80a52ee20,0x8,0x0,0x80a52ee00,0x7f1f9eb0,0x80a52ee00) = 116 (0x74)
> -- UNKNOWN SYSCALL -14704864 --
> syscall(0x7f1f9ec0,0x0,0x18745,0x7f1f9eb0,0x1,0x7f1f9e90) = 454 
> (0x1c6)
> umask(0x80a52ee20,0x8,0x0,0x80a52ee00,0x7f1f9eb0,0x80a52ee00) = 116 (0x74)
> -- UNKNOWN SYSCALL -14704864 --
> syscall(0x7f1f9ec0,0x0,0x18745,0x7f1f9eb0,0x1,0x7f1f9e90) = 454 
> (0x1c6)

Two problems: truss get confused when you attach to a process that's
currently executing a syscall, and it gets even more confused when you have
a threaded process waiting in many syscalls at once.

The following patch fixes problem #1, but problem #2 involves keeping more
per-thread state and ends up touching a lot of the truss code.  See
http://www.evoy.net/FreeBSD/truss.diff for one solution (and more syscall
decodes).

Index: setup.c
===
--- setup.c (revision 228242)
+++ setup.c (working copy)
@@ -202,8 +202,10 @@
find_thread(info, lwpinfo.pl_lwpid);
switch(WSTOPSIG(waitval)) {
case SIGTRAP:
-   info->pr_why = info->curthread->in_syscall?S_SCX:S_SCE;
-   info->curthread->in_syscall = 1 - 
info->curthread->in_syscall;
+   if ((lwpinfo.pl_flags&(PL_FLAG_SCE|PL_FLAG_SCX)) == 0)
+   err(1,"pl_flags=%x contains neither PL_FLAG_SCE 
or PL_FLAG_SCX", lwpinfo.pl_flags);
+   info->pr_why = (lwpinfo.pl_flags&PL_FLAG_SCE) ? 
S_SCE:S_SCX;
+       info->curthread->in_syscall = (info->pr_why == S_SCE) ? 
1:0;
break;
default:
info->pr_why = S_SIG;


-- 
Dan Nelson
dnel...@allantgroup.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: kernel: negative sbsize for uid = 0

2011-12-14 Thread Dan Nelson

In the last episode (Dec 13), Doug Barton said:
> I'm running 8.2-RELEASE-p4 i386 on some web servers that are generally
> lightly-moderately loaded, but occasionally see some heavy spikes where
> load average goes way up.  When that is happening, but sometimes even when
> it's not, I get hundreds of this message spewing into the logs:
> 
> kernel: negative sbsize for uid = 0
> 
> I haven't found anything particularly useful by searching for that
> message, the one reference was to mbufs, but that seems not to be the
> problem.  Here is the output of 'netstat -m' during one of the load
> spikes:
[...]
> So is this message something to worry about? If so, how can I diagnose
> what's happening, and how do I fix it?

I've seen it ocassionally too.  The error message is printed in
/sys/kern/kern_resource.c when the ui_sbsize resource counter goes negative. 
There's probably insufficient locking somewhere in the functions that call
chgsbsize.  The increment/decrement is done atomically, but the data pointed
to by the "hiwat" argument is read then updated later without an explicit
lock, so if that value changes while the function is executing, it could
cause problems.  ui_sbsize is only used by the resource limiting code,
though, so unless you're enforcing an sbsize rlimit, it should be harmless.

-- 
Dan Nelson
dnel...@allantgroup.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: swi4: clock taking 40% cpu?!?

2011-12-15 Thread Dan Nelson

In the last episode (Dec 15), Jeremy Chadwick said:
> On Thu, Dec 15, 2011 at 12:51:28PM -0800, Doug Barton wrote:
> > Web server under heavy'ish load (7 on a 2 cpu system) running
> > 8.2-RELEASE-p4 i386 I'm seeing this:
> > 
> > PID USERNAME PRI NICE   SIZERES STATE   C   TIME   WCPU COMMAND
> > 12  root -32- 0K   112K WAIT0 129:01 39.99% {swi4: clock}
> > 
> > Any ideas why the clock should be taking so much cpu? HZ=100 if that
> > makes a difference ...
> 
> Could be wrong, but I believe this correlates with IRQ 4.  What does
> vmstat -i show for a total and rate for irq4 if you run it, wait a few
> seconds, then run it again?  Does the number greatly/rapidly increase?

That would be "irq4" in that case, though.  "swi4" is just a software
interrupt thread, and "clock" is the softclock callout handler.  There are
both KTR and DTrace logging functions in kern_timeout.c, so you could use
either one to get a handle on what's eating your CPU.  Busy-looping
"procstat -k 12" for a few seconds might get you some useful stacks, as
well.
 
> Shot in the dark here, but the only thing I can think of that might
> cause this is software being extremely aggressive with calls to things
> like gettimeofday(2) or clock_gettime(2).  Really not sure.  ntpd maybe
> (unlikely but possible)?  Sort of grasping at straws here.

-- 
Dan Nelson
dnel...@allantgroup.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

ACPI broke going from 8 to 9

(** Originally posted to freebsd-curr...@freebsd.org but I noticed that there 
are 9.0 RC3 questions here on freebsd-stable; I am not sure which forum is 
appropriate. **)

I just upgraded my Dell OptiPlex GX270 from RELENG_8 to RELENG_9.  The machine 
no longer boots.  However, if I put

 hint.acpi.0.disabled=1

in /boot/loader.conf then the machine runs fine.  With RELENG_8 the machine had 
no loader.conf, and the power button worked on my desktop machine.  Now with 
ACPI disabled my power button does not work.  I have found that the machine 
hangs at boot during a scan of the PCI bus, but if I disable that 
(hw.acpi.disable=pci) then the machine cannot find a boot drive.

So I have lost functionality that worked fine in BSD 8.

Thoughts?  Suggestions?

Thanks,

Dan Allen
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ACPI broke going from 8 to 9

On 31 Dec 2011, at 10:57 AM, Jeremy Chadwick wrote:

> Do you have a necessary reason to upgrade to 9 given this situation?
> Given the conditions I would stay you should stay with 8.

This philosophy seems wrong, but it may be the way to go.

My Toshiba Satellite U205 used to work great with RELENG_7, but the boot code 
of RELENG_8 will not recognize the 2nd core of my Core Duo (not Core 2 Duo) 
processor.  Nobody seems to care as few machines have Core Duo, or few people 
use this era of Toshiba BIOS, or whatever.

Now my Dell GX270 ACPI code is pre 2.0 (so Garrett tells me), so RELENG_9 is 
out.

I guess I should run all of my older machines on RELENG_7 but -- and this is 
where the philosophy you suggest seems wrong -- I still want the latest apps, 
security fixes, etc.  If the stable tree updates ls or tcsh or awk, I want 
these, but the core OS seems to have moved on from 2004 machines.

In other words, there is no tree for me.

Dan

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ACPI broke going from 8 to 9


On 31 Dec 2011, at 1:01 PM, Adrian Chadd wrote:

> So what I can only suggest is that you build and boot a variety of
> -HEAD kernels. Start with HEAD from say, Jan 1 2011. Boot it, see if
> it works. If it doesn't, go back 3 months at a time. If it does, go
> forward three months until it breaks.

Fair enough.  I will see what I can accomplish.  Thanks!

Dan


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ACPI broke going from 8 to 9

On 31 Dec 2011, at 12:34 PM, Garrett Cooper wrote:

> Not yet. Add 'nooptions NEW_PCIB' to your KERNCONF, recompile, and
> try booting the new kernel. See if this works.

It worked!  No hang, power button works.  Nice.  I hope this experimental 
option stays in.

Thank you everyone for your help.  Happy New Years!

Dan

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ACPI broke going from 8 to 9

On 31 Dec 2011, at 4:31 PM, Jeremy Chadwick wrote:

> In the meantime: Dan, when you say in your original mail, "I just
> upgraded my Dell OptiPlex GX270 from RELENG_8 to RELENG_9", can you
> please provide uname -a output from the system when it was running
> RELENG_8?  I'm looking specifically for the exact time when the kernel
> was built

Almost every day I csup from RELENG_x and build.  The traces of RELENG_8 are 
gone, so no, unfortunately I cannot give you a uname -a from those days.

However, I have a build log file, and I see that I moved from RELENG_8 to 
RELENG_9 on Friday, Dec 23, 2011.  I csup'd at 12:24:26 MST and discovered the 
failure at 15:41 MST.

This "nooptions NEW_PCIB" fix does seem rather tenuous if it is not documented. 
 Wouldn't a better route be something like

  if (ACPI < 2.0)
oldCode();
  else
newCodeForNewACPI();

so that it will always work for everyone without having to build a special 
kernel?  After all, I went from a working system to a hung system which is not 
the best upgrade path... ;-)

Dan

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 9-stable from i386 to amd64

2012-02-10 Thread Dan Nelson

In the last episode (Feb 10), Randy Bush said:
> is there a recipe for moving from i386 to amd64?
> 
> on a very remote system, i made the migration from 7.4 to 8.2 to 9.0, all
> 32-bit.  it was done with repeated
> 
> make buildworld
> make kernel.new [0]
> nextboot -k kernel.new
> reboot
> make installworld
> etc
> 
> [0] - well, there were some mv(1)s in there :)
> 
> so after it was happy with 9.0 i386, i went to move to amd64 with
> 
> make buildworld TARGET=amd64
> make kernel TARGET=amd64 DESTDIR=kernel.new [0]
> nextboot -k kernel.new
> reboot
> 
> it did not come back from the reboot, and required a manual reset.  i have
> no console access to the machine, not my choice.
> 
> clue bat please.

You probably got bit by a mismatched /libexec/ld-elf.so. The kernel expects
that to be the "native" version, and on a 64-bit kernel it also expects a
ld-elf32.so to be the "compat" 32-bit version.  When you rebooted onto the
64-bit kernel, it couldn't find /libexec/ld-elf32.so to run any of the
32-bit binaries on the system.  My guess is that your reboot attempt died in
/sbin/init, prompting for a path to /bin/sh.  If you compiled with a static
/bin/sh for performance, it probably died very early in /etc/rc.

I think copying ld-elf.so over to ld-elf32.so might have been all you needed
to boot, but that would end up with a 64-bit kernel running a true 32-bit
userland with all the libraries in the "wrong" place, and your
"installworld" step would replace them with their 64-bit equivalents and
your install would die halfway through, leaving you with a large mess to
clean up.

The cleanest upgrade path is to prepare your 32-bit root to be bootable by
both 32- and 64-bit kernels: copy the ld-elf32.so that was built during your
buildworld over to /libexec/ld-elf32.so, and also make copies of
/lib and /usr/lib to /lib32 and /usr/lib32 respectively.  That way when you
reboot to a 64-bit kernel, your 32-bit executables will be running
"correctly" out of compat32 paths and your installworld should succeed.

When I did all this on a local system, I made judicious use of ZFS snapshots
and clones, preserving a bootable clone of my original system plus
intermediate versions all the way until I was happy with the result.  I've
never done it completely remotely, but if you do a trial run or two on a
local machine or VM, you should be able to it confidently remotely.

-- 
Dan Nelson
dnel...@allantgroup.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Floppy disks don't work with FreeBSD 9.0

2012-03-27 Thread Dan Daley



Pretty sure I have a brand new, never used floppy drive laying around 
that I could send to you :)  I don't think I have any discs though.


On 03/27/2012 17:03, Mark Felder wrote:

On Tue, 27 Mar 2012 16:48:26 -0500, Thomas Laus  wrote:



It looks like we both have confirmed that the floppy disk operation 
works up

to FreeBSD 8.3 RC1.  I will need to file a PR for FreeBSD 9.0 in the bug
system.
Thanks for the help.



Could this be related to CAM system issues that shipped with FreeBSD 
9.0 and were fixed in -STABLE? Like the CDROM issues? I'd probably 
test in -STABLE first. Unfortunately I don't have any floppy drives to 
test this with or I'd lend a hand.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ntpd couldn't resolve host name on system boot

2012-05-19 Thread Dan Daley



I haven't had a chance to read through this entire thread yet, but 
wanted to post this in case it helps someone.


ntpd was working fine for me for a while, and then I started getting 
this exact same error.  After a few weeks, I finally started 
troubleshooting and it turned out that I had, at some point, commented 
this out in my rc.conf


defaultrouter="192.168.1.1"

Everything else seemed to work fine (I use DHCP, so assume that it 
figured out the router from that).  Once I uncommented that line, ntpd 
started working again.  Maybe the netwait and/or dhcp sync would allow 
ntpd to work without having to specify a default router in rc.conf, but 
I haven't played with that yet.




On 05/18/2012 08:28, Matthew Doughty wrote:

Hello Bjoern,

It's still a problem for me.  Here is a list of my system information:

Hostname freenas.local  FreeNAS Build FreeNAS-8.0.4-RELEASE-p1-x64 (11059)
Platform AMD Athlon(tm) II X2 250 Processor  Memory 8144MB  System Time Fri
May 18 09:07:22 2012  Uptime 9:07AM up 6 mins, 0 users  Load Average 0.00,
0.21, 0.16  OS Version FreeBSD 8.2-RELEASE-p6
Are you refering to the OS version? 9.0 or 8.3?  Looks like I'm using 8.2.

Best regards,
Matthew


On 17 May 2012 16:48, Bjoern A. Zeeb  wrote:


On 17. May 2012, at 15:03 , Matthew Doughty wrote:


Dear Jerermy,

Whilst searching for a solution to a problem, I found your post:


http://lists.freebsd.org/pipermail/freebsd-stable/2011-October/064350.html

Please could you explain how I can implement the netwait script to solve
the problem?  I'm new to freenas/BSD but am willing to try working from

the

Cmd line.

ntpd in head 9.0 and later and 8.3 and later should not exhibit that
problem
anymore as it was fixed.   Could you please tell me if that is not the
case?

/bz

--
Bjoern A. Zeeb You have to have visions!
   It does not matter how good you are. It matters what good you do!





___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ntpd couldn't resolve host name on system boot

2012-05-19 Thread Dan Daley



I haven't had a chance to read through this entire thread yet, but 
wanted to post this in case it helps someone.


ntpd was working fine for me for a while, and then I started getting 
this exact same error.  After a few weeks, I finally started 
troubleshooting and it turned out that I had, at some point, commented 
this out in my rc.conf


defaultrouter="192.168.1.1"

Everything else seemed to work fine (I use DHCP, so assume that it 
figured out the router from that).  Once I uncommented that line, ntpd 
started working again.  Maybe the netwait and/or dhcp sync would allow 
ntpd to work without having to specify a default router in rc.conf, but 
I haven't played with that yet.




On 05/18/2012 08:28, Matthew Doughty wrote:

Hello Bjoern,

It's still a problem for me.  Here is a list of my system information:

Hostname freenas.local  FreeNAS Build FreeNAS-8.0.4-RELEASE-p1-x64 (11059)
Platform AMD Athlon(tm) II X2 250 Processor  Memory 8144MB  System Time Fri
May 18 09:07:22 2012  Uptime 9:07AM up 6 mins, 0 users  Load Average 0.00,
0.21, 0.16  OS Version FreeBSD 8.2-RELEASE-p6
Are you refering to the OS version? 9.0 or 8.3?  Looks like I'm using 8.2.

Best regards,
Matthew


On 17 May 2012 16:48, Bjoern A. Zeeb  wrote:


On 17. May 2012, at 15:03 , Matthew Doughty wrote:


Dear Jerermy,

Whilst searching for a solution to a problem, I found your post:


http://lists.freebsd.org/pipermail/freebsd-stable/2011-October/064350.html

Please could you explain how I can implement the netwait script to solve
the problem?  I'm new to freenas/BSD but am willing to try working from

the

Cmd line.

ntpd in head 9.0 and later and 8.3 and later should not exhibit that
problem
anymore as it was fixed.   Could you please tell me if that is not the
case?

/bz

--
Bjoern A. Zeeb You have to have visions!
   It does not matter how good you are. It matters what good you do!





___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Netflix's New Peering Appliance Uses FreeBSD

2012-06-05 Thread Dan Daley


I didn't see a link to this information in the e-mail below.  I found this info 
detailed here: 


https://signup.netflix.com/openconnect/software






From: Benjamin Francom 
To: freebsd-stable@freebsd.org
Sent: Tue, June 5, 2012 11:00:01 AM
Subject: Netflix's New Peering Appliance Uses FreeBSD

I just saw this, and thought I'd share:

Open Connect Appliance Software

Netflix delivers streaming content using a combination of intelligent
clients, a central control system, and a network of Open Connect appliances.

When designing the Open Connect Appliance Software, we focused on these
fundamental design goals:

   - Use of Open Source software
   - Ability to efficiently read from disk and write to network sockets
   - High-performance HTTP delivery
   - Ability to gather routing information via BGP

Operating System

For the operating system, we use FreeBSD  version
9.0. This was selected for its balance of stability and features, a strong
development community and staff expertise. We will contribute changes we
make as part of our project to the community through the FreeBSD committers
on our team.
Web server

We use the nginx  web server for its proven
scalability and performance. Netflix audio and video is served via HTTP.
Routing intelligence proxy

We use the BIRD Internet routing daemon  to enable
the transfer of network topology from ISP networks to the Netflix control
system that directs clients to sources of content.
Acknowledgements

We would would like to express our thanks to the FreeBSD community, the
nginx community, and Ondrej and the BIRD team for providing excellent open
source software. We also work directly with Igor, Maxim, Andrew, Sergey,
Ruslan and the rest of the team at nginx.com , who
provide superb development support for our project."

-- 
Benjamin Francom
Information Technology Professional
http://www.benfrancom.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Netflix's New Peering Appliance Uses FreeBSD

2012-06-05 Thread Dan Daley



Maybe their knowledge of Linux drove them to use FreeBSD.  Sorry, 
couldn't resist ;)


On 06/05/2012 19:42, David Magda wrote:

On Jun 5, 2012, at 20:16, Scott Long wrote:


If you have any questions, let me know or follow the information links on the
OpenConnect web site.

Out of curiosity, given that Linux seems popular in so many other places 
(Google, Facebook), is there any particular reason why FreeBSD was chosen for 
this?

I'm sure Linux is used in many other places (much of Netflix's IT 
infrastructure is on Amazon IIRC), so I'm kind of surprised that they went with 
FreeBSD when they probably already have so much knowledge with Linux.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Documenting 'make config' options

2012-06-06 Thread Dan Daley



I usually use portmaster to install ports. The options dialogs that pop 
up are often for dependencies. The options dialog gives the name of the 
port for which the options are being selected, but no description or 
indication as to why this is being installed (this could be a dependency 
of a dependency of some dependency of the port I am installing). It's 
probably too much for this dialog to show why this port is being 
installed (what other port required this port that is being installed), 
but a description of what this current port is would be helpful.


But, if possible, some breadcrumb across the top showing the 
dependencies which prompted this install would be great:


Port A --> Port B --> Port C --> Current Port for which options are 
being chosen




On 06/06/2012 21:47, Charles Sprickman wrote:

On Jun 6, 2012, at 7:43 PM, Warren Block wrote:


On Wed, 6 Jun 2012, Vincent Hoffman wrote:


On 06/06/2012 22:23, Glen Barber wrote:

On Wed, Jun 06, 2012 at 02:14:46PM -0700, Doug Barton wrote:

On 06/06/2012 11:59, Dave Hayes wrote:

I'm describing more of a use case here, not attempting to specify an
implementation. If a user invokes 'make', a window is presented to them
with various options. It's probably very common that this is met with an
initial reaction of "what the hell do these do?", even from the most
seasoned of admins (presuming they are unfamiliar with the software they
have been asked to install). I claim it would be an improvement to have
that information at the fingertips of the make invoker.

What manner of providing this information would meet your needs?


IMHO, something informing what "THAT" is in devel/subversion option
MOD_DONTDOTHAT would be nice.  :)


Not something I had bothered looking up till now as I hadnt wanted to
use it but the 2nd hit on google,
http://lists.freebsd.org/pipermail/freebsd-ports-bugs/2009-April/161673.html
describes it quite well.
I tend to go with, If i dont know what it is, and its not default, I
probably dont need it.
Unless it looks interesting, then I google it ;)

Maybe an (optional) new file with a longer descriptions of the make
options so as not to crowd the make config dialog?
I dont mind looking up compile time options for software I am installing
but I can see how having a precis available locally might be handy.

Here's an idea: if the description is too long to show in the very limited space, cut it off, 
show a "...", and show the entire description in a two- or three-line text box below 
the main one.  The><  indicate a highlight here:

  ---
  >[ ] GOOFY Build with support for the...<
   [ ] EXAMPLES  Install the examples

  ---
   <   OK>   

   -
   Build with support for the GOOFY framework
   that provides concurrent whoopsies integrated
   with a Perubython interpreter, and stuff.
  ---

The description at the bottom is from whatever option is currently highlighted, 
and changes as the user scrolls through the options.  It would be blank if the 
entire description could be displayed in the space available above.

The advantage of this is that it would work with existing ports, and give the 
ability to use longer descriptions.  The disadvantage is that dialog(1) would 
probably need modifications.

If we're talking about changing dialog(1), let's make sure there's also an "uncheck 
all"/"check all" option.

I'm looking at you, ghostscript:

 wc -l /var/db/ports/ghostscript9/options
  315 /var/db/ports/ghostscript9/options

Charles


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

--
Charles Sprickman
NetEng/SysAdmin
Bway.net - New York's Best Internet www.bway.net
sp...@bway.net - 212.655.9344





___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Documenting 'make config' options

2012-06-08 Thread Dan Daley


Thanks.  I'll check this script out.





From: Oliver Fromme 
To: freebsd-stable@FreeBSD.ORG; Dan Daley ; Charles 
Sprickman 
; Warren Block ; Vincent Hoffman 

Sent: Fri, June 8, 2012 2:47:37 AM
Subject: Re: Documenting 'make config' options

Dan Daley  wrote:
> I usually use portmaster to install ports. The options dialogs that pop 
> up are often for dependencies. The options dialog gives the name of the 
> port for which the options are being selected, but no description or 
> indication as to why this is being installed (this could be a dependency 
> of a dependency of some dependency of the port I am installing). It's 
> probably too much for this dialog to show why this port is being 
> installed (what other port required this port that is being installed), 
> but a description of what this current port is would be helpful.
> 
> But, if possible, some breadcrumb across the top showing the 
> dependencies which prompted this install would be great:
> 
> Port A --> Port B --> Port C --> Current Port for which options are 
> being chosen

You might want to have a look at my "portup" script.  It can
be used to install ports, and the -w option causes it to use
a split-screen display:  The bottom 80% contain the usual
output from "make", and the top 20% show the progress of the
build, including information about dependencies.  This might
be exactly the "breadcrumb across the top" that you requested.

You can download the current version from here:

http://www.secnetix.de/olli/scripts/portup

For FreeBSD >= 8.x, the -w option requires the "window" port
to be installed (from /usr/ports/misc/window) which was removed
from the base system in FreeBSD 8.x.

Usage for installing ports is simple:

# cd /usr/ports/category/foo
# portup -wy .

Best regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

I suggested holding a "Python Object Oriented Programming Seminar",
but the acronym was unpopular.
-- Joseph Strout

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Make ZFS auto-destroy snapshots when the out of space?

2010-05-29 Thread Dan Nelson

In the last episode (May 29), Kirk Strauser said:
> I found some nice scripts to regularly snapshot all the filesystems in my
> ZFS pool at
> http://www.neces.com/blog/technology/integrating-freebsd-zfs-and-periodic-snapshots-and-scrubs
> . One thing bothers me, though: I have to intentionally set how many 
> months' worth of snapshots I want to keep. Too many and I run out of 
> room. Too few and I lose some of the benefits of easy recovery of 
> deleted data. My computer is better at bookkeeping than I am, so why not 
> let it?
> 
> I'd propose standardizing on an attribute like
> org.freebsd:allowautodestroy.  Modify ZFS's disk full behavior to scan for
> snapshots with that attribute set and destroy the oldest one, and continue
> until there's enough free space to complete a write requests or until out
> of "expendable" snapshots to destroy (at which time the normal disk full
> handler would run).  Also run a daily periodic script to ensure that the
> free space stays below a configurable threshold each day so that ZFS isn't
> constantly butting up against completely full drives.

If the kernel does the snapshot deleting itself, why not add a pool-level
property that sets the amount of free space at which the deletion starts? 
That way you don't need the cleanup script.  Alternatively, make the
org.freebsd:allowautodestroy property hold the trigger freespace amount. 
That way you can have monthly/daily/hourly snapshots but set it so the
hourly ones disappear first, then the dailies (by setting the destroy
trigger slightly higher for the ones you want to expire first).

-- 
Dan Nelson
dnel...@allantgroup.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Why is NFSv4 so slow? (root/toor)

2010-06-29 Thread Dan Nelson

In the last episode (Jun 29), Rick C. Petty said:
> On Tue, Jun 29, 2010 at 10:20:57AM -0500, Adam Vande More wrote:
> > On Tue, Jun 29, 2010 at 9:58 AM, Rick Macklem  wrote:
> > 
> > > I suppose if the FreeBSD world feels that "root" and "toor" must both
> > > exist in the password database, then "nfsuserd" could be hacked to
> > > handle the case of translating uid 0 to "root" without calling
> > > getpwuid().  It seems ugly, but if deleting "toor" from the password
> > > database upsets people, I can do that.
> > 
> > I agree with Ian on this.  I don't use toor either, but have seen people
> > use it, and sometimes it will get recommended here for various reasons
> > e.g.  running a root account with a different default shell.  It
> > wouldn't bother me having to do this provided it was documented, but
> > having to do so would be a POLA violation to many users I think.
> 
> To be fair, I'm not sure this is even a problem.  Rick M. only suggested
> it as a possibility.  I would think that getpwuid() would return the first
> match which has always been root.  At least that's what it does when
> scanning the passwd file; I'm not sure about NIS.  If someone can prove
> that this will cause a problem with NFSv4, we could consider hackingit. 
> Otherwise I don't think we should change this behavior yet.

If there are multiple users that map to the same userid, nscd on Linux will
select one name at random and return it for getpwuid() calls.  I haven't
seen this behaviour on FreeBSD or Solaris, though.  They always seem to
return the first entry in the passwd file.

-- 
Dan Nelson
dnel...@allantgroup.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Authentication tried for XXX with correct key but not from a permitted host

2010-07-10 Thread Dan Langille


This is more for the record than asking a specific question.

Today I upgraded a system to FreeBSD 8.1-PRERELEASE.  Then I started 
seeing these messages when I ssh to said box with an ssh-agent enabled 
connection:


Jul 11 03:43:06 ngaio sshd[30290]: Authentication tried for dan with 
correct key but not from a permitted host (host=laptop.example.org, 
ip=10.0.0.100).


Jul 11 03:43:07 ngaio sshd[30290]: Authentication tried for dan with 
correct key but not from a permitted host (host=laptop.example.org, 
ip=10.0.0.100).


Jul 11 03:43:07 ngaio sshd[30290]: Accepted publickey for dan from 
10.0.0.100 port 53525 ssh2


My questions were:

1 - how do I set a permitted host?
2 - why is the message logged twice?

That asked, I know if I move the key to the top of the 
~/.ssh/authorized_keys file, the message is no longer logged. Further 
investigation reveals that if a line of the form:


from="10..etc"

appears before the key being used to log in, the message will appear.

Solution: move the from= line to the  bottom of the file.  Ugly, but it 
works.


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Problems replacing failing drive in ZFS pool

2010-07-19 Thread Dan Langille


On 7/19/2010 12:15 PM, Freddie Cash wrote:

On Mon, Jul 19, 2010 at 8:56 AM, Garrett Moore  wrote:

So you think it's because when I switch from the old disk to the new disk,
ZFS doesn't realize the disk has changed, and thinks the data is just
corrupt now? Even if that happens, shouldn't the pool still be available,
since it's RAIDZ1 and only one disk has gone away?


I think it's because you pull the old drive, boot with the new drive,
the controller re-numbers all the devices (ie da3 is now da2, da2 is
now da1, da1 is now da0, da0 is now da6, etc), and ZFS thinks that all
the drives have changed, thus corrupting the pool.  I've had this
happen on our storage servers a couple of times before I started using
glabel(8) on all our drives (dead drive on RAID controller, remove
drive, reboot for whatever reason, all device nodes are renumbered,
everything goes kablooey).


Can you explain a bit about how you use glabel(8) in conjunction with 
ZFS?  If I can retrofit this into an exist ZFS array to make things 
easier in the future...


8.0-STABLE #0: Fri Mar  5 00:46:11 EST 2010

]# zpool status
  pool: storage
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
storage ONLINE   0 0 0
  raidz1ONLINE   0 0 0
ad8 ONLINE   0 0 0
ad10ONLINE   0 0 0
ad12ONLINE   0 0 0
ad14ONLINE   0 0 0
ad16ONLINE   0 0 0


Of course, always have good backups.  ;)


In my case, this ZFS array is the backup.  ;)

But I'm setting up a tape library, real soon now

--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Problems replacing failing drive in ZFS pool


On 7/21/2010 2:54 AM, Charles Sprickman wrote:

On Wed, 21 Jul 2010, Charles Sprickman wrote:


On Tue, 20 Jul 2010, alan bryan wrote:




--- On Mon, 7/19/10, Dan Langille  wrote:


From: Dan Langille 
Subject: Re: Problems replacing failing drive in ZFS pool
To: "Freddie Cash" 
Cc: "freebsd-stable" 
Date: Monday, July 19, 2010, 7:07 PM
On 7/19/2010 12:15 PM, Freddie Cash
wrote:
> On Mon, Jul 19, 2010 at 8:56 AM, Garrett
Moore wrote:
>> So you think it's because when I switch from the
old disk to the new disk,
>> ZFS doesn't realize the disk has changed, and
thinks the data is just
>> corrupt now? Even if that happens, shouldn't the
pool still be available,
>> since it's RAIDZ1 and only one disk has gone
away?
> > I think it's because you pull the old drive, boot with
the new drive,
> the controller re-numbers all the devices (ie da3 is
now da2, da2 is
> now da1, da1 is now da0, da0 is now da6, etc), and ZFS
thinks that all
> the drives have changed, thus corrupting the
pool.  I've had this
> happen on our storage servers a couple of times before
I started using
> glabel(8) on all our drives (dead drive on RAID
controller, remove
> drive, reboot for whatever reason, all device nodes
are renumbered,
> everything goes kablooey).

Can you explain a bit about how you use glabel(8) in
conjunction with ZFS?  If I can retrofit this into an
exist ZFS array to make things easier in the future...

8.0-STABLE #0: Fri Mar  5 00:46:11 EST 2010

]# zpool status
  pool: storage
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
storage
   ONLINE
   0 0
   0
  raidz1 ONLINE   0
   0 0
ad8
   ONLINE
   0 0
   0
ad10 ONLINE   0
   0 0
ad12 ONLINE   0
   0 0
ad14 ONLINE   0
   0 0
ad16 ONLINE   0
   0 0

> Of course, always have good backups.  ;)

In my case, this ZFS array is the backup.  ;)

But I'm setting up a tape library, real soon now

-- Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org
mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to
"freebsd-stable-unsubscr...@freebsd.org"



Dan,

Here's how to do it after the fact:

http://unix.derkeiler.com/Mailing-Lists/FreeBSD/current/2009-07/msg00623.html



Two things:

-What's the preferred labelling method for disks that will be used
with zfs these days? geom_label or gpt labels? I've been using the
latter and I find them a little simpler.

-I think that if you already are using gpt partitioning, you can add a
gpt label after the fact (ie: gpart -i index# -l your_label adaX).
"gpart list" will give you a list of index numbers.


Oops.

That should be "gpart modify -i index# -l your_label adax".


I'm not using gpt partitioning.  I think I'd like to try that.  To do 
just that, I've ordered two more HDD.  They'll be arriving today.


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Problems replacing failing drive in ZFS pool


On 7/19/2010 10:50 PM, Adam Vande More wrote:

On Mon, Jul 19, 2010 at 9:07 PM, Dan Langille  wrote:


I think it's because you pull the old drive, boot with the new drive,

the controller re-numbers all the devices (ie da3 is now da2, da2 is
now da1, da1 is now da0, da0 is now da6, etc), and ZFS thinks that all
the drives have changed, thus corrupting the pool.  I've had this
happen on our storage servers a couple of times before I started using
glabel(8) on all our drives (dead drive on RAID controller, remove
drive, reboot for whatever reason, all device nodes are renumbered,
everything goes kablooey).




Can you explain a bit about how you use glabel(8) in conjunction with ZFS?
  If I can retrofit this into an exist ZFS array to make things easier in the
future...



If you've used whole disks in ZFS, you can't retrofit it if by retrofit you
mean an almost painless method of resolving this.  GEOM setup stuff
generally should happen BEFORE the file system is on it.

You would create your partition(s) slightly smaller than the disk, label it,
then use the resulting device as your zfs device when creating the pool.  If
you have an existing full disk install, that means restoring the data after
you've done those steps.  It works just as well with MBR style partitioning,
there's nothing saying you have to use GPT.  GPT is just better though in
terms of ease of use IMO among other things.


FYI, this is exactly what I'm doing to do.  I have obtained addition HDD 
to serve as temporary storage.  I will also use them for practicing the 
commands before destroying the original array.  I'll post my plan to the 
list for review.


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Using GTP and glabel for ZFS arrays


I hope my terminology is correct

I have a ZFS array which uses raw devices.  I'd rather it use glabel and 
supply the GEOM devices to ZFS instead.  In addition, I'll also 
partition the HDD to avoid using the entire HDD: leave a little bit of 
space at the start and end.


Why use glabel?

 * So ZFS can find and use the correct HDD should the HDD device ever
   get renumbered for whatever reason.  e.g. /dev/da0 becomes /dev/da6
   when you move it to another controller.

Why use partitions?

 * Primarily: two HDD of a given size, say 2TB, do not always provide
   the same amount of available space.  If you use a slightly smaller
   partition instead of the entire physical HDD, you're much more
   likely to have a happier experience when it comes time to replace an
   HDD.

 * There seems to be a consensus amongst some that leaving the start and
   and of your HDD empty.  Give the rest to ZFS.

Things I've read that led me to the above reasons:

* 
http://docs.freebsd.org/cgi/getmsg.cgi?fetch=399538+0+current/freebsd-stable
* 
http://lists.freebsd.org/pipermail/freebsd-stable/2010-February/055008.html

* http://lists.freebsd.org/pipermail/freebsd-geom/2009-July/003620.html

The plan for this plan, I'm going to play with just two HDD, because 
that's what I have available.  Let's assume these two HDD are ad0 and 
ad1.  I am not planning to boot from these HDD; they are for storage only.


First, create a new GUID Partition Table partition scheme on the HDD:

  gpart create -s GPT ad0


Let's see how much space we have.  This output will be used to determine 
SOMEVALUE in the next command.


  gpart show


Create a new partition within that scheme:

  gpart add -b 34 -s SOMEVALUE -t freebsd-zfs ad0

Why '-b 34'?  Randi pointed me to 
http://en.wikipedia.org/wiki/GUID_Partition_Table where it explains what 
the first 33 LBA are used for.  It's not for us to use here.


Where SOMEVALUE is the number of blocks to use.  I plan not to use all 
the available blocks but leave a few hundred MB free at the end. 
That'll allow for the variance in HDD size.



Now, label the thing:

  glabel label -v disk00 /dev/ad0

Repeat the above with ad1 to get disk01.  Repeat for all other HDD...

Then create your zpool:

 zpool create bigtank disk00 disk01 ... etc


Any suggestions/comments?  Is there any advantage to using the -l option 
on 'gpart add' instead of the glabel above?


Thanks


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Using GTP and glabel for ZFS arrays


On 7/21/2010 11:05 PM, Dan Langille wrote (something close to this):


First, create a new GUID Partition Table partition scheme on the HDD:

gpart create -s GPT ad0


Let's see how much space we have. This output will be used to determine
SOMEVALUE in the next command.

gpart show


Create a new partition within that scheme:

gpart add -b 34 -s SOMEVALUE -t freebsd-zfs ad0


Now, label the thing:

glabel label -v disk00 /dev/ad0


Or, is this more appropriate?

  glabel label -v disk00 /dev/ad0s1

--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Using GTP and glabel for ZFS arrays


On 7/21/2010 11:39 PM, Adam Vande More wrote:

On Wed, Jul 21, 2010 at 10:34 PM, Adam Vande More mailto:amvandem...@gmail.com>> wrote:



Also if you have an applicable SATA controller, running the ahci module
with give you more speed.  Only change one thing a time though.
Virtualbox makes a great testbed for this, you don't need to allocate
the VM a lot of RAM just make sure it boots and such.


I'm not sure of the criteria, but this is what I'm running:

atapci0:  port 0xdc00-0xdc0f mem 
0xfbeffc00-0xfbeffc7f,0xfbef-0xfbef7fff irq 17 at device 4.0 on pci7


atapci1:  port 0xac00-0xac0f mem 
0xfbbffc00-0xfbbffc7f,0xfbbf-0xfbbf7fff irq 19 at device 4.0 on pci3


I added ahci_load="YES" to loader.conf and rebooted.  Now I see:

ahci0:  port 
0x8000-0x8007,0x7000-0x7003,0x6000-0x6007,0x5000-0x5003,0x4000-0x400f 
mem 0xfb3fe400-0xfb3fe7ff irq 22 at device 17.0 on pci0


Which is the onboard SATA from what I can tell, not the controllers I 
installed to handle the ZFS array.  The onboard SATA runs a gmirror 
array which handles /, /tmp, /usr, and /var (i.e. the OS).  ZFS runs 
only on on my /storage mount point.


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Using GTP and glabel for ZFS arrays


On 7/22/2010 2:59 AM, Andrey V. Elsukov wrote:

On 22.07.2010 10:32, Dan Langille wrote:

I'm not sure of the criteria, but this is what I'm running:

atapci0:  port 0xdc00-0xdc0f mem
0xfbeffc00-0xfbeffc7f,0xfbef-0xfbef7fff irq 17 at device 4.0 on pci7

atapci1:  port 0xac00-0xac0f mem
0xfbbffc00-0xfbbffc7f,0xfbbf-0xfbbf7fff irq 19 at device 4.0 on pci3

I added ahci_load="YES" to loader.conf and rebooted.  Now I see:


You can add siis_load="YES" to loader.conf for SiI 3124.


Ahh, thank you.

I'm afraid to do that now, before I label my ZFS drives for fear that 
the ZFS array will be messed up.  But I do plan to do that for the 
system after my plan is implemented.  Thank you.  :)


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Using GTP and glabel for ZFS arrays


On 7/22/2010 3:08 AM, Jeremy Chadwick wrote:

On Thu, Jul 22, 2010 at 03:02:33AM -0400, Dan Langille wrote:

On 7/22/2010 2:59 AM, Andrey V. Elsukov wrote:

On 22.07.2010 10:32, Dan Langille wrote:

I'm not sure of the criteria, but this is what I'm running:

atapci0:   port 0xdc00-0xdc0f mem
0xfbeffc00-0xfbeffc7f,0xfbef-0xfbef7fff irq 17 at device 4.0 on pci7

atapci1:   port 0xac00-0xac0f mem
0xfbbffc00-0xfbbffc7f,0xfbbf-0xfbbf7fff irq 19 at device 4.0 on pci3

I added ahci_load="YES" to loader.conf and rebooted.  Now I see:


You can add siis_load="YES" to loader.conf for SiI 3124.


Ahh, thank you.

I'm afraid to do that now, before I label my ZFS drives for fear
that the ZFS array will be messed up.  But I do plan to do that for
the system after my plan is implemented.  Thank you.  :)


They won't be messed up.  ZFS will figure out, using its metadata, which
drive is part of what pool despite the device name changing.


I now have:
siis0:  port 0xdc00-0xdc0f mem 
0xfbeffc00-0xfbeffc7f,0xfbef-0xfbef7fff irq 17 at device 4.0 on pci7


siis1:  port 0xac00-0xac0f mem 
0xfbbffc00-0xfbbffc7f,0xfbbf-0xfbbf7fff irq 19 at device 4.0 on pci3


And my zpool is now:

$ zpool status
  pool: storage
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
storage ONLINE   0 0 0
  raidz1ONLINE   0 0 0
ada0ONLINE   0 0 0
ada1ONLINE   0 0 0
ada2ONLINE   0 0 0
ada3ONLINE   0 0 0
ada4ONLINE   0 0 0

Whereas previously, it was ad devices (see 
http://docs.freebsd.org/cgi/getmsg.cgi?fetch=399538+0+current/freebsd-stable).


Thank you (and to Andrey V. Elsukov who posted the same suggestion at 
the same time you did).  I appreciate it.


> I don't

use glabel or GPT so I can't comment on whether or not those work
reliably in this situation (I imagine they would, but I keep seeing
problem reports on the lists when people have them in use...)


Really?  The whole basis of the action plan I'm highlighting in this 
post is to avoid ZFS-related problems when devices get renumbered and 
ZFS is using device names (e.g. /dev/ad0> instead of labels (e.g. 
gpt/disk00).


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Using GTP and glabel for ZFS arrays


On 7/22/2010 3:30 AM, Charles Sprickman wrote:

On Thu, 22 Jul 2010, Dan Langille wrote:


On 7/22/2010 2:59 AM, Andrey V. Elsukov wrote:

On 22.07.2010 10:32, Dan Langille wrote:

I'm not sure of the criteria, but this is what I'm running:

atapci0: port 0xdc00-0xdc0f mem
0xfbeffc00-0xfbeffc7f,0xfbef-0xfbef7fff irq 17 at device 4.0 on
pci7

atapci1: port 0xac00-0xac0f mem
0xfbbffc00-0xfbbffc7f,0xfbbf-0xfbbf7fff irq 19 at device 4.0 on
pci3

I added ahci_load="YES" to loader.conf and rebooted. Now I see:


You can add siis_load="YES" to loader.conf for SiI 3124.


Ahh, thank you.

I'm afraid to do that now, before I label my ZFS drives for fear that
the ZFS array will be messed up. But I do plan to do that for the
system after my plan is implemented. Thank you. :)


You may even get hotplug support if you're lucky. :)

I just built a box and gave it a spin with the "old" ata stuff and then
with the "new" (AHCI) stuff. It does perform a bit better and my BIOS
claims it supports hotplug with ahci enabled as well... Still have to
test that.


Well, I don't have anything to support hotplug.  All my stuff is internal.

http://sphotos.ak.fbcdn.net/hphotos-ak-ash1/hs430.ash1/23778_106837706002537_10289239443_171753_3508473_n.jpg



--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Using GTP and glabel for ZFS arrays


On 7/22/2010 4:03 AM, Charles Sprickman wrote:

On Thu, 22 Jul 2010, Dan Langille wrote:


On 7/22/2010 3:30 AM, Charles Sprickman wrote:

On Thu, 22 Jul 2010, Dan Langille wrote:


On 7/22/2010 2:59 AM, Andrey V. Elsukov wrote:

On 22.07.2010 10:32, Dan Langille wrote:

I'm not sure of the criteria, but this is what I'm running:

atapci0: port 0xdc00-0xdc0f mem
0xfbeffc00-0xfbeffc7f,0xfbef-0xfbef7fff irq 17 at device 4.0 on
pci7

atapci1: port 0xac00-0xac0f mem
0xfbbffc00-0xfbbffc7f,0xfbbf-0xfbbf7fff irq 19 at device 4.0 on
pci3

I added ahci_load="YES" to loader.conf and rebooted. Now I see:


You can add siis_load="YES" to loader.conf for SiI 3124.


Ahh, thank you.

I'm afraid to do that now, before I label my ZFS drives for fear that
the ZFS array will be messed up. But I do plan to do that for the
system after my plan is implemented. Thank you. :)


You may even get hotplug support if you're lucky. :)

I just built a box and gave it a spin with the "old" ata stuff and then
with the "new" (AHCI) stuff. It does perform a bit better and my BIOS
claims it supports hotplug with ahci enabled as well... Still have to
test that.


Well, I don't have anything to support hotplug. All my stuff is internal.

http://sphotos.ak.fbcdn.net/hphotos-ak-ash1/hs430.ash1/23778_106837706002537_10289239443_171753_3508473_n.jpg



The frankenbox I'm testing on is a retrofitted 1U (it had a scsi
backplane, now has none).

I am not certain, but I think with 8.1 (which it's running) and all the
cam integration stuff, hotplug is possible. Is a special backplane
required? I seriously don't know... I'm going to give it a shot though.

Oh, you also might get NCQ. Try:

[r...@h21 /tmp]# camcontrol tags ada0
(pass0:ahcich0:0:0:0): device openings: 32


# camcontrol tags ada0
(pass0:siisch2:0:0:0): device openings: 31


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Using GTP and glabel for ZFS arrays


On 7/22/2010 4:03 AM, Charles Sprickman wrote:

On Thu, 22 Jul 2010, Dan Langille wrote:


On 7/22/2010 3:30 AM, Charles Sprickman wrote:

On Thu, 22 Jul 2010, Dan Langille wrote:


On 7/22/2010 2:59 AM, Andrey V. Elsukov wrote:

On 22.07.2010 10:32, Dan Langille wrote:

I'm not sure of the criteria, but this is what I'm running:

atapci0: port 0xdc00-0xdc0f mem
0xfbeffc00-0xfbeffc7f,0xfbef-0xfbef7fff irq 17 at device 4.0 on
pci7

atapci1: port 0xac00-0xac0f mem
0xfbbffc00-0xfbbffc7f,0xfbbf-0xfbbf7fff irq 19 at device 4.0 on
pci3

I added ahci_load="YES" to loader.conf and rebooted. Now I see:


You can add siis_load="YES" to loader.conf for SiI 3124.


Ahh, thank you.

I'm afraid to do that now, before I label my ZFS drives for fear that
the ZFS array will be messed up. But I do plan to do that for the
system after my plan is implemented. Thank you. :)


You may even get hotplug support if you're lucky. :)

I just built a box and gave it a spin with the "old" ata stuff and then
with the "new" (AHCI) stuff. It does perform a bit better and my BIOS
claims it supports hotplug with ahci enabled as well... Still have to
test that.


Well, I don't have anything to support hotplug. All my stuff is internal.

http://sphotos.ak.fbcdn.net/hphotos-ak-ash1/hs430.ash1/23778_106837706002537_10289239443_171753_3508473_n.jpg



The frankenbox I'm testing on is a retrofitted 1U (it had a scsi
backplane, now has none).

I am not certain, but I think with 8.1 (which it's running) and all the
cam integration stuff, hotplug is possible. Is a special backplane
required? I seriously don't know... I'm going to give it a shot though.

Oh, you also might get NCQ. Try:

[r...@h21 /tmp]# camcontrol tags ada0
(pass0:ahcich0:0:0:0): device openings: 32


# camcontrol tags ada0
(pass0:siisch2:0:0:0): device openings: 31

resending with this:

ada{0..4} give the above.

# camcontrol tags ada5
(pass5:ahcich0:0:0:0): device openings: 32

That's part of the gmirror array for the OS, along with ad6 which has 
similar output.


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Using GTP and glabel for ZFS arrays

On 7/22/2010 4:03 AM, Charles Sprickman wrote:

On Thu, 22 Jul 2010, Dan Langille wrote:

On 7/22/2010 3:30 AM, Charles Sprickman wrote:

On Thu, 22 Jul 2010, Dan Langille wrote:

On 7/22/2010 2:59 AM, Andrey V. Elsukov wrote:

On 22.07.2010 10:32, Dan Langille wrote:

I'm not sure of the criteria, but this is what I'm running:

atapci0: port 0xdc00-0xdc0f mem
0xfbeffc00-0xfbeffc7f,0xfbef-0xfbef7fff irq 17 at device 4.0 on
pci7

atapci1: port 0xac00-0xac0f mem
0xfbbffc00-0xfbbffc7f,0xfbbf-0xfbbf7fff irq 19 at device 4.0 on
pci3

I added ahci_load="YES" to loader.conf and rebooted. Now I see:

You can add siis_load="YES" to loader.conf for SiI 3124.

Ahh, thank you.

I'm afraid to do that now, before I label my ZFS drives for fear that
the ZFS array will be messed up. But I do plan to do that for the
system after my plan is implemented. Thank you. :)

You may even get hotplug support if you're lucky. :)

I just built a box and gave it a spin with the "old" ata stuff and then
with the "new" (AHCI) stuff. It does perform a bit better and my BIOS
claims it supports hotplug with ahci enabled as well... Still have to
test that.

Well, I don't have anything to support hotplug. All my stuff is internal.

http://sphotos.ak.fbcdn.net/hphotos-ak-ash1/hs430.ash1/23778_106837706002537_10289239443_171753_3508473_n.jpg

The frankenbox I'm testing on is a retrofitted 1U (it had a scsi
backplane, now has none).

I am not certain, but I think with 8.1 (which it's running) and all the
cam integration stuff, hotplug is possible. Is a special backplane
required? I seriously don't know... I'm going to give it a shot though.

Oh, you also might get NCQ. Try:

[r...@h21 /tmp]# camcontrol tags ada0
(pass0:ahcich0:0:0:0): device openings: 32

# camcontrol tags ada0
(pass0:siisch2:0:0:0): device openings: 31

resending with this:

ada{0..4} give the above.

# camcontrol tags ada5
(pass5:ahcich0:0:0:0): device openings: 32

That's part of the gmirror array for the OS, along with ad6 which has
similar output.

And again with this output from one of the ZFS drives:

# camcontrol identify ada0
pass0: ATA-8 SATA 2.x device
pass0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)

protocol ATA/ATAPI-8 SATA 2.x
device model Hitachi HDS722020ALA330
firmware revision JKAOA28A
serial number JK1130YAH531ST
WWN 5000cca221d068d5
cylinders 16383
heads 16
sectors/track 63
sector size logical 512, physical 512, offset 0
LBA supported 268435455 sectors
LBA48 supported 3907029168 sectors
PIO supported PIO4
DMA supported WDMA2 UDMA6
media RPM 7200

Feature Support EnableValue Vendor
read ahead yes yes
write cacheyes yes
flush cacheyes yes
overlapno
Tagged Command Queuing (TCQ) no no
Native Command Queuing (NCQ) yes 32 tags
SMART yes yes
microcode download yes yes
security yes no
power management yes yes
advanced power management yes no 0/0x00
automatic acoustic management yes no 254/0xFE128/0x80
media status notification no no
power-up in Standbyyes no
write-read-verify no no 0/0x0
unload no no
free-fall no no
data set management (TRIM) no

--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Using GTP and glabel for ZFS arrays

Thank you to all the helpful discussion.  It's been very helpful and 
educational.  Based on the advice and suggestions, I'm going to adjust 
my original plan as follows.


NOTE: glabel will not be used.


First, create a new GUID Partition Table partition scheme on the HDD:

gpart create -s GPT ad0


Let's see how much space we have. This output will be used to determine
SOMEVALUE in the next command.

gpart show


Create a new partition within that scheme:

gpart add -b 1024 -s SOMEVALUE -t freebsd-zfs -l disk00 ad0

The -b 1024 ensures alignment on a 4KB boundary.

SOMEVALUE will be set so approximately 200MB is left empty at the end of 
the HDD.  That's part more than necessary to accommodate the different 
actualy size of 2TB HDD.


Repeat the above with ad1 to get disk01. Repeat for all other HDD...

Then create your zpool:

zpool create bigtank gpt/disk00 gpt/disk02 ... etc


This plan will be applied to an existing 5 HDD ZFS pool.  I have two new 
empty HDD which will be added to this new array (giving me 7 x 2TB HDD). 
 The array is raidz1 and I'm wondering if I want to go to raidz2.  That 
would be about 10TB and I'm only using up 3.1TB at present.  That 
represents about 4 months of backups.


I do not think I can adjust the existing zpool on the fly.  I think I 
need to copy everything elsewhere (i.e the 2 empty drives).  Then start 
the new zpool from scratch.


The risk: when the data is on the 2 spare HDD, there is no redundancy. 
I wonder if my friend Jerry has a spare 2TB HDD I could borrow for the 
evening.


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Using GTP and glabel for ZFS arrays


On 7/22/2010 9:22 PM, Pawel Tyll wrote:

I do not think I can adjust the existing zpool on the fly.  I think I
need to copy everything elsewhere (i.e the 2 empty drives).  Then start
the new zpool from scratch.



You can, and you should (for educational purposes if not for fun :>),
unless you wish to change raidz1 to raidz2. Replace, wait for
resilver, if redoing used disk then offline it, wipe magic with dd
(16KB at the beginning and end of disk/partition will do), carry on
with GPT, rinse and repeat with next disk. When last vdev's replace
finishes, your pool will grow automagically.


So... the smaller size won't mess things up...

--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Using GTP and glabel for ZFS arrays


On 7/22/2010 9:51 PM, Pawel Tyll wrote:

So... the smaller size won't mess things up...

If by smaller size you mean smaller size of existing
drives/partitions, then growing zpools by replacing smaller vdevs
with larger ones is supported and works. What isn't supported is
basically everything else:
- you can't change number of raid columns (add/remove vdevs from raid)
- you can't change number of parity columns (raidz1->2 or 3)
- you can't change vdevs to smaller ones, even if pool's free space
would permit that.


Isn't what I'm doing breaking the last one?



Good news is these features are planned/being worked on.

If you can attach more drives to your system without disconnecting
existing drives, then you can grow your pool pretty much risk-free.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"




--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Using GTP and glabel for ZFS arrays


On 7/22/2010 8:47 PM, Dan Langille wrote:

Thank you to all the helpful discussion.  It's been very helpful and
educational. Based on the advice and suggestions, I'm going to adjust my
original plan as follows.

NOTE: glabel will not be used.


First, create a new GUID Partition Table partition scheme on the HDD:

gpart create -s GPT ad0


Let's see how much space we have. This output will be used to determine
SOMEVALUE in the next command.

gpart show


Create a new partition within that scheme:

gpart add -b 1024 -s SOMEVALUE -t freebsd-zfs -l disk00 ad0

The -b 1024 ensures alignment on a 4KB boundary.

SOMEVALUE will be set so approximately 200MB is left empty at the end of
the HDD. That's part more than necessary to accommodate the different
actualy size of 2TB HDD.

Repeat the above with ad1 to get disk01. Repeat for all other HDD...

Then create your zpool:

zpool create bigtank gpt/disk00 gpt/disk02 ... etc


This plan will be applied to an existing 5 HDD ZFS pool. I have two new
empty HDD which will be added to this new array (giving me 7 x 2TB HDD).
The array is raidz1 and I'm wondering if I want to go to raidz2. That
would be about 10TB and I'm only using up 3.1TB at present. That
represents about 4 months of backups.

I do not think I can adjust the existing zpool on the fly. I think I
need to copy everything elsewhere (i.e the 2 empty drives). Then start
the new zpool from scratch.

The risk: when the data is on the 2 spare HDD, there is no redundancy. I
wonder if my friend Jerry has a spare 2TB HDD I could borrow for the
evening.



The work is in progress.  Updates are at 
http://beta.freebsddiary.org/zfs-with-gpart.php which will be updated 
frequently as the work continues.


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Using GTP and glabel for ZFS arrays


On 7/22/2010 9:22 PM, Pawel Tyll wrote:

I do not think I can adjust the existing zpool on the fly.  I think I
need to copy everything elsewhere (i.e the 2 empty drives).  Then start
the new zpool from scratch.



You can, and you should (for educational purposes if not for fun :>),
unless you wish to change raidz1 to raidz2. Replace, wait for
resilver, if redoing used disk then offline it, wipe magic with dd
(16KB at the beginning and end of disk/partition will do), carry on
with GPT, rinse and repeat with next disk. When last vdev's replace
finishes, your pool will grow automagically.


Pawell and I had an online chat about part of my strategy.  To be clear:

I have a 5x2TB raidz1 array.

I have 2x2TB empty HDD

My goal was to go to raidz2 by:
- copy data to empty HDD
- redo the zpool to be raidz2
- copy back the data
- add in the two previously empty HDD to the zpol

I now understand that after a raidz array has been created, you can't 
add a new HDD to it.  I'd like to, but it sounds like you cannot.


"It is not possible to add a disk as a column to a RAID-Z, RAID-Z2, or 
RAID-Z3 vdev." http://en.wikipedia.org/wiki/ZFS#Limitations


So, it seems I have a 5-HDD zpool and it's going to stay that way.





--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Using GTP and glabel for ZFS arrays


On 7/23/2010 10:25 PM, Freddie Cash wrote:

On Fri, Jul 23, 2010 at 6:33 PM, Dan Langille  wrote:

Pawell and I had an online chat about part of my strategy.  To be clear:

I have a 5x2TB raidz1 array.

I have 2x2TB empty HDD

My goal was to go to raidz2 by:
- copy data to empty HDD
- redo the zpool to be raidz2
- copy back the data
- add in the two previously empty HDD to the zpol

I now understand that after a raidz array has been created, you can't add a
new HDD to it.  I'd like to, but it sounds like you cannot.

"It is not possible to add a disk as a column to a RAID-Z, RAID-Z2, or
RAID-Z3 vdev." http://en.wikipedia.org/wiki/ZFS#Limitations

So, it seems I have a 5-HDD zpool and it's going to stay that way.


You can fake it out by using sparse files for members of the new
raidz2 vdev (when creating the vdev), then offline the file-based
members so that you are running a degraded pool, copy the data to the
pool, then replace the file-based members with physical harddrives.


So I'm creating a 7 drive pool, with 5 real drives members and two 
file-based members.



I've posted a theoretical method for doing so here:
http://forums.freebsd.org/showpost.php?p=93889&postcount=7

It's theoretical as I have not investigated how to create sparse files
on FreeBSD, nor have I done this.  It's based on several posts to the
zfs-discuss mailing list where several people have done this on
OpenSolaris.


I see no downside.  There is no risk that "it won't work and I'll lose 
all the data".


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Using GTP and glabel for ZFS arrays


On 7/23/2010 10:42 PM, Daniel O'Connor wrote:


On 24/07/2010, at 11:55, Freddie Cash wrote:

It's theoretical as I have not investigated how to create sparse files
on FreeBSD, nor have I done this.  It's based on several posts to the
zfs-discuss mailing list where several people have done this on
OpenSolaris.


FYI you would do..
truncate -s 1T /tmp/fake-disk1
mdconfig -a -t vnode -f /tmp/fake-disk1

etc..

Although you'd want to determine the exact size of your real disks from geom 
and use that.



 $ dd if=/dev/zero of=/tmp/sparsefile1.img bs=1 count=0 oseek=2000G
0+0 records in
0+0 records out
0 bytes transferred in 0.25 secs (0 bytes/sec)

$ ls -l /tmp/sparsefile1.img
-rw-r--r--  1 dan  wheel  2147483648000 Jul 23 22:49 /tmp/sparsefile1.img

$ ls -lh /tmp/sparsefile1.img
-rw-r--r--  1 dan  wheel   2.0T Jul 23 22:49 /tmp/sparsefile1.img


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Using GTP and glabel for ZFS arrays


On 7/23/2010 10:51 PM, Dan Langille wrote:

On 7/23/2010 10:42 PM, Daniel O'Connor wrote:


On 24/07/2010, at 11:55, Freddie Cash wrote:

It's theoretical as I have not investigated how to create sparse
files on FreeBSD, nor have I done this. It's based on several
posts to the zfs-discuss mailing list where several people have
done this on OpenSolaris.


FYI you would do.. truncate -s 1T /tmp/fake-disk1 mdconfig -a -t
vnode -f /tmp/fake-disk1

etc..

Although you'd want to determine the exact size of your real disks
from geom and use that.



$ dd if=/dev/zero of=/tmp/sparsefile1.img bs=1 count=0 oseek=2000G
0+0 records in 0+0 records out 0 bytes transferred in 0.25 secs
(0 bytes/sec)

$ ls -l /tmp/sparsefile1.img -rw-r--r-- 1 dan wheel 2147483648000
Jul 23 22:49 /tmp/sparsefile1.img

$ ls -lh /tmp/sparsefile1.img -rw-r--r-- 1 dan wheel 2.0T Jul 23
22:49 /tmp/sparsefile1.img


Going a bit further, and actually putting 30MB of data in there:


$ rm sparsefile1.img
$ dd if=/dev/zero of=/tmp/sparsefile1.img bs=1 count=0
oseek=2000G
0+0 records in
0+0 records out
0 bytes transferred in 0.30 secs (0 bytes/sec)

$ ls -lh /tmp/sparsefile1.img
-rw-r--r--  1 dan  wheel   2.0T Jul 23 22:59 /tmp/sparsefile1.img

$ dd if=/dev/zero of=sparsefile1.img bs=1M count=30 conv=notrunc
30+0 records in
30+0 records out
31457280 bytes transferred in 0.396570 secs (79323405 bytes/sec)

$ ls -l sparsefile1.img
-rw-r--r--  1 dan  wheel  2147483648000 Jul 23 23:00 sparsefile1.img

$ ls -lh sparsefile1.img
-rw-r--r--  1 dan  wheel   2.0T Jul 23 23:00 sparsefile1.img
$


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Using GTP and glabel for ZFS arrays


On 7/24/2010 7:56 AM, Pawel Tyll wrote:

Easiest way to create sparse eg 20 GB assuming test.img doesn't exist
already


You trim posts too much... there is no way to compare without opening 
another email.


Adam wrote:


truncate -s 20g test.img
ls -sk test.img
1 test.img




No no no. Easiest way to do what you want to do:
mdconfig -a -t malloc -s 3t -u 0
mdconfig -a -t malloc -s 3t -u 1


In what way is that easier?  Now I have /dev/md0 and /dev/md1 as opposed 
to two sparse files.



Just make sure to offline and delete mds ASAP, unless you have 6TB of
RAM waiting to be filled ;) - note that with RAIDZ2 you have no
redundancy with two fake disks gone, and if going with RAIDZ1 this
won't work at all. I can't figure out a safe way (data redundancy all
the way) of doing things with only 2 free disks and 3.5TB data - third
disk would make things easier, fourth would make them trivial; note
that temporary disks 3 and 4 don't have to be 2TB, 1.5TB will do.


The lack of redundancy is noted and accepted.  Thanks.  :)

--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Using GTP and glabel for ZFS arrays

On 7/22/2010 4:11 AM, Dan Langille wrote:

On 7/22/2010 4:03 AM, Charles Sprickman wrote:

On Thu, 22 Jul 2010, Dan Langille wrote:

On 7/22/2010 3:30 AM, Charles Sprickman wrote:

On Thu, 22 Jul 2010, Dan Langille wrote:

On 7/22/2010 2:59 AM, Andrey V. Elsukov wrote:

On 22.07.2010 10:32, Dan Langille wrote:

I'm not sure of the criteria, but this is what I'm running:

atapci0: port 0xdc00-0xdc0f mem
0xfbeffc00-0xfbeffc7f,0xfbef-0xfbef7fff irq 17 at device 4.0 on
pci7

atapci1: port 0xac00-0xac0f mem
0xfbbffc00-0xfbbffc7f,0xfbbf-0xfbbf7fff irq 19 at device 4.0 on
pci3

I added ahci_load="YES" to loader.conf and rebooted. Now I see:

You can add siis_load="YES" to loader.conf for SiI 3124.

Ahh, thank you.

I'm afraid to do that now, before I label my ZFS drives for fear that
the ZFS array will be messed up. But I do plan to do that for the
system after my plan is implemented. Thank you. :)

You may even get hotplug support if you're lucky. :)

Well, I don't have anything to support hotplug. All my stuff is
internal.

http://sphotos.ak.fbcdn.net/hphotos-ak-ash1/hs430.ash1/23778_106837706002537_10289239443_171753_3508473_n.jpg

The frankenbox I'm testing on is a retrofitted 1U (it had a scsi
backplane, now has none).

Oh, you also might get NCQ. Try:

[r...@h21 /tmp]# camcontrol tags ada0
(pass0:ahcich0:0:0:0): device openings: 32

# camcontrol tags ada0
(pass0:siisch2:0:0:0): device openings: 31

resending with this:

ada{0..4} give the above.

# camcontrol tags ada5
(pass5:ahcich0:0:0:0): device openings: 32

That's part of the gmirror array for the OS, along with ad6 which has
similar output.

And again with this output from one of the ZFS drives:

# camcontrol identify ada0
pass0: ATA-8 SATA 2.x device
pass0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)

Feature Support Enable Value Vendor
read ahead yes yes
write cache yes yes
flush cache yes yes
overlap no
Tagged Command Queuing (TCQ) no no
Native Command Queuing (NCQ) yes 32 tags
SMART yes yes
microcode download yes yes
security yes no
power management yes yes
advanced power management yes no 0/0x00
automatic acoustic management yes no 254/0xFE 128/0x80
media status notification no no
power-up in Standby yes no
write-read-verify no no 0/0x0
unload no no
free-fall no no
data set management (TRIM) no

Does this support NCQ?

Re: Using GTP and glabel for ZFS arrays


On 7/23/2010 7:42 AM, John Hawkes-Reed wrote:

Dan Langille wrote:

Thank you to all the helpful discussion. It's been very helpful and
educational. Based on the advice and suggestions, I'm going to adjust
my original plan as follows.


[ ... ]

Since I still have the medium-sized ZFS array on the bench, testing this
GPT setup seemed like a good idea.
bonnie -s 5
The hardware's a Supermicro X8DTL-iF m/b + 12Gb memory, 2x 5502 Xeons,
3x Supermicro USASLP-L8I 3G SAS controllers and 24x Hitachi 2Tb drives.

Partitioning the drives with the command-line:
gpart add -s 1800G -t freebsd-zfs -l disk00 da0[1] gave the following
results with bonnie-64: (Bonnie -r -s 5000|2|5)[2]


What test is this?  I just installed benchmarks/bonnie and I see no -r 
option.  Right now, I'm trying this: bonnie -s 5



--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

gpart -b 34 versus gpart -b 1024

You may have seen my cunning plan: 
http://docs.freebsd.org/cgi/getmsg.cgi?fetch=883310+0+current/freebsd-stable


I've been doing some testing today.  The first of my tests comparing 
partitions aligned on a 4KB boundary are in.  I created a 5x2TB zpool, 
each of which was set up like this:


gpart add -b 1024 -s 3906824301 -t freebsd-zfs -l disk01 ada1
or
gpart add -b   34 -s 3906824301 -t freebsd-zfs -l disk01 ada1

Repeat for all 5 HDD.  And then:

zpool create storage raidz2 gpt/disk01 gpt/disk02 gpt/disk03 gpt/disk04 
gpt/disk05


Two Bonnie-64 tests:

First, with -b 34:

# ~dan/bonnie-64-read-only/Bonnie -s 5000
File './Bonnie.12315', size: 524288
Writing with putc()...done
Rewriting...done
Writing intelligently...done
Reading with getc()...done
Reading intelligently...done
Seeker 1...Seeker 2...Seeker 3...start 'em...done...done...done...
   ---Sequential Output ---Sequential Input-- --Random--
   -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
GB M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU  /sec %CPU
 5 110.6 80.5 115.3 15.1  60.9  8.5  68.8 46.2 326.7 15.3   469  1.4




And then with -b 1024

# ~dan/bonnie-64-read-only/Bonnie -s 5000
File './Bonnie.21095', size: 524288
Writing with putc()...done
Rewriting...^[[1~done
Writing intelligently...done
Reading with getc()...done
Reading intelligently...done
Seeker 1...Seeker 2...Seeker 3...start 'em...done...done...done...
   ---Sequential Output ---Sequential Input-- --Random--
   -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
GB M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU  /sec %CPU
 5 130.9 94.2 118.3 15.6  61.1  8.5  70.1 46.8 241.2 12.7   473  1.4


My reading of this:  All M/sec rates are faster except sequential input. 
 Comments?


I'll run -s 2 and -s 5 tests overnight and will post them in the 
morning.


Sunday, I'll try creating a 7x2TB array consisting of 5HDD and two 
sparse files and see how that goes. Here's hoping.


Full logs here, including a number of panics:

  http://beta.freebsddiary.org/zfs-with-gpart.php

--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: gpart -b 34 versus gpart -b 1024


On 7/24/2010 10:44 PM, Dan Langille wrote:


I'll run -s 2 and -s 5 tests overnight and will post them in the
morning.


The -s 2 results are in:

-b 34:

   ---Sequential Output ---Sequential Input-- --Random--
   -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
GB M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU  /sec %CPU
20 114.1 82.7 110.9 14.1  62.5  8.9  73.1 48.8 153.6  9.9   195  0.9

-b 1024:

   ---Sequential Output ---Sequential Input-- --Random--
   -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
GB M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU  /sec %CPU
20 111.0 81.2 114.7 15.1  62.6  8.9  71.9 47.9 135.3  8.7   180  1.1


Hmmm, seems like the first test was better...

--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: gpart -b 34 versus gpart -b 1024


On 7/24/2010 10:44 PM, Dan Langille wrote:

You may have seen my cunning plan:
http://docs.freebsd.org/cgi/getmsg.cgi?fetch=883310+0+current/freebsd-stable


I've been doing some testing today. The first of my tests comparing
partitions aligned on a 4KB boundary are in. I created a 5x2TB zpool,
each of which was set up like this:

gpart add -b 1024 -s 3906824301 -t freebsd-zfs -l disk01 ada1
or
gpart add -b 34 -s 3906824301 -t freebsd-zfs -l disk01 ada1

Repeat for all 5 HDD. And then:

zpool create storage raidz2 gpt/disk01 gpt/disk02 gpt/disk03 gpt/disk04
gpt/disk05

Two Bonnie-64 tests:

First, with -b 34:

# ~dan/bonnie-64-read-only/Bonnie -s 5000
File './Bonnie.12315', size: 524288
Writing with putc()...done
Rewriting...done
Writing intelligently...done
Reading with getc()...done
Reading intelligently...done
Seeker 1...Seeker 2...Seeker 3...start 'em...done...done...done...
---Sequential Output ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
GB M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU /sec %CPU
5 110.6 80.5 115.3 15.1 60.9 8.5 68.8 46.2 326.7 15.3 469 1.4




And then with -b 1024

# ~dan/bonnie-64-read-only/Bonnie -s 5000
File './Bonnie.21095', size: 524288
Writing with putc()...done
Rewriting...^[[1~done
Writing intelligently...done
Reading with getc()...done
Reading intelligently...done
Seeker 1...Seeker 2...Seeker 3...start 'em...done...done...done...
---Sequential Output ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
GB M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU /sec %CPU
5 130.9 94.2 118.3 15.6 61.1 8.5 70.1 46.8 241.2 12.7 473 1.4


My reading of this: All M/sec rates are faster except sequential input.
Comments?

I'll run -s 2 and -s 5 tests overnight and will post them in the
morning.


Well, it seems I'm not sleeping yet, so:

-b 34

   ---Sequential Output ---Sequential Input-- --Random--
   -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
GB M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU  /sec %CPU
50 113.1 82.4 114.6 15.2  63.4  8.9  72.7 48.2 142.2  9.5   126  0.7


-b 1024
   ---Sequential Output ---Sequential Input-- --Random--
   -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
GB M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU  /sec %CPU
50 110.5 81.0 112.8 15.0  62.8  9.0  72.9 48.5 139.7  9.5   144  0.9

Here, the results aren't much better either...  am I not aligning this 
partition correctly?  Missing something else?  Or... are they both 4K 
block aligned?


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

zpool destroy causes panic

I'm trying to destroy a zfs array which I recently created.  It contains 
nothing of value.


# zpool status
  pool: storage
 state: ONLINE
status: One or more devices could not be used because the label is 
missing or

invalid.  Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-4J
 scrub: none requested
config:

NAME  STATE READ WRITE CKSUM
storage   ONLINE   0 0 0
  raidz2  ONLINE   0 0 0
gpt/disk01ONLINE   0 0 0
gpt/disk02ONLINE   0 0 0
gpt/disk03ONLINE   0 0 0
gpt/disk04ONLINE   0 0 0
gpt/disk05ONLINE   0 0 0
/tmp/sparsefile1.img  UNAVAIL  0 0 0  corrupted 
data
/tmp/sparsefile2.img  UNAVAIL  0 0 0  corrupted 
data


errors: No known data errors

Why sparse files?  See this post:

http://docs.freebsd.org/cgi/getmsg.cgi?fetch=1007077+0+archive/2010/freebsd-stable/20100725.freebsd-stable

The two tmp files were created via:

dd if=/dev/zero of=/tmp/sparsefile1.img bs=1 count=0  oseek=1862g
dd if=/dev/zero of=/tmp/sparsefile2.img bs=1 count=0  oseek=1862g

And the array created with:

zpool create -f storage raidz2 gpt/disk01 gpt/disk02 gpt/disk03  \
gpt/disk04 gpt/disk05 /tmp/sparsefile1.img /tmp/sparsefile2.img

The -f flag was required to avoid this message:

invalid vdev specification
use '-f' to override the following errors:
mismatched replication level: raidz contains both files and devices


I tried to offline one of the sparse files:

 zpool offline storage /tmp/sparsefile2.img

That caused a panic: http://www.langille.org/tmp/zpool-offline-panic.jpg

After rebooting, I rm'd both /tmp/sparsefile1.img  and 
/tmp/sparsefile2.img without thinking they were still in the zpool.  Now 
I am unable to destroy the pool.  The system panics.  I disabled ZFS via 
/etc/rc.conf, rebooted, recreated the two sparse files, then did a 
forcestart of zfs.  Then I saw:


# zpool status
  pool: storage
 state: ONLINE
status: One or more devices could not be used because the label is 
missing or

invalid.  Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-4J
 scrub: none requested
config:

NAME  STATE READ WRITE CKSUM
storage   ONLINE   0 0 0
  raidz2  ONLINE   0 0 0
gpt/disk01ONLINE   0 0 0
gpt/disk02ONLINE   0 0 0
gpt/disk03ONLINE   0 0 0
gpt/disk04ONLINE   0 0 0
gpt/disk05ONLINE   0 0 0
/tmp/sparsefile1.img  UNAVAIL  0 0 0  corrupted 
data
/tmp/sparsefile2.img  UNAVAIL  0 0 0  corrupted 
data


errors: No known data errors


Another attempt to destroy the array created a panic.

Suggestions as to how to remove this array and get started again?

--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: zpool destroy causes panic


On 7/25/2010 1:58 PM, Dan Langille wrote:

I'm trying to destroy a zfs array which I recently created.  It contains
nothing of value.


Oh... I left this out:

FreeBSD kraken.unixathome.org 8.0-STABLE FreeBSD 8.0-STABLE #0: Fri Mar 
 5 00:46:11 EST 2010 
d...@kraken.example.org:/usr/obj/usr/src/sys/KRAKEN  amd64





# zpool status
pool: storage
state: ONLINE
status: One or more devices could not be used because the label is
missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-4J
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz2 ONLINE 0 0 0
gpt/disk01 ONLINE 0 0 0
gpt/disk02 ONLINE 0 0 0
gpt/disk03 ONLINE 0 0 0
gpt/disk04 ONLINE 0 0 0
gpt/disk05 ONLINE 0 0 0
/tmp/sparsefile1.img UNAVAIL 0 0 0 corrupted data
/tmp/sparsefile2.img UNAVAIL 0 0 0 corrupted data

errors: No known data errors

Why sparse files? See this post:

http://docs.freebsd.org/cgi/getmsg.cgi?fetch=1007077+0+archive/2010/freebsd-stable/20100725.freebsd-stable


The two tmp files were created via:

dd if=/dev/zero of=/tmp/sparsefile1.img bs=1 count=0 oseek=1862g
dd if=/dev/zero of=/tmp/sparsefile2.img bs=1 count=0 oseek=1862g

And the array created with:

zpool create -f storage raidz2 gpt/disk01 gpt/disk02 gpt/disk03 \
gpt/disk04 gpt/disk05 /tmp/sparsefile1.img /tmp/sparsefile2.img

The -f flag was required to avoid this message:

invalid vdev specification
use '-f' to override the following errors:
mismatched replication level: raidz contains both files and devices


I tried to offline one of the sparse files:

zpool offline storage /tmp/sparsefile2.img

That caused a panic: http://www.langille.org/tmp/zpool-offline-panic.jpg

After rebooting, I rm'd both /tmp/sparsefile1.img and
/tmp/sparsefile2.img without thinking they were still in the zpool. Now
I am unable to destroy the pool. The system panics. I disabled ZFS via
/etc/rc.conf, rebooted, recreated the two sparse files, then did a
forcestart of zfs. Then I saw:

# zpool status
pool: storage
state: ONLINE
status: One or more devices could not be used because the label is
missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-4J
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz2 ONLINE 0 0 0
gpt/disk01 ONLINE 0 0 0
gpt/disk02 ONLINE 0 0 0
gpt/disk03 ONLINE 0 0 0
gpt/disk04 ONLINE 0 0 0
gpt/disk05 ONLINE 0 0 0
/tmp/sparsefile1.img UNAVAIL 0 0 0 corrupted data
/tmp/sparsefile2.img UNAVAIL 0 0 0 corrupted data

errors: No known data errors


Another attempt to destroy the array created a panic.

Suggestions as to how to remove this array and get started again?




--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: zpool destroy causes panic


On 7/25/2010 4:37 PM, Volodymyr Kostyrko wrote:

25.07.2010 23:18, Jeremy Chadwick wrote:

Footnote: can someone explain to me how ZFS would, upon reboot, know
that /tmp/sparsefile[12].img are part of the pool? How would ZFS taste
metadata in this situation?


Just hacking it.

Each ZFS device which is part of the pool tracks all other devices which
are part of the pool with their sizes, device ids, last known points. It
doesn't know that /tmp/sparsefile[12].img is part of the pool, yet it
does know that pool have had some /tmp/sparsefile[12].img before and now
they can't be found or current contents doesn't look like ZFS device.

Can you try moving current files to /tmp/sparsefile[34].img and then
readd them to the pool with zpool replace? One by one please.


I do not know what the above paragraph means.

--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: zpool destroy causes panic


On 7/25/2010 4:49 PM, Volodymyr Kostyrko wrote:

25.07.2010 20:58, Dan Langille wrote:

NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz2 ONLINE 0 0 0
gpt/disk01 ONLINE 0 0 0
gpt/disk02 ONLINE 0 0 0
gpt/disk03 ONLINE 0 0 0
gpt/disk04 ONLINE 0 0 0
gpt/disk05 ONLINE 0 0 0
/tmp/sparsefile1.img UNAVAIL 0 0 0 corrupted data
/tmp/sparsefile2.img UNAVAIL 0 0 0 corrupted data


0k, i'll try it from here. UNAVAIL means ZFS can't locate correct vdev
for this pool member. Even if this file exists it's not used by ZFS
because it lacks ZFS headers/footers.

You can (I think so) reinsert empty file to the pool with:

# zpool replace storage /tmp/sparsefile1.img /tmp/sparsefile1.img

^- pool ^- ZFS old vdev name ^- current file

If you replace both files you can theoretically bring pool to fully
consistent state.

Also you can use md to convert files to devices:

# mdconfig -a -t vnode -f /tmp/sparsefile1.img
md0

And you can use md0 with your pool.


FYI, tried this, got a panic:

errors: No known data errors
# mdconfig -a -t vnode -f /tmp/sparsefile1.img
md0
# mdconfig -a -t vnode -f /tmp/sparsefile2.img
md1
# zpool replace storage /tmp/sparsefile1.img /dev/md0


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: zpool destroy causes panic

On 7/25/2010 1:58 PM, Dan Langille wrote:

I'm trying to destroy a zfs array which I recently created. It contains
nothing of value.

# zpool status
pool: storage
state: ONLINE
status: One or more devices could not be used because the label is
missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-4J
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz2 ONLINE 0 0 0
gpt/disk01 ONLINE 0 0 0
gpt/disk02 ONLINE 0 0 0
gpt/disk03 ONLINE 0 0 0
gpt/disk04 ONLINE 0 0 0
gpt/disk05 ONLINE 0 0 0
/tmp/sparsefile1.img UNAVAIL 0 0 0 corrupted data
/tmp/sparsefile2.img UNAVAIL 0 0 0 corrupted data

errors: No known data errors

Why sparse files? See this post:

http://docs.freebsd.org/cgi/getmsg.cgi?fetch=1007077+0+archive/2010/freebsd-stable/20100725.freebsd-stable

The two tmp files were created via:

dd if=/dev/zero of=/tmp/sparsefile1.img bs=1 count=0 oseek=1862g
dd if=/dev/zero of=/tmp/sparsefile2.img bs=1 count=0 oseek=1862g

And the array created with:

zpool create -f storage raidz2 gpt/disk01 gpt/disk02 gpt/disk03 \
gpt/disk04 gpt/disk05 /tmp/sparsefile1.img /tmp/sparsefile2.img

The -f flag was required to avoid this message:

invalid vdev specification
use '-f' to override the following errors:
mismatched replication level: raidz contains both files and devices

I tried to offline one of the sparse files:

zpool offline storage /tmp/sparsefile2.img

That caused a panic: http://www.langille.org/tmp/zpool-offline-panic.jpg

After rebooting, I rm'd both /tmp/sparsefile1.img and
/tmp/sparsefile2.img without thinking they were still in the zpool. Now
I am unable to destroy the pool. The system panics. I disabled ZFS via
/etc/rc.conf, rebooted, recreated the two sparse files, then did a
forcestart of zfs. Then I saw:

errors: No known data errors

Another attempt to destroy the array created a panic.

Suggestions as to how to remove this array and get started again?

I fixed this by:

* reboot zfs_enable="NO" in /etc/rc.conf
* rm /boot/zfs/zpool.cache
* wiping the first and last 16KB of each partition involved in the array

Now I'm trying mdconfig instead of sparse files. Making progress, but
not all the way there yet. :)

Where's the space? raidz2

2010-08-02 Thread Dan Langille

 type: freebsd-swap
   index: 2
   end: 14680063
   start: 2097152
3. Name: mirror/gm0s1d
   Mediasize: 4294967296 (4.0G)
   Sectorsize: 512
   Mode: r1w1e1
   rawtype: 7
   length: 4294967296
   offset: 7516192768
   type: freebsd-ufs
   index: 4
   end: 23068671
   start: 14680064
4. Name: mirror/gm0s1e
   Mediasize: 4294967296 (4.0G)
   Sectorsize: 512
   Mode: r1w1e1
   rawtype: 7
   length: 4294967296
   offset: 11811160064
   type: freebsd-ufs
   index: 5
   end: 31457279
   start: 23068672
5. Name: mirror/gm0s1f
   Mediasize: 63920202240 (60G)
   Sectorsize: 512
   Mode: r1w1e1
   rawtype: 7
   length: 63920202240
   offset: 16106127360
   type: freebsd-ufs
   index: 6
   end: 156301424
   start: 31457280
Consumers:
1. Name: mirror/gm0s1
   Mediasize: 80026329600 (75G)
   Sectorsize: 512
   Mode: r5w5e9

Geom name: ada0
fwheads: 16
fwsectors: 63
last: 3907029134
first: 34
entries: 128
scheme: GPT
Providers:
1. Name: ada0p1
   Mediasize: 2000188135936 (1.8T)
   Sectorsize: 512
   Mode: r1w1e2
   rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b
   label: disk06-live
   length: 2000188135936
   offset: 1048576
   type: freebsd-zfs
   index: 1
   end: 3906619500
   start: 2048
Consumers:
1. Name: ada0
   Mediasize: 2000398934016 (1.8T)
   Sectorsize: 512
   Mode: r1w1e3

Geom name: ada6
fwheads: 16
fwsectors: 63
last: 3907029134
first: 34
entries: 128
scheme: GPT
Providers:
1. Name: ada6p1
   Mediasize: 2000188135936 (1.8T)
   Sectorsize: 512
   Mode: r1w1e2
   rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b
   label: disk07-live
   length: 2000188135936
   offset: 1048576
   type: freebsd-zfs
   index: 1
   end: 3906619500
   start: 2048
Consumers:
1. Name: ada6
   Mediasize: 2000398934016 (1.8T)
   Sectorsize: 512
   Mode: r1w1e3

Geom name: ada1
fwheads: 16
fwsectors: 63
last: 3907029134
first: 34
entries: 128
scheme: GPT
Providers:
1. Name: ada1p1
   Mediasize: 2000188135936 (1.8T)
   Sectorsize: 512
   Mode: r1w1e2
   rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b
   label: disk01-live
   length: 2000188135936
   offset: 1048576
   type: freebsd-zfs
   index: 1
   end: 3906619500
   start: 2048
Consumers:
1. Name: ada1
   Mediasize: 2000398934016 (1.8T)
   Sectorsize: 512
   Mode: r1w1e3

Geom name: ada3
fwheads: 16
fwsectors: 63
last: 3907029134
first: 34
entries: 128
scheme: GPT
Providers:
1. Name: ada3p1
   Mediasize: 2000188135936 (1.8T)
   Sectorsize: 512
   Mode: r1w1e2
   rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b
   label: disk03-live
   length: 2000188135936
   offset: 1048576
   type: freebsd-zfs
   index: 1
   end: 3906619500
   start: 2048
Consumers:
1. Name: ada3
   Mediasize: 2000398934016 (1.8T)
   Sectorsize: 512
   Mode: r1w1e3

Geom name: ada4
fwheads: 16
fwsectors: 63
last: 3907029134
first: 34
entries: 128
scheme: GPT
Providers:
1. Name: ada4p1
   Mediasize: 2000188135936 (1.8T)
   Sectorsize: 512
   Mode: r1w1e2
   rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b
   label: disk04-live
   length: 2000188135936
   offset: 1048576
   type: freebsd-zfs
   index: 1
   end: 3906619500
   start: 2048
Consumers:
1. Name: ada4
   Mediasize: 2000398934016 (1.8T)
   Sectorsize: 512
   Mode: r1w1e3

Geom name: ada5
fwheads: 16
fwsectors: 63
last: 3907029134
first: 34
entries: 128
scheme: GPT
Providers:
1. Name: ada5p1
   Mediasize: 2000188135936 (1.8T)
   Sectorsize: 512
   Mode: r1w1e2
   rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b
   label: disk05-live
   length: 2000188135936
   offset: 1048576
   type: freebsd-zfs
   index: 1
   end: 3906619500
   start: 2048
Consumers:
1. Name: ada5
   Mediasize: 2000398934016 (1.8T)
   Sectorsize: 512
   Mode: r1w1e3



--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Where's the space? raidz2

2010-08-02 Thread Dan Langille


On 8/2/2010 7:11 PM, Dan Langille wrote:

I recently altered an existing raidz2 pool from using 7 vdevs of about
931G to 1.81TB. In fact, the existing pool used half of each HDD. I then
wanted to go to using [almost] all of each HDD.

I offline'd each vdev, adjusted the HDD paritions using gpart, then
replaced the vdev. After letting the resilver occur, I did the next vdev.

The space available after this process did not go up as I expected. I
have about 4TB in the pool, not the 8 or 9TB I expected.


This fixed it:

# df -h
FilesystemSizeUsed   Avail Capacity  Mounted on
/dev/mirror/gm0s1a989M508M402M56%/
devfs 1.0K1.0K  0B   100%/dev
/dev/mirror/gm0s1e3.9G500K3.6G 0%/tmp
/dev/mirror/gm0s1f 58G4.6G 48G 9%/usr
/dev/mirror/gm0s1d3.9G156M3.4G 4%/var
storage   512G1.7G510G 0%/storage
storage/pgsql 512G1.7G510G 0%/storage/pgsql
storage/bacula3.7T3.2T510G87%/storage/bacula
storage/Retored   510G 39K510G 0%/storage/Retored


# zpool export storage
# zpool import storage

# df -h
FilesystemSizeUsed   Avail Capacity  Mounted on
/dev/mirror/gm0s1a989M508M402M56%/
devfs 1.0K1.0K  0B   100%/dev
/dev/mirror/gm0s1e3.9G500K3.6G 0%/tmp
/dev/mirror/gm0s1f 58G4.6G 48G 9%/usr
/dev/mirror/gm0s1d3.9G156M3.4G 4%/var
storage   5.0T1.7G5.0T 0%/storage
storage/Retored   5.0T 39K5.0T 0%/storage/Retored
storage/bacula8.2T3.2T5.0T39%/storage/bacula
storage/pgsql 5.0T1.7G5.0T 0%/storage/pgsql



--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 8.1R ZFS almost locking up system

2010-08-21 Thread Dan Nelson

In the last episode (Aug 21), Tim Bishop said:
> I've had a problem on a FreeBSD 8.1R system for a few weeks. It seems
> that ZFS gets in to an almost unresponsive state. Last time it did it
> (two weeks ago) I couldn't even log in, although the system was up, this
> time I could manage a reboot but couldn't stop any applications (they
> were likely hanging on I/O).

Could your pool be very close to full?  Zfs will throttle itself when it's
almost out of disk space.  I know it's "saved" me from filling up my
filesystems a couple times :)

> A few items from top, including zfskern:
> 
>   PID USERNAME  THR PRI NICE   SIZERES STATE   C   TIME   WCPU COMMAND
> 5 root4  -8- 0K60K zio->i  0  54:38  3.47% zfskern
> 91775 70  1  440 53040K 31144K tx->tx  1   2:11  0.00% postgres
> 39661 tdb 1  440 55776K 32968K tx->tx  0   0:39  0.00% mutt
> 14828 root1  470 14636K  1572K tx->tx  1   0:03  0.00% zfs
> 11188 root1  510 14636K  1572K tx->tx  0   0:03  0.00% zfs
> 
> At some point during this process my zfs snapshots have been failing to
> complete:
> 
> root5  0.8  0.0 060  ??  DL7Aug10  54:43.83 [zfskern]
> root 8265  0.0  0.0 14636  1528  ??  D10:00AM   0:03.12 zfs snapshot 
> -r po...@2010-08-21_10:00:01--1d
> root11188  0.0  0.1 14636  1572  ??  D11:00AM   0:02.93 zfs snapshot 
> -r po...@2010-08-21_11:00:01--1d
> root14828  0.0  0.1 14636  1572  ??  D12:00PM   0:03.04 zfs snapshot 
> -r po...@2010-08-21_12:00:00--1d
> root17862  0.0  0.1 14636  1572  ??  D 1:00PM   0:01.96 zfs snapshot 
> -r po...@2010-08-21_13:00:01--1d
> root20986  0.0  0.1 14636  1572  ??  D 2:00PM   0:02.07 zfs snapshot 
> -r po...@2010-08-21_14:00:01--1d

procstat -k on some of these processes might help to pinpoint what part of
the zfs code they're all waiting in.

-- 
Dan Nelson
dnel...@allantgroup.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

kernel MCA messages

2010-08-22 Thread Dan Langille


What does this mean?

kernel: MCA: Bank 4, Status 0x940c4001fe080813
kernel: MCA: Global Cap 0x0105, Status 0x
kernel: MCA: Vendor "AuthenticAMD", ID 0xf5a, APIC ID 0
kernel: MCA: CPU 0 COR BUSLG Source RD Memory
kernel: MCA: Address 0x7ff6b0

FreeBSD 7.3-STABLE #1: Sun Aug 22 23:16:43

--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: kernel MCA messages

2010-08-22 Thread Dan Langille


On 8/22/2010 9:18 PM, Dan Langille wrote:

What does this mean?

kernel: MCA: Bank 4, Status 0x940c4001fe080813
kernel: MCA: Global Cap 0x0105, Status 0x
kernel: MCA: Vendor "AuthenticAMD", ID 0xf5a, APIC ID 0
kernel: MCA: CPU 0 COR BUSLG Source RD Memory
kernel: MCA: Address 0x7ff6b0

FreeBSD 7.3-STABLE #1: Sun Aug 22 23:16:43


And another one:

kernel: MCA: Bank 4, Status 0x9459c0014a080813
kernel: MCA: Global Cap 0x0105, Status 0x
kernel: MCA: Vendor "AuthenticAMD", ID 0xf5a, APIC ID 0
kernel: MCA: CPU 0 COR BUSLG Source RD Memory
kernel: MCA: Address 0x7ff670


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: kernel MCA messages

2010-08-23 Thread Dan Langille


On 8/22/2010 10:05 PM, Dan Langille wrote:

On 8/22/2010 9:18 PM, Dan Langille wrote:

What does this mean?

kernel: MCA: Bank 4, Status 0x940c4001fe080813
kernel: MCA: Global Cap 0x0105, Status 0x
kernel: MCA: Vendor "AuthenticAMD", ID 0xf5a, APIC ID 0
kernel: MCA: CPU 0 COR BUSLG Source RD Memory
kernel: MCA: Address 0x7ff6b0

FreeBSD 7.3-STABLE #1: Sun Aug 22 23:16:43


And another one:

kernel: MCA: Bank 4, Status 0x9459c0014a080813
kernel: MCA: Global Cap 0x0105, Status 0x
kernel: MCA: Vendor "AuthenticAMD", ID 0xf5a, APIC ID 0
kernel: MCA: CPU 0 COR BUSLG Source RD Memory
kernel: MCA: Address 0x7ff670


kernel: MCA: Bank 4, Status 0x947ec000d8080a13
kernel: MCA: Global Cap 0x0105, Status 0x
kernel: MCA: Vendor "AuthenticAMD", ID 0xf5a, APIC ID 0
kernel: MCA: CPU 0 COR BUSLG Responder RD Memory
kernel: MCA: Address 0xbfa9930

Another one.

These errors started appearing after upgrading to 8.1-STABLE from 7.2.. 
something.  I suspect the functionality was added about then



--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: kernel MCA messages

2010-08-23 Thread Dan Langille


On 8/23/2010 7:47 PM, Andriy Gapon wrote:

on 24/08/2010 02:43 Dan Langille said the following:

On 8/22/2010 10:05 PM, Dan Langille wrote:

On 8/22/2010 9:18 PM, Dan Langille wrote:

What does this mean?

kernel: MCA: Bank 4, Status 0x940c4001fe080813
kernel: MCA: Global Cap 0x0105, Status 0x
kernel: MCA: Vendor "AuthenticAMD", ID 0xf5a, APIC ID 0
kernel: MCA: CPU 0 COR BUSLG Source RD Memory
kernel: MCA: Address 0x7ff6b0

FreeBSD 7.3-STABLE #1: Sun Aug 22 23:16:43


And another one:

kernel: MCA: Bank 4, Status 0x9459c0014a080813
kernel: MCA: Global Cap 0x0105, Status 0x
kernel: MCA: Vendor "AuthenticAMD", ID 0xf5a, APIC ID 0
kernel: MCA: CPU 0 COR BUSLG Source RD Memory
kernel: MCA: Address 0x7ff670


kernel: MCA: Bank 4, Status 0x947ec000d8080a13
kernel: MCA: Global Cap 0x0105, Status 0x
kernel: MCA: Vendor "AuthenticAMD", ID 0xf5a, APIC ID 0
kernel: MCA: CPU 0 COR BUSLG Responder RD Memory
kernel: MCA: Address 0xbfa9930

Another one.

These errors started appearing after upgrading to 8.1-STABLE from 7.2..
something.  I suspect the functionality was added about then


Please strop the flood :-)


Sure.  Three emails is hardly a flood.  :)


Depending on hardware there could be hundreds of such errors per day.
Either replace memory modules or learn to live with these messages.


I was posting a remark anyone.  Thought I'd include one more that I 
noticed.  Surely you can cope.  :)


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: kernel MCA messages

2010-08-24 Thread Dan Langille


On 8/22/2010 9:18 PM, Dan Langille wrote:

What does this mean?

kernel: MCA: Bank 4, Status 0x940c4001fe080813
kernel: MCA: Global Cap 0x0105, Status 0x
kernel: MCA: Vendor "AuthenticAMD", ID 0xf5a, APIC ID 0
kernel: MCA: CPU 0 COR BUSLG Source RD Memory
kernel: MCA: Address 0x7ff6b0

FreeBSD 7.3-STABLE #1: Sun Aug 22 23:16:43


FYI, these are occurring every hour, almost to the second. e.g. 
xx:56:yy, where yy is 09, 10, or 11.


Checking logs, I don't see anything that correlates with this point in 
the hour (i.e 56 minutes past) that doesn't also occur at other times.


It seems very odd to occur so regularly.

--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: kernel MCA messages

2010-08-24 Thread Dan Langille


On 8/24/2010 7:38 PM, Jeremy Chadwick wrote:

On Tue, Aug 24, 2010 at 07:13:23PM -0400, Dan Langille wrote:

On 8/22/2010 9:18 PM, Dan Langille wrote:

What does this mean?

kernel: MCA: Bank 4, Status 0x940c4001fe080813
kernel: MCA: Global Cap 0x0105, Status 0x
kernel: MCA: Vendor "AuthenticAMD", ID 0xf5a, APIC ID 0
kernel: MCA: CPU 0 COR BUSLG Source RD Memory
kernel: MCA: Address 0x7ff6b0

FreeBSD 7.3-STABLE #1: Sun Aug 22 23:16:43


FYI, these are occurring every hour, almost to the second. e.g.
xx:56:yy, where yy is 09, 10, or 11.

Checking logs, I don't see anything that correlates with this point
in the hour (i.e 56 minutes past) that doesn't also occur at other
times.

It seems very odd to occur so regularly.


1) Why haven't you replaced the DIMM in Bank 4 -- or better yet, all
the DIMMs just to be sure?  Do this and see if the problem goes
away.  If not, no harm done, and you've narrowed it down.


For good reason: time and distance.   I've not hand the time or 
opportunity to buy new RAM.  Today is Tuesday.  The problem appeared 
about 48 hours ago after upgrading to 8.1 stable from 7.x.  The box is 
in Austin.  I'm in Philadelphia.  You know the math.  ;)  When I can get 
the time to fly to Austin, I will if required.


I'm sorry, I'm not meaning to be flippant.  I'm just glad I documented 
as such as I could 4 years ago.



2) What exact manufacturer and model of motherboard is this?  If
you can provide a link to a User Manual that would be great.


 This is a box from iXsystems that I obtained back when 6.1-RELEASE was 
the latest.  I know it has four sticks of 2GB.


   http://www.freebsddiary.org/dual-opteron.php

Sadly, many of the links are now invalid. The board is a AccelerTech 
ATO2161-DC, also known as a RioWorks HDAMA-G.


See also:

  http://www.freebsddiary.org/dual-opteron-dmidecode.txt

And we have a close up of the RAM and the m/b:

  http://www.freebsddiary.org/showpicture.php?id=85
  http://www.freebsddiary.org/showpicture.php?id=84

I am quite sure it's very close to this:

  http://www.accelertech.com/2007/amd_mb/opteron/ato2161i-dc_pic.php

With the manual here:

  http://www.accelertech.com/2007/amd_mb/opteron/ato2161i-dc_manual.php


3) Please go into your system BIOS and find where "ECC ChipKill"
options are available (likely under a Memory, Chipset, or
Northbridge section).  Please write down and provide here all
of the options and what their currently selected values are.

4) Please make sure you're running the latest system BIOS.  I've seen
on certain Rackable AMD-based systems where Northbridge-related
features don't work quite right (at least with Solaris), resulting
in atrocious memory performance on the system.  A BIOS upgrade
solved the problem.


3 & 4 are just as hard as #1 at the moment.


There's a ChipKill feature called "ECC BG Scrubbing" that's vague in
definition, given that it's a "background memory scrub" that happens at
intervals which are unknown to me.  Maybe 60 minutes?  I don't know.
This is why I ask question #3.

For John and other devs: I assume the decoded MCA messages indicate with
absolute certainty that the ECC error is coming from external DRAM and
not, say, bad L1 or L2 cache?


Nice question.

--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: kernel MCA messages

2010-08-25 Thread Dan Langille


On 8/25/2010 3:11 AM, Andriy Gapon wrote:


Have you read the decoded message?
Please re-read it.

I still recommend reading at least the summary of the RAM ECC research article
to make your own judgment about need to replace DRAM.


Andriy: What is your interpretation of the decoded message?  What is 
your view on replacing DRAM?  What do you conclude from the summary?


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

ACPI Warning: Optional field Pm2ControlBlock has zero address

2010-08-28 Thread Dan Langille


This this something to be concerned about:

ACPI Warning: Optional field Pm2ControlBlock has zero address or length: 
0x/0x1 (20100331/tbfadt-655)


--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ACPI Warning: Optional field Pm2ControlBlock has zero address

2010-08-28 Thread Dan Langille


On 8/28/2010 8:30 PM, Jeremy Chadwick wrote:

On Sat, Aug 28, 2010 at 04:35:58PM -0400, Dan Langille wrote:

This this something to be concerned about:

ACPI Warning: Optional field Pm2ControlBlock has zero address or
length: 0x/0x1 (20100331/tbfadt-655)


CC'ing freebsd-acpi.  OS version is unknown.


FreeBSD-Stable 8.1

--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 8.1R ZFS almost locking up system

2010-08-31 Thread Dan Nelson

In the last episode (Aug 31), Tim Bishop said:
> On Sat, Aug 21, 2010 at 05:24:29PM -0500, Dan Nelson wrote:
> > In the last episode (Aug 21), Tim Bishop said:
> > > A few items from top, including zfskern:
> > > 
> > >   PID USERNAME  THR PRI NICE   SIZERES STATE   C   TIME   WCPU COMMAND
> > > 5 root4  -8- 0K60K zio->i  0  54:38  3.47% zfskern
> > > 91775 70  1  440 53040K 31144K tx->tx  1   2:11  0.00% 
> > > postgres
> > > 39661 tdb 1  440 55776K 32968K tx->tx  0   0:39  0.00% mutt
> > > 14828 root1  470 14636K  1572K tx->tx  1   0:03  0.00% zfs
> > > 11188 root1  510 14636K  1572K tx->tx  0   0:03  0.00% zfs
> > > 
> > > At some point during this process my zfs snapshots have been failing to
> > > complete:
> > > 
> > > root5  0.8  0.0 060  ??  DL7Aug10  54:43.83 [zfskern]
> > > root 8265  0.0  0.0 14636  1528  ??  D10:00AM   0:03.12 zfs 
> > > snapshot -r po...@2010-08-21_10:00:01--1d
> > > root11188  0.0  0.1 14636  1572  ??  D11:00AM   0:02.93 zfs 
> > > snapshot -r po...@2010-08-21_11:00:01--1d
> > > root14828  0.0  0.1 14636  1572  ??  D12:00PM   0:03.04 zfs 
> > > snapshot -r po...@2010-08-21_12:00:00--1d
> > > root17862  0.0  0.1 14636  1572  ??  D 1:00PM   0:01.96 zfs 
> > > snapshot -r po...@2010-08-21_13:00:01--1d
> > > root20986  0.0  0.1 14636  1572  ??  D 2:00PM   0:02.07 zfs 
> > > snapshot -r po...@2010-08-21_14:00:01--1d
> > 
> > procstat -k on some of these processes might help to pinpoint what part of
> > the zfs code they're all waiting in.
> 
> It happened again this Saturday (clearly something in the weekly
> periodic run is triggering the issue). procstat -kk shows the following
> for processes doing something zfs related (where zfs related means the
> string 'zfs' in the procstat -kk output):
> 
> 0 100084 kernel   zfs_vn_rele_task mi_switch+0x16f 
> sleepq_wait+0x42 _sleep+0x31c taskqueue_thread_loop+0xb7 fork_exit+0x118 
> fork_trampoline+0xe 
> 5 100031 zfskern  arc_reclaim_thre mi_switch+0x16f 
> sleepq_timedwait+0x42 _cv_timedwait+0x129 arc_reclaim_thread+0x2d1 
> fork_exit+0x118 fork_trampoline+0xe 
> 5 100032 zfskern  l2arc_feed_threa mi_switch+0x16f 
> sleepq_timedwait+0x42 _cv_timedwait+0x129 l2arc_feed_thread+0x1be 
> fork_exit+0x118 fork_trampoline+0xe 
> 5 100085 zfskern  txg_thread_enter mi_switch+0x16f 
> sleepq_wait+0x42 _cv_wait+0x111 txg_thread_wait+0x79 txg_quiesce_thread+0xb5 
> fork_exit+0x118 fork_trampoline+0xe 
> 5 100086 zfskern  txg_thread_enter mi_switch+0x16f 
> sleepq_wait+0x42 _cv_wait+0x111 zio_wait+0x61 dsl_pool_sync+0xea 
> spa_sync+0x355 txg_sync_thread+0x195 fork_exit+0x118 fork_trampoline+0xe 
>17 100040 syncer   -mi_switch+0x16f 
> sleepq_wait+0x42 _cv_wait+0x111 txg_wait_synced+0x7c zil_commit+0x416 
> zfs_sync+0xa6 sync_fsync+0x184 sync_vnode+0x16b sched_sync+0x1c9 
> fork_exit+0x118 fork_trampoline+0xe 
>  2210 100156 syslogd  -mi_switch+0x16f 
> sleepq_wait+0x42 _cv_wait+0x111 txg_wait_open+0x85 zfs_freebsd_write+0x378 
> VOP_WRITE_APV+0xb2 vn_write+0x2d7 dofilewrite+0x85 kern_writev+0x60 
> writev+0x41 syscall+0x1e7 Xfast_syscall+0xe1 
>  3500 100177 syslogd  -mi_switch+0x16f 
> sleepq_wait+0x42 _cv_wait+0x111 txg_wait_open+0x85 zfs_freebsd_write+0x378 
> VOP_WRITE_APV+0xb2 vn_write+0x2d7 dofilewrite+0x85 kern_writev+0x60 
> writev+0x41 syscall+0x1e7 Xfast_syscall+0xe1 
>  3783 100056 syslogd  -mi_switch+0x16f 
> sleepq_wait+0x42 _cv_wait+0x111 txg_wait_open+0x85 zfs_freebsd_write+0x378 
> VOP_WRITE_APV+0xb2 vn_write+0x2d7 dofilewrite+0x85 kern_writev+0x60 
> writev+0x41 syscall+0x1e7 Xfast_syscall+0xe1 
>  4064 100165 mysqld   initial thread   mi_switch+0x16f 
> sleepq_wait+0x42 _cv_wait+0x111 txg_wait_open+0x85 dmu_tx_assign+0x16c 
> zfs_inactive+0xd9 zfs_freebsd_inactive+0x1a vinactive+0x6a vputx+0x1cc 
> vn_close+0xa1 vn_closefile+0x5a _fdrop+0x23 closef+0x3b kern_close+0x14d 
> syscall+0x1e7 Xfast_syscall+0xe1 
>  4441 100224 python2.6initial thread   mi_switch+0x16f 
> sleepq_wait+0x42 _cv_wait+0x111 txg_wait_open+0x85 dmu_tx_assign+0x16c 
> zfs_inactive+0xd9 zfs_freebsd_inactive+0x1a vinactive+0x6a vputx+0x1cc 
> null_reclaim+0xbc vgonel+0x12e vrecycle+0x7d null_inactive+0x1f 
> vinactive+0x6a vputx+0x1cc vn_close+0xa1 vn_closefile+0x5a _fdrop+0x23 
>   100227 python2.6initial thread   mi_swit

Re: 8.1R ZFS almost locking up system

2010-09-02 Thread Dan Nelson

In the last episode (Sep 01), Tim Bishop said:
> On Tue, Aug 31, 2010 at 10:58:29AM -0500, Dan Nelson wrote:
> > In the last episode (Aug 31), Tim Bishop said:
> > > It happened again this Saturday (clearly something in the weekly
> > > periodic run is triggering the issue).  procstat -kk shows the
> > > following for processes doing something zfs related (where zfs related
> > > means the string 'zfs' in the procstat -kk output):
> > > 
> > > 0 100084 kernel   zfs_vn_rele_task mi_switch+0x16f 
> > > sleepq_wait+0x42 _sleep+0x31c taskqueue_thread_loop+0xb7 fork_exit+0x118 
> > > fork_trampoline+0xe 
> > > 5 100031 zfskern  arc_reclaim_thre mi_switch+0x16f 
> > > sleepq_timedwait+0x42 _cv_timedwait+0x129 arc_reclaim_thread+0x2d1 
> > > fork_exit+0x118 fork_trampoline+0xe 
> > > 5 100032 zfskern  l2arc_feed_threa mi_switch+0x16f 
> > > sleepq_timedwait+0x42 _cv_timedwait+0x129 l2arc_feed_thread+0x1be 
> > > fork_exit+0x118 fork_trampoline+0xe 
> > > 5 100085 zfskern  txg_thread_enter mi_switch+0x16f 
> > > sleepq_wait+0x42 _cv_wait+0x111 txg_thread_wait+0x79 
> > > txg_quiesce_thread+0xb5 fork_exit+0x118 fork_trampoline+0xe 
> > > 5 100086 zfskern  txg_thread_enter mi_switch+0x16f 
> > > sleepq_wait+0x42 _cv_wait+0x111 zio_wait+0x61 dsl_pool_sync+0xea 
> > > spa_sync+0x355 txg_sync_thread+0x195 fork_exit+0x118 fork_trampoline+0xe 
> > >17 100040 syncer   -mi_switch+0x16f 
> > > sleepq_wait+0x42 _cv_wait+0x111 txg_wait_synced+0x7c zil_commit+0x416 
> > > zfs_sync+0xa6 sync_fsync+0x184 sync_vnode+0x16b sched_sync+0x1c9 
> > > fork_exit+0x118 fork_trampoline+0xe 
> > >  2210 100156 syslogd  -mi_switch+0x16f 
> > > sleepq_wait+0x42 _cv_wait+0x111 txg_wait_open+0x85 
> > > zfs_freebsd_write+0x378 VOP_WRITE_APV+0xb2 vn_write+0x2d7 
> > > dofilewrite+0x85 kern_writev+0x60 writev+0x41 syscall+0x1e7 
> > > Xfast_syscall+0xe1 
> > >  3500 100177 syslogd  -mi_switch+0x16f 
> > > sleepq_wait+0x42 _cv_wait+0x111 txg_wait_open+0x85 
> > > zfs_freebsd_write+0x378 VOP_WRITE_APV+0xb2 vn_write+0x2d7 
> > > dofilewrite+0x85 kern_writev+0x60 writev+0x41 syscall+0x1e7 
> > > Xfast_syscall+0xe1 
> > >  3783 100056 syslogd  -mi_switch+0x16f 
> > > sleepq_wait+0x42 _cv_wait+0x111 txg_wait_open+0x85 
> > > zfs_freebsd_write+0x378 VOP_WRITE_APV+0xb2 vn_write+0x2d7 
> > > dofilewrite+0x85 kern_writev+0x60 writev+0x41 syscall+0x1e7 
> > > Xfast_syscall+0xe1 
> > >  4064 100165 mysqld   initial thread   mi_switch+0x16f 
> > > sleepq_wait+0x42 _cv_wait+0x111 txg_wait_open+0x85 dmu_tx_assign+0x16c 
> > > zfs_inactive+0xd9 zfs_freebsd_inactive+0x1a vinactive+0x6a vputx+0x1cc 
> > > vn_close+0xa1 vn_closefile+0x5a _fdrop+0x23 closef+0x3b kern_close+0x14d 
> > > syscall+0x1e7 Xfast_syscall+0xe1 
> > >  4441 100224 python2.6initial thread   mi_switch+0x16f 
> > > sleepq_wait+0x42 _cv_wait+0x111 txg_wait_open+0x85 dmu_tx_assign+0x16c 
> > > zfs_inactive+0xd9 zfs_freebsd_inactive+0x1a vinactive+0x6a vputx+0x1cc 
> > > null_reclaim+0xbc vgonel+0x12e vrecycle+0x7d null_inactive+0x1f 
> > > vinactive+0x6a vputx+0x1cc vn_close+0xa1 vn_closefile+0x5a _fdrop+0x23 
> > >   100227 python2.6initial thread   mi_switch+0x16f 
> > > sleepq_wait+0x42 _cv_wait+0x111 txg_wait_open+0x85 dmu_tx_assign+0x16c 
> > > zfs_inactive+0xd9 zfs_freebsd_inactive+0x1a vinactive+0x6a vputx+0x1cc 
> > > null_reclaim+0xbc vgonel+0x12e vrecycle+0x7d null_inactive+0x1f 
> > > vinactive+0x6a vputx+0x1cc vn_close+0xa1 vn_closefile+0x5a _fdrop+0x23 
> > >  4445 100228 python2.6initial thread   mi_switch+0x16f 
> > > sleepq_wait+0x42 _cv_wait+0x111 txg_wait_open+0x85 dmu_tx_assign+0x16c 
> > > zfs_inactive+0xd9 zfs_freebsd_inactive+0x1a vinactive+0x6a vputx+0x1cc 
> > > null_reclaim+0xbc vgonel+0x12e vrecycle+0x7d null_inactive+0x1f 
> > > vinactive+0x6a vputx+0x1cc vn_close+0xa1 vn_closefile+0x5a _fdrop+0x23 
> > >  4446 100229 python2.6initial thread   mi_switch+0x16f 
> > > sleepq_wait+0x42 _cv_wait+0x111 txg_wait_open+0x85 dmu_tx_assign+0x16c 
> > > zfs_inactive+0xd9 zfs_freebsd_inactive+0x1a vinactive+0x6a vputx+0x1cc 
> > > null_reclaim+0xbc vgonel+0x12e vrecycle+0x7d null_inactive+0x1f 
> > > vinactive+0x6a vputx+0x1cc vn_close+0xa1 vn_closefile+0x5a _fdrop+0x23 
> > >  4447 100

Re: MFC of ZFSv15

2010-09-19 Thread Dan Mack


But I should be able to boot my ZFSv14 root pool using the ZFSv15 build of 
FreeBSD, correct?   But the problem scenario would be when I've upgraded my 
root pool to v15 and I attempt to boot it with v14 boot loader.  At least that 
is what I think ...

I guess what I'm getting at is ... you should be able to buildworld, 
installkernel, reboot, installworld, reboot without worry.   But after your run 
'zpool upgrade', you will need to re-write the bootcode using gpart on each of 
your root pool ZFS disks.

Am I understanding this correctly ?

Thanks for all the work on ZFS BTW, it's great!

Dan
On Sep 16, 2010, at 10:59 AM, Martin Matuska wrote:

> Dont forget to read the general "ZFS notes" section in UPDATING:
> 
> ZFS notes
> -
> When upgrading the boot ZFS pool to a new version, always follow
> these two steps:
> 
> 1.) recompile and reinstall the ZFS boot loader and boot block
> (this is part of "make buildworld" and "make installworld")
> 
> 2.) update the ZFS boot block on your boot drive
> 
> The following example updates the ZFS boot block on the first
> partition (freebsd-boot) of a GPT partitioned drive ad0:
> "gpart bootcode -p /boot/gptzfsboot -i 1 ad0"
> 
> Non-boot pools do not need these updates.
> 
> Dňa 16. 9. 2010 17:43, Mike Tancsa wrote / napísal(a):
>> At 11:18 AM 9/16/2010, jhell wrote:
>>> On 09/16/2010 09:55, Mike Tancsa wrote:
>>>> 
>>>> Thanks again for all the ZFS fixes and enhancements! Are there any
>>>> caveats to upgrading ?
>>>> 
>>>> Do I just do
>>>> 
>>>> zpool upgrade -a
>>>> zfs upgrade -a
>>>> 
>>>> or are there any extra steps ?
>>>> 
>>> 
>>> Hi Mike,
>>> 
>>> No-one knows your bootcode better than you. So if you are upgrading
>>> don't forget if you are on a ZFS root then your bootcode might need
>>> updating.
>> 
>> 
>> Hi,
>> I am booting off UFS right now so no bootcode updates for me :) I did
>> look at UPDATING which does mention
>> 
>> 20100915:
>> A new version of ZFS (version 15) has been merged.
>> This version uses a python library for the following subcommands:
>> zfs allow, zfs unallow, zfs groupspace, zfs userspace.
>> For full functionality of these commands the following port must
>> be installed: sysutils/py-zfs
>> 
>> ---Mike
>> 
>> 
>> --------
>> Mike Tancsa, tel +1 519 651 3400
>> Sentex Communications, m...@sentex.net
>> Providing Internet since 1994 www.sentex.net
>> Cambridge, Ontario Canada www.sentex.net/mike
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Dan
--
Dan Mack
m...@macktronics.com




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: MFC of ZFSv15

2010-09-19 Thread Dan Mack

But I should be able to boot my ZFSv14 root pool using the ZFSv15 build of 
FreeBSD, correct?   But the problem scenario would be when I've upgraded by 
root pool to v15 and I attempt to boot it with v14 boot loader.  At least that 
is what I think ...

I guess what I'm getting at is ... you should be able to buildworld, 
installkernel, reboot, installworld, reboot without worry.   But when after 
your run 'zpool upgrade', you will need to re-write the bootcode using gpart on 
each of your root pool ZFS disks.

Am I understanding this correctly ?

Thanks for all the work on ZFS BTW, it's great!

Dan

On Sep 16, 2010, at 2:03 PM, Henri Hennebert wrote:

> On 09/16/2010 17:18, jhell wrote:
>> On 09/16/2010 09:55, Mike Tancsa wrote:
>>> 
>>> Thanks again for all the ZFS fixes and enhancements!   Are there any
>>> caveats to upgrading ?
>>> 
>>> Do I just do
>>> 
>>> zpool upgrade -a
>>> zfs upgrade -a
>>> 
>>> or are there any extra steps ?
>>> 
>> 
>> Hi Mike,
>> 
>> No-one knows your bootcode better than you. So if you are upgrading
>> don't forget if you are on a ZFS root then your bootcode might need
>> updating.
>> 
> I was bitten by this problem in a previous ZFS upgrade.
> 
> To be sure, I have added this patch to zfsimpl.c so, at boot I know if 
> zpool/zfs upgrade will be OK.
> 
> Henri
>> 
>> Regards, UPDATING should have anything else.
>> 
> 
> _______
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Dan
--
Dan Mack
m...@macktronics.com

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: MFC of ZFSv15

2010-09-19 Thread Dan Mack

Thanks for the confirmation.  This worked fine and I did notice that "zpool 
upgrade zroot" was nice enough to emit the reminder:

  gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da0

which is slightly different than the recipe given in /usr/src/UPDATING:

"gpart bootcode -p /boot/gptzfsboot -i 1 ad0"

Since the recipe for my root/zfs system included pmbr and gptzfsboot, I used 
the example emitted from the zpool command instead of the one from UPDATING.


e.g.

  pool: zroot
 state: ONLINE
status: The pool is formatted using an older on-disk format.  The pool can
still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
pool will no longer be accessible on older software versions.
 scrub: none requested
config:

NAME   STATE READ WRITE CKSUM
zroot  ONLINE   0 0 0
  mirror   ONLINE   0 0 0
gpt/disk0  ONLINE   0 0 0
gpt/disk1  ONLINE   0 0 0

errors: No known data errors

(zfs) ~ # zpool upgrade zroot
This system is currently running ZFS pool version 15.

Successfully upgraded 'zroot' from version 14 to version 15

If you boot from pool 'zroot', don't forget to update boot code.
Assuming you use GPT partitioning and da0 is your boot disk
the following command will do it:

gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da0

(zfs) ~/zfs # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ad4
ad4 has bootcode
(zfs) ~/zfs # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ad5
ad5 has bootcode
(zfs) ~/zfs # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ad6
ad6 has bootcode
(zfs) ~/zfs # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ad7
ad7 has bootcode
(zfs) ~/zfs # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ad8
ad8 has bootcode

(zfs) ~/zfs # reboot



Dan

On Sep 19, 2010, at 12:41 PM, Matthew Seaman wrote:

> On 19/09/2010 17:36:01, Dan Mack wrote:
>> But I should be able to boot my ZFSv14 root pool using the ZFSv15
>> build of FreeBSD, correct?   But the problem scenario would be when
>> I've upgraded my root pool to v15 and I attempt to boot it with v14
>> boot loader.  At least that is what I think ...
> 
> Yes.  The bootloader is not prescient, so  bootloader compiled against
> v14 can't cope with a zpool using v15.  It's only the on-disk format
> that counts in this: zfs software will operate perfectly well with older
> on-disk data formats.
> 
>> I guess what I'm getting at is ... you should be able to buildworld,
>> installkernel, reboot, installworld, reboot without worry.   But
>> after your run 'zpool upgrade', you will need to re-write the
>> bootcode using gpart on each of your root pool ZFS disks.
> 
> If you want to be completely paranoid, you could update the bootcode on
> your boot drive (or one out of a mirror pair, if that's what you're
> using) at the point of running installkernel and way before you run
> 'zpool upgrade'.  In theory, should this go horribly wrong and you end
> up with an unbootable system, you can recover by booting the 8.0 install
> media into FIXIT mode and reinstalling the bootblocks from there (or
> booting from the other disk in your mirror set).  Once you've got a
> system you know will reboot with the new bootblocks, then go ahead and
> with installworld and updating the zpool version.
> 
>> Am I understanding this correctly ?
> 
> Yep.  That's quite right.  Running 'zpool upgrade -a' is one of those
> operations you can't easily reverse, so designing an upgrade plan where
> you can stop and back-out at any point is quite tricky.  Fortunately,
> the risk of things going wrong at the point of running zpool upgrade is
> really very small, so for most purposes, just ploughing ahead and
> accepting the really very small risk is going to be acceptable.
> 
>   Cheers,
> 
>   Matthew
> 
> -- 
> Dr Matthew J Seaman MA, D.Phil.   7 Priory Courtyard
>  Flat 3
> PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate
> JID: matt...@infracaninophile.co.uk   Kent, CT11 9PW
> 

Dan
--
Dan Mack
m...@macktronics.com




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

zfs send/receive: is this slow?

2010-09-29 Thread Dan Langille

 device
cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO
8192bytes)SMP: AP CPU #2 Launched!
cd0: Attempt to query device size failed: NOT READY, Medium not present -
tray closed

GEOM_MIRROR: Device mirror/gm0 launched (1/2).
GEOM_MIRROR: Device gm0: rebuilding provider ada7.
GEOM: mirror/gm0s1: geometry does not match label (16h,63s != 255h,63s).
Trying to mount root from ufs:/dev/mirror/gm0s1a
WARNING: / was not properly dismounted
ZFS NOTICE: Prefetch is disabled by default if less than 4GB of RAM is
present;
to enable, add "vfs.zfs.prefetch_disable=0" to /boot/loader.conf.
ZFS filesystem version 4
ZFS storage pool version 15
WARNING: /tmp was not properly dismounted
WARNING: /usr was not properly dismounted
WARNING: /var was not properly dismounted


-- 
Dan Langille -- http://langille.org/

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: zfs send/receive: is this slow?

2010-09-29 Thread Dan Langille


On 9/29/2010 3:57 PM, Artem Belevich wrote:

On Wed, Sep 29, 2010 at 11:04 AM, Dan Langille  wrote:

It's taken about 15 hours to copy 800GB.  I'm sure there's some tuning I
can do.

The system is now running:

# zfs send storage/bac...@transfer | zfs receive storage/compressed/bacula


Try piping zfs data through mbuffer (misc/mbuffer in ports). I've
found that it does help a lot to smooth out data flow and increase
send/receive throughput even when send/receive happens on the same
host. Run it with a buffer large enough to accommodate few seconds
worth of write throughput for your target disks.


Thanks.  I just installed it.  I'll use it next time.  I don't want to 
interrupt this one.  I'd like to see how long it takes.  Then compare.



Here's an example:
http://blogs.everycity.co.uk/alasdair/2010/07/using-mbuffer-to-speed-up-slow-zfs-send-zfs-receive/


That looks really good. Thank you.

--
Dan Langille - http://langille.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: zfs send/receive: is this slow?


On Wed, September 29, 2010 3:57 pm, Artem Belevich wrote:
> On Wed, Sep 29, 2010 at 11:04 AM, Dan Langille  wrote:
>> It's taken about 15 hours to copy 800GB.  I'm sure there's some tuning I
>> can do.
>>
>> The system is now running:
>>
>> # zfs send storage/bac...@transfer | zfs receive
>> storage/compressed/bacula
>
> Try piping zfs data through mbuffer (misc/mbuffer in ports). I've
> found that it does help a lot to smooth out data flow and increase
> send/receive throughput even when send/receive happens on the same
> host. Run it with a buffer large enough to accommodate few seconds
> worth of write throughput for your target disks.
>
> Here's an example:
> http://blogs.everycity.co.uk/alasdair/2010/07/using-mbuffer-to-speed-up-slow-zfs-send-zfs-receive/

I'm failing.  In one session:

# mbuffer -s 128k -m 1G -I 9090 | zfs receive
storage/compressed/bacula-mbuffer
Assertion failed: ((err == 0) && (bsize == sizeof(rcvsize))), function
openNetworkInput, file mbuffer.c, line 1358.
cannot receive: failed to read from stream


In the other session:

# time zfs send storage/bac...@transfer | mbuffer -s 128k -m 1G -O
10.55.0.44:9090
Assertion failed: ((err == 0) && (bsize == sizeof(sndsize))), function
openNetworkOutput, file mbuffer.c, line 897.
warning: cannot send 'storage/bac...@transfer': Broken pipe
Abort trap: 6 (core dumped)

real0m17.709s
user0m0.000s
sys 0m2.502s



-- 
Dan Langille -- http://langille.org/

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: zfs send/receive: is this slow?


On Fri, October 1, 2010 11:45 am, Dan Langille wrote:
>
> On Wed, September 29, 2010 3:57 pm, Artem Belevich wrote:
>> On Wed, Sep 29, 2010 at 11:04 AM, Dan Langille  wrote:
>>> It's taken about 15 hours to copy 800GB.  I'm sure there's some tuning
>>> I
>>> can do.
>>>
>>> The system is now running:
>>>
>>> # zfs send storage/bac...@transfer | zfs receive
>>> storage/compressed/bacula
>>
>> Try piping zfs data through mbuffer (misc/mbuffer in ports). I've
>> found that it does help a lot to smooth out data flow and increase
>> send/receive throughput even when send/receive happens on the same
>> host. Run it with a buffer large enough to accommodate few seconds
>> worth of write throughput for your target disks.
>>
>> Here's an example:
>> http://blogs.everycity.co.uk/alasdair/2010/07/using-mbuffer-to-speed-up-slow-zfs-send-zfs-receive/
>
> I'm failing.  In one session:
>
> # mbuffer -s 128k -m 1G -I 9090 | zfs receive
> storage/compressed/bacula-mbuffer
> Assertion failed: ((err == 0) && (bsize == sizeof(rcvsize))), function
> openNetworkInput, file mbuffer.c, line 1358.
> cannot receive: failed to read from stream
>
>
> In the other session:
>
> # time zfs send storage/bac...@transfer | mbuffer -s 128k -m 1G -O
> 10.55.0.44:9090
> Assertion failed: ((err == 0) && (bsize == sizeof(sndsize))), function
> openNetworkOutput, file mbuffer.c, line 897.
> warning: cannot send 'storage/bac...@transfer': Broken pipe
> Abort trap: 6 (core dumped)
>
> real0m17.709s
> user0m0.000s
> sys 0m2.502s

My installed mbuffer was out of date.  After an upgrade:

# mbuffer -s 128k -m 1G -I 9090 | zfs receive
storage/compressed/bacula-mbuffer
mbuffer: warning: unable to set socket buffer size: No buffer space available
in @  0.0 kB/s, out @  0.0 kB/s, 1897 MB total, buffer 100% full


# time zfs send storage/bac...@transfer | mbuffer -s 128k -m 1G -O ::1:9090
mbuffer: warning: unable to set socket buffer size: No buffer space available
in @ 4343 kB/s, out @ 2299 kB/s, 3104 MB total, buffer  85% full


-- 
Dan Langille -- http://langille.org/

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: zfs send/receive: is this slow?


On Wed, September 29, 2010 2:04 pm, Dan Langille wrote:
> $ zpool iostat 10
>capacity operationsbandwidth
> pool used  avail   read  write   read  write
> --  -  -  -  -  -  -
> storage 7.67T  5.02T358 38  43.1M  1.96M
> storage 7.67T  5.02T317475  39.4M  30.9M
> storage 7.67T  5.02T357533  44.3M  34.4M
> storage 7.67T  5.02T371556  46.0M  35.8M
> storage 7.67T  5.02T313521  38.9M  28.7M
> storage 7.67T  5.02T309457  38.4M  30.4M
> storage 7.67T  5.02T388589  48.2M  37.8M
> storage 7.67T  5.02T377581  46.8M  36.5M
> storage 7.67T  5.02T310559  38.4M  30.4M
> storage 7.67T  5.02T430611  53.4M  41.3M

Now that I'm using mbuffer:

$ zpool iostat 10
   capacity operationsbandwidth
pool used  avail   read  write   read  write
--  -  -  -  -  -  -
storage 9.96T  2.73T  2.01K131   151M  6.72M
storage 9.96T  2.73T615515  76.3M  33.5M
storage 9.96T  2.73T360492  44.7M  33.7M
storage 9.96T  2.73T388554  48.3M  38.4M
storage 9.96T  2.73T403562  50.1M  39.6M
storage 9.96T  2.73T313468  38.9M  28.0M
storage 9.96T  2.73T462677  57.3M  22.4M
storage 9.96T  2.73T383581  47.5M  21.6M
storage 9.96T  2.72T142571  17.7M  15.4M
storage 9.96T  2.72T 80598  10.0M  18.8M
storage 9.96T  2.72T718503  89.1M  13.6M
storage 9.96T  2.72T594517  73.8M  14.1M
storage 9.96T  2.72T367528  45.6M  15.1M
storage 9.96T  2.72T338520  41.9M  16.4M
storage 9.96T  2.72T348499  43.3M  21.5M
storage 9.96T  2.72T398553  49.4M  14.4M
storage 9.96T  2.72T346481  43.0M  6.78M

If anything, it's slower.

The above was without -s 128.  The following used that setting:

 $ zpool iostat 10
   capacity operationsbandwidth
pool used  avail   read  write   read  write
--  -  -  -  -  -  -
storage 9.78T  2.91T  1.98K137   149M  6.92M
storage 9.78T  2.91T761577  94.4M  42.6M
storage 9.78T  2.91T462411  57.4M  24.6M
storage 9.78T  2.91T492497  61.1M  27.6M
storage 9.78T  2.91T632446  78.5M  22.5M
storage 9.78T  2.91T554414  68.7M  21.8M
storage 9.78T  2.91T459434  57.0M  31.4M
storage 9.78T  2.91T398570  49.4M  32.7M
storage 9.78T  2.91T338495  41.9M  26.5M
storage 9.78T  2.91T358526  44.5M  33.3M
storage 9.78T  2.91T385555  47.8M  39.8M
storage 9.78T  2.91T271453  33.6M  23.3M
storage 9.78T  2.91T270456  33.5M  28.8M


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: zfs send/receive: is this slow?

FYI: this is all on the same box. 

-- 
Dan Langille
http://langille.org/


On Oct 1, 2010, at 5:56 PM, Artem Belevich  wrote:

> Hmm. It did help me a lot when I was replicating ~2TB worth of data
> over GigE. Without mbuffer things were roughly in the ballpark of your
> numbers. With mbuffer I've got around 100MB/s.
> 
> Assuming that you have two boxes connected via ethernet, it would be
> good to check that nobody generates PAUSE frames. Some time back I've
> discovered that el-cheapo switch I've been using for some reason could
> not keep up with traffic bursts and generated tons of PAUSE frames
> that severely limited throughput.
> 
> If you're using Intel adapters, check xon/xoff counters in "sysctl
> dev.em.0.mac_stats". If you see them increasing, that may explain slow
> speed.
> If you have a switch between your boxes, try bypassing it and connect
> boxes directly.
> 
> --Artem
> 
> 
> 
> On Fri, Oct 1, 2010 at 11:51 AM, Dan Langille  wrote:
>> 
>> On Wed, September 29, 2010 2:04 pm, Dan Langille wrote:
>>> $ zpool iostat 10
>>>capacity operationsbandwidth
>>> pool used  avail   read  write   read  write
>>> --  -  -  -  -  -  -
>>> storage 7.67T  5.02T358 38  43.1M  1.96M
>>> storage 7.67T  5.02T317475  39.4M  30.9M
>>> storage 7.67T  5.02T357533  44.3M  34.4M
>>> storage 7.67T  5.02T371556  46.0M  35.8M
>>> storage 7.67T  5.02T313521  38.9M  28.7M
>>> storage 7.67T  5.02T309457  38.4M  30.4M
>>> storage 7.67T  5.02T388589  48.2M  37.8M
>>> storage 7.67T  5.02T377581  46.8M  36.5M
>>> storage 7.67T  5.02T310559  38.4M  30.4M
>>> storage 7.67T  5.02T430611  53.4M  41.3M
>> 
>> Now that I'm using mbuffer:
>> 
>> $ zpool iostat 10
>>   capacity operationsbandwidth
>> pool used  avail   read  write   read  write
>> --  -  -  -  -  -  -
>> storage 9.96T  2.73T  2.01K131   151M  6.72M
>> storage 9.96T  2.73T615515  76.3M  33.5M
>> storage 9.96T  2.73T360492  44.7M  33.7M
>> storage 9.96T  2.73T388554  48.3M  38.4M
>> storage 9.96T  2.73T403562  50.1M  39.6M
>> storage 9.96T  2.73T313468  38.9M  28.0M
>> storage 9.96T  2.73T462677  57.3M  22.4M
>> storage 9.96T  2.73T383581  47.5M  21.6M
>> storage 9.96T  2.72T142571  17.7M  15.4M
>> storage 9.96T  2.72T 80598  10.0M  18.8M
>> storage 9.96T  2.72T718503  89.1M  13.6M
>> storage 9.96T  2.72T594517  73.8M  14.1M
>> storage 9.96T  2.72T367528  45.6M  15.1M
>> storage 9.96T  2.72T338520  41.9M  16.4M
>> storage 9.96T  2.72T348499  43.3M  21.5M
>> storage 9.96T  2.72T398553  49.4M  14.4M
>> storage 9.96T  2.72T346481  43.0M  6.78M
>> 
>> If anything, it's slower.
>> 
>> The above was without -s 128.  The following used that setting:
>> 
>>  $ zpool iostat 10
>>   capacity operationsbandwidth
>> pool used  avail   read  write   read  write
>> --  -  -  -  -  -  -
>> storage 9.78T  2.91T  1.98K137   149M  6.92M
>> storage 9.78T  2.91T761577  94.4M  42.6M
>> storage 9.78T  2.91T462411  57.4M  24.6M
>> storage 9.78T  2.91T492497  61.1M  27.6M
>> storage 9.78T  2.91T632446  78.5M  22.5M
>> storage 9.78T  2.91T554414  68.7M  21.8M
>> storage 9.78T  2.91T459434  57.0M  31.4M
>> storage 9.78T  2.91T398570  49.4M  32.7M
>> storage 9.78T  2.91T338495  41.9M  26.5M
>> storage 9.78T  2.91T358526  44.5M  33.3M
>> storage 9.78T  2.91T385555  47.8M  39.8M
>> storage 9.78T  2.91T271453  33.6M  23.3M
>> storage 9.78T  2.91T270456  33.5M  28.8M
>> 
>> 
>> ___
>> freebsd-stable@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>> 
> 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: zfs send/receive: is this slow?