Re: ZFS - moving from a zraid1 to zraid2 pool with 1.5tb disks

2011-01-05 Thread Artem Belevich
On Wed, Jan 5, 2011 at 1:55 PM, Damien Fleuriot  wrote:
> Well actually...
>
> raidz2:
> - 7x 1.5 tb = 10.5tb
> - 2 parity drives
>
> raidz1:
> - 3x 1.5 tb = 4.5 tb
> - 4x 1.5 tb = 6 tb , total 10.5tb
> - 2 parity drives in split thus different raidz1 arrays
>
> So really, in both cases 2 different parity drives and same storage...

In second case you get better performance, but lose some data
protection. It's still raidz1 and you can't guarantee functionality in
all cases of two drives failing. If two drives fail in the same vdev,
your entire pool will be gone.  Granted, it's better than single-vdev
raidz1, but it's *not* as good as raidz2.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ZFS - moving from a zraid1 to zraid2 pool with 1.5tb disks

2011-01-07 Thread Artem Belevich
On Fri, Jan 7, 2011 at 3:16 AM, Matthew D. Fuller
 wrote:
> On Thu, Jan 06, 2011 at 03:45:04PM +0200 I heard the voice of
> Daniel Kalchev, and lo! it spake thus:
>>
>> You should also know that having large L2ARC requires that you also
>> have larger ARC, because there are data pointers in the ARC that
>> point to the L2ARC data. Someone will do good to the community to
>> publish some reasonable estimates of the memory needs, so that
>> people do not end up with large but unusable L2ARC setups.
>
> Estimates I've read in the past are that L2ARC consumes ARC space at
> around 1-2%.

Each record in L2ARC takes about 250 bytes in ARC. If I understand it
correctly, not all records are 128K which is default record size on
ZFS. If you end up with a lot of small records (for instance, if you
have a lot of small files or due to a lot of synchronous writes or if
record size is set to a lower value) then you could potentially end up
with much higher ARC requirements.

So, 1-2% seems to be a reasonable estimate assuming that ZFS deals
with ~10K-20K records most of the time. If you mostly store large
files your ratio would probably be much better.

One way to get specific ratio for *your* pool would be to collect
record size statistics from your pool using "zdb -L -b " and
then calculate L2ARC:ARC ratio based on average record size. I'm not
sure, though whether L2ARC stores records in compressed or
uncompressed form.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: drives >2TB on mpt device

2011-04-04 Thread Artem Belevich
2011/4/4 Gerrit Kühn :
> On Mon, 4 Apr 2011 14:36:25 +0100 Bruce Cran  wrote
> about Re: drives >2TB on mpt device:
>
> Hi Bruce,
>
> BC> It looks like a known issue:
> BC> http://www.freebsd.org/cgi/query-pr.cgi?pr=bin/147572
>
> Hm, I don't know if this is exactly what I'm seeing here (although the
> cause may be the same):
> I do not use mptutil. The controller is "dumb" (without actual raid
> processor), and I intend to use it with zfs. However, I cannot even get
> gpart to create a partition larger than 2TB, because mpt comes back with
> only 2TB after probing the drive. As this is a problem that already exists
> with 1 drive, I cannot use gstripe or zfs to get around this.
> But the PR above states that this limitation is already built into mpt, so
> my only chance is probably to try a different controller/driver (any
> suggestions for a cheap 8port controller to use with zfs?), or to wait
> until mpt is updated to support larger drives. Does anyone know if there
> is already ongoing effort to do this?

You're probably out of luck as far as 2Tb+ support for 1068-based HBAs:
http://kb.lsi.com/KnowledgebaseArticle16399.aspx

Newer controllers based on LSI2008 (mps driver?) should not have that limit.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Kernel memory leak in 8.2-PRERELEASE?

2011-04-04 Thread Artem Belevich
On Mon, Apr 4, 2011 at 1:56 PM, Boris Kochergin  wrote:
> The problem persists, I'm afraid, and seems to have crept up a lot more
> quickly than before:
>
> # uname -a
> FreeBSD exodus.poly.edu 8.2-STABLE FreeBSD 8.2-STABLE #3: Sat Apr  2
> 11:48:43 EDT 2011     sp...@exodus.poly.edu:/usr/obj/usr/src/sys/EXODUS
>  amd64
>
> Mem: 314M Active, 955M Inact, 6356M Wired, 267M Cache, 828M Buf, 18M Free
>
> Any ideas for a diagnostic recourse?

My bet would be that the wired memory is used by ZFS ARC. In your
vmstat-m output you can see that  ~2.2G were allocated for 'solaris'
subsystem. Due to the fact that ARC allocations tend to be random, we
do waste a lot of memory on that. There were few patches floating on
stable@ adn fs@ that were supposed to mitigate the issue, but so far
there's no good out of the box solution. General advice is to tune ARC
so that it works in your case. Key loader tunables are
vfs.zfs.arc_min and vfs.zfs.arc_max. Don't set min too high and
experimentally set max to the maximum value that does not cause
problems.

One of the factors that makes things noticeably worse is presence of
i/o actifity on non-ZFS filesystems. Regular filesystems cache
competes with ZFS for RAM. In the past ZFS used to give up memory way
too easily. Currently it's a bit more balanced, but is still far from
perfect.

By the way, if you don't have on-disk swap configured, I'd recommend
adding some. It may help avoiding processes being killed during
intermittent memory shortages.

If would also help if you could post your /boot/loader.conf and output
of "zfs-stats -a" (available in ports).

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ZFS vs OSX Time Machine

2011-04-28 Thread Artem Belevich
On Thu, Apr 28, 2011 at 6:08 PM, Jeremy Chadwick
 wrote:
> I will note something, however: your ARC max is set to 3072MB, yet Wired
> is around 4143MB.  Do you have something running on this box that takes
> up a lot of RAM?  mysqld, etc..?  I'm trying to account for the "extra
> gigabyte" in Wired.  "top -o res" might help here, but we'd need to see
> the process list.
>
> I'm thinking something else on your machine is also taking up Wired,
> because your arcstats shows:
>
>> kstat.zfs.misc.arcstats.c: 3221225472
>> kstat.zfs.misc.arcstats.c_min: 402653184
>> kstat.zfs.misc.arcstats.c_max: 3221225472
>> kstat.zfs.misc.arcstats.size: 3221162968
>
> Which is about 3072MB (there is always some degree of variance).

The difference is probably due to fragmentation (most of ARC
allocations are served from power-of-2 zones, if I'm not mistaken) + a
lot of wired memory sits in slab allocator caches (FREE column in
vmstat -z). On a system with ARC size of ~16G I regularly see ~22GB
wired. Ona smaller box I get about 7GB wired at around 5.5GB ARC size.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: resilvering takes ages on 8.2 (amd64 from 18.04.2011)

2011-05-06 Thread Artem Belevich
On Fri, May 6, 2011 at 5:43 AM, Holger Kipp  wrote:
> Resilvering a disk in raidz2 ZFS is taking ages. Any ideas? I had replaced a 
> different disk this morning (da7) and it took only about 1 hour alltogether. 
> Any ideas? Or did I do something very wrong (tm)?

Don't believe everything you see. On my pool I often see scrub time
estimates of few hundred hours even though it always takes about 6
hours.
Once ZFS is done thrashing disks while it scrubs metadata and starts
doing bulk data transfer, estimates would eventually converge to a
sensible value.

> Disks   da0   da1   da2   da3   da4   da5   da6    328004 wire
> KB/t   0.67  4.92  4.68  4.46  4.53  4.66  4.91     84728 act
> tps     151   138   150   151   146   149   140     39788 inact
> MB/s   0.10  0.66  0.69  0.66  0.65  0.68  0.67      1520 cache
> %busy   101    41    44    42    41    40    39   7647296 free

Yup. It does look like your pool is in "thrashing" stage -- lots of seeking.

--Artem

>
> Best regards,
> Holger
>
> 
>
>  8.2-STABLE FreeBSD 8.2-STABLE #12: Mon Apr 18 12:48:56 CEST 2011
>
> # zpool status
>  pool: tank
>  state: DEGRADED
> status: One or more devices is currently being resilvered.  The pool will
>        continue to function, possibly in a degraded state.
> action: Wait for the resilver to complete.
>  scrub: resilver in progress for 0h21m, 1.10% done, 32h19m to go
> config:
>
>        NAME           STATE     READ WRITE CKSUM
>        tank           DEGRADED     0     0     0
>          raidz2       DEGRADED     0     0     0
>            replacing  DEGRADED     0     0     0
>              da0/old  OFFLINE      0     0     0
>              da0      ONLINE       0     0     0  158M resilvered
>            da1        ONLINE       0     0     0
>            da2        ONLINE       0     0     0
>            da7        ONLINE       0     0     0
>            da3        ONLINE       0     0     0
>            da4        ONLINE       0     0     0
>            da5        ONLINE       0     0     0
>            da6        ONLINE       0     0     0
>
> errors: No known data errors
>
>
>
> -
>    1 users    Load  0.00  0.00  0.00                  May  6 14:37
>
> Mem:KB    REAL            VIRTUAL                       VN PAGER   SWAP PAGER
>        Tot   Share      Tot    Share    Free           in   out     in   out
> Act  119320   18980  1362144    25756 7648824  count
> All  344920   23296 1075218k    57012          pages
> Proc:                                                            Interrupts
>  r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Flt        cow    4765 total
>             76      9171   11  130  766  952             zfod        atkbd0 1
>                                                          ozfod       ata0 
> irq14
>  0.7%Sys   0.3%Intr  0.0%User  0.0%Nice 99.0%Idle        %ozfod     1 uhci0 16
> |    |    |    |    |    |    |    |    |    |    |       daefr       uhci1 17
>                                                          prcfr     1 twe0 
> irq24
>                                         7 dtbuf     1106 totfr  1999 cpu0: 
> time
> Namei     Name-cache   Dir-cache    206492 desvn          react   763 isp0 256
>   Calls    hits   %    hits   %      3598 numvn          pdwak     2 em0 
> irq257
>      12      11  92                  1382 frevn          pdpgs  1999 cpu1: 
> time
>                                                          intrn
> Disks   da0   da1   da2   da3   da4   da5   da6    328004 wire
> KB/t   0.67  4.92  4.68  4.46  4.53  4.66  4.91     84728 act
> tps     151   138   150   151   146   149   140     39788 inact
> MB/s   0.10  0.66  0.69  0.66  0.65  0.68  0.67      1520 cache
> %busy   101    41    44    42    41    40    39   7647296 free
>                                                    60464 buf
>
>
> --
>
>
> Holger Kipp
> Diplom-Mathematiker
> Senior Consultant
>
>
>                [alogis]
>
> Tel.    :       +49 30 436 58 114
> Fax     :       +49 30 436 58 214
> Mobil   :       +49 178 36 58 114
>
> E-Mail  :       holger.k...@alogis.com
>                alogis AG
> Alt-Moabit 90 B
> D- 10559 Berlin
>
> Web: www.alogis.com
>
>
>
> alogis AG
> Sitz/Registergericht: Berlin/AG Charlottenburg, HRB 71484
> Vorstand: Arne Friedrichs, Joern Samuelson
> Aufsichtsratsvorsitzender: Reinhard Mielke
>
>
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: PCIe SATA HBA for ZFS on -STABLE

2011-05-31 Thread Artem Belevich
On Tue, May 31, 2011 at 7:31 AM, Freddie Cash  wrote:
> On Tue, May 31, 2011 at 5:48 AM, Matt Thyer  wrote:
>
>> What do people recommend for 8-STABLE as a PCIe SATA II HBA for someone
>> using ZFS ?
>>
>> Not wanting to break the bank.
>> Not interested in SATA III 6GB at this time... though it could be useful if
>> I add an SSD for... (is it ZIL ?).
>> Can this be added at any time ?
>>
>> The main issue is I need at least 10 ports total for all existing drives...
>> ZIL would require 11 so ideally we are talking a 6 port HBA.
>>
>
> SuperMicro AOC-USAS2-L8i works exceptionally well.  These are 8-port HBAs
> using the LSI1068 chipset, supported by the mpt(4) driver.  Support 3 Gpbs
> SATA/SAS, using multi-lane cables (2 connectors on the card, each connector
> supports 4 SATA ports), hot-plug, hot-swap.
>
> These are UIO cards, so the bracket that comes with it doesn't work with
> normal cases (the bracket is on the wrong side of the card; they're made for
> SuperMicro's UIO-based motherboards).  However, these are normal PCIe cards
> and work in any PCIe slot.  You either have to remove the bracket, or you
> can purchase separate brackets online.
>
> These cards are recommended on the zfs-discuss mailing list.  They are only
> ~$120 CDN at places like cdw.ca and newegg.ca.

+1 for LSI1068(e) controller + mpt driver. It's cheap and it works.
Those LSI controllers are often hiding behind other brands. SuperMicro
mentioned above is one. Intel would be another -- search for Intel
SASUC8I. Tyan also sells one as TYAN P3208SR. LSI-branded controllers
tend to be a bit more expensive than rebranded ones, though
functionality is the same and you can often cross-flash firmware.

Keep in mind that HBAs based on LSI1068(e) can't handle hard drives
larger than 2TB and will truncate larger drive capacity to 2TB.

As for the SSD, you may want to hook them up to on-board SATA ports.
In my not-very scientific benchmark Intel's X25-M SSD connected to
on-board SATA port on ICH10 was able to deliver ~20% more reads/sec
than the same SSD connected to LSI1068 based controller.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Fileserver panic - FreeBSD 8.1-stable and zfs

2011-06-02 Thread Artem Belevich
On Thu, Jun 2, 2011 at 12:31 PM, Torfinn Ingolfsen
 wrote:
> FYI, in case it is interesting
> my zfs fileserver[1] just had a panic: (transcribed from screen)
> panic: kmem_malloc(131072): kmem_map too small: 1324613632 total allocated
> cpuid = 1

It's probably one of the most frequently reported issues with ZFS.
While things got quite a bit better lately, you still need to bump up
kernel VM size with vm.kmem_size tunable. I typically set vm.kmem_size
tunable to ~2x physical memory size.

> The machine runs:
> root@kg-f2# uname -a
> FreeBSD kg-f2.kg4.no 8.1-STABLE FreeBSD 8.1-STABLE #4: Fri Oct 29 12:11:48 
> CEST 2010     r...@kg-f2.kg4.no:/usr/obj/usr/src/sys/GENERIC  amd64

In general you may want to update to the latest -stable. There were a
lot of ZFS fixes committed.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: recover file from destroyed zfs snapshot - is it possible?

2011-06-09 Thread Artem Belevich
On Thu, Jun 9, 2011 at 1:00 PM, Greg Bonett  wrote:
> Hi all,
> I know this is a long shot, but I figure it's worth asking. Is there
> anyway to recover a file from a zfs snapshot which was destroyed? I know
> the name of the file and a unique string that should be in it. The zfs
> pool is on geli devices so I can't dd the raw device and look for it.
> Any suggestions?
>
> Thanks for the help.

Theoretically it may be possible. Practically it will not be trivial.

ZFS state at any given point in time is determined by it's uberblock.
ZFS keeps number of previous uberblocks (and thus -- ZFS states). If
you're familiar with ZDB and with ZFS filesystem layout, you may be
able to use information in the last saved uberblock before the
snapshot was nuked and, with some luck, would be able to find the data
tat was in your file. That, however, relies on an optimistic
assumption that a) such uberblock is still around and, b) appropriate
disk blocks have not been reused by more recent transactions.I believe
recent ZFS versions (v28 should qualify) make an effort to keep recent
transaction groups in consistent state to improve chances of recovery
with "zpool import -F" in case disks lied about cache flushes.

In case you do want to dig in, ZFS on-disk data structures are documented here:
http://opensolaris.org/os/community/zfs/docs/ondiskformat0822.pdf

This script (WARNING -- DESCTRUCTIVE) should allow you to roll back
ZFS state to an earlier point in time on single-device pools:
http://www.solarisinternals.com/wiki/index.php/ZFS_forensics_scrollback_script

Good luck.

--Artem

P.S. You wouldn't happen to have backup of your file,  would you? That
would make things so much easier. :-)
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: recover file from destroyed zfs snapshot - is it possible?

2011-06-09 Thread Artem Belevich
On Thu, Jun 9, 2011 at 3:43 PM, Greg Bonett  wrote:
> One question though, you say it's necessary that "appropriate
>  disk blocks have not been reused by more recent transactions"
> Is it not possible for me to just read all the disk blocks looking for
> the filename and string it contained? How big are disk blocks, is it
> possible the whole 16k file is on one or a few contiguous blocks?

Whether all your data is in a single block would depend on how large
the file is and how exactly it was written out. If it's something
that's been written all at once, chances are that it will end up
located sequentially somewhere on disk. If the file was written more
than once, you may find several file variants. Telling which one is
the most recent one without parsing ZFS metadata would be up to you.

Another question is whether the content would be easy to identify. If
you have compression turned on, then simple grepping for content will
not work.

So, if your pool does not have compression on, you know what was in
the file and are reasonably sure that you will be able to tell whether
the data you recover is consistent or not, then by all means start
with searching for this content in a raw data. Default ZFS block size
is 128K, so for the small file written at once there's a good chance
that it's been written out in a single chunk.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: "log_sysevent: type 19 is not implemented" messages during boot

2011-06-17 Thread Artem Belevich
On Fri, Jun 17, 2011 at 6:06 AM, Bartosz Stec  wrote:
> W dniu 2011-06-11 18:43, Sergey Kandaurov pisze:
>>
>> On 11 June 2011 20:01, Rolf Nielsen  wrote:
>>>
>>> Hi all,
>>>
>>> After going from 8.2-RELEASE to 8-STABLE (to get ZFS v28), I get
>>>
>>> log_sysevent: type 19 is not implemented
>>>
>>> exactly 20 times during boot. What does that message mean? Need I worry
>>> about it? And even if it's harmless, it annoys me, so can I get rid of
>>> it,
>>> and if so, how?
>>>
>> Hi.
>> This warning indeed came with ZFS v28 recently merged to 8-STABLE.
>> AFAIK it's rather harmless. It was silenced in current recently (see svn
>> r222343), and the fix is expected to be merged to 8-STABLE soon.
>>
> Are you sure that it's harmless? It appeared for me as an evidence of pool
> breakage. I had these messages when I ran any zpool command on broken pool.
> I do't havesingle one after pool is fixed. Here's my thread on freebsd-fs :
> http://lists.freebsd.org/pipermail/freebsd-fs/2011-June/011639.html

Indeed. Same story here. Last week I've got my pool corrupted due to a
bad memory stick.  Then I've got tons of thse "log_sysevent: type
19..." messages. After re-importing the pool with -F the messages went
away. So, from where I stand, those messages do seem to correlate with
a problem and should not be hushed by default.

Instead, they should probably be converted to something easier to
understand by humans. Something like "Oops. I do hope you had a backup
of this pool." should do the trick.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: disable 64-bit dma for one PCI slot only?

2011-07-19 Thread Artem Belevich
On Tue, Jul 19, 2011 at 6:31 AM, John Baldwin  wrote:
> The only reason it might be nice to stick with two fields is due to the line
> length (though the first line is over 80 cols even in the current format).  
> Here
> are two possible suggestions:
>
> old:
>
> hostb0@pci0:0:0:0:      class=0x06 card=0x20108086 chip=0x01008086 
> rev=0x09 hdr=0x00
> pcib1@pci0:0:1:0:       class=0x060400 card=0x20108086 chip=0x01018086 
> rev=0x09 hdr=0x01
> pcib2@pci0:0:1:1:       class=0x060400 card=0x20108086 chip=0x01058086 
> rev=0x09 hdr=0x01
> none0@pci0:0:22:0:      class=0x078000 card=0x47428086 chip=0x1c3a8086 
> rev=0x04 hdr=0x00
> em0@pci0:0:25:0:        class=0x02 card=0x8086 chip=0x15038086 
> rev=0x04 hdr=0x00
> ...
>
> A)
>
> hostb0@pci0:0:0:0:      class=0x06 vendor=0x8086 device=0x0100 
> subvendor=0x8086 subdevice=0x2010 rev=0x09 hdr=0x00
> pcib1@pci0:0:1:0:       class=0x060400 vendor=0x8086 device=0x0101 
> subvendor=0x8086 subdevice=0x2010 rev=0x09 hdr=0x01
> pcib2@pci0:0:1:1:       class=0x060400 vendor=0x8086 device=0x0105 
> subvendor=0x8086 subdevice=0x2010 rev=0x09 hdr=0x01
> none0@pci0:0:22:0:      class=0x078000 vendor=0x8086 device=0x1c3a 
> subvendor=0x8086 subdevice=0x4742 rev=0x04 hdr=0x00
> em0@pci0:0:25:0:        class=0x02 vendor=0x8086 device=0x1503 
> subvendor=0x8086 subdevice=0x rev=0x04 hdr=0x00
> ...
>
> B)
>
> hostb0@pci0:0:0:0:      class=0x06 devid=0x8086:0100 subid=0x8086:2010 
> rev=0x09 hdr=0x00
> pcib1@pci0:0:1:0:       class=0x060400 devid=0x8086:0101 subid=0x8086:2010 
> rev=0x09 hdr=0x01
> pcib2@pci0:0:1:1:       class=0x060400 devid=0x8086:0105 subid=0x8086:2010 
> rev=0x09 hdr=0x01
> none0@pci0:0:22:0:      class=0x078000 devid=0x8086:1c3a subid=0x8086:4742 
> rev=0x04 hdr=0x00
> em0@pci0:0:25:0:        class=0x02 devid=0x8086:1503 subid=0x8086: 
> rev=0x04 hdr=0x00
> ...
>
> I went with vendor word first for both A) and B) as in my experience that is
> the more common ordering in driver tables, etc.

Do we need to print (class|devid|device|subvendor|etc.)= on every
line? IMHO they belong to a header line. Something like this:

Driver Handle   ClassVnd:Dev Sub Vnd:Dev Rev  Hdr
--
hostb0 pci0:0:0:0   0x06 0x8086:0100 0x8086:2010 0x09 0x00
pcib1  pci0:0:1:0   0x060400 0x8086:0101 0x8086:2010 0x09 0x01
pcib2  pci0:0:1:1   0x060400 0x8086:0105 0x8086:2010 0x09 0x01
none0  pci0:0:22:0  0x078000 0x8086:1c3a 0x8086:4742 0x04 0x00
em0pci0:0:25:0  0x02 0x8086:1503 0x8086: 0x04 0x00

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: can not boot from RAIDZ with 8-STABLE

2011-08-17 Thread Artem Belevich
2011/8/17 Daniel Kalchev :
> On 17.08.11 16:35, Miroslav Lachman wrote:
>>
>> I tried mfsBSD installation on Dell T110 with PERC H200A and 4x 500GB SATA
>> disks. If I create zpool with RAIDZ, the boot immediately hangs with
>> following error:
>>
> May be it that the BIOS does not see all drives at boot?

Indeed. On one of my systems BIOS only allows access to the first four
HDDs in the BIOS' boot priority list. What's especially annoying is
that BIOS keep rearranging boot list every time new device is added or
removed or if SATA controller card is moved to another slot. Every
time it happens I have to go back and rearrange the drives so that my
RAIDZ drives are on top of the list.

If you can boot off CD or USB how many drives does bootloader report
just before it gets to the menu?

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: can not boot from RAIDZ with 8-STABLE

2011-08-17 Thread Artem Belevich
On Wed, Aug 17, 2011 at 12:40 PM, Miroslav Lachman <000.f...@quip.cz> wrote:
> Thank you guys, you are right. The BIOS provides only 1 disk to the loader!
> I checked it from loader prompt by lsdev (booted from USB external HDD).
>
> So I will try to make a small zpool mirror for root and boot (if ZFS mirror
> can be made of 4 providers instead of two) and the rest will be in RAIDZ.
>
> If that fails, I will go my old way with internal USB flash disk with UFS
> for booting and RAIDZ of 4 disks for storage as I did it few years ago with
> 7.0 or 7.1.

You seem to be booting from disks attached to some sort of add-on
card. Sometimes those have per-disk 'bootable' option in their own
extension ROM. You may investigate yours. Perhaps all you need to do
is just tweak controller settings.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: "High" cpu usage when using ZFS cache device

2011-10-11 Thread Artem Belevich
On Tue, Oct 11, 2011 at 2:34 AM, Steven Hartland
 wrote:
> - Original Message - From: "Mickaël Maillot"
> 
>
>
>> same problem here after ~ 30 days with a production server and 2 SSD Intel
>> X25M as L2.
>> so we update and reboot the 8-STABLE server every month.
>
> Old thread but also seeing this on 8.2-RELEASE so looks like this
> may still be an issue.
>
> In our case this machine was running mysql with 2 x 60GB cache
> SSD's. I checked for usage when the machine was idle just before
> reboot to fix and the l2arc thread was still using 100% of a core
> even with no disk access happening.
>
> Was a PR ever raised for this?

No, there was no PR.

L2arc CPU hogging after ~24 days was fixed in r218180 in -HEAD and was
MFC'ed to 8-stable in r218429 early in February '11.

If you're using 8-RELEASE, upgrading to 8-STABLE would be something to
consider as there were other ZFS-related issues fixed there that
didn't make it into -RELEASE.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: "High" cpu usage when using ZFS cache device

2011-10-11 Thread Artem Belevich
On Tue, Oct 11, 2011 at 10:21 AM, Steven Hartland
 wrote:
> Thanks for the confirmation there Artem, we currently can't use 8-STABLE
> due to the serious routing issue, seem like every packet generates a
> RTM_MISS routing packet to be sent, which causes high cpu load.
>
> Thread: "Re: serious packet routing issue causing ntpd high load?"

It's a bummer. If you can build your own kernel cherry-picking
following revisions may help with long-term stability:
r218429 - fixes original overflow causing CPU hogging by l2arc feeding
thread. It will keep you up and running for longer until you hit
another overflow. If I remember correctly, it will hit you around
100-days of uptime.

Following changes were done after ZFSv28 import, so they will not
apply directly to 8-RELEASE, but the idea applies to ZFSv15 as well.
The changes should be easy to backport.

r223412 - avoids more early overflows in time routines.
r224647 - avoids time overflow in TXG processing.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: "High" cpu usage when using ZFS cache device

2011-10-11 Thread Artem Belevich
On Tue, Oct 11, 2011 at 1:17 PM, Steven Hartland
 wrote:
>> It's a bummer. If you can build your own kernel cherry-picking
>> following revisions may help with long-term stability:
>> r218429 - fixes original overflow causing CPU hogging by l2arc feeding
>> thread. It will keep you up and running for longer until you hit
>> another overflow. If I remember correctly, it will hit you around
>> 100-days of uptime.
>
> This is the main issue we have been keeping an eye out for as we've
> seen it several times, we don't have too many machines with L2ARC so
> was surprised to see this with just 26 days up time in this case.
>
>> Following changes were done after ZFSv28 import, so they will not
>> apply directly to 8-RELEASE, but the idea applies to ZFSv15 as well.
>> The changes should be easy to backport.
>>
>> r223412 - avoids more early overflows in time routines.
>> r224647 - avoids time overflow in TXG processing.
>
> We already maintain a custom set of patches for our 8.2 installs so
> shouldn't be an issue to add these so thanks for the info :)
>
> With all three should we expect no uptime overflow issues or still are
> we still going to look at ~100 day reboots required?

Those should get you through the known (to me) sources of LBOLT and
clock_t related overflows.
Can't say whether you'll run into some other problems.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 8.1 xl + dual-speed Netgear hub = yoyo

2011-10-23 Thread Artem Belevich
On Sun, Oct 23, 2011 at 8:54 AM, Matthew Seaman
 wrote:
> On the other hand, for anything Gb capable nowadays connected to a
> switch autoneg pretty much just works -- em(4), bce(4) are excellent,
> and even re(4) gets this stuff right.

There are still cases of incompatibility. I've got a cheap D-Link GigE
switch that consistently autonegotiates bge(4) devices to 100/FD
unless I force mastership on the NIC end with 'ifconfig bge0 link0'.
em(4) device negotiates with the switch just fine, though.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ld: kernel.debug: Not enough room for program headers (allocated 5, need 6)

2011-11-17 Thread Artem Belevich
On Thu, Nov 17, 2011 at 6:41 AM, David Wolfskill  wrote:
> MAKE=/usr/obj/usr/src/make.i386/make sh /usr/src/sys/conf/newvers.sh GENERIC
> cc -c -O -pipe  -std=c99 -g -Wall -Wredundant-decls -Wnested-externs 
> -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline 
> -Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc  -I. 
> -I/usr/src/sys -I/usr/src/sys/contrib/altq -D_KERNEL 
> -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common 
> -finline-limit=8000 --param inline-unit-growth=100 --param 
> large-function-growth=1000  -mno-align-long-strings 
> -mpreferred-stack-boundary=2  -mno-mmx -mno-3dnow -mno-sse -mno-sse2 
> -mno-sse3 -ffreestanding -fstack-protector -Werror  vers.c
> linking kernel.debug
> ld: kernel.debug: Not enough room for program headers (allocated 5, need 6)
> ld: final link failed: Bad value
> *** Error code 1

> I'm rather left wondering "room" where, precisely?

Room for the program headers at the beginning of the ELF file. Look at
sys/conf/ldscript.* and search for SIZEOF_HEADERS.
One way to work around the issue is to replace SIZEOF_HEADERS with a
fixed value. Try 0x1000.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Using mmap(2) with a hint address

2011-12-20 Thread Artem Belevich
Hi,

On Tue, Dec 20, 2011 at 7:03 AM, Andriy Gapon  wrote:
> on 20/12/2011 16:31 Ganael LAPLANCHE said the following:
>> On Tue, 20 Dec 2011 15:02:01 +0100 (CET), Ganael LAPLANCHE wrote
>>
>>> But there is still something I don't understand : on the Linux
>>> machine where I ran my test program, the current RLIMIT_DATA
>>> is set to 0x/0x and I can manage to mmap at
>>> address 0x2000. If I set the same limit on FreeBSD, I
>>> won't get the mapping at 0x2000. So, there *is* a
>>> difference of behaviour between the two systems, but I don't
>>> understand why.
>>
>> Well, in fact, two things remain not very clear for me :
>>
>> - Why are mmap()s performed *after* data segment ?
>>   => It seems they can go within, on GNU/Linux and NetBSD.
>>
>> - Why do we have such a default value for datasize (8.2, amd64) :
>>
>> $ limits
>> Resource limits (current):
>>   cputime              infinity secs
>>   filesize             infinity kB
>>   datasize             33554432 kB
>>
>> this is HUGE !
>
> Just a guess - this might be some sort of optimization to keep virtual address
> range of dynamic allocations untouched by unrelated mmap calls.  Not sure if
> that's so and how useful could that be.

Something like that. In the past heap allocator used to get memory
from system via sbrk(). It still may do so, if you set
MALLOC_OPTIONS=D. The problem is that sbrk() can't advance past an
area used by something else (i.e. mmaped region) so kernel makes an
effort to leave a lot of unused address space which sbrk() may claim
later on. These days malloc() by default uses mmap, so if you don't
force it to use sbrk() you can probably lower MAXDSIZE and let kernel
use most of address space for hinted mmaps.

That said, unless you use MAP_FIXED, malloc is not guaranteed to pay
attention to hints and the app must be able to deal with that. FreeBSD
kernel behavior is just one possible scenario that may affect mmap
behavior. Behavior may also change between architectures, or due to
preceeding mmaps (think of dynamic linker mapping in shared
libraries). If an application relies on hints having effect without
MAP_FIXED, it's the app that needs fixing, IMHO.

--Artem

> svn log / svn annotate of the file may reveal more details.
>
> --
> Andriy Gapon
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Performance problems with pagedaemon

2012-01-02 Thread Artem Belevich
On Mon, Jan 2, 2012 at 5:41 AM, Victor Balada Diaz  wrote:
...
> System wide totals computed every five seconds: (values in kilobytes)
> ===
> Processes:              (RUNQ: 2 Disk Wait: 0 Page Wait: 0 Sleep: 51)
> Virtual Memory:         (Total: 1098017100K, Active 24065448K)
> Real Memory:            (Total: 21157668K Active 20971144K)
> Shared Virtual Memory:  (Total: 27740K Active: 8632K)
> Shared Real Memory:     (Total: 6128K Active: 4928K)
> Free Memory Pages:      315636K

On a system with 24GB of RAM you seem to have almost all of it active.
It appears that you're simply on the edge of running out of memory and
thus page daemon wakes up constantly trying to find more pages...

...
> Top:
>
> last pid: 24777;  load averages:  3.26,  4.07,  4.49                          
>                                                                               
>      up 34+19:43:58  14:32:05
> 66 processes:  5 running, 61 sleeping
> CPU:  0.2% user,  0.0% nice, 37.6% system,  0.0% interrupt, 62.2% idle
> Mem: 18G Active, 1908M Inact, 3008M Wired, 73M Cache, 2465M Buf, 232M Free
> Swap: 4096M Total, 4096M Free
>
>  PID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
>  1059 mysql      22  44    0 21004M 19741M ucond   3  71.2H 124.51% mysqld

Mysql uses more than 20G of RAM. You may want to tune it down a bit so
that there is a bit of free RAM around.

Page daemon is trying to maintain v_free_target + v_cache_min.

>vm.v_free_target: 161771
>vm.v_cache_min: 161771

In your case that would be about 1.2GB. If 'v_free_count +
v_cache_count' are below that page daemon will periodically wake up
and will start scanning active/inactive lists trying to find the pages
it could use. In your case, when most of the memory is in active use,
page daemon's job will be almost pointless and would just waste CPU
time.

On large memory systems default tuning for v_free/cache_min/target is
probably somewhat conservative. You may try setting them somewhat
lower via sysctl and see if you can find an equilibrium with mysql
happy, pagedaemon sleeping and the system up and running. The danger
of tuning these parameters too low is that is you don't have enough
memory available for allocation without having to sleep, things will
start falling apart and will eventually hang or crash your box.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs arc and amount of wired memory

2012-02-08 Thread Artem Belevich
On Wed, Feb 8, 2012 at 4:28 PM, Jeremy Chadwick
 wrote:
> On Thu, Feb 09, 2012 at 01:11:36AM +0100, Miroslav Lachman wrote:
...
>> ARC Size:
>>          Current Size:             1769 MB (arcsize)
>>          Target Size (Adaptive):   512 MB (c)
>>          Min Size (Hard Limit):    512 MB (zfs_arc_min)
>>          Max Size (Hard Limit):    3584 MB (zfs_arc_max)
>>
>> The target size is going down to the min size and after few more
>> days, the system is so slow, that I must reboot the machine. Then it
>> is running fine for about 107 days and then it all repeat again.
>>
>> You can see more on MRTG graphs
>> http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/
>> You can see links to other useful informations on top of the page
>> (arc_summary, top, dmesg, fs usage, loader.conf)
>>
>> There you can see nightly backups (higher CPU load started at
>> 01:13), otherwise the machine is idle.
>>
>> It coresponds with ARC target size lowering in last 5 days
>> http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_arcstats_size.html
>>
>> And with ARC metadata cache overflowing the limit in last 5 days
>> http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_vfs_meta.html
>>
>> I don't know what's going on and I don't know if it is something
>> know / fixed in newer releases. We are running a few more ZFS
>> systems on 8.2 without this issue. But those systems are in
>> different roles.
>
> This sounds like the... damn, what is it called... some kind of internal
> "counter" or "ticks" thing within the ZFS code that was discovered to
> only begin happening after a certain period of time (which correlated to
> some number of days, possibly 107).  I'm sorry that I can't be more
> specific, but it's been discussed heavily on the lists in the past, and
> fixes for all of that were committed to RELENG_8.  I wish I could
> remember the name of the function or macro or variable name it pertained
> to, something like LTHAW or TLOCK or something like that.  I would say
> "I don't know why I can't remember", but I do know why I can't remember:
> because I gave up trying to track all of these problems.
>
> Does someone else remember this issue?  CC'ing Martin who might remember
> for certain.

It's LBOLT. :-)

And there was more than one related integer overflow. One of them
manifested itself as L2ARC feeding thread hogging CPU time after about
a month of uptime. Another one caused issue with ARC reclaim after 107
days. See more details in this thread:

http://lists.freebsd.org/pipermail/freebsd-fs/2011-May/011584.html

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Can't read a full block, only got 8193 bytes.

2012-02-19 Thread Artem Belevich
On Sat, Feb 18, 2012 at 10:10 PM, Ask Bjørn Hansen  wrote:
> Hi everyone,
>
> We're recycling an old database server with room for 16 disks as a backup 
> server (our old database servers had 12-20 15k disks; the new ones one or two 
> SSDs and they're faster).
>
> We have a box running FreeBSD 8.2 with 7 disks in a ZFS raidz2 (and a spare). 
>  It's using an older 3ware card with all the disks (2TB WD green "ears" ones) 
> setup as a "single" unit on the 3ware controller and though slow is basically 
> working great.  We have a small program to smartly purge old snapshots that I 
> wrote after a year and tens of thousands of snapshots: 
> https://github.com/abh/zfs-snapshot-cleaner
>
> The new box is running 9.0 with a 3ware 9690SA-4I4E card with the latest 
> firmware (4.10.00.024).  We're using Seagate 3TB barracuda disks (big and 
> cheap; good for backups).
>
> Now for the problem: When running bonnie++ we get a few ZFS checksum errors 
> and (weirder) we get this error from bonnie:
>
> "Can't read a full block, only got 8193 bytes."

That's probably just a side effect of ZFS checksum errors. ZFS will
happily read the file until it hits a record with checksum. If
redundant info is available (raidz or mirror), ZFS will attempt to
recover your data. If there's no redundancy you will get read error.
If you do "zpool status -v" you should see list of files affected by
corruption.

>
> This seems to only be when testing a single ZFS disk or a UFS partition.  
> Testing a raidz1 we just get checksum errors noted in zpool status, but no 
> errors reading (though read speeds are ~10MB/second across four disks -- 
> writing sequentially was ~230MB/second).
>
> Any ideas where to start look?

You need to figure out why you're getting checksum errors. Alas
there's probably no easy way to troubleshoot it. The issue could be
hardware related and possible culprits may include bad RAM, bad SATA
cables, quirks of particular firmware revision on disk controller
and/or hard drive.

> Our best guess is that the 3ware controller can't play nicely with the disks; 
> we're planning to try some older/smaller disks on Monday and then trying the 
> same system and disks with Linux to see if the 3ware driver there works 
> differently.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zpool - low speed write

2010-08-05 Thread Artem Belevich
On Wed, Aug 4, 2010 at 9:47 PM, Alex V. Petrov  wrote:
...
>> > vfs.zfs.cache_flush_disable=1
>> > vfs.zfs.zil_disable=1
>>
>> I question both of these settings, especially the latter.  Please remove
>> them both and re-test your write performance.
>
> I removed all settings of zfs.
> Now it default.
>

ZFS would throttle writes if it thinks that not enough memory is
available. Did you by any chance tinker with VM parameters, too? Could
you post output of following commands?

sysctl vm |grep kmem
sysctl vfs.zfs
sysctl kstat.zfs  (before and after after you do some of your write speed tests)

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zpool - low speed write

2010-08-07 Thread Artem Belevich
On Sat, Aug 7, 2010 at 4:51 AM, Ivan Voras  wrote:
> On 5.8.2010 6:47, Alex V. Petrov wrote:
>
>> camcontrol identify ada2
>> pass2:  ATA-8 SATA 2.x device
>
> Aren't those 4k sector drives?

EADS drives use regular 512-byte sectors AFAIK. It's EA*R*S models
that use 4K sectors.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: kernel MCA messages

2010-08-24 Thread Artem Belevich
IMHO the key here is whether hardware is broken or not. The only case
where correctable ECC errors are OK is when a bit gets flipped by a
high-energy particle. That's a normal but fairly rare event. If you
get bit flips often enough that you can recall details of more then
one of them on the same hardware, my guess would be that you're
dealing with something else -- bad/marginal memory, signal integrity
issues, power issues, overheating... The list continues.. In all those
cases hardware does *not* work correctly. Whether you can (or want to)
keep running stuff on the hardware that is broken is another question.

--Artem



On Tue, Aug 24, 2010 at 1:15 AM, Andriy Gapon  wrote:
> on 24/08/2010 09:14 Ronald Klop said the following:
>>
>> A little off topic, but what is 'a low rate of corrected ECC errors'? At work
>> one machine has them like ones per day, but runs ok. Is ones per day much?
>
> That's up to your judgment.  It's like after how many remapped sectors do you
> replace HDD.
> You may find this interesting:
> http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf
>
> --
> Andriy Gapon
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Still getting kmem exhausted panic

2010-09-28 Thread Artem Belevich
On Tue, Sep 28, 2010 at 3:22 PM, Andriy Gapon  wrote:
> BTW, have you seen my posts about UMA and ZFS on hackers@ ?
> I found it advantageous to use UMA for ZFS I/O buffers, but only after 
> reducing
> size of per-CPU caches for the zones with large-sized items.
> I further modified the code in my local tree to completely disable per-CPU
> caches for items > 32KB.

Do you have updated patch disabling per-cpu caches for large items?
I've just rebuilt FreeBSD-8 with your uma-2.diff (it needed r209050
from -head to compile) and so far things look good. I'll re-enable UMA
for ZFS and see how it flies in a couple of days.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs send/receive: is this slow?

2010-09-29 Thread Artem Belevich
On Wed, Sep 29, 2010 at 11:04 AM, Dan Langille  wrote:
> It's taken about 15 hours to copy 800GB.  I'm sure there's some tuning I
> can do.
>
> The system is now running:
>
> # zfs send storage/bac...@transfer | zfs receive storage/compressed/bacula

Try piping zfs data through mbuffer (misc/mbuffer in ports). I've
found that it does help a lot to smooth out data flow and increase
send/receive throughput even when send/receive happens on the same
host. Run it with a buffer large enough to accommodate few seconds
worth of write throughput for your target disks.

Here's an example:
http://blogs.everycity.co.uk/alasdair/2010/07/using-mbuffer-to-speed-up-slow-zfs-send-zfs-receive/

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs send/receive: is this slow?

2010-10-01 Thread Artem Belevich
Hmm. It did help me a lot when I was replicating ~2TB worth of data
over GigE. Without mbuffer things were roughly in the ballpark of your
numbers. With mbuffer I've got around 100MB/s.

Assuming that you have two boxes connected via ethernet, it would be
good to check that nobody generates PAUSE frames. Some time back I've
discovered that el-cheapo switch I've been using for some reason could
not keep up with traffic bursts and generated tons of PAUSE frames
that severely limited throughput.

If you're using Intel adapters, check xon/xoff counters in "sysctl
dev.em.0.mac_stats". If you see them increasing, that may explain slow
speed.
If you have a switch between your boxes, try bypassing it and connect
boxes directly.

--Artem



On Fri, Oct 1, 2010 at 11:51 AM, Dan Langille  wrote:
>
> On Wed, September 29, 2010 2:04 pm, Dan Langille wrote:
>> $ zpool iostat 10
>>                capacity     operations    bandwidth
>> pool         used  avail   read  write   read  write
>> --  -  -  -  -  -  -
>> storage     7.67T  5.02T    358     38  43.1M  1.96M
>> storage     7.67T  5.02T    317    475  39.4M  30.9M
>> storage     7.67T  5.02T    357    533  44.3M  34.4M
>> storage     7.67T  5.02T    371    556  46.0M  35.8M
>> storage     7.67T  5.02T    313    521  38.9M  28.7M
>> storage     7.67T  5.02T    309    457  38.4M  30.4M
>> storage     7.67T  5.02T    388    589  48.2M  37.8M
>> storage     7.67T  5.02T    377    581  46.8M  36.5M
>> storage     7.67T  5.02T    310    559  38.4M  30.4M
>> storage     7.67T  5.02T    430    611  53.4M  41.3M
>
> Now that I'm using mbuffer:
>
> $ zpool iostat 10
>               capacity     operations    bandwidth
> pool         used  avail   read  write   read  write
> --  -  -  -  -  -  -
> storage     9.96T  2.73T  2.01K    131   151M  6.72M
> storage     9.96T  2.73T    615    515  76.3M  33.5M
> storage     9.96T  2.73T    360    492  44.7M  33.7M
> storage     9.96T  2.73T    388    554  48.3M  38.4M
> storage     9.96T  2.73T    403    562  50.1M  39.6M
> storage     9.96T  2.73T    313    468  38.9M  28.0M
> storage     9.96T  2.73T    462    677  57.3M  22.4M
> storage     9.96T  2.73T    383    581  47.5M  21.6M
> storage     9.96T  2.72T    142    571  17.7M  15.4M
> storage     9.96T  2.72T     80    598  10.0M  18.8M
> storage     9.96T  2.72T    718    503  89.1M  13.6M
> storage     9.96T  2.72T    594    517  73.8M  14.1M
> storage     9.96T  2.72T    367    528  45.6M  15.1M
> storage     9.96T  2.72T    338    520  41.9M  16.4M
> storage     9.96T  2.72T    348    499  43.3M  21.5M
> storage     9.96T  2.72T    398    553  49.4M  14.4M
> storage     9.96T  2.72T    346    481  43.0M  6.78M
>
> If anything, it's slower.
>
> The above was without -s 128.  The following used that setting:
>
>  $ zpool iostat 10
>               capacity     operations    bandwidth
> pool         used  avail   read  write   read  write
> --  -  -  -  -  -  -
> storage     9.78T  2.91T  1.98K    137   149M  6.92M
> storage     9.78T  2.91T    761    577  94.4M  42.6M
> storage     9.78T  2.91T    462    411  57.4M  24.6M
> storage     9.78T  2.91T    492    497  61.1M  27.6M
> storage     9.78T  2.91T    632    446  78.5M  22.5M
> storage     9.78T  2.91T    554    414  68.7M  21.8M
> storage     9.78T  2.91T    459    434  57.0M  31.4M
> storage     9.78T  2.91T    398    570  49.4M  32.7M
> storage     9.78T  2.91T    338    495  41.9M  26.5M
> storage     9.78T  2.91T    358    526  44.5M  33.3M
> storage     9.78T  2.91T    385    555  47.8M  39.8M
> storage     9.78T  2.91T    271    453  33.6M  23.3M
> storage     9.78T  2.91T    270    456  33.5M  28.8M
>
>
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs send/receive: is this slow?

2010-10-01 Thread Artem Belevich
On Fri, Oct 1, 2010 at 3:49 PM, Dan Langille  wrote:
> FYI: this is all on the same box.

In one of the previous emails you've used this command line:
> # mbuffer -s 128k -m 1G -I 9090 | zfs receive

You've used mbuffer in network client mode. I assumed that you did do
your transfer over network.

If you're running send/receive locally just pipe the data through
mbuffer -- zfs send|mbuffer|zfs receive

--Artem

>
> --
> Dan Langille
> http://langille.org/
>
>
> On Oct 1, 2010, at 5:56 PM, Artem Belevich  wrote:
>
>> Hmm. It did help me a lot when I was replicating ~2TB worth of data
>> over GigE. Without mbuffer things were roughly in the ballpark of your
>> numbers. With mbuffer I've got around 100MB/s.
>>
>> Assuming that you have two boxes connected via ethernet, it would be
>> good to check that nobody generates PAUSE frames. Some time back I've
>> discovered that el-cheapo switch I've been using for some reason could
>> not keep up with traffic bursts and generated tons of PAUSE frames
>> that severely limited throughput.
>>
>> If you're using Intel adapters, check xon/xoff counters in "sysctl
>> dev.em.0.mac_stats". If you see them increasing, that may explain slow
>> speed.
>> If you have a switch between your boxes, try bypassing it and connect
>> boxes directly.
>>
>> --Artem
>>
>>
>>
>> On Fri, Oct 1, 2010 at 11:51 AM, Dan Langille  wrote:
>>>
>>> On Wed, September 29, 2010 2:04 pm, Dan Langille wrote:
>>>> $ zpool iostat 10
>>>>                capacity     operations    bandwidth
>>>> pool         used  avail   read  write   read  write
>>>> --  -  -  -  -  -  -
>>>> storage     7.67T  5.02T    358     38  43.1M  1.96M
>>>> storage     7.67T  5.02T    317    475  39.4M  30.9M
>>>> storage     7.67T  5.02T    357    533  44.3M  34.4M
>>>> storage     7.67T  5.02T    371    556  46.0M  35.8M
>>>> storage     7.67T  5.02T    313    521  38.9M  28.7M
>>>> storage     7.67T  5.02T    309    457  38.4M  30.4M
>>>> storage     7.67T  5.02T    388    589  48.2M  37.8M
>>>> storage     7.67T  5.02T    377    581  46.8M  36.5M
>>>> storage     7.67T  5.02T    310    559  38.4M  30.4M
>>>> storage     7.67T  5.02T    430    611  53.4M  41.3M
>>>
>>> Now that I'm using mbuffer:
>>>
>>> $ zpool iostat 10
>>>               capacity     operations    bandwidth
>>> pool         used  avail   read  write   read  write
>>> --  -  -  -  -  -  -
>>> storage     9.96T  2.73T  2.01K    131   151M  6.72M
>>> storage     9.96T  2.73T    615    515  76.3M  33.5M
>>> storage     9.96T  2.73T    360    492  44.7M  33.7M
>>> storage     9.96T  2.73T    388    554  48.3M  38.4M
>>> storage     9.96T  2.73T    403    562  50.1M  39.6M
>>> storage     9.96T  2.73T    313    468  38.9M  28.0M
>>> storage     9.96T  2.73T    462    677  57.3M  22.4M
>>> storage     9.96T  2.73T    383    581  47.5M  21.6M
>>> storage     9.96T  2.72T    142    571  17.7M  15.4M
>>> storage     9.96T  2.72T     80    598  10.0M  18.8M
>>> storage     9.96T  2.72T    718    503  89.1M  13.6M
>>> storage     9.96T  2.72T    594    517  73.8M  14.1M
>>> storage     9.96T  2.72T    367    528  45.6M  15.1M
>>> storage     9.96T  2.72T    338    520  41.9M  16.4M
>>> storage     9.96T  2.72T    348    499  43.3M  21.5M
>>> storage     9.96T  2.72T    398    553  49.4M  14.4M
>>> storage     9.96T  2.72T    346    481  43.0M  6.78M
>>>
>>> If anything, it's slower.
>>>
>>> The above was without -s 128.  The following used that setting:
>>>
>>>  $ zpool iostat 10
>>>               capacity     operations    bandwidth
>>> pool         used  avail   read  write   read  write
>>> --  -  -  -  -  -  -
>>> storage     9.78T  2.91T  1.98K    137   149M  6.92M
>>> storage     9.78T  2.91T    761    577  94.4M  42.6M
>>> storage     9.78T  2.91T    462    411  57.4M  24.6M
>>> storage     9.78T  2.91T    492    497  61.1M  27.6M
>>> storage     9.78T  2.91T    632    446  78.5M  22.5M
>>> storage     9.78T  2.91T    554    414  68.7M  21.8M
>>> storage     9.78T  2.91T    459    434  57.0M  31.4M
>>> storage     9.78T  2.91T    398    570  49.4M  32.7M
>>> storage     9.78T  2.91T    338    495  41.9M  26.5M
>>> storage     9.78T  2.91T    358    526  44.5M  33.3M
>>> storage     9.78T  2.91T    385    555  47.8M  39.8M
>>> storage     9.78T  2.91T    271    453  33.6M  23.3M
>>> storage     9.78T  2.91T    270    456  33.5M  28.8M
>>>
>>>
>>> ___
>>> freebsd-stable@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>>>
>>
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs send/receive: is this slow?

2010-10-01 Thread Artem Belevich
> As soon as I opened this email I knew what it would say.
>
>
> # time zfs send storage/bac...@transfer | mbuffer | zfs receive
> storage/compressed/bacula-mbuffer
> in @  197 MB/s, out @  205 MB/s, 1749 MB total, buffer   0% full
...
> Big difference.  :)

I'm glad it helped.

Does anyone know why sending/receiving stuff via loopback is so much
slower compared to pipe?

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs send/receive: is this slow?

2010-10-02 Thread Artem Belevich
I've just tested on my box and loopback interface does not seem to be
the bottleneck. I can easily push through ~400MB/s through two
instances of mbuffer.

--Artem



On Fri, Oct 1, 2010 at 7:51 PM, Sean  wrote:
>
> On 02/10/2010, at 11:43 AM, Artem Belevich wrote:
>
>>> As soon as I opened this email I knew what it would say.
>>>
>>>
>>> # time zfs send storage/bac...@transfer | mbuffer | zfs receive
>>> storage/compressed/bacula-mbuffer
>>> in @  197 MB/s, out @  205 MB/s, 1749 MB total, buffer   0% full
>> ..
>>> Big difference.  :)
>>
>> I'm glad it helped.
>>
>> Does anyone know why sending/receiving stuff via loopback is so much
>> slower compared to pipe?
>
>
> Up and down the entire network stack, in and out of TCP buffers at both 
> ends... might add some overhead, and other factors in limiting it.
>
> Increasing TCP buffers, and disabling delayed acks might help. Nagle might 
> also have to be disabled too. (delayed acks and nagle in combination can 
> interact in odd ways)
>
>
>>
>> --Artem
>> ___
>> freebsd-stable@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs send/receive: is this slow?

2010-10-03 Thread Artem Belevich
On Sun, Oct 3, 2010 at 6:11 PM, Dan Langille  wrote:
> I'm rerunning my test after I had a drive go offline[1].  But I'm not
> getting anything like the previous test:
>
> time zfs send storage/bac...@transfer | mbuffer | zfs receive
> storage/compressed/bacula-buffer
>
> $ zpool iostat 10 10
>               capacity     operations    bandwidth
> pool         used  avail   read  write   read  write
> --  -  -  -  -  -  -
> storage     6.83T  5.86T      8     31  1.00M  2.11M
> storage     6.83T  5.86T    207    481  25.7M  17.8M

It may be worth checking individual disk activity using gstat -f 'da.$'

Some time back I had one drive that was noticeably slower than the
rest of the  drives in RAID-Z2 vdev and was holding everything back.
SMART looked OK, there were no obvious errors and yet performance was
much worse than what I'd expect. gstat clearly showed that one drive
was almost constantly busy with much lower number of reads and writes
per second than its peers.

Perhaps previously fast transfer rates were due to caching effects.
I.e. if all metadata already made it into ARC, subsequent "zfs send"
commands would avoid a lot of random seeks and would show much better
throughput.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: VirtualBox OpenSolaris guest

2010-10-10 Thread Artem Belevich
On Sun, Oct 10, 2010 at 5:25 PM, Alex Goncharov
 wrote:
> (It only www/opera stopped crashing on File/Exit now...)

I think I've accidentally stumbled on a workaround for this crash on exit issue.

Once you've started opera and opened a page (any page), turn on print
preview on and off (Menu->Print->Print preview). Once it's done, opera
will exit cleanly. It beats me why print preview has anything to do
with exiting opera, but in my case it certainly does.

Another option is to delete liboperagtk.so once opera has been
installed. The downside is that file dialogs will be horrible. I
personally stick with the print preview workaround.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: VirtualBox OpenSolaris guest

2010-10-11 Thread Artem Belevich
On Mon, Oct 11, 2010 at 4:32 AM, Jakub Lach  wrote:
> Remedy for ugly file dialogs is skin with skinned ones.
>
> e.g. http://my.opera.com/community/customize/skins/info/?id=10071

That may make them look better, but the main issue was that file
open/save dialogs turned into a simple text input field w/o any way to
browse for files. I think few other dialogs have lost their
functionality, too.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Degraded zpool cannot detach old/bad drive

2010-10-27 Thread Artem Belevich
Are you interested in what's wrong or in how to fix it?

If fixing is the priority, I'd boot from OpenSolaris live CD and would
try importing the array there. Just make sure you don't upgrade ZFS to
a version that is newer than the one FreeBSD supports.

Opensolaris may be able to fix the array. Once it's done, export it,
boot back to FreeBSD and re-import it.

--Artem



On Wed, Oct 27, 2010 at 4:22 PM, Rumen Telbizov  wrote:
> No ideas whatsoever?
>
> On Tue, Oct 26, 2010 at 1:04 PM, Rumen Telbizov  wrote:
>
>> Hello everyone,
>>
>> After a few days of struggle with my degraded zpool on a backup server I
>> decided to ask for
>> help here or at least get some clues as to what might be wrong with it.
>> Here's the current state of the zpool:
>>
>> # zpool status
>>
>>   pool: tank
>>  state: DEGRADED
>> status: One or more devices has experienced an error resulting in data
>>         corruption.  Applications may be affected.
>> action: Restore the file in question if possible.  Otherwise restore the
>>         entire pool from backup.
>>    see: http://www.sun.com/msg/ZFS-8000-8A
>>  scrub: none requested
>> config:
>>
>>         NAME                          STATE     READ WRITE CKSUM
>>         tank                          DEGRADED     0     0     0
>>           raidz1                      DEGRADED     0     0     0
>>             spare                     DEGRADED     0     0     0
>>               replacing               DEGRADED     0     0     0
>>                 17307041822177798519  UNAVAIL      0   299     0  was
>> /dev/gpt/disk-e1:s2
>>                 gpt/newdisk-e1:s2     ONLINE       0     0     0
>>               gpt/disk-e2:s10         ONLINE       0     0     0
>>             gpt/disk-e1:s3            ONLINE      30     0     0
>>             gpt/disk-e1:s4            ONLINE       0     0     0
>>             gpt/disk-e1:s5            ONLINE       0     0     0
>>           raidz1                      ONLINE       0     0     0
>>             gpt/disk-e1:s6            ONLINE       0     0     0
>>             gpt/disk-e1:s7            ONLINE       0     0     0
>>             gpt/disk-e1:s8            ONLINE       0     0     0
>>             gpt/disk-e1:s9            ONLINE       0     0     0
>>           raidz1                      ONLINE       0     0     0
>>             gpt/disk-e1:s10           ONLINE       0     0     0
>>             gpt/disk-e1:s11           ONLINE       0     0     0
>>             gpt/disk-e1:s12           ONLINE       0     0     0
>>             gpt/disk-e1:s13           ONLINE       0     0     0
>>           raidz1                      DEGRADED     0     0     0
>>             gpt/disk-e1:s14           ONLINE       0     0     0
>>             gpt/disk-e1:s15           ONLINE       0     0     0
>>             gpt/disk-e1:s16           ONLINE       0     0     0
>>             spare                     DEGRADED     0     0     0
>>               replacing               DEGRADED     0     0     0
>>                 15258738282880603331  UNAVAIL      0    48     0  was
>> /dev/gpt/disk-e1:s17
>>                 gpt/newdisk-e1:s17    ONLINE       0     0     0
>>               gpt/disk-e2:s11         ONLINE       0     0     0
>>           raidz1                      ONLINE       0     0     0
>>             gpt/disk-e1:s18           ONLINE       0     0     0
>>             gpt/disk-e1:s19           ONLINE       0     0     0
>>             gpt/disk-e1:s20           ONLINE       0     0     0
>>             gpt/disk-e1:s21           ONLINE       0     0     0
>>           raidz1                      ONLINE       0     0     0
>>             gpt/disk-e1:s22           ONLINE       0     0     0
>>             gpt/disk-e1:s23           ONLINE       0     0     0
>>             gpt/disk-e2:s0            ONLINE       0     0     0
>>             gpt/disk-e2:s1            ONLINE       0     0     0
>>           raidz1                      ONLINE       0     0     0
>>             gpt/disk-e2:s2            ONLINE       0     0     0
>>             gpt/disk-e2:s3            ONLINE       0     0     0
>>             gpt/disk-e2:s4            ONLINE       0     0     0
>>             gpt/disk-e2:s5            ONLINE       0     0     0
>>           raidz1                      ONLINE       0     0     0
>>             gpt/disk-e2:s6            ONLINE       0     0     0
>>             gpt/disk-e2:s7            ONLINE       0     0     0
>>             gpt/disk-e2:s8            ONLINE       0     0     0
>>             gpt/disk-e2:s9            ONLINE       0     0     0
>>         spares
>>           gpt/disk-e2:s10             INUSE     currently in use
>>           gpt/disk-e2:s11             INUSE     currently in use
>>           gpt/disk-e1:s2              UNAVAIL   cannot open
>>           gpt/newdisk-e1:s17          INUSE     currently in use
>>
>> errors: 4 data errors, use '-v' for a list
>>
>>
>> The problem is: after replacing the bad drives and resilveri

Re: Degraded zpool cannot detach old/bad drive

2010-10-28 Thread Artem Belevich
> but only those 3 devices in /dev/gpt and absolutely nothing in /dev/gptid/
> So is there a way to bring all the gpt labeled partitions back into the pool
> instead of using the mfidXX devices?

Try re-importing the pool with "zpool import -d /dev/gpt". This will
tell ZFS to use only devices found within that path and your pool
should be using gpt labels again.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Degraded zpool cannot detach old/bad drive

2010-10-29 Thread Artem Belevich
On Thu, Oct 28, 2010 at 10:51 PM, Rumen Telbizov  wrote:
> Hi Artem, everyone,
>
> Thanks for your quick response. Unfortunately I already did try this
> approach.
> Applying -d /dev/gpt only limits the pool to the bare three remaining disks
> which turns
> pool completely unusable (no mfid devices). Maybe those labels are removed
> shortly
> they are being tried to be imported/accessed?

In one of the previous emails you've clearly listed many devices in
/dev/gpt and said that they've disappeared after pool import.
Did you do "zpool import -d /dev/gpt" while /dev/gpt entries were present?

> What I don't understand is what exactly makes those gpt labels disappear
> when the pool is imported and otherwise are just fine?!

This is the way GEOM works. If something (ZFS in this case) uses raw
device, derived GEOM entities disappear.

Try exporting the pool. Your /dev/gpt entries should be back. Now try
to import with -d option and see if it works.

You may try bringing the labels back the hard way by detaching raw
drive and then re-attaching it via the label, but resilvering one
drive at a time will take a while.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Degraded zpool cannot detach old/bad drive

2010-10-29 Thread Artem Belevich
On Fri, Oct 29, 2010 at 11:34 AM, Rumen Telbizov  wrote:
> The problem I think comes down to what I have written in the zpool.cache
> file.
> It stores the mfid path instead of the gpt/disk one.
>       children[0]
>              type='disk'
>              id=0
>              guid=1641394056824955485
>              path='/dev/mfid33p1'
>              phys_path='/p...@0,0/pci8086,3...@1c/pci15d9,c...@0/s...@1,0:a'
>              whole_disk=0
>              DTL=55

Yes, phys_path does look like something that came from solaris.

> Compared to a disk from a partner server which is fine:
>       children[0]
>              type='disk'
>              id=0
>              guid=5513814503830705577
>              path='/dev/gpt/disk-e1:s6'
>              whole_disk=0

If you have old copy of /boot/zfs/zpool.cache you could try use "zpool
import -c old-cache-file".

I don't think zpool.cache is needed for import. Import should work
without it just fine. Just remove /boot/zfs/zpool.cache (or move it
somewhere else and then try importing with -d /dev/gpt again.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Degraded zpool cannot detach old/bad drive

2010-10-29 Thread Artem Belevich
On Fri, Oct 29, 2010 at 2:19 PM, Rumen Telbizov  wrote:
> You're right. zpool export tank seems to remove the cache file so import has
> nothing to consult so doesn't make any difference.
> I guess my only chance at this point would be to somehow manually edit
> the zpool configuration, via the zpool.cache file or not, and substitute
> mfid with gpt/disk?!
> Is there a way to do this?

I'm not aware of any tools to edit zpool.cache.

What's really puzzling is why GPT labels disappear in the middle of
zpool import. I'm fresh out of ideas why that would happen.

What FreeBSD version are you running. SVN revision of the sources
would be good, but date may also work.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Degraded zpool cannot detach old/bad drive

2010-10-29 Thread Artem Belevich
On Fri, Oct 29, 2010 at 4:42 PM, Rumen Telbizov  wrote:
> FreeBSD 8.1-STABLE #0: Sun Sep  5 00:22:45 PDT 2010
> That's when I csuped and rebuilt world/kernel.

There were a lot of ZFS-related MFCs since then. I'd suggest updating
to the most recent -stable and try again.

I've got another idea that may or may not work. Assuming that GPT
labels disappear because zpool opens one of the /dev/mfid* devices,
you can try to do "chmod a-rw /dev/mfid*" on them and then try
importing the pool again.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: How to tell whether ECC (memory) is enabled?

2010-11-06 Thread Artem Belevich
On Sat, Nov 6, 2010 at 9:09 AM, Thomas Zander
 wrote:
> This means for now I have to trust the BIOS that ECC is enabled and I
> should see MCA reports in the dmesg output once a bit error is
> detected?

Well, you don't have to take BIOS' word for that and test whether ECC
really works. All you need is intentionally make one data bit bad.Put
some tape on one of the data pads on the DIMM and run memtest. That
would conclusively prove whether the motherboard has ECC enabled.

See here for more details:
http://bluesmoke.sourceforge.net/testing.html

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: How to tell whether ECC (memory) is enabled?

2010-11-06 Thread Artem Belevich
I would agree with you if one would try to use electrical tape. It
would be unsuitable for this purpose because of it's thickness and
because of the residue it tends to leave.

I believe there are better options. For what it's worth, at home
scotch tape ("invisible" matte kind) worked well enough for me. It
didn't get damaged by the slot connector and it didn't leave any
residue. YMMV, caveat emptor, beware, use at your own risk, you know
the drill..

--Artem



On Sat, Nov 6, 2010 at 5:00 PM,   wrote:
> Artem Belevich  wrote:
>
>> All you need is intentionally make one data bit bad.  Put some
>> tape on one of the data pads on the DIMM and run memtest ...
>
> and then spend the next couple of hours cleaning the gunk off of
> the DIMM and out of the slot :(
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: DTrace (or other monitor) access to LBA of a block device

2010-12-03 Thread Artem Belevich
On Thu, Dec 2, 2010 at 1:33 PM, Thomas Zander
 wrote:
> Hi,
>
> do we have any way to monitor which LBAs of which block device are
> read/written at a given time?
>
> I stumbled upon this,
> http://southbrain.com/south/2008/02/fun-with-dtrace-and-zfs-mirror.html
>
> which is pretty intriguing. Unfortunately on FreeBSD we do not have
> the DTrace io provider, so his dtrace script would not work.
> Do we have another option to monitor block device access in a similar fashion?

GEOM sounds like a good candidate for probing of that kind.

sudo dtrace -n 'fbt:kernel:g_io_deliver:entry { printf("%s %d %d
%d\n",stringof(args[0]->bio_from->geom->name), args[0]->bio_cmd,
args[0]->bio_offset, args[0]->bio_length); }'

Keep in mind that g_io_deliver will be called for each GEOM node from
top to bottom for each completed request. You may need to add some
filtering on device name to avoid redundant info.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: DTrace (or other monitor) access to LBA of a block device

2010-12-05 Thread Artem Belevich
> GEOM sounds like a good candidate for probing of that kind.
>
> sudo dtrace -n 'fbt:kernel:g_io_deliver:entry { printf("%s %d %d
> %d\n",stringof(args[0]->bio_from->geom->name), args[0]->bio_cmd,
> args[0]->bio_offset, args[0]->bio_length); }'

By the way, in order for this to work one would need r207057 applied
to -8. Any chance that could be MFC'ed?

http://svn.freebsd.org/viewvc/base?view=revision&revision=207057

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: DTrace (or other monitor) access to LBA of a block device

2010-12-05 Thread Artem Belevich
On Sun, Dec 5, 2010 at 11:31 AM, Andriy Gapon  wrote:
>> By the way, in order for this to work one would need r207057 applied
>> to -8. Any chance that could be MFC'ed?
>>
>> http://svn.freebsd.org/viewvc/base?view=revision&revision=207057
>
> Nice catch.
>
> Alexander,
> can that commit be trivially MFC-ed or are there any complications around this
> change?

r207057 seems to depend on r206082:
http://svn.freebsd.org/viewvc/base?view=revision&revision=206082

I'm using simpler version of changes equivalent to those in r207057
that I've picked up on one of freebsd lists some time back:

diff --git a/sys/conf/kmod.mk b/sys/conf/kmod.mk
index 56ef3ef..e9b5879 100644
--- a/sys/conf/kmod.mk
+++ b/sys/conf/kmod.mk
@@ -132,6 +132,10 @@ CFLAGS+=   -mlongcall -fno-omit-frame-pointer
 CFLAGS+=   -G0 -fno-pic -mno-abicalls -mlong-calls
 .endif

+.if defined(DEBUG) || defined(DEBUG_FLAGS)
+CTFFLAGS+= -g
+.endif
+
 .if defined(FIRMWS)
 .if !exists(@)
 ${KMOD:S/$/.c/}: @
@@ -197,6 +201,9 @@ ${KMOD}.kld: ${OBJS}
 ${FULLPROG}: ${OBJS}
 .endif
${LD} ${LDFLAGS} -r -d -o ${.TARGET} ${OBJS}
+.if defined(CTFMERGE)
+   ${CTFMERGE} ${CTFFLAGS} -o ${.TARGET} ${OBJS}
+.endif
 .if defined(EXPORT_SYMS)
 .if ${EXPORT_SYMS} != YES
 .if ${EXPORT_SYMS} == NO

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: New ZFSv28 patchset for 8-STABLE

2011-01-01 Thread Artem Belevich
On Sat, Jan 1, 2011 at 10:18 AM, Attila Nagy  wrote:
> What I see:
> - increased CPU load
> - decreased L2 ARC hit rate, decreased SSD (ad[46]), therefore increased
> hard disk load (IOPS graph)
>
...
> Any ideas on what could cause these? I haven't upgraded the pool version and
> nothing was changed in the pool or in the file system.


The fact that L2 ARC is full does not mean that it contains the right
data.  Initial L2ARC warm up happens at a much higher rate than the
rate L2ARC is updated after it's been filled initially. Even
accelerated warm-up took almost a day in your case. In order for L2ARC
to warm up properly you may have to wait quite a bit longer. My guess
is that it should slowly improve over the next few days as data goes
through L2ARC and those bits that are hit more often take residence
there. The larger your data set, the longer it will take for L2ARC to
catch the right data.

Do you have similar graphs from pre-patch system just after reboot? I
suspect that it may show similarly abysmal L2ARC hit rates initially,
too.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: constant zfs data corruption

2008-10-20 Thread Artem Belevich
> all right and understood but shouldn't something as fsck should correct the
> error? Seems kind of problematic to me mounting zfs in single user mode,
> deleting the file and restarting the OS ?

According to Sun's documents, removing corrupted file seems to be
'official' way to get rid of the problem
http://docs.sun.com/app/docs/doc/819-5461/gbctx?a=view

   If the damage is within a file data block, then the file can safely be
   removed, thereby clearing the error from the system.

If the files have something important and you want to recover
uncorrupted parts, here's what you can try:
http://blogs.sun.com/relling/entry/holy_smokes_a_holey_file

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Western Digital hard disks and ATA timeouts

2008-11-07 Thread Artem Belevich
> Note that Western Digital's "RAID edition" drives claim to take up to 7
> seconds to reallocate sectors, using something they call TLER, which
> force-limits the amount of time the drive can spend reallocating.  TLER
> cannot be disabled:

TLER can be enabled/disabled on recent WD drives (SE16/RE2/GP). SE16/GP
come with TLER off, RE2 with TLER on. Google WDTLER utility.
It can apparently be obtained from WD by asking them nicely.
Or, yet again, google is your friend. Here's one example -
http://www.hardforum.com/archive/index.php/t-1191548.html

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Support for SAS/SATA non-RAID adapters

2009-11-17 Thread Artem Belevich
>  LSI SAS 3080X-R    8-port SATA/SATA PCI-X

This one uses LSI1068 chip which is supported by mpt driver. I'm using
motherboard with an on-board equivalent of this and don't have much to
complain about. I did see some CRC errors with SATA drives in 3Gbps
mode, but those went away after updating firmware to 1.29.0.0. I've
seen some comments on zfs-discuss mailing list that -IR variant of the
firmware (the one that provides RAID0/1 capabilities) does have some
stability issues and recommended going with simpler -IT version (just
pass-through disks).

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Support for SAS/SATA non-RAID adapters

2009-11-17 Thread Artem Belevich
In general, I've found following page very informative about what's available:
http://www.hardforums.com/showthread.php?t=1413050

I've also tried AOC-SAT2-MV8:
http://www.supermicro.com/products/accessories/addon/AOC-SAT2-MV8.cfm

It's based on Marvell 88sx8061 chipset. Technically it is supported by
FreeBSD, but I'd rather stay away from it at least for now. The main
issue is that the largest transfer size is 32K. ZFS does push this
card hard and under load this card showed noticeably slower transfer
rate than LSI1068 under the same circumstances. Folks on zfs-discuss
list also mentioned issues with hot-swap on this card on controller
level. The somewhat better news is that NetBSD does seem to have much
better driver for this marvell chip. If someone gets to port it to
FreeBSD, the card may be pretty decent choice for those who have PCI-X
slot on-board.

--Artem



On Tue, Nov 17, 2009 at 6:12 PM, Artem Belevich  wrote:
>>  LSI SAS 3080X-R    8-port SATA/SATA PCI-X
>
> This one uses LSI1068 chip which is supported by mpt driver. I'm using
> motherboard with an on-board equivalent of this and don't have much to
> complain about. I did see some CRC errors with SATA drives in 3Gbps
> mode, but those went away after updating firmware to 1.29.0.0. I've
> seen some comments on zfs-discuss mailing list that -IR variant of the
> firmware (the one that provides RAID0/1 capabilities) does have some
> stability issues and recommended going with simpler -IT version (just
> pass-through disks).
>
> --Artem
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Support for SAS/SATA non-RAID adapters

2009-11-17 Thread Artem Belevich
Hi,

> If that one uses the LSI1068 chipset, do you know which one uses the
> LSI1078 chipset?

Supermicro's AOC-USAS-H8iR uses LSI1078:
http://www.supermicro.com/products/accessories/addon/AOC-USAS-H8iR.cfm

Dell PERC 6/i is based on LSI1078 as well.
http://www.dell.com/content/topics/topic.aspx/global/products/pvaul/topics/en/us/raid_controller?c=us&l=en&cs=555

However, these cards are full-blown RAID controllers with their own
CPU, memory and corresponding price.

>  I've seen that number in the comments in one of the mf* drivers (think it 
> was mfi).

Yes, it is indeed mfi that supports LSI1078.

> How does one determine which actual chipset is in which controller?  Do they 
> have that buried in the docs somewhere?

The docs, if you're lucky. Cards listed above mention controller chip
explicitly on the product pages.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ZFS performance degradation over time

2010-01-08 Thread Artem Belevich
Keep an eye on ARC size and on active/inactive/cache/free memory lists:

sysctl kstat.zfs.misc.arcstats.size
sysctl vm.stats.vm.v_inactive_count
sysctl vm.stats.vm.v_active_count
sysctl vm.stats.vm.v_cache_count
sysctl vm.stats.vm.v_cache_count

ZFS performance does degrade a lot if ARC becomes too small. Writes
also get throttled if ZFS thinks the system is running low on memory.

One way to help the situation somewhat is to bump vfs.zfs.arc_min
tunable. It would make ZFS somewhal less eager to give up memory.
However, write throttling seems to rely on amount on memory on the
free list. FreeBSD appears to have somewhat different semantics for
'free" compared to solaris and that makes ZFS think that we're running
low on memory while there's plenty of it sitting on inactive/cached
lists and could be used.

One rather crude way to get ZFS back in shape in this situation is to
temporarily cause real memory shortage on the system. That would force
trimming of active/inactive lists (and ARC, too) but once it's done,
ARC would be free to grow and that may restore ZFS performance for a
while.

Following command will allocate about 8G of memory on my system --
enough to start swapout:
perl -e '$x="x"x30'

--Artem


On Fri, Jan 8, 2010 at 8:31 AM, Garrett Moore  wrote:
>
> No, I haven't isolated the cause to only be uptime related. In my original
> email I mentioned that "as suggested by someone in the thread, it's probably
> not directly related to system uptime, but instead related to usage - the
> more usage, the worse the performance."
>
> I've been starting my system with different combinations of applications
> running to see what access patterns cause the most slowdown. So far, I don't
> have enough data to give anything concrete.
>
> This weekend I'll try some tests such as the one you describe, and see what
> happens. I have a strong suspicion that rTorrent is to blame, since I
> haven't seen major slowdowns in the last few days with rTorrent not running.
> rTorrent preallocates the space needed for the file download (and I'm
> downloading large 4GB+ files using it), and then writes to them in an
> unpredictable pattern, so maybe ZFS doesn't like being touched this way?
>
>
>
> On Wed, Jan 6, 2010 at 2:21 PM, Ivan Voras  wrote:
>
> > On 3.1.2010 17:42, Garrett Moore wrote:
> >
> > > I'm having problems with ZFS performance. When my system comes up,
> > > read/write speeds are excellent (testing with dd if=/dev/zero
> > > of=/tank/bigfile and dd if=/tank/bigfile of=/dev/null); I get at least
> > > 100MB/s on both reads and writes, and I'm happy with that.
> > >
> > > The longer the system is up, the worse my performance gets. Currently my
> > > system has been up for 4 days, and read/write performance is down to
> > about
> > > 10MB/s at best.
> >
> > Are you sure you have isolated the cause to be only the uptime of the
> > machine? Is there no other change between the runs? E.g. did you stop
> > all other services and applications on the machine before doing the test
> > for the second time? Can you create a big file (2x memory size) when the
> > machine boots, measure the time to read it, then read it again after a
> > few days when you notice performance problems?
> >
> >
> > ___
> > freebsd-stable@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
> >
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 8.0-RELEASE/amd64 - full ZFS install - low read and write disk performance

2010-01-25 Thread Artem Belevich
aoc-sat2-mv8 was somewhat slower compared to ICH9 or LSI1068
controllers when I tried it with 6 and 8 disks.
I think the problem is that MV8 only does 32K per transfer and that
does seem to matter when you have 8 drives hooked up to it. I don't
have hard numbers, but peak throughput of MV8 with 8-disk raidz2 was
noticeably lower than that of LSI1068 in the same configuration. Both
LSI1068 and MV2 were on the same PCI-X bus. It could be a driver
limitation. The driver for Marvel SATA controllers in NetBSD seems a
bit more advanced compared to what's in FreeBSD.

I wish intel would make cheap multi-port PCIe SATA card based on their
AHCI controllers.

--Artem

On Mon, Jan 25, 2010 at 3:29 AM, Pete French
 wrote:
>> I like to use pci-x with aoc-sat2-mv8 cards or pci-e cardsthat way you
>> get a lot more bandwidth..
>
> I would goalong with that - I have precisely the same controller, with
> a pair of eSATA drives, running ZFS mirrored. But I get a nice 100
> meg/second out of them if I try. My controller is, however on PCI-X, not
> PCI. It's a shame PCI-X appears to have gone the way of the dinosaur :-(
>
> -pete.
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ZFS panic on RELENG_7/i386

2010-01-26 Thread Artem Belevich
> will do, thank you. is fletcher4 faster?
Not necessarily. But it does work as a checksum much better. See
following link for the details.

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6740597

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ATA_CAM + ZFS gives short 1-2 seconds system freeze on disk load

2010-02-08 Thread Artem Belevich
> I'd like a technical explanation of exactly what this loader.conf
> tunable does.  The sysctl -d explanation is useful if one has insight to
> what purpose it serves.  I can find mention on Solaris lists about "txg
> timeout", but the context is over my head (intended for those very
> familiar with the inner workings of ZFS).

Ben Rockwood's blog has pretty decent explanation of how transaction
groups in ZFS work:
http://www.cuddletech.com/blog/pivot/entry.php?id=1015

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ZFS ARC being limited below what is defined in /boot/loader.conf

2010-02-12 Thread Artem Belevich
Check your vm.kmem_size. Default setting is way too low. Set it to at
least double of desired arc size.

--Artem



On Fri, Feb 12, 2010 at 10:31 AM, Steve Polyack  wrote:
> Has anyone had an issue with the ZFS ARC max being limited below what has
> been defined in /boot/loader.conf?  I just upgraded the RAM in a
> ZFS-equipped system and attempted to devote 4GB to the ARC cache by placing
> the following in loader.conf:
>  vfs.zfs.arc_max="4096M"
>
> However, after rebooting, querying the sysctl gives me this:
> $ sysctl vfs.zfs.arc_max
> vfs.zfs.arc_max: 1726489600
>
> or about 1.7GB, an odd number that I can't find any references to.  For
> reference, I'm running 8-STABLE (as of Jan 19th) on an amd64 system with 8GB
> of RAM.  The system was previously very stable with 4GB of RAM and a 512MB
> arc_max.  I have not modified vm.kmem_size_max (defaults to ~330GB on amd64)
> or any other ZFS tunables.  I'd also like to avoid syncing up to the current
> 8-STABLE if at all possible.
>
> Thanks,
> Steve Polyack
>
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ZFS ARC being limited below what is defined in /boot/loader.conf

2010-02-12 Thread Artem Belevich
vm.kmem_size_max/vm.kmem_size_min define the range vm.kmem_size can be set to.
vm_kmem_size specifies the actual kmem size.

ARC size in turn limited by vm.kmem_size.

If you want to bump ARC size, you do need to bump vm.kmem_size.

--Artem



On Fri, Feb 12, 2010 at 11:36 AM, Steve Polyack  wrote:
> On 02/12/10 13:47, Artem Belevich wrote:
>>
>> On Fri, Feb 12, 2010 at 10:31 AM, Steve Polyack
>>  wrote:
>>
>>
>>>
>>> Has anyone had an issue with the ZFS ARC max being limited below what has
>>> been defined in /boot/loader.conf?  I just upgraded the RAM in a
>>> ZFS-equipped system and attempted to devote 4GB to the ARC cache by
>>> placing
>>> the following in loader.conf:
>>>  vfs.zfs.arc_max="4096M"
>>>
>>> However, after rebooting, querying the sysctl gives me this:
>>> $ sysctl vfs.zfs.arc_max
>>> vfs.zfs.arc_max: 1726489600
>>>
>>> or about 1.7GB, an odd number that I can't find any references to.  For
>>> reference, I'm running 8-STABLE (as of Jan 19th) on an amd64 system with
>>> 8GB
>>> of RAM.  The system was previously very stable with 4GB of RAM and a
>>> 512MB
>>> arc_max.  I have not modified vm.kmem_size_max (defaults to ~330GB on
>>> amd64)
>>> or any other ZFS tunables.  I'd also like to avoid syncing up to the
>>> current
>>> 8-STABLE if at all possible.
>>>
>>> Thanks,
>>> Steve Polyack
>>>
>>> ___
>>> freebsd-stable@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>>>
>>>
>>
>> Check your vm.kmem_size. Default setting is way too low. Set it to at
>> least double of desired arc size.
>>
>> --Artem
>
> I mentioned it briefly, but vm.kmem_size_max was left at the default for
> amd64.  At 330GB it is way above and beyond what will ever be allocated to
> ARC:
> $ sysctl vm.kmem_size_max
> vm.kmem_size_max: 329853485875
>
>
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: More zfs benchmarks

2010-02-14 Thread Artem Belevich
Can you check if kstat.zfs.misc.arcstats.memory_throttle_count sysctl
increments during your tests?

ZFS self-throttles writes if it thinks system is running low on
memory. Unfortunately on FreeBSD the 'free' list is a *very*
conservative indication of available memory so ZFS often starts
throttling before it's really needed. With only 2GB in the system,
that's probably what slows you down.

The code is in arc_memory_throttle() in
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c, if anyone's
curious.

--Artem



On Sun, Feb 14, 2010 at 9:28 AM, Jonathan Belson  wrote:
> Hiya
>
> After reading some earlier threads about zfs performance, I decided to test 
> my own server.  I found the results rather surprising...
>
> The machine is a Dell SC440, dual core 2GHz E2180, 2GB of RAM and ICH7 
> SATA300 controller.  There are three Hitachi 500GB drives (HDP725050GLA360) 
> in a raidz1 configuration (version 13).  I'm running amd64 7.2-STABLE from 
> 14th Jan.
>
>
> First of all, I tried creating a 200MB file on / (the only non-zfs partition):
>
> # dd if=/dev/zero of=/root/zerofile.000 bs=1M count=200
> 200+0 records in
> 200+0 records out
> 209715200 bytes transferred in 6.158355 secs (34053769 bytes/sec)
>
> # dd if=/dev/zero of=/root/zerofile.000 bs=1M count=200
> 200+0 records in
> 200+0 records out
> 209715200 bytes transferred in 5.423107 secs (38670674 bytes/sec)
>
> # dd if=/dev/zero of=/root/zerofile.000 bs=1M count=200
> 200+0 records in
> 200+0 records out
> 209715200 bytes transferred in 6.113258 secs (34304982 bytes/sec)
>
>
> Next, I tried creating a 200MB file on a zfs partition:
>
> # dd if=/dev/zero of=/tank/test/zerofile.000 bs=1M count=200
> 200+0 records in
> 200+0 records out
> 209715200 bytes transferred in 58.540571 secs (3582391 bytes/sec)
>
> # dd if=/dev/zero of=/tank/test/zerofile.000 bs=1M count=200
> 200+0 records in
> 200+0 records out
> 209715200 bytes transferred in 46.867240 secs (4474665 bytes/sec)
>
> # dd if=/dev/zero of=/tank/test/zerofile.000 bs=1M count=200
> 200+0 records in
> 200+0 records out
> 209715200 bytes transferred in 21.145221 secs (9917853 bytes/sec)
>
> # dd if=/dev/zero of=/tank/test/zerofile.000 bs=1M count=200
> 200+0 records in
> 200+0 records out
> 209715200 bytes transferred in 19.387938 secs (10816787 bytes/sec)
>
> # dd if=/dev/zero of=/tank/test/zerofile.000 bs=1M count=200
> 200+0 records in
> 200+0 records out
> 209715200 bytes transferred in 21.378161 secs (9809787 bytes/sec)
>
> # dd if=/dev/zero of=/tank/test/zerofile.000 bs=1M count=200
> 200+0 records in
> 200+0 records out
> 209715200 bytes transferred in 23.774958 secs (8820844 bytes/sec)
>
> Ouch!  Ignoring the first result, that's still over three times slower than 
> the non-zfs test.
>
>
> With a 2GB test file:
>
> # dd if=/dev/zero of=/tank/test/zerofile.000 bs=1M count=2000
> 2000+0 records in
> 2000+0 records out
> 2097152000 bytes transferred in 547.901945 secs (3827605 bytes/sec)
>
> # dd if=/dev/zero of=/tank/test/zerofile.000 bs=1M count=2000
> 2000+0 records in
> 2000+0 records out
> 2097152000 bytes transferred in 595.052017 secs (3524317 bytes/sec)
>
> # dd if=/dev/zero of=/tank/test/zerofile.000 bs=1M count=2000
> 2000+0 records in
> 2000+0 records out
> 2097152000 bytes transferred in 517.326470 secs (4053827 bytes/sec)
>
> Even worse :-(
>
>
> Reading 2GB from a raw device:
>
> dd if=/dev/ad4s1a of=/dev/null bs=1M count=2000
> 1024+0 records in
> 1024+0 records out
> 1073741824 bytes transferred in 13.914145 secs (77169084 bytes/sec)
>
>
> Reading 2GB from a zfs partition (unmounting each time):
>
> dd if=/tank/test/zerofile.000 of=/dev/null bs=1M count=2000
> 2000+0 records in
> 2000+0 records out
> 2097152000 bytes transferred in 29.905155 secs (70126772 bytes/sec)
>
> dd if=/tank/test/zerofile.000 of=/dev/null bs=1M count=2000
> 2000+0 records in
> 2000+0 records out
> 2097152000 bytes transferred in 32.557361 secs (64414066 bytes/sec)
>
> dd if=/tank/test/zerofile.000 of=/dev/null bs=1M count=2000
> 2000+0 records in
> 2000+0 records out
> 2097152000 bytes transferred in 34.137874 secs (61431828 bytes/sec)
>
> For reading, there seems to be much less of a disparity in performance.
>
> I notice that one drive is on atapci0 and the other two are on atapci1, but 
> surely it wouldn't make this much of a difference to write speeds?
>
> Cheers,
>
> --Jon
>
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: hardware for home use large storage

2010-02-14 Thread Artem Belevich
> your ZFS pool of SATA disks has 120gb worth of L2ARC space

Keep in mind that housekeeping of 120G L2ARC may potentially require
fair amount of RAM, especially if you're dealing with tons of small
files.

See this thread:
http://www.mail-archive.com/zfs-disc...@opensolaris.org/msg34674.html

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: hardware for home use large storage

2010-02-15 Thread Artem Belevich
>> * vm.kmem_size
>> * vm.kmem_size_max
>
> I tried kmem_size_max on -current (this year), and I got a panic during use,
> I changed kmem_size to the same value I have for _max and it didn't panic
> anymore. It looks (from mails on the lists) that _max is supposed to give a
> max value for auto-enhancement, but at least it was not working with ZFS
> last month (and I doubt it works now).

It used to be that vm.kmem_size_max needed to be bumped to allow for
larger vm.kmem_size. It's no longer needed on amd64. Not sure about
i386.

vm.kmem_size still needs tuning, though. While vm.kmem_size_max is no
longer a limit, there are other checks in place that result in default
vm.kmem_size being a bit on the conservative side for ZFS.

>> Then, when it comes to debugging problems as a result of tuning
>> improperly (or entire lack of), the following counters (not tunables)
>> are thrown into the mix as "things people should look at":
>>
>>  kstat.zfs.misc.arcstats.c
>>  kstat.zfs.misc.arcstats.c_min
>>  kstat.zfs.misc.arcstats.c_max
>
> c_max is vfs.zfs.arc_max, c_min is vfs.zfs.arc_min.
>
>>  kstat.zfs.misc.arcstats.evict_skip
>>  kstat.zfs.misc.arcstats.memory_throttle_count
>>  kstat.zfs.misc.arcstats.size
>
> I'm not very sure about size and c... both represent some kind of current
> size, but they are not the same.

arcstats.c -- adaptive ARC target size. I.e. that's what ZFS thinks it
can grow ARC to. It's dynamically adjusted based on when/how ZFS is
back-pressured for memory.
arcstats.size -- current ARC size
arcstats.p -- portion of arcstats.c that's used by "Most Recently
Used" items. What's left of arcstats.c is used by "Most Frequently
Used" items.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: hardware for home use large storage

2010-02-15 Thread Artem Belevich
> How much ram are you running with?

8GB on amd64. kmem_size=16G, zfs.arc_max=6G

> In a latest test with 8.0-R on i386 with 2GB of ram, an install to a ZFS
> root *will* panic the kernel with kmem_size too small with default
> settings. Even dropping down to Cy Schubert's uber-small config will panic
> the kernel (vm.kmem_size_max = 330M, vfs.zfs.arc_size = 40M,
> vfs.zfs.vdev.cache_size = 5M); the system is currently stable using DIST
> kernel, vm.kmem_size/max = 512M, arc_size = 40M and vdev.cache_size = 5M.

On i386 you don't really have much wiggle room. Your address space is
32-bit and, to make things more interesting, it's split between
user-land and kernel. You can keep bumping KVA_PAGES only so far and
that's what limits your vm.kmem_size_max which is the upper limit for
vm.kmem_size.

The bottom line -- if you're planning to use ZFS, do switch to amd64.
Even with only 2GB of physical RAM available, your box will behave
better. At the very least it will be possible to avoid the panics
caused by kmem exhaustion.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: puc(4) timedia baudrate problem

2010-04-27 Thread Artem Belevich
I've got another PCI UART card based on OX16PCI952 that needs its
clock multiplied by 8 in order to work correctly. It was some
el-cheapo card I've got at Fry's.

p...@pci0:1:0:0:class=0x070006 card=0x00011415 chip=0x95211415
rev=0x00 hdr=0x00
vendor = 'Oxford Semiconductor Ltd'
device = 'OX16PCI952 Integrated Dual UART'
class  = simple comms
subclass   = UART
bar   [10] = type I/O Port, range 32, base 0xd480, size  8, enabled
bar   [14] = type I/O Port, range 32, base 0xd400, size  8, enabled
bar   [18] = type I/O Port, range 32, base 0xd080, size 32, enabled
bar   [1c] = type Memory, range 32, base 0xf9ffd000, size 4096, enabled
bar   [20] = type Memory, range 32, base 0xf9ffc000, size 4096, enabled

Perhaps we can add some sort of tunable to override UART clock, if necessary?

--Artem



On Tue, Apr 27, 2010 at 9:46 PM, Marcel Moolenaar  wrote:
>
> On Apr 27, 2010, at 12:47 PM, Paul Schenkeveld wrote:
>
>>    puc0:  port 
>> 0xe500-0xe51f,0xe520-0xe52f,0xe530-0xe537,0xe538-0xe53f,0xe540-0xe547,0xe548-0xe54f
>>  irq 10 at device 14.0 on pci0
> *snip*
>> The first two ports work correctly but the baudrate of the other six
>> is incorrect, i.e. I have to use 'tip -76800 uart5' to get the port
>> to communicate at 9600 baud.  I 'know' that this particular hardware
>> has a baudrate multiplier on the first two ports but not on the other
>> six.
> *snip*
>
> Can you show me the output of ``pciconf -lbv'' for this device so that
> I can create a patch for you to test?
>
> Also: do you happen to know if all 8-port Timedia cards have a non-
> uniform RCLK or only a select set (maybe only yours)?
>
> Thanks,
>
> --
> Marcel Moolenaar
> xcl...@mac.com
>
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Freebsd 8.0 kmem map too small

2010-05-10 Thread Artem Belevich
> On this specific system, it has 32 GB physical memory and has
> vfs.zfs.arc_max="2G" and vm.kmem_size="64G" in /boot/loader.conf.  The
> latter was added per earlier suggestions on this list, but appears to be
> ignored as "sysctl vm.kmem_size" returns about 2 GB (2172452864) anyway.

Set vm.kmem_size to slightly below 2x the amount of physical memory
your kernel *sees* (sysctl hw.physmem) . Chances are that real amount
of physical memory available to kernel is slightly below 32G so your
tunable is ignored. My guess would be that vm.kmem_size=63G would work
much better.

--Artem



On Mon, May 10, 2010 at 8:55 AM, Mike Andrews  wrote:
> On 5/5/10 11:19 AM, Freddie Cash wrote:
>>
>> On Tue, May 4, 2010 at 11:32 PM, Giulio Ferro
>>  wrote:
>>
>>> Giulio Ferro wrote:
>>>
 Thanks, I'll try these settings.

 I'll keep you posted.

>>>
>>> Nope, it's happened again... Now I've tried to rise vm.kmem_size to 6G...
>>> I'm really astounded at how unstable zfs is, it's causing me a lot of
>>> problem.
>>> Why isn't it stated in the handbook that zfs isn't up to production yet?
>>>
>>
>> As with everything related to computers, it all depends on your uses.
>
> Sorry to semi-hijack this, but...  I'm also running into frequent "kmem_map
> too small" panics on 8-STABLE, such as:
>
> panic: kmem_malloc(131072): kmem_map too small: 2023780352 total allocated
> panic: kmem_malloc(131072): kmem_map too small: 2011525120 total allocated
> panic: kmem_malloc(114688): kmem_map too small: 1849356288 total allocated
> panic: kmem_malloc(114688): kmem_map too small: 1849356288 total allocated
> panic: kmem_malloc(114688): kmem_map too small: 1849356288 total allocated
> panic: kmem_malloc(131072): kmem_map too small: 2020409344 total allocated
> panic: kmem_malloc(536576): kmem_map too small: 2022957056 total allocated
>
> (those are over the course of 3-4 days)
>
> On this specific system, it has 32 GB physical memory and has
> vfs.zfs.arc_max="2G" and vm.kmem_size="64G" in /boot/loader.conf.  The
> latter was added per earlier suggestions on this list, but appears to be
> ignored as "sysctl vm.kmem_size" returns about 2 GB (2172452864) anyway.
>
> Unfortunately I have not yet found a way to reliably reproduce the panic on
> demand, so it's hard to iteratively narrow down which date the potentially
> offending commit was on.  It happened at least twice overnight, where
> several memory-intensive jobs run.  Backing out to a previous 8-STABLE
> kernel from March 28 appears (so far) to have stabilized things, so the
> offending code was likely somewhere between then and May 5 or so.  I do have
> KDB/DDB and serial console on this box, if there's any more info I can give
> to help troubleshoot other than just a one-month vague date range :)
>
> This is happening to more than just one system, but I figure it's easier to
> troubleshoot memory settings on the 32 GB i7 server instead of the 2 GB Atom
> one...
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Freebsd 8.0 kmem map too small

2010-05-10 Thread Artem Belevich
vm.kmem_size limitation has been this way for a pretty long time.

What's changed recently is that ZFS ARC now uses UMA for its memory
allocations. If I understand it correctly, this would make ARC's
memory use more efficient as allocated chunks will end up in a zone
tuned for allocations of particular size.

Increased fragmentation could be the side effect of this change, but
I'm guessing here.

--Artem



On Mon, May 10, 2010 at 1:45 PM, Mike Andrews  wrote:
> On Mon, 10 May 2010, Steve Polyack wrote:
>
>> On 05/10/10 11:55, Mike Andrews wrote:
>>>
>>> On 5/5/10 11:19 AM, Freddie Cash wrote:

 On Tue, May 4, 2010 at 11:32 PM, Giulio Ferro
 wrote:

> Giulio Ferro wrote:
>
>> Thanks, I'll try these settings.
>>
>> I'll keep you posted.
>>
>
> Nope, it's happened again... Now I've tried to rise vm.kmem_size to
> 6G...
> I'm really astounded at how unstable zfs is, it's causing me a lot of
> problem.
> Why isn't it stated in the handbook that zfs isn't up to production
> yet?
>

 As with everything related to computers, it all depends on your uses.
>>>
>>> Sorry to semi-hijack this, but...  I'm also running into frequent
>>> "kmem_map too small" panics on 8-STABLE, such as:
>>>
>>> panic: kmem_malloc(131072): kmem_map too small: 2023780352 total
>>> allocated
>>> panic: kmem_malloc(131072): kmem_map too small: 2011525120 total
>>> allocated
>>> panic: kmem_malloc(114688): kmem_map too small: 1849356288 total
>>> allocated
>>> panic: kmem_malloc(114688): kmem_map too small: 1849356288 total
>>> allocated
>>> panic: kmem_malloc(114688): kmem_map too small: 1849356288 total
>>> allocated
>>> panic: kmem_malloc(131072): kmem_map too small: 2020409344 total
>>> allocated
>>> panic: kmem_malloc(536576): kmem_map too small: 2022957056 total
>>> allocated
>>>
>>> (those are over the course of 3-4 days)
>>>
>>> On this specific system, it has 32 GB physical memory and has
>>> vfs.zfs.arc_max="2G" and vm.kmem_size="64G" in /boot/loader.conf.  The
>>> latter was added per earlier suggestions on this list, but appears to be
>>> ignored as "sysctl vm.kmem_size" returns about 2 GB (2172452864) anyway.
>>>
>>>
>> As Artem stated in another reply, you will need to set vm.kmem_size
>> slightly under 2x the physical memory.  The kernel will default to 2GB if
>> you pass this limit.  1.5x physical memory size should be sufficient, so try
>> "48G" and verify that it gets set correctly on the next boot.
>
>
> OK, I've got vm.kmem_size set a bit lower and it now accepts it.  It's still
> not clear why this just recently (April?) became necessary to do at all :)
>
> Meanwhile, I'll see if things get more stable now...
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Freebsd 8.0 kmem map too small

2010-05-10 Thread Artem Belevich
You can try disabling ZIO_USE_UMA in sys/modules/zfs/Makefile

Comment out following line in that file:
CFLAGS+=-DZIO_USE_UMA

This should revert memory allocation method back to its previous mode.
Let us know whether it helps or not.

--Artem



On Mon, May 10, 2010 at 4:14 PM, Richard Perini  wrote:
> On Wed, May 05, 2010 at 03:33:02PM +0200, Pawel Jakub Dawidek wrote:
>> On Wed, May 05, 2010 at 10:46:31AM +0200, Giulio Ferro wrote:
>> > On 05.05.2010 09:52, Jeremy Chadwick wrote:
>> >
>> > Nope, it's happened again... Now I've tried to rise vm.kmem_size to 6G...
>> >
>
> [ ... ]
>
>> Could you try to track down the commit that is causing your problems?
>> Could you try 8-STABLE kernel from before r206815?
>
> A quick note to say "same here", but on i386.
>
> FreeBSD 8.0-STABLE as of 8/5/2010 paniced last night with same symptoms,
> approx 48 hours uptime.
>
> Previous kernel was FreeBSD 8.0-STABLE from Sun Mar  7 14:31:45 EST 2010,
> perfectly stable for intervening 2 months, about 2 months uptime.
>
> Please let me know if full details would help (as opposed to just adding 
> noise :-)
>
> --
> Richard Perini                                       Internet:  ...@ci.com.au
> Corinthian Engineering Pty Ltd                       PHONE:   +61 2 9552 5500
> Sydney, Australia                                    FAX:     +61 2 9552 5549
> ___
> freebsd...@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ZFS corruption due to lack of space?

2012-10-31 Thread Artem Belevich
On Wed, Oct 31, 2012 at 10:55 AM, Steven Hartland
 wrote:
> At that point with the test seemingly successful I went
> to delete test files which resulted in:-
> rm random*
> rm: random1: Unknown error: 122

ZFS is a logging filesystem. Even removing a file apparently requires
some space to write a new record saying that the file is not
referenced any more.

One way out of this jam is to try truncating some large file in place.
Make sure that file is not part of any snapshot.
Something like this may do the trick:
#dd if=/dev/null of=existing_large_file

Or, perhaps even something as simple as 'echo -n > large_file' may work.

Good luck,
--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: how to destroy zfs parent filesystem without destroying children - corrupted file causing kernel panick

2012-12-28 Thread Artem Belevich
On Fri, Dec 28, 2012 at 12:46 PM, Greg Bonett  wrote:

> However, I can't figure out how to destroy the /tank filesystem without
> destroying /tank/tempfs (and the other /tank children).  Is it possible to
> destroy a parent without destroying the children? Or, create a new parent
> zfs file system on the same zpool and move the /tank children there before
> destroying /tank?
>

It is possible in case parent is not the top-most zfs filesystem (i.e
tomp-most filesystem for the pool).

I.e. if your zfs filesystem layout looked like zfs-pool/tank/tempfs, then
you could simply do "zfs rename zfs-pool/tank/tempfs zfs-pool/tempfs" and
then would be free to remove zfs-pool/tank. Alas this rename semantics
breaks down when you can no longer rename sub-filesystem upward. I don't
think ZFS would allow you to promote inner filesystem to a pool which is
what you seem to want.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: how to destroy zfs parent filesystem without destroying children - corrupted file causing kernel panick

2012-12-29 Thread Artem Belevich
On Sat, Dec 29, 2012 at 12:35 AM, Greg Bonett  wrote:
>
> >
> > Does:
> >
> > cat /dev/null > bad.file
> >
> > Cause a kernel panic?
> >
> >
> >
> ah, sadly that does cause a kernel panic. I hadn't tried it though, thanks
> for the suggestion.

It's probably a long shot, but you may try removing bad file using
illumos (ex opensolaris) system (or liveCD). In the past it used to be
a little bit more robust than FreeBSD when it came to dealing with
filesystem corruption. In one case I had illumos printed out a message
about corrupt pool and remained up and running while FreeBSD would
just crash when I tried mounting the pool.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: FreeBSD history

2013-06-24 Thread Artem Belevich
On Sun, Jun 16, 2013 at 10:00 AM, Andy Farkas  wrote:

> On 16/06/13 20:30, Jeremy Chadwick wrote:
> > * Output from: strings /boot/kernel/kernel | egrep ^option Thanks.
>
> I stumbled across this one about a week ago:
>
>  strings /boot/kernel/kernel | head -1
>
> and was wondering about the history of where it came from / what it means.
>
> I can see it was added to Makefile.i386 in September 1998 but the commit
> comment mentions the defunct alpha port and searching SVN for things in the
> Attic is a PITA.
>

The key in the log message is that the kernel became a dynamic executable.
In order to launch typical dynamic executable kernel would actually launch
dynamic linker specified in the INTERP program header in the ELF file. By
default it's /libexec/ld-elf.so.1. Dynamic linker in turn would load the
app and the shared libraries it requires.

Kernel is, obviously, not a typical executable. My guess is that the idea
behind changing dynamic linker to /red/herring was to make it obvious that
the file is not a typical app and that despite being an ELF executable, it
should not be executed as a regular program.

It's just a guess, though.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: stopping amd causes a freeze

2013-07-22 Thread Artem Belevich
On Mon, Jul 22, 2013 at 2:50 AM, Dominic Fandrey wrote:

> Occasionally stopping amd freezes my system. It's a rare occurrence,
> and I haven't found a reliable way to reproduce it.
>
> It's also a real freeze, so there's no way to get into the debugger
> or grab a core dump. I only can perform the 4 seconds hard shutdown to
> revive the system.
>
> I run amd through sysutils/automounter, which is a scripting solution
> that generates an amd.map file based on encountered devices and devd
> events. The SIGHUP it sends to amd to tell it the map file was updated
> does not cause problems, only a SIGKILL may cause the freeze.
>
> Nothing was mounted (by amd) during the last freeze.
>
>
amd itself is a primitive NFS server as far as system is concerned and amd
mount points are mounted from it. If you just KILL it without giving it a
chance to clean things up you'll potentially end up in a situation similar
to mounting from remote NFS server that's unresponsive. From mount_nfs(8):

 If the server becomes unresponsive while an NFS file system is mounted,
 any new or outstanding file operations on that file system will hang
 uninterruptibly until the server comes back.  To modify this default
be-
 haviour, see the intr and soft options.



> I don't see any angle to tackle this, but I'm throwing it out here
> any way, in the hopes that someone actually has an idea how to approach
> the issue.
>

Don't use KILL or make sure that nobody tries to use amd mountpoints until
new instance starts. Manually unmounting them before killing amd may help.
Why not let amd do it itself with "/etc/rc.d/amd stop" ?

--Artem


>
> # uname -a
> FreeBSD mobileKamikaze.norad 9.2-PRERELEASE FreeBSD 9.2-PRERELEASE #0
> r253413: Wed Jul 17 13:12:46 CEST 2013 
> root@mobileKamikaze.norad:/usr/obj/HP6510b-91/amd64/usr/src/sys/HP6510b-91
>  amd64
>
> That's amd's starting message:
> Jul 22 11:32:28 mobileKamikaze amd[8176]/info:  no logfile defined; using
> stderr
> Jul 22 11:32:28 mobileKamikaze amd[8176]/info:  AM-UTILS VERSION
> INFORMATION:
> Jul 22 11:32:28 mobileKamikaze amd[8176]/info:  Copyright (c) 1997-2006
> Erez Zadok
> Jul 22 11:32:28 mobileKamikaze amd[8176]/info:  Copyright (c) 1990
> Jan-Simon Pendry
> Jul 22 11:32:28 mobileKamikaze amd[8176]/info:  Copyright (c) 1990
> Imperial College of Science, Technology & Medicine
> Jul 22 11:32:28 mobileKamikaze amd[8176]/info:  Copyright (c) 1990 The
> Regents of the University of California.
> Jul 22 11:32:28 mobileKamikaze amd[8176]/info:  am-utils version 6.1.5
> (build 901505).
> Jul 22 11:32:28 mobileKamikaze amd[8176]/info:  Report bugs to
> https://bugzilla.am-utils.org/ or am-ut...@am-utils.org.
> Jul 22 11:32:28 mobileKamikaze amd[8176]/info:  Configured by David
> O'Brien  on date 4-December-2007 PST.
> Jul 22 11:32:28 mobileKamikaze amd[8176]/info:  Built by
> root@mobileKamikaze.norad.
> Jul 22 11:32:28 mobileKamikaze amd[8176]/info:  cpu=amd64 (little-endian),
> arch=amd64, karch=amd64.
> Jul 22 11:32:28 mobileKamikaze amd[8176]/info:  full_os=freebsd9.2,
> os=freebsd9, osver=9.2, vendor=undermydesk, distro=The FreeBSD Project.
> Jul 22 11:32:28 mobileKamikaze amd[8176]/info:  domain=norad,
> host=mobileKamikaze, hostd=mobileKamikaze.norad.
> Jul 22 11:32:28 mobileKamikaze amd[8176]/info:  Map support for: root,
> passwd, union, nis, ndbm, file, exec, error.
> Jul 22 11:32:28 mobileKamikaze amd[8176]/info:  AMFS: nfs, link, nfsx,
> nfsl, host, linkx, program, union, ufs, cdfs,
> Jul 22 11:32:28 mobileKamikaze amd[8176]/info:pcfs, auto, direct,
> toplvl, error, inherit.
> Jul 22 11:32:28 mobileKamikaze amd[8176]/info:  FS: cd9660, nfs, nfs3,
> nullfs, msdosfs, ufs, unionfs.
> Jul 22 11:32:28 mobileKamikaze amd[8176]/info:  Network 1:
> wire="192.168.1.0" (netnumber=192.168.1).
> Jul 22 11:32:28 mobileKamikaze amd[8176]/info:  Network 2:
> wire="192.168.0.0" (netnumber=192.168).
> Jul 22 11:32:28 mobileKamikaze amd[8176]/info:  My ip addr is 127.0.0.1
>
> amd is called with the flags -r -p -a -c 4 -w 2
>
> --
> A: Because it fouls the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
> A: Top-posting.
> Q: What is the most annoying thing on usenet and in e-mail?
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: stopping amd causes a freeze

2013-07-23 Thread Artem Belevich
On Tue, Jul 23, 2013 at 10:43 AM, Dominic Fandrey wrote:

> > Don't use KILL or make sure that nobody tries to use amd mountpoints
> until
> > new instance starts. Manually unmounting them before killing amd may
> help.
> > Why not let amd do it itself with "/etc/rc.d/amd stop" ?
>
> That was a typo, I'm using SIGTERM. Sorry about that.
>
>
On SIGTERM amd will attempt to unmount its mountpoints. If someone is using
them, unmount may not succeed. I've no clue what amd does in such case.

The point is that you should treat amd restart as reboot of an NFS server.
amd map reload does not really require amd restart. In some cases you may
have to manually unmount some automounted filesystem if underlying map had
changed, but that's the only case I can think of off the top of my head. In
most of the cases "amq -f" worked well enough for me.

By the way, are you absolutely sure that your script that restarts amd is
guaranteed not to touch anything mounted with amd? Otherwise you're risking
a deadlock. For example, if PATH contains amd-mounted directory then when
it's time to execute next command your script may attempt to touch such
path and may hang waiting for amd to respond which will never happen
because the script can't start it.

Now, back to debugging your problem. One way to check what's going on would
be to figure out where do the processes get stuck.
Start with "ps -axl" and see STAT field. CHances are that stuck processes
will be in uninterruptible sleep state 'D'. Check MWCHAN field for those.
Hitting '^T' which normally sends SIGINFO should also produce a message
that includes process' wait channel and is convenient to use when you have
console where you've started the app that is hung.

Dig further into the sleeping process with "procstat -kk PID" -- it will
give you in-kernel stack trace for process' threads which should whos
what's going on. You may want to do it from a root login with local host
directory and minimalistic PATH so it does not touch any amd mount points.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: stopping amd causes a freeze

2013-07-26 Thread Artem Belevich
On Fri, Jul 26, 2013 at 10:10 AM, Dominic Fandrey wrote:

> Amd exhibits several very strange behaviours.
>
> a)
> During the first start it writes the wrong PID into the pidfile,
> it however still reacts to SIGTERM.
>
> b)
> After starting it again, it no longer reacts to SIGTERM.
>

amd does block off signals in some of its sub-processes. For instance amd
process that works as NFS server and handles amd mount points does block
off INT/TERM/CHLD/HUP. See /usr/src/contrib/amd/amd/nfs_start.c


>
> c)
> It appear to be no longer reacting to SIGHUP, which is required to
> tell it that the amd.map was updated.
>
>
Try using 'amq -f' which would ask amd to reload its maps via RPC and
should work regardless of whether you know the right PID.

Strangely enough amd man page does not mention SIGHUP at all.
amd/doc/am-utils.texi in the source tree does, but only when it talks about
hlfsd or about 'type:=auto' maps with 'cache' option.
Documentation on am-utils.org matches am-utils.texi.

As far as I can tell 'amq -f' is the official way to tell amd that it
should reload maps.

--Artem



> d)
> It doesn't work at all, I only get:
> # cd /media/ufs/FreeBSD_Install
> /media/ufs/FreeBSD_Install: Too many levels of symbolic links.
>
> e)
> A SIGKILL without load will terminate the process. A SIGKILL while
> there is heavy file system load panics the system.
>
>
I'll try a clean buildworld buildkernel and repeat.
>
> --
> A: Because it fouls the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
> A: Top-posting.
> Q: What is the most annoying thing on usenet and in e-mail?
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: current zfs tuning in RELENG_7 (AMD64) suggestions ?

2009-05-02 Thread Artem Belevich
>> This information is outdated.  The current max in RELENG_7 for amd64 is
>> ~3.75GB.

Technically, RELENG_7 should allow kmem_size of up to 6G, but the
sysctl variables used for tuning are 32-bit and *that* limits
kmem_size to ~4G.
It's been fixed in -current and can easily be fixed in RELENG_7 (if
it's not fixedyet).

As far as I can tell, all necessary code to support large kmem_size is
already in RELENG_7. It's easy enough to allow even larger kmem_size.
See attached diff that I'm using. With that diff you can set
vm.kmem_size to ~16G.

--Artem


vm-large.diff
Description: Binary data
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ZFS MFC heads down

2009-05-27 Thread Artem Belevich
I had the same problem on -current. Try attached patch. It may not
apply cleanly on -stable, but should be easy enough to make equivalent
changes on -stable.

--Artem



On Wed, May 27, 2009 at 3:00 AM, Henri Hennebert  wrote:
> Kip Macy wrote:
>>
>> On Wed, May 20, 2009 at 2:59 PM, Kip Macy  wrote:
>>>
>>> I will be MFC'ing the newer ZFS support some time this afternoon. Both
>>> world and kernel will need to be re-built. Existing pools will
>>> continue to work without upgrade.
>>>
>>>
>>> If you choose to upgrade a pool to take advantage of new features you
>>> will no longer be able to use it with sources prior to today. 'zfs
>>> send/recv' is not expected to inter-operate between different pool
>>> versions.
>>
>>
>> The MFC went in r192498. Please let me know if you have any problems.
>
> No a real problem but maybe worth mentioning:
>
> on FreeBSD morzine.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Tue May 26
> 15:37:48 CEST 2009 r...@morzine.restart.bel:/usr/obj/usr/src/sys/MORZINE
>  i386
>
> [r...@morzine ~]# zdb rpool
>    version=13
>    name='rpool'
>    state=0
>    txg=959
>    pool_guid=17669857244588609348
>    hostid=2315842372
>    hostname='unset'
>    vdev_tree
>        type='root'
>        id=0
>        guid=17669857244588609348
>        children[0]
>                type='mirror'
>                id=0
>                guid=3225603179255348056
>                metaslab_array=23
>                metaslab_shift=28
>                ashift=9
>                asize=51534888960
>                is_log=0
>                children[0]
>                        type='disk'
>                        id=0
>                        guid=17573085726489368265
>                        path='/dev/da0p2'
>                        whole_disk=0
>                children[1]
>                        type='disk'
>                        id=1
>                        guid=2736169600077218893
>                        path='/dev/da1p2'
>                        whole_disk=0
> Assertion failed: (?Ąuč? ėŪ¨´&), function mp->m_owner == NULL, file
> /usr/src/cddl/lib/libzpool/../../../cddl/contrib/opensolaris/lib/libzpool/common/kernel.c,
> line 112.
> Abort trap: 6
>
>
> and on FreeBSD avoriaz.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Mon May
> 25 12:06:07 CEST 2009 r...@avoriaz.restart.bel:/usr/obj/usr/src/sys/AVORIAZ
>  amd64
>
> [r...@avoriaz ~]# zdb rpool
>    version=13
>    name='rpool'
>    state=0
>    txg=3467
>    pool_guid=536117255064806899
>    hostid=1133576597
>    hostname='unset'
>    vdev_tree
>        type='root'
>        id=0
>        guid=536117255064806899
>        children[0]
>                type='mirror'
>                id=0
>                guid=3124217685892976292
>                metaslab_array=23
>                metaslab_shift=30
>                ashift=9
>                asize=155741847552
>                is_log=0
>                children[0]
>                        type='disk'
>                        id=0
>                        guid=11099413743436480159
>                        path='/dev/ad4p2'
>                        whole_disk=0
>                children[1]
>                        type='disk'
>                        id=1
>                        guid=12724983687805955432
>                        path='/dev/ad6p2'
>                        whole_disk=0
> Segmentation fault: 11
>
> By the way, to help prepare a boot/root pool does a utility to display the
> content of zpool.cache exist ?
>
>
> Henri
>>
>> Thanks,
>> Kip
>> ___
>> freebsd-stable@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
diff -r 99a13064f0d8 cddl/contrib/opensolaris/lib/libzpool/common/kernel.c
--- a/cddl/contrib/opensolaris/lib/libzpool/common/kernel.c	Tue May 26 13:17:51 2009 -0700
+++ b/cddl/contrib/opensolaris/lib/libzpool/common/kernel.c	Tue May 26 13:19:10 2009 -0700
@@ -104,17 +104,17 @@ zmutex_init(kmutex_t *mp)
 	mp->initialized = B_TRUE;
 	(void) _mutex_init(&mp->m_lock, USYNC_THREAD, NULL);
 }
 
 void
 zmutex_destroy(kmutex_t *mp)
 {
 	ASSERT(mp->initialized == B_TRUE);
-	ASSERT(mp->m_owner == NULL);
+//	ASSERT(mp->m_owner == NULL);
 	(void) _mutex_destroy(&(mp)->m_lock);
 	mp->m_owner = (void *)-1UL;
 	mp->initialized = B_FALSE;
 }
 
 void
 mutex_enter(kmutex_t *mp)
 {
@@ -163,16 +163,17 @@ mutex_owner(kmutex_t *mp)
  */
 /*ARGSUSED*/
 void
 rw_init(krwlock_t *rwlp, char *name, int type, void *arg)
 {
 	rwlock_init(&rwlp->rw_lock, USYNC_THREAD, NULL);
 	rwlp->rw_owner = NULL;
 	rwlp->initialized = B_TRUE;
+	rwlp->rw_count = 0;
 }
 
 void
 rw_destroy(krwlock_t *rwlp)
 {
 	rwlock_destroy(&rwlp->rw_lock);
 	rwlp-

Re: ZFS MFC heads down

2009-05-27 Thread Artem Belevich
Did you by any chance do that from single-user mode? ZFS seems to rely
on hostid being set.
Try running "/etc/rc.d/hostid start" and then re-try your zfs commands.

--Artem



On Wed, May 27, 2009 at 1:06 PM, Henri Hennebert  wrote:
> Artem Belevich wrote:
>>
>> I had the same problem on -current. Try attached patch. It may not
>> apply cleanly on -stable, but should be easy enough to make equivalent
>> changes on -stable.
>
> The patch is ok for stable.
>
> now I get for the pool with my root:
>
> [r...@morzine libzpool]# zdb rpool
>    version=13
>    name='rpool'
>    state=0
>    txg=959
>    pool_guid=17669857244588609348
>    hostid=2315842372
>    hostname='unset'
>    vdev_tree
>        type='root'
>        id=0
>        guid=17669857244588609348
>        children[0]
>                type='mirror'
>                id=0
>                guid=3225603179255348056
>                metaslab_array=23
>                metaslab_shift=28
>                ashift=9
>                asize=51534888960
>                is_log=0
>                children[0]
>                        type='disk'
>                        id=0
>                        guid=17573085726489368265
>                        path='/dev/da0p2'
>                        whole_disk=0
>                children[1]
>                        type='disk'
>                        id=1
>                        guid=2736169600077218893
>                        path='/dev/da1p2'
>                        whole_disk=0
> WARNING: pool 'rpool' could not be loaded as it was last accessed by another
> system (host: unset hostid: 0x8a08f344). See:
> http://www.sun.com/msg/ZFS-8000-EY
> zdb: can't open rpool: No such file or directory
>
> But rpool have been used for many boot now - strange ...
>
> Thanks for your patch and time
>
> Henri
>
>
>>
>> --Artem
>>
>>
>>
>> On Wed, May 27, 2009 at 3:00 AM, Henri Hennebert  wrote:
>>>
>>> Kip Macy wrote:
>>>>
>>>> On Wed, May 20, 2009 at 2:59 PM, Kip Macy  wrote:
>>>>>
>>>>> I will be MFC'ing the newer ZFS support some time this afternoon. Both
>>>>> world and kernel will need to be re-built. Existing pools will
>>>>> continue to work without upgrade.
>>>>>
>>>>>
>>>>> If you choose to upgrade a pool to take advantage of new features you
>>>>> will no longer be able to use it with sources prior to today. 'zfs
>>>>> send/recv' is not expected to inter-operate between different pool
>>>>> versions.
>>>>
>>>> The MFC went in r192498. Please let me know if you have any problems.
>>>
>>> No a real problem but maybe worth mentioning:
>>>
>>> on FreeBSD morzine.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Tue May
>>> 26
>>> 15:37:48 CEST 2009 r...@morzine.restart.bel:/usr/obj/usr/src/sys/MORZINE
>>>  i386
>>>
>>> [r...@morzine ~]# zdb rpool
>>>   version=13
>>>   name='rpool'
>>>   state=0
>>>   txg=959
>>>   pool_guid=17669857244588609348
>>>   hostid=2315842372
>>>   hostname='unset'
>>>   vdev_tree
>>>       type='root'
>>>       id=0
>>>       guid=17669857244588609348
>>>       children[0]
>>>               type='mirror'
>>>               id=0
>>>               guid=3225603179255348056
>>>               metaslab_array=23
>>>               metaslab_shift=28
>>>               ashift=9
>>>               asize=51534888960
>>>               is_log=0
>>>               children[0]
>>>                       type='disk'
>>>                       id=0
>>>                       guid=17573085726489368265
>>>                       path='/dev/da0p2'
>>>                       whole_disk=0
>>>               children[1]
>>>                       type='disk'
>>>                       id=1
>>>                       guid=2736169600077218893
>>>                       path='/dev/da1p2'
>>>                       whole_disk=0
>>> Assertion failed: (?Ąuč? ėŪ¨´&), function mp->m_owner == NULL, file
>>>
>>> /usr/src/cddl/lib/libzpool/../../../cddl/contrib/opensolaris/lib/libzpool/common/kernel.c,
>>> line 112.
>>>

Re: zpool scrub hangs on 7.2-stable

2009-09-20 Thread Artem Belevich
Do you have ZIL disabled? I think I saw the same scrub stall on -7
when I had vfs.zfs.zil_disable=1. After re-enabling ZIL scrub
proceeded normally.

--Artem



On Sun, Sep 20, 2009 at 2:42 PM, Christof Schulze
 wrote:
> Hello,
>
> currently I am running a 7.2 stable with zfs v13.
> Things work nicely except that zpool scrub hangs without disk activity.
> I do not get any error messages in dmesg or /var/log/messages and therefore I
> do not know where to look further.
>
> Is this a known issue or should I investigate? If the latter is the case I
> would need some help doing so.
>
> % uname -a                                           ~
> FreeBSD ccschu935 7.2-STABLE FreeBSD 7.2-STABLE #0: Tue Jul  7 04:56:00 CEST
> 2009     r...@ccschu935:/usr/obj/usr/src/sys/GENERIC  amd64
> % zpool status                                       ~
>  pool: tank
>  state: ONLINE
>  scrub: scrub in progress for 0h3m, 0,00% done, 3370h48m to go
> config:
>
>        NAME        STATE     READ WRITE CKSUM
>        tank        ONLINE       0     0     0
>          ad0s6     ONLINE       0     0     0
>          ad0s3f    ONLINE       0     0     0
>          ad0s3e    ONLINE       0     0     0
>        cache
>          mmcsd0    UNAVAIL      0     0     0  cannot open
>
> errors: No known data errors
>
>
> kind Regards
>
> Christof
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 8.0-RC1 ZFS-root installscript

2009-10-06 Thread Artem Belevich
> - untested support for raidz, I can't test this because virtualbox only
> provides one BIOS disk to the bootloader and raidz needs at least two
> disks for booting :-/

You can try creating raidz from multiple GPT slices on the same disk.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: whats best pracfive for ZFS on a whole disc these days ?

2009-10-26 Thread Artem Belevich
> Unfortunately it appears ZFS doesn't search for GPT partitions so if you
> have them and swap the drives around you need to fix it up manually.

When I used raw disk or GPT partitions, if disk order was changed the
pool would come up in 'DEGRADED' or UNAVAILABLE state. Even then all
that had to be done is export/import the pool. After the pool has been
re-imported it was back to ONLINE.

Now I'm using GPT labels (gpart -l) specifically because that avoids
issues with disk order or driver change. The pool I've built from GPT
labels has survived several migrations between different
controllers/drivers adX (ata) -> daX (SATA disks on mpt) -> adaX
(ahci) and multiple drive permutations without any manual intervention
at all. All that was done on 8-RC1/amd64.
I have also successfully imported the pool on OpenSolaris and back
again on FreeBSD.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"