Re: Panic with amr and 5.4-PRERELEASE

2005-03-15 Thread Pierre DAVID
On Tue, Mar 15, 2005 at 11:12:49AM +0100, Constant, Benjamin wrote:
> 
> Fyi,
> 
> The tool is working fine but doesn't show up rebuild state (just removed a
> drive and card bios is showing up rebuild on hotspare).
> I think it's coming from the driver as it doesn't say that logical device is
> in rebuild state.
> 
> [EMAIL PROTECTED]:~: dmesg | grep amrd0
> amrd0:  on amr0
> amrd0: 35073MB (71829504 sectors) RAID 1 (degraded)
> Mounting root from ufs:/dev/amrd0s1a
> 
> [EMAIL PROTECTED]:~: ./amrstat -l0
> Drive 0:34.25 GB, RAID1 
> degraded
>

You're right. The driver doesn't report the rebuild state.

> 
> By the way, this tool seems promising, thanks for the effort!
> 

Coupled with a simple perl script which periodically calls amrstat
on all volumes, it has been proved very useful by the past.

However, under heavy I/O load, the amr driver returns meaningful
information. This is the reason why amrstat.c make a few attempts.
Since this is only to send a mail for an alert, We ignore these
"false positive".

Pierre David
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Panic with amr and 5.4-PRERELEASE

2005-03-16 Thread Pierre DAVID
On Wed, Mar 16, 2005 at 04:46:01PM +0100, Rutger Bevaart wrote:
> 
> can't you get that information by using the combination of compat4x
> package, the amrcontrol tool from E Moore? (see
> http://people.freebsd.org/~emoore/MegaRAID_SCSI/). i've actually succeeded
> in rebuilding a RAID on a Dell PE2850 using the MEGAMGR application which
> provides the same GUI as the BIOS interface does. get the asci character
> set of your terminal right (search google) and you can manage.
> 
> there seems to be no sourcecode available of the amrcontrol tool that
> allows a clean build unfortunately.
> 

This amrcontrol tool was not available at the time I wrote amrstat.c,
according to "Release History.txt" (I remember I didn't find anything
on www.google.com/bsd).

However, we will try it. Thanks for the information.

Pierre David
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Show stopper for large disks with 5.4-RELEASE

2005-06-08 Thread Pierre DAVID
Hi,

we are setting up a mail server for ~50 000 users, with around 1.8TB
on a DAS storage (HP MSA 500).

We were planning to use FreeBSD (5.4-RELEASE), as with all other
servers in our machine room.

However, we are encountering a show stopper: after an unclean
shutdown, the snapshot that fsck creates is taking too much time
(more than 20 minutes). During the most part of this time, all I/O
are frozen on this large disk, so the server cannot serve our
clients. Our SLA constraints do not allow us to have these recovery
times.

The options used to create the file system were standard sysinstall
options. During normal operations, performances are very good.

Our tests showed that Linux doesn't present the same problem. With
ReiserFS, the reboot after crash takes only 20 seconds to read the
journal and recover the file system.

Is the snapshot time for large volumes a known problem? We don't
see such a long time with smaller file systems.

Do you have a clue to help us use FreeBSD and not switch on Linux
for this service?


Philippe Pegon & Pierre David
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Show stopper for large disks with 5.4-RELEASE

2005-06-08 Thread Pierre DAVID
On Wed, Jun 08, 2005 at 08:52:39PM +0800, Xin LI wrote:
> 
> Would you please provide a bit more of information so we can investigate
> what was happening, like:
> 
>   - first few lines dumpfs(8) output from your storage filesystem
>   - df -i on your storag filesystem
>   - dmesg.boot from your /var/run
> 
> I think the delay is too long.  Taking a snapshot on a large volume
> is slow, but should not be that slow :-)  We run similiar mail server
> at company, but with ~10x of users and 2x storage (divided into two
> RAID groups).  Hope I would be able to provide some help.
> 

The server with the MSA 500 is not available at this time (there is
another OS on it, sigh...).

We reproduced the problem with the planned backup server, with ~2TB
on a DAS storage (HP MSA 20).

seth ~ # newfs -O 2 -U /dev/da1
...
seth ~ # mount /dev/da1 /mail
seth ~ # df -g
Filesystem  1G-blocks Used Avail Capacity  Mounted on
/dev/da0s1a 00 065%/
devfs   00 0   100%/dev
/dev/da0s1g 90 8 0%/local
/dev/da0s1f 00 0 0%/tmp
/dev/da0s1d 70 6 9%/usr
/dev/da0s1e 90 8 0%/var
/dev/da1 19831  1823 0%/mail

And the problem is worse:

seth ~ # time mksnap_ffs /mail /mail/.snap/snap1

real69m18.083s
user0m0.000s
sys 0m21.121s

First lines of "dumpfs /dev/da1" are:
magic   19540119 (UFS2) timeWed Jun  8 17:03:14 2005
superblock location 65536   id  [ 42a6f77a a2574d5f ]
ncg 11413   size1073741823  blocks  1039959213
bsize   16384   shift   14  mask0xc000
fsize   2048shift   11  mask0xf800
frag8   shift   3   fsbtodb 2
minfree 8%  optim   timesymlinklen 120
maxbsize 16384  maxbpg  2048maxcontig 8 contigsumsize 8
nbfree  129917894   ndir2   nifree  268798970   nffree  
27
bpg 11761   fpg 94088   ipg 23552
nindir  2048inopb   64  maxfilesize 140806241583103
sbsize  2048cgsize  16384   csaddr  3000cssize  184320
sblkno  40  cblkno  48  iblkno  56  dblkno  3000
cgrotor 8470fmod0   ronly   0   clean   0
avgfpdir 64 avgfilesize 16384
flags   soft-updates 
fsmnt   /mail
volname swuid   0

The complete "dumpfs /dev/da1" output is available on:
ftp://ftp8.fr.freebsd.org/pub/tmp/dumpfs


Here is the "df -i" output:

seth mail # df -i
Filesystem  1K-blocks Used  Avail Capacity iused ifree 
%iused 
  Mounted on
/dev/da0s1a1012974  607370 32456865%1399139911
1%   /
devfs1   1  0   100%   0 0  
100% 
  /dev
/dev/da0s1g   1006383841429254590 0%   6   1318904
0% 
  /local
/dev/da0s1f1012974  14 931924 0%   8141302
0% 
  /tmp
/dev/da0s1d8122126  6826546789702 9%  116342943496   
11% 
  /usr
/dev/da0s1e   10154158   257329316094 0% 258   1318652
0% 
  /var
/dev/da12079918426 1232068 1912292884 0%   4 268798970
0% 
  /mail


The dmesg.boot is attached to this mail.

Just in case, we also included the kernel configuration file to
this mail.


Thanks in advance for your help,


Philippe Pegon & Pierre David
Copyright (c) 1992-2005 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.4-STABLE #2: Thu May 19 21:43:04 CEST 2005
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/SETH
ACPI APIC Table: 
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2790.96-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0xf29  Stepping = 9
  
Features=0xbfebfbff
  Hyperthreading: 2 logical CPUs
real memory  = 1073717248 (1023 MB)
avail memory = 1041145856 (992 MB)
MADT: Forcing active-low polarity and level trigger for SCI
ioapic0  irqs 0-15 on motherboard
ioapic1  irqs 16-31 on motherboard
ioapic2  irqs 32-47 on motherboard
ioapic3  irqs 48-63 on motherboard
npx0:  on motherboard
npx0: INT 16 interface
acpi0:  on motherboard
acpi0: Power Button (fixed)
Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000
acpi_timer0: <32-bit timer at 3.579545MHz> port 0x920-0x923 on acpi0
cpu0:  on acpi0
pcib0:  on acpi0
pci0:  on pcib0
pci0:  at device 3.0 (no driver attached)
pci0:  at device 4.0 (no driver attached)
pci0:  at device 4.2 (no driver attached)
isab0:  at device 15.