On 26/07/2011 15:16, Jeremy Chadwick wrote:
On Tue, Jul 26, 2011 at 02:50:28PM +0200, Jerome Herman wrote:
[very large snip]
So here I am starting to think that my disklabel and fsck are not in
sync with my kernel.
I've never heard of either of these utilities (bsdlabel/disklabel, nor
fsck) having to be "in sync with the kernel". My opinion at this moment
in time is that you're barking up the wrong tree.
Actually fsck and disklabel needed to be heavily modified in order to
support gvinum fully, mainly because the underlying device is very
different from a standard drive.
As I'm not familiar with the vinum infrastructure, GEOM-based or
without, others will have to assist with that. However, I'm still not
able to discern what your "type" of gvinum volume is -- is it a mirror,
a stripe, or a raid5?
Actually it is Raid 10 of a sort. Three first halves of the three disk
concatenated and mirrored on the three second half of the same drives.
Others who are more familiar with vinum are probably going to ask you to
provide the full configuration details of your vinum setup, including
all the commands you issued to create it. "gvinum printconfig" would be
a great start.
Here is gvinum printconfig
drive c device /dev/ad7
drive b device /dev/ad6
drive a device /dev/ad5
volume backup
plex name backup.p1 org striped 1024s vol backup
plex name backup.p0 org striped 1024s vol backup
sd name backup.p1.s2 drive b len 1465137152s driveoffset 1465137417s
plex backup.p1 plexoffset 2048s
sd name backup.p1.s1 drive a len 1465137152s driveoffset 1465137417s
plex backup.p1 plexoffset 1024s
sd name backup.p1.s0 drive c len 1465137152s driveoffset 1465137417s
plex backup.p1 plexoffset 0s
sd name backup.p0.s2 drive c len 1465137152s driveoffset 265s plex
backup.p0 plexoffset 2048s
sd name backup.p0.s1 drive b len 1465137152s driveoffset 265s plex
backup.p0 plexoffset 1024s
sd name backup.p0.s0 drive a len 1465137152s driveoffset 265s plex
backup.p0 plexoffset 0s
By the way, I did the make buildworld, make installworld.
results :
a) it did reboot and started fine
b) it did reboot in 43 seconds (according to monitoring) instead of
8+minutes.
c) fsck is now working fine, in under 10 minutes.
Boy I love when I do something completely stupid, and it works. (This is
a test machine by the way, I would not do this in production)
Furthermore, could you please provide the data I asked for with regards
to your storage devices? In this case, /dev/ad5, /dev/ad6, and /dev/ad7
(assuming those are all which are on the system)? Let's try to rule out
ANY underlying disk issues first, otherwise the rest of the above may
be wasted effort.
I completely agree with "removing underlying issues first", that is why
when I realized that my base install was borked I went for the make
installworld first.
The dmesg is very long (it holds about 12 reboots)
but for the rest :
*> /etc/fstab*
# Device Mountpoint FStype Options Dump
Pass#
/dev/ad4s1a / ufs rw 1 1
/dev/ad4s1b none swap sw 0 0
/dev/ad4s1d /var ufs rw 2 2
/dev/ad4s1e /usr ufs rw 2 2
/dev/ad4s1f /data ufs rw 2 2
/dev/gvinum/backup /backup ufs rw 2 2
proc /proc procfs rw 0 0
*> sysctl kern.disks*
kern.disks: ad7 ad6 ad5 ad4
*> atacontrol list*
ATA channel 0:
Master: no device present
Slave: no device present
ATA channel 2:
Master: ad4 <ST31500341AS/CC1H> SATA revision 2.x
Slave: ad5 <ST31500341AS/CC1H> SATA revision 2.x
ATA channel 3:
Master: ad6 <ST31500341AS/CC1H> SATA revision 2.x
Slave: ad7 <ST31500341AS/CC1H> SATA revision 2.x
*> atacontrol cap ad5*
Protocol SATA revision 2.x
device model ST31500341AS
serial number 9VS4QNSC
firmware revision CC1H
cylinders 16383
heads 16
sectors/track 63
lba supported 268435455 sectors
lba48 supported 2930277168 sectors
dma supported
overlap not supported
Feature Support Enable Value Vendor
write cache yes yes
read ahead yes yes
Native Command Queuing (NCQ) yes - 31/0x1F
Tagged Command Queuing (TCQ) no no 31/0x1F
SMART yes yes
microcode download yes yes
security yes no
power management yes yes
advanced power management no no 0/0x00
automatic acoustic management yes yes 254/0xFE 254/0xFE
ad6 and 7 are indentical except for serial number.
*> smartctl -a /dev/ad5*
smartctl 5.41 2011-06-09 r3365 [FreeBSD 8.2-RELEASE amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.11
Device Model: ST31500341AS
Serial Number: 9VS4QNSC
LU WWN Device Id: 5 000c50 02d019b97
Firmware Version: CC1H
User Capacity: 1,500,301,910,016 bytes [1.50 TB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Tue Jul 26 14:21:07 2011 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection:
Enabled.
Self-test execution status: ( 0) The previous self-test routine
completed
without error or no self-test
has ever
been run.
Total time to complete Offline
data collection: ( 617) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection
on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 255) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x103f) SCT Status supported.
SCT Error Recovery Control
supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 115 099 006 Pre-fail
Always - 87140948
3 Spin_Up_Time 0x0003 100 100 000 Pre-fail
Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age
Always - 24
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail
Always - 0
7 Seek_Error_Rate 0x000f 074 060 030 Pre-fail
Always - 28683116
9 Power_On_Hours 0x0032 094 094 000 Old_age
Always - 5418
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail
Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age
Always - 24
184 End-to-End_Error 0x0032 100 100 099 Old_age
Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age
Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age
Always - 1
189 High_Fly_Writes 0x003a 099 099 000 Old_age
Always - 1
190 Airflow_Temperature_Cel 0x0022 060 047 045 Old_age
Always - 40 (Min/Max 38/53)
194 Temperature_Celsius 0x0022 040 053 000 Old_age
Always - 40 (0 18 0 0)
195 Hardware_ECC_Recovered 0x001a 039 023 000 Old_age
Always - 87140948
197 Current_Pending_Sector 0x0012 100 100 000 Old_age
Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age
Offline - 261580688201002
241 Total_LBAs_Written 0x0000 100 253 000 Old_age
Offline - 3949374025
242 Total_LBAs_Read 0x0000 100 253 000 Old_age
Offline - 4071332224
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining
LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00%
5417 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
*>smartctl -a /dev/ad6*
smartctl 5.41 2011-06-09 r3365 [FreeBSD 8.2-RELEASE amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.11
Device Model: ST31500341AS
Serial Number: 9VS4MXD5
LU WWN Device Id: 5 000c50 02cd319bf
Firmware Version: CC1H
User Capacity: 1,500,301,910,016 bytes [1.50 TB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Tue Jul 26 14:22:34 2011 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection:
Enabled.
Self-test execution status: ( 0) The previous self-test routine
completed
without error or no self-test
has ever
been run.
Total time to complete Offline
data collection: ( 609) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection
on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 255) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x103f) SCT Status supported.
SCT Error Recovery Control
supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 116 099 006 Pre-fail
Always - 107861899
3 Spin_Up_Time 0x0003 100 100 000 Pre-fail
Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age
Always - 23
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail
Always - 0
7 Seek_Error_Rate 0x000f 074 060 030 Pre-fail
Always - 26454013
9 Power_On_Hours 0x0032 094 094 000 Old_age
Always - 5419
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail
Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age
Always - 23
184 End-to-End_Error 0x0032 100 100 099 Old_age
Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age
Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age
Always - 0
189 High_Fly_Writes 0x003a 096 096 000 Old_age
Always - 4
190 Airflow_Temperature_Cel 0x0022 056 044 045 Old_age
Always In_the_past 44 (0 14 56 39)
194 Temperature_Celsius 0x0022 044 056 000 Old_age
Always - 44 (0 18 0 0)
195 Hardware_ECC_Recovered 0x001a 057 029 000 Old_age
Always - 107861899
197 Current_Pending_Sector 0x0012 100 100 000 Old_age
Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age
Offline - 161821482816811
241 Total_LBAs_Written 0x0000 100 253 000 Old_age
Offline - 2546745907
242 Total_LBAs_Read 0x0000 100 253 000 Old_age
Offline - 3981257233
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining
LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00%
5417 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
*> smartctl -a /dev/ad7*
smartctl 5.41 2011-06-09 r3365 [FreeBSD 8.2-RELEASE amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.11
Device Model: ST31500341AS
Serial Number: 9VS4FSDY
LU WWN Device Id: 5 000c50 0274ee0d7
Firmware Version: CC1H
User Capacity: 1,500,301,910,016 bytes [1.50 TB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Tue Jul 26 14:23:08 2011 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection:
Enabled.
Self-test execution status: ( 0) The previous self-test routine
completed
without error or no self-test
has ever
been run.
Total time to complete Offline
data collection: ( 617) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection
on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 255) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x103f) SCT Status supported.
SCT Error Recovery Control
supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 110 099 006 Pre-fail
Always - 26689916
3 Spin_Up_Time 0x0003 100 100 000 Pre-fail
Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age
Always - 24
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail
Always - 8
7 Seek_Error_Rate 0x000f 073 060 030 Pre-fail
Always - 22747051
9 Power_On_Hours 0x0032 094 094 000 Old_age
Always - 5401
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail
Always - 0
12 Power_Cycle_Count 0x0032 100 037 020 Old_age
Always - 24
184 End-to-End_Error 0x0032 100 100 099 Old_age
Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age
Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age
Always - 1
189 High_Fly_Writes 0x003a 100 100 000 Old_age
Always - 0
190 Airflow_Temperature_Cel 0x0022 058 045 045 Old_age
Always In_the_past 42 (Min/Max 35/55)
194 Temperature_Celsius 0x0022 042 055 000 Old_age
Always - 42 (0 18 0 0)
195 Hardware_ECC_Recovered 0x001a 041 028 000 Old_age
Always - 26689916
197 Current_Pending_Sector 0x0012 100 100 000 Old_age
Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age
Offline - 18356690227381
241 Total_LBAs_Written 0x0000 100 253 000 Old_age
Offline - 125910856
242 Total_LBAs_Read 0x0000 100 253 000 Old_age
Offline - 1003871140
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining
LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00%
5399 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"