Re: [zfs-discuss] zpool does not like iSCSI ?

2010-11-03 Thread Andreas Koppenhoefer
After a while I was able to track down the problem.
During boot process service filesystem/local gets enabled long before 
iscsi/initiator. Start method of filesystem/local will mount ufs, swap and all 
stuff from /etc/vftab.
Some recent patch added a "zfs mount -a" to the filesystem/local start method.
Of couse "zfs mount -a" will not find our iSCSI zpools and should no do nothing 
at all. But in our case it finds and imports a zpool "data" located on a local 
attached disk.

I suspect that the zpool management information stored within zpool "data" 
contains a pointer to a (at that time) inaccessible zpool "iscsi1", which gets 
marked as "FAULTED"
Therefore "zpool list" shows that missing zpool as "FAULTED" since all devices 
of iscsi1 are inaccessible.
Some times later in boot process iscsi/initiator gets enabled. After getting 
all iscsi targets online, one would expect the zpool to change state, because 
of all devices are online now. But that will not happen.

As a workaround I've written a service manifest and start method. The service 
will be fired up after iscsi/initiator. The start method scans output of "zpool 
list " for faulted pools with size of "-". For each faulted zpool it will do a 
"zpool export pool-name" and after that a re-import with "zpool import 
pool-name".
After that the previously faulted iscsi zpool will be online again (unless you 
have other problems).

But be aware, there is a race condition: you have to wait some time (at least 
10 secs) between export and re-import of a zpool. Without wait inbetween the 
fault condition will not get cleared.

Again: you may encounter this case only if...
... you're running some recent kernel patch level (in our case 142909-17)
... *and* you have placed zpools on both iscsi and non-iscsi devices.

- Andreas
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool does not like iSCSI ?

2010-11-03 Thread Markus Kovero
> Again: you may encounter this case only if...
> ... you're running some recent kernel patch level (in our case 142909-17)
> ... *and* you have placed zpools on both iscsi and non-iscsi devices.

Witnessed same behavior with osol_134 but it seems to be fixed in 147 atleast. 
No idea about Solaris though

Yours
Markus Kovero



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [OpenIndiana-discuss] format dumps the core

2010-11-03 Thread Cindy Swearingen

Moazam,

Thanks for the update. I hope this is Roy's issue too.

I can see that format would freak out over ext3, but it
shouldn't core dump.

Cindy

On 11/02/10 17:00, Moazam Raja wrote:

Fixed!

It turns out the problem was that we pulled these two disks from a
Linux box and they were formatted with ext3 on partition 0 for the
whole disk, which was somehow causing 'format' to freak out.

So, we fdisk'ed the p0 slice to delete the Linux partition and then
created a SOLARIS2 type partition on it. It worked and no more crash
during format command.

Cindy, please let the format team know about this since I'm sure
others will also run into this problem at some point if they have a
mixed Linux/Solaris environment.


-Moazam

On Tue, Nov 2, 2010 at 3:15 PM, Cindy Swearingen
 wrote:

Hi Moazam,

The initial diagnosis is that the LSI controller is reporting bogus
information. It looks like Roy is using a similar controller.

You might report this problem to LSI, but I will pass this issue
along to the format folks.

Thanks,

Cindy

On 11/02/10 15:26, Moazam Raja wrote:

I'm having the same problem after adding 2 SSD disks to my machine.
The controller is LSI SAS9211-8i PCI Express.

# format
Searching for disks...Arithmetic Exception (core dumped)



# pstack core.format.1016
core 'core.format.1016' of 1016:format
 fee62e4a UDiv (4, 0, 8046bf0, 8046910, 80469a0, 80469c0) + 2a
 08079799 auto_sense (4, 0, 8046bf0, 1c8) + 281
 080751a6 add_device_to_disklist (8047930, 8047530, feffb8f4, 804716c) +
62a
 080746ff do_search (0, 1, 8047d98, 8066576) + 273
 0806658d main (1, 8047dd0, 8047dd8, 8047d8c) + c1
 0805774d _start   (1, 8047e88, 0, 8047e8f, 8047e99, 8047ead) + 7d


I'm on b147.

# uname -a
SunOS geneva5 5.11 oi_147 i86pc i386 i86pc Solaris


On Tue, Nov 2, 2010 at 7:17 AM, Joerg Schilling
 wrote:

Roy Sigurd Karlsbakk  wrote:


Hi all

(crossposting to zfs-discuss)

This error also seems to occur on osol 134. Any idea what this might be?


ioctl(4, USCSICMD, 0x08046910) = 0
ioctl(4, USCSICMD, 0x08046900) = 0
ioctl(4, USCSICMD, 0x08046570) = 0
ioctl(4, USCSICMD, 0x08046570) = 0
Incurred fault #8, FLTIZDIV %pc = 0xFEE62E4A
siginfo: SIGFPE FPE_INTDIV addr=0xFEE62E4A
Received signal #8, SIGFPE [default]
siginfo: SIGFPE FPE_INTDIV addr=0xFEE62E4A
r...@tos-backup:~#

You need to find out at which source line the integer division by zero
occurs.

Jörg

--
 EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353
Berlin
 j...@cs.tu-berlin.de(uni)
 joerg.schill...@fokus.fraunhofer.de (work) Blog:
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/
ftp://ftp.berlios.de/pub/schily
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send/receive and locking

2010-11-03 Thread Chris Mosetick
Sorry i'm not able to provide more insight but I thought some of the
concepts in this article might help you, as well as Mike's replication
script, also available on this page:
http://blog.laspina.ca/ubiquitous/provisioning_disaster_recovery_with_zfs

You also might want to look at InfraGeeks auto-replicate script:

http://www.infrageeks.com/groups/infrageeks/wiki/8fb35/zfs_autoreplicate_script.html

If the auto-replicate script works without locking things up, then you are
probably doing something wrong.  This script works great for me on b134
hosts.  One thing you might do to enable faster transfer speeds is enable
blowfish-cbc in your /etc/ssh/sshd_config and then modify the script to use
that cipher.

Cheers,

-Chris
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS/iSCSI Issues when migrating.

2010-11-03 Thread kzlli
So I'm going to try to outline the problem as thoroughly as possible and 
hopefully someone can help.

I have a b134 box that has a zpool on it which I created a block device on and 
set to export via iscsi. That block device was imported onto FreeNAS (ZFS v6, 
V0.7.1 5771) and setup as a simple stripe. Now I want to migrate from *BSD 
based freenas to another b134 box. I setup the initiator on the the b134 box 
and can see the target:

lbari...@alpha:~# iscsiadm list target -S
Target: iqn.1986-03.com.sun:02:086cf3d8-5e35-612f-c778-db39ffa7cccf
Alias: -
TPGT: 1
ISID: 402a
Connections: 1
LUN: 0
 Vendor:  SUN 
 Product: COMSTAR 
 OS Device Name: /dev/rdsk/c6t600144F0180005004C1D12070001d0s2


__


lbari...@alpha:~# iscsiadm list target -v
Target: iqn.1986-03.com.sun:02:086cf3d8-5e35-612f-c778-db39ffa7cccf
Alias: -
TPGT: 1
ISID: 402a
Connections: 1
CID: 0
  IP address (Local): 192.168.2.21:49276
  IP address (Peer): 192.168.2.12:3260
  Discovery Method: SendTargets 
  Login Parameters (Negotiated):
Data Sequence In Order: yes
Data PDU In Order: yes
Default Time To Retain: 20
Default Time To Wait: 2
Error Recovery Level: 0
First Burst Length: 65536
Immediate Data: yes
Initial Ready To Transfer (R2T): yes
Max Burst Length: 262144
Max Outstanding R2T: 1
Max Receive Data Segment Length: 32768
Max Connections: 1
Header Digest: NONE
Data Digest: NONE

The problem is when I try to import it in opensol I get the following:

lbari...@alpha:~# zpool import
  pool: iscsi
id: 3764978642237279050
 state: UNAVAIL
status: The pool was last accessed by another system.
action: The pool cannot be imported due to damaged devices or data.
   see: http://www.sun.com/msg/ZFS-8000-EY
config:

iscsi  UNAVAIL  insufficient 
replicas
  c6t600144F0180005004C1D12070001d0p0  UNAVAIL  cannot open


_
lbari...@alpha:~# zpool import -f iscsi
cannot import 'iscsi': invalid vdev configuration

In case this is useful, output from zdb for labels:
lbari...@alpha:~# zdb -l /dev/rdsk/c6t600144F0180005004C1D12070001d0p0 

LABEL 0

version: 6
name: 'iscsi'
state: 0
txg: 12626435
pool_guid: 3764978642237279050
hostid: 2046694327
hostname: 'freenas.local'
top_guid: 13220149227215223439
guid: 13220149227215223439
vdev_tree:
type: 'disk'
id: 0
guid: 13220149227215223439
path: '/dev/da1'
whole_disk: 0
metaslab_array: 14
metaslab_shift: 31
ashift: 9
asize: 22762248208384

LABEL 1

version: 6
name: 'iscsi'
state: 0
txg: 12626435
pool_guid: 3764978642237279050
hostid: 2046694327
hostname: 'freenas.local'
top_guid: 13220149227215223439
guid: 13220149227215223439
vdev_tree:
type: 'disk'
id: 0
guid: 13220149227215223439
path: '/dev/da1'
whole_disk: 0
metaslab_array: 14
metaslab_shift: 31
ashift: 9
asize: 22762248208384

LABEL 2

failed to read label 2

LABEL 3

failed to read label 3



Although I can still access this data on the FreeNAS box, I made sure v6 could 
be updated by v22 of ZFS via a few VM's and this works without a problem. Short 
of getting another large amount of space and copying the data over on a new 
share, I'm not really sure what to do to get this pool to integrate with the 
b134 osol initiator, any help would be great. Thanks!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss