Persistent reservation behaviour/compliance with redundant controllers

2013-12-25 Thread Matthias Eble
Hi all,

I'm experiencing a behaviour that doesn't comply to the SPC3/4 standards from
my point of view. I have read the t10 drafts to understand scsi3 persistent
reservations (PR). Probably I simply got the standard wrong, but maybe somebody
can bring light into the situation.

My understanding of SPC-3/4 is that with PR, registrations should happen on any
I_T Nexus accessing a volume. To me, in a dm-multipath environment, this
translates to "register every single path".

But that doesn't work on our 3Par 7400.
Now the question is, who is wrong? Me (likely :-), or HP/3Par (unlikely).


Here's the dmmp map
360002aca6e6b dm-6 3PARdata,VV
size=2.0T features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 3:0:1:4  sdg  8:96active ready running
  |- 3:0:3:4  sdl  8:176   active ready running
  |- 5:0:3:4  sdbg 67:160  active ready running
  `- 5:0:1:4  sdce 69:32   active ready running


Here are the commands:
1: starting with a clean state:
   # sg_persist --in --read-keys /dev/sdg
 3PARdata  VV3122
 Peripheral device type: disk
 PR generation=0x3a, there are NO registered reservation keys

2: first registration (sdg) works fine:
   # sg_persist -d /dev/sdg --no-inquiry --out --register \
--param-sark=0x420480a02967

3: however registering sdl fails:
   # sg_persist -d /dev/sdl --no-inquiry --out --register \
--param-sark=0x420480a0296c
  persistent reserve out: scsi status: Reservation Conflict

When I --register-*ignore* the second device, the command succeeds.
But the first registration key for sdg gets substituted by the new one for sdl.
The same thing happens the other way around when sdg is register-ignore'd
again.

There can only be two registrations at a time: (sdg XOR sdl) and (sdbg XOR sdce)
Now my question is: Does this comply to the standard?

My core problem is that I'd like to ensure that no registration is missing
by accident.

I hope that somebody on this list is kind enough to answer my question or
give me a hint. HP was not able to direct it to a capable person in the
last 9 months. *sigh*


Any help is appreciated!

Thanks in advance,
Matthias



3Par specific information:
3Par systems have a transparent controller(node) failover feature.
In the example above, scsi host3 has two paths to the same volume.
The paths are provided by two different controller nodes.
If one node fails, the other node can take over the path transparently.
To me it looks like the SG3PR implementation is too transparent when it
comes to SG3PR.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Persistent reservation behaviour/compliance with redundant controllers

2014-01-06 Thread Matthias Eble
2014/1/6 Lee Duncan :
> On 12/25/2013 03:00 PM, Matthias Eble wrote:
>> Here's the dmmp map
>> 360002aca6e6b dm-6 3PARdata,VV
>> size=2.0T features='0' hwhandler='0' wp=rw
>> `-+- policy='round-robin 0' prio=1 status=active
>>   |- 3:0:1:4  sdg  8:96active ready running
>>   |- 3:0:3:4  sdl  8:176   active ready running
>>   |- 5:0:3:4  sdbg 67:160  active ready running
>>   `- 5:0:1:4  sdce 69:32   active ready running
>>
>> There can only be two registrations at a time: (sdg XOR sdl) and (sdbg XOR 
>> sdce)
>> Now my question is: Does this comply to the standard?
>>
>
> I _believe_ the problem is that you are re-registering the same
> I_T_Nexus through /dev/sdl, your second attempt at registration, as you
> did when you used /dev/sdg, your original registration.


Can sdg and sdl be the same I_T_Nexus at a time?
Right now, they are handled like that.
In my understanding, every scsi disk device represents an I_T_Nexus.


# lsscsi -t | egrep '/dev/sd(g|l|bg|ce)'
[3:0:1:4]diskfc:0x20120002ac006e6b,0x14ad40  /dev/sdg
[3:0:3:4]diskfc:0x21120002ac006e6b,0x14ad80  /dev/sdl
[5:0:1:4]diskfc:0x22110002ac006e6b,0x0aad40  /dev/sdce
[5:0:3:4]diskfc:0x23110002ac006e6b,0x0aad80  /dev/sdbg


> What are you really trying to do? Are you testing that persistent
> reservations "work" or trying to figure them out?

I am testing PR on a specific storage system, which seems to behave differently
like the ones before.

> I have a "persistent reservations for dummies" document I wrote that I
> can send you off list, if you like.

I think I know how PRs work. Yet I'd be happy about your document.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Persistent reservation behaviour/compliance with redundant controllers

2014-01-06 Thread Matthias Eble
2014/1/7 James Bottomley :
> On Mon, 2014-01-06 at 23:53 +0100, Matthias Eble wrote:
>>
>> Can sdg and sdl be the same I_T_Nexus at a time?
>> Right now, they are handled like that.
>> In my understanding, every scsi disk device represents an I_T_Nexus.
>
> No, every SCSI disk is an I_T_L nexus.  There's no actual device object
> in Linux for an I_T nexus.

So, PR registrations are made for an I_T nexus using an I_T_L nexus.
Probably my previous systems had a 1:1 relation between I_T and I_T_L.

Is there a way to identify which I_T_L nexuses belong to the same I_T nexus?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Persistent reservation behaviour/compliance with redundant controllers

2014-01-22 Thread Matthias Eble
2014/1/7 James Bottomley 
>
> On Mon, 2014-01-06 at 23:53 +0100, Matthias Eble wrote:
> > 2014/1/6 Lee Duncan :
> > > On 12/25/2013 03:00 PM, Matthias Eble wrote:
> > >> Here's the dmmp map
> > >> 360002aca6e6b dm-6 3PARdata,VV
> > >> size=2.0T features='0' hwhandler='0' wp=rw
> > >> `-+- policy='round-robin 0' prio=1 status=active
> > >>   |- 3:0:1:4  sdg  8:96active ready running
> > >>   |- 3:0:3:4  sdl  8:176   active ready running
> > >>   |- 5:0:3:4  sdbg 67:160  active ready running
> > >>   `- 5:0:1:4  sdce 69:32   active ready running
> > >>
> > >> There can only be two registrations at a time: (sdg XOR sdl) and (sdbg 
> > >> XOR sdce)
> > >> Now my question is: Does this comply to the standard?
> > >>
> > >
> > > I _believe_ the problem is that you are re-registering the same
> > > I_T_Nexus through /dev/sdl, your second attempt at registration, as you
> > > did when you used /dev/sdg, your original registration.
> >
> >
> > Can sdg and sdl be the same I_T_Nexus at a time?
> > Right now, they are handled like that.
> > In my understanding, every scsi disk device represents an I_T_Nexus.
>
> No, every SCSI disk is an I_T_L nexus.  There's no actual device object
> in Linux for an I_T nexus.


Hi All,

I'd like to document the progress and findings in lots of off-list emails with
HP's t10 members.
Maybe someone on the net will face the same problem.

First of all, the SPC wording isn't 100% precise. For most commands, the Lun
context is implicit. So if the standards state "I_T Nexus", I_T_L Nexuses are
meant, as the reservation commands are always lun specific.

That said, PR-registrations need to be done for every
I_T_L Nexus -> every single dmmp path (/dev/sdX)

So we started to test the behaviour of the 3Par system.
It seems that there are some quirks in the 3Par implementation.
The error that led to my initial question is that the target port
identifier isn't included in the target's reservation handling.
Thus all PR commands from one host port are considered the same.
Regardless of the target port over which they were received.
(As seen in attached commands #5 or #6 after issuing #2 )
Note that the investigations haven't been finished.


For those who are interested, here are the findings (verbose output stripped):


1.# sg_persist --in --read-keys /dev/sdl
  3PARdata  VV3122
  Peripheral device type: disk
  PR generation=0x44, there are NO registered reservation keys

register via sdl:
2.# sg_persist -vvv -d /dev/sdl --no-inquiry --out --register
--param-sark=0x420480a0296c
PR out: command (Register) successful

test for scp3r23 table 33 compliance (same key on registered I_T Nexus
should succeed): False
3.# sg_persist -vvv -d /dev/sdl --no-inquiry --out --register
--param-sark=0x420480a0296c
persistent reserve out: scsi status: Reservation Conflict
PR out: command failed

now with a *different key* (should conflict): True
4.# sg_persist -vvv -d /dev/sdl --no-inquiry --out --register
--param-sark=0x420480a0296d
persistent reserve out: scsi status: Reservation Conflict
PR out: command failed

Same behaviour using another path/I_T_L Nexus (should succeed in both cases):
5.# sg_persist -vvv -d /dev/sdg --no-inquiry --out --register
--param-sark=0x420480a0296c
persistent reserve out: scsi status: Reservation Conflict
PR out: command failed
6.# sg_persist -vvv -d /dev/sdg --no-inquiry --out --register
--param-sark=0x420480a0296d
persistent reserve out: scsi status: Reservation Conflict
PR out: command failed

Unregister via sdg :-/
7.# sg_persist -vvv -d /dev/sdg --no-inquiry --out --register
--param-rk=0x420480a0296c
PR out: command (Register) successful

Additionally, read-full-status service action and ALL_TG_PT are not
supported, right now.


That's it for now.

Thanks for your replies,
Matthias
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Open/INQUIRY fails on RESERVE'd tape device

2014-01-23 Thread Matthias Eble
Hi list,

When a tape device is reserved with old reserve/release commands,
we see inquiry only works on the scsi generic device. For scsi tape devices
open() fails already:

# lsscsi -g | grep st15
[2:0:6:0]tapeHP   Ultrium 5-SCSI   I5DZ  /dev/st15  /dev/sg17

# sg_vpd -vvv /dev/st15
  open /dev/st15 with flags=0x800
  error opening file: /dev/st15: Input/output error

# sg_vpd -vvv /dev/nst15
  open /dev/nst15 with flags=0x800
  error opening file: /dev/nst15: Input/output error

# sg_vpd -vvv /dev/sg17
open /dev/sg17 with flags=0x800
Supported VPD pages VPD page:
inquiry cdb: 12 01 00 00 fc 00
  duration=2 ms
inquiry: requested 252 bytes but got 22 bytes
   [PQual=0  Peripheral device type: tape]
  Supported VPD pages [sv]
  Unit serial number [sn]
  ...


So: should open() fail on a reserved tape device?
SPC2 states that INQUIRY should never conflict.
Or does that only apply to the generic device?
Okay, it doesn't conflict, but open fails. A SunOS st man page I found
states, INQUIRY shall be possible with reserved devices.

Of course the inquiry succeeds, after the reservation is being released.


Thanks in advance
Matthias
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Open/INQUIRY fails on RESERVE'd tape device

2014-01-24 Thread Matthias Eble
Hi all,

2014/1/24 Jeremy Linton :
> On 1/23/2014 4:02 PM, Matthias Eble wrote:
>> So: should open() fail on a reserved tape device?
>
> Yes, this is expected behavior for tape devices, reserve 6/release is 
> sometimes
> used by backup applications in SAN environments as an arbitration mechanism
> across multiple machines.

You hit the nail on the head. Problem is that our backup application
does inquiry on /dev/nst*,
which is broken when the same application uses RESERVE/RELEASE.

> Its not that the INQUIRY is failing, its that the st open sequence is 
> doing a
> reserve/TUR/etc during the open.

This is exactly what I am facing. I just thought that it might not be
OK to issue these commands with st_open. But I guess, there is no
right or wrong it's just implemented that way - so applications need
to deal with it and use a generic device.

> If that fails then you can't open the drives sufficiently to send a 
> inquiry via
> pass-through. In some environments you can bypass that processing with
> O_NDELAY/O_NONBLOCK. Or you just use the sg device which doesn't perform the
> tape open processing that st does.

I guess you mean operating systems with environments, as sg_vpd
also uses O_NONBLOCK, which doesn't help:
  open("/dev/st15", O_RDONLY|O_NONBLOCK)  = -1 EIO (Input/output error)

But as this behaviour has been there for long time, the backup vendor
needs to fix it IMO.

Thanks to all of you
Matthias
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html