Hi Jake,
Thanks for this, I have been going through this and have a pretty
good idea on what you are doing now, however I maybe missing
something looking through your scripts, but I’m still not quite
understanding how you are managing to make sure locking is
happening with the ESXi ATS SCSI command.
From this slide
http://xo4t.mjt.lu/link/xo4t/gzyhtx3/1/_9gJVMUrSdvzGXYaZfCkVA/aHR0cHM6Ly93aWtpLmNlcGguY29tL0BhcGkvZGVraS9maWxlcy8zOC9oYW1tZXItY2VwaC1kZXZlbC1zdW1taXQtc2NzaS10YXJnZXQtY2x1c3RlcmluZy5wZGY
(Page 8)
It seems to indicate that for a true active/active setup the two
targets need to be aware of each other and exchange locking
information for it to work reliably, I’ve also watched the video
from the Ceph developer summit where this is discussed and it
seems that Ceph+Kernel need changes to allow this locking to be
pushed back to the RBD layer so it can be shared, from what I can
see browsing through the Linux Git Repo, these patches haven’t
made the mainline kernel yet.
Can you shed any light on this? As tempting as having
active/active is, I’m wary about using the configuration until I
understand how the locking is working and if fringe cases
involving multiple ESXi hosts writing to the same LUN on
different targets could spell disaster.
Many thanks,
Nick
*From:*Jake Young [mailto:jak3...@gmail.com]
*Sent:* 14 January 2015 16:54
*To:* Nick Fisk
*Cc:* Giuseppe Civitella; ceph-users
*Subject:* Re: [ceph-users] Ceph, LIO, VMWARE anyone?
Yes, it's active/active and I found that VMWare can switch from
path to path with no issues or service impact.
I posted some config files here: github.com/jak3kaj/misc
<http://xo4t.mjt.lu/link/xo4t/gzyhtx3/2/_P2HWj3RxQZC1v5DQ_206Q/aHR0cDovL2dpdGh1Yi5jb20vamFrM2thai9taXNj>
One set is from my LIO nodes, both the primary and secondary
configs so you can see what I needed to make unique. The other
set (targets.conf) are from my tgt nodes. They are both 4 LUN
configs.
Like I said in my previous email, there is no performance
difference between LIO and tgt. The only service I'm running on
these nodes is a single iscsi target instance (either LIO or tgt).
Jake
On Wed, Jan 14, 2015 at 8:41 AM, Nick Fisk <n...@fisk.me.uk
<mailto:n...@fisk.me.uk>> wrote:
Hi Jake,
I can’t remember the exact details, but it was something to
do with a potential problem when using the pacemaker resource
agents. I think it was to do with a potential hanging issue
when one LUN on a shared target failed and then it tried to
kill all the other LUNS to fail the target over to another
host. This then leaves the TCM part of LIO locking the RBD
which also can’t fail over.
That said I did try multiple LUNS on one target as a test and
didn’t experience any problems.
I’m interested in the way you have your setup configured
though. Are you saying you effectively have an active/active
configuration with a path going to either host, or are you
failing the iSCSI IP between hosts? If it’s the former, have
you had any problems with scsi locking/reservations…etc
between the two targets?
I can see the advantage to that configuration as you
reduce/eliminate a lot of the troubles I have had with
resources failing over.
Nick
*From:*Jake Young [mailto:jak3...@gmail.com
<mailto:jak3...@gmail.com>]
*Sent:* 14 January 2015 12:50
*To:* Nick Fisk
*Cc:* Giuseppe Civitella; ceph-users
*Subject:* Re: [ceph-users] Ceph, LIO, VMWARE anyone?
Nick,
Where did you read that having more than 1 LUN per target
causes stability problems?
I am running 4 LUNs per target.
For HA I'm running two linux iscsi target servers that map
the same 4 rbd images. The two targets have the same serial
numbers, T10 address, etc. I copy the primary's config to
the backup and change IPs. This way VMWare thinks they are
different target IPs on the same host. This has worked very
well for me.
One suggestion I have is to try using rbd enabled tgt. The
performance is equivalent to LIO, but I found it is much
better at recovering from a cluster outage. I've had LIO lock
up the kernel or simply not recognize that the rbd images are
available; where tgt will eventually present the rbd images
again.
I have been slowly adding servers and am expanding my test
setup to a production setup (nice thing about ceph). I now
have 6 OSD hosts with 7 disks on each. I'm using the LSI
Nytro cache raid controller, so I don't have a separate
journal and have 40Gb networking. I plan to add another 6 OSD
hosts in another rack in the next 6 months (and then another
6 next year). I'm doing 3x replication, so I want to end up
with 3 racks.
Jake
On Wednesday, January 14, 2015, Nick Fisk <n...@fisk.me.uk
<mailto:n...@fisk.me.uk>> wrote:
Hi Giuseppe,
I am working on something very similar at the moment. I
currently have it working on some test hardware but seems
to be working reasonably well.
I say reasonably as I have had a few instability’s but
these are on the HA side, the LIO and RBD side of things
have been rock solid so far. The main problems I have had
seem to be around recovering from failure with resources
ending up in a unmanaged state. I’m not currently using
fencing so this may be part of the cause.
As a brief description of my configuration.
4 Hosts each having 2 OSD’s also running the monitor role
3 additional host in a HA cluster which act as iSCSI
proxy nodes.
I’m using the IP, RBD, iSCSITarget and iSCSILUN resource
agents to provide HA iSCSI LUN which maps back to a RBD.
All the agents for each RBD are in a group so they follow
each other between hosts.
I’m using 1 LUN per target as I read somewhere there are
stability problems using more than 1 LUN per target.
Performance seems ok, I can get about 1.2k random IO’s
out the iSCSI LUN. These seems to be about right for the
Ceph cluster size, so I don’t think the LIO part is
causing any significant overhead.
We should be getting our production hardware shortly
which wil have 40 OSD’s with journals and a SSD caching
tier, so within the next month or so I will have a better
idea of running it in a production environment and the
performance of the system.
Hope that helps, if you have any questions, please let me
know.
Nick
*From:*ceph-users
[mailto:ceph-users-boun...@lists.ceph.com] *On Behalf Of
*Giuseppe Civitella
*Sent:* 13 January 2015 11:23
*To:* ceph-users
*Subject:* [ceph-users] Ceph, LIO, VMWARE anyone?
Hi all,
I'm working on a lab setup regarding Ceph serving rbd
images as ISCSI datastores to VMWARE via a LIO box. Is
there someone that already did something similar wanting
to share some knowledge? Any production deployments? What
about LIO's HA and luns' performances?
Thanks
Giuseppe
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com