Just to chime in: it will look fine, feel fine, but underneath it's
quite easy to get VMFS corruption. Happened in our tests.
Also if you're running LIO, from time to time expect a kernel panic
(haven't tried with the latest upstream, as I've been using
Ubuntu 14.04 on my "export" hosts for the test, so might have improved...).
As of now I would not recommend this setup without being aware of the
risks involved.
There have been a few upstream patches getting the LIO code in better
cluster-aware shape, but no idea if they have been merged
yet. I know RedHat has a guy on this.
On 01/21/2015 02:40 PM, Nick Fisk wrote:
Hi Jake,
Thanks for this, I have been going through this and have a pretty good
idea on what you are doing now, however I maybe missing something
looking through your scripts, but I’m still not quite understanding
how you are managing to make sure locking is happening with the ESXi
ATS SCSI command.
From this slide
http://xo4t.mjt.lu/link/xo4t/gzyhtx3/1/_9gJVMUrSdvzGXYaZfCkVA/aHR0cHM6Ly93aWtpLmNlcGguY29tL0BhcGkvZGVraS9maWxlcy8zOC9oYW1tZXItY2VwaC1kZXZlbC1zdW1taXQtc2NzaS10YXJnZXQtY2x1c3RlcmluZy5wZGY
(Page 8)
It seems to indicate that for a true active/active setup the two
targets need to be aware of each other and exchange locking
information for it to work reliably, I’ve also watched the video from
the Ceph developer summit where this is discussed and it seems that
Ceph+Kernel need changes to allow this locking to be pushed back to
the RBD layer so it can be shared, from what I can see browsing
through the Linux Git Repo, these patches haven’t made the mainline
kernel yet.
Can you shed any light on this? As tempting as having active/active
is, I’m wary about using the configuration until I understand how the
locking is working and if fringe cases involving multiple ESXi hosts
writing to the same LUN on different targets could spell disaster.
Many thanks,
Nick
*From:*Jake Young [mailto:jak3...@gmail.com]
*Sent:* 14 January 2015 16:54
*To:* Nick Fisk
*Cc:* Giuseppe Civitella; ceph-users
*Subject:* Re: [ceph-users] Ceph, LIO, VMWARE anyone?
Yes, it's active/active and I found that VMWare can switch from path
to path with no issues or service impact.
I posted some config files here: github.com/jak3kaj/misc
<http://xo4t.mjt.lu/link/xo4t/gzyhtx3/2/_P2HWj3RxQZC1v5DQ_206Q/aHR0cDovL2dpdGh1Yi5jb20vamFrM2thai9taXNj>
One set is from my LIO nodes, both the primary and secondary configs
so you can see what I needed to make unique. The other set
(targets.conf) are from my tgt nodes. They are both 4 LUN configs.
Like I said in my previous email, there is no performance difference
between LIO and tgt. The only service I'm running on these nodes is a
single iscsi target instance (either LIO or tgt).
Jake
On Wed, Jan 14, 2015 at 8:41 AM, Nick Fisk <n...@fisk.me.uk
<mailto:n...@fisk.me.uk>> wrote:
Hi Jake,
I can’t remember the exact details, but it was something to do
with a potential problem when using the pacemaker resource agents.
I think it was to do with a potential hanging issue when one LUN
on a shared target failed and then it tried to kill all the other
LUNS to fail the target over to another host. This then leaves the
TCM part of LIO locking the RBD which also can’t fail over.
That said I did try multiple LUNS on one target as a test and
didn’t experience any problems.
I’m interested in the way you have your setup configured though.
Are you saying you effectively have an active/active configuration
with a path going to either host, or are you failing the iSCSI IP
between hosts? If it’s the former, have you had any problems with
scsi locking/reservations…etc between the two targets?
I can see the advantage to that configuration as you
reduce/eliminate a lot of the troubles I have had with resources
failing over.
Nick
*From:*Jake Young [mailto:jak3...@gmail.com
<mailto:jak3...@gmail.com>]
*Sent:* 14 January 2015 12:50
*To:* Nick Fisk
*Cc:* Giuseppe Civitella; ceph-users
*Subject:* Re: [ceph-users] Ceph, LIO, VMWARE anyone?
Nick,
Where did you read that having more than 1 LUN per target causes
stability problems?
I am running 4 LUNs per target.
For HA I'm running two linux iscsi target servers that map the
same 4 rbd images. The two targets have the same serial numbers,
T10 address, etc. I copy the primary's config to the backup and
change IPs. This way VMWare thinks they are different target IPs
on the same host. This has worked very well for me.
One suggestion I have is to try using rbd enabled tgt. The
performance is equivalent to LIO, but I found it is much better at
recovering from a cluster outage. I've had LIO lock up the kernel
or simply not recognize that the rbd images are available; where
tgt will eventually present the rbd images again.
I have been slowly adding servers and am expanding my test setup
to a production setup (nice thing about ceph). I now have 6 OSD
hosts with 7 disks on each. I'm using the LSI Nytro cache raid
controller, so I don't have a separate journal and have 40Gb
networking. I plan to add another 6 OSD hosts in another rack in
the next 6 months (and then another 6 next year). I'm doing 3x
replication, so I want to end up with 3 racks.
Jake
On Wednesday, January 14, 2015, Nick Fisk <n...@fisk.me.uk
<mailto:n...@fisk.me.uk>> wrote:
Hi Giuseppe,
I am working on something very similar at the moment. I
currently have it working on some test hardware but seems to
be working reasonably well.
I say reasonably as I have had a few instability’s but these
are on the HA side, the LIO and RBD side of things have been
rock solid so far. The main problems I have had seem to be
around recovering from failure with resources ending up in a
unmanaged state. I’m not currently using fencing so this may
be part of the cause.
As a brief description of my configuration.
4 Hosts each having 2 OSD’s also running the monitor role
3 additional host in a HA cluster which act as iSCSI proxy nodes.
I’m using the IP, RBD, iSCSITarget and iSCSILUN resource
agents to provide HA iSCSI LUN which maps back to a RBD. All
the agents for each RBD are in a group so they follow each
other between hosts.
I’m using 1 LUN per target as I read somewhere there are
stability problems using more than 1 LUN per target.
Performance seems ok, I can get about 1.2k random IO’s out the
iSCSI LUN. These seems to be about right for the Ceph cluster
size, so I don’t think the LIO part is causing any significant
overhead.
We should be getting our production hardware shortly which wil
have 40 OSD’s with journals and a SSD caching tier, so within
the next month or so I will have a better idea of running it
in a production environment and the performance of the system.
Hope that helps, if you have any questions, please let me know.
Nick
*From:*ceph-users [mailto:ceph-users-boun...@lists.ceph.com]
*On Behalf Of *Giuseppe Civitella
*Sent:* 13 January 2015 11:23
*To:* ceph-users
*Subject:* [ceph-users] Ceph, LIO, VMWARE anyone?
Hi all,
I'm working on a lab setup regarding Ceph serving rbd images
as ISCSI datastores to VMWARE via a LIO box. Is there someone
that already did something similar wanting to share some
knowledge? Any production deployments? What about LIO's HA and
luns' performances?
Thanks
Giuseppe
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com