Correct me if I'm wrong, but tgt doesn't have full SCSI-3 persistence support when _not_ using the LIO
backend for it, right?

AFAIK you can either run tgt with it's own iSCSI implementation or you can use tgt to manage your LIO targets.

I assume when you're running tgt with the rbd backend code you're skipping all the in-kernel LIO parts (in which case the RedHat patches won't help a bit), and you won't have proper active-active support, since the initiators have no way to synchronize state (and more importantly, no way to synchronize write caching! [I can think
of some really ugly hacks to get around that, tho...]).

On 01/23/2015 05:46 PM, Jake Young wrote:
Thanks for the feedback Nick and Zoltan,

I have been seeing periodic kernel panics when I used LIO. It was either due to LIO or the kernel rbd mapping. I have seen this on Ubuntu precise with kernel 3.14.14 and again in Ubunty trusty with the utopic kernel (currently 3.16.0-28). Ironically, this is the primary reason I started exploring a redundancy solution for my iSCSI proxy node. So, yes, these crashes have nothing to do with running the Active/Active setup.

I am moving my entire setup from LIO to rbd enabled tgt, which I've found to be much more stable and gives equivalent performance.

I've been testing active/active LIO since July of 2014 with VMWare and I've never seen any vmfs corruption. I am now convinced (thanks Nick) that it is possible. The reason I have not seen any corruption may have to do with how VMWare happens to be configured.

Originally, I had made a point to use round robin path selection in the VMware hosts; but as I did performance testing, I found that it actually didn't help performance. When the host switches iSCSI targets there is a short "spin up time" for LIO to get to 100% IO capability. Since round robin switches targets every 30 seconds (60 seconds? I forget), this seemed to be significant. A secondary goal for me was to end up with a config that required minimal tuning from VMWare and the target software; so the obvious choice is to leave VMWare's path selection at the default which is Fixed and picks the first target in ASCII-betical order. That means I am actually functioning in Active/Passive mode.

Jake




On Fri, Jan 23, 2015 at 8:46 AM, Zoltan Arnold Nagy <zol...@linux.vnet.ibm.com <mailto:zol...@linux.vnet.ibm.com>> wrote:

    Just to chime in: it will look fine, feel fine, but underneath
    it's quite easy to get VMFS corruption. Happened in our tests.
    Also if you're running LIO, from time to time expect a kernel
    panic (haven't tried with the latest upstream, as I've been using
    Ubuntu 14.04 on my "export" hosts for the test, so might have
    improved...).

    As of now I would not recommend this setup without being aware of
    the risks involved.

    There have been a few upstream patches getting the LIO code in
    better cluster-aware shape, but no idea if they have been merged
    yet. I know RedHat has a guy on this.

    On 01/21/2015 02:40 PM, Nick Fisk wrote:

    Hi Jake,

    Thanks for this, I have been going through this and have a pretty
    good idea on what you are doing now, however I maybe missing
    something looking through your scripts, but I’m still not quite
    understanding how you are managing to make sure locking is
    happening with the ESXi ATS SCSI command.

    From this slide

    
http://xo4t.mjt.lu/link/xo4t/gzyhtx3/1/_9gJVMUrSdvzGXYaZfCkVA/aHR0cHM6Ly93aWtpLmNlcGguY29tL0BhcGkvZGVraS9maWxlcy8zOC9oYW1tZXItY2VwaC1kZXZlbC1zdW1taXQtc2NzaS10YXJnZXQtY2x1c3RlcmluZy5wZGY
    (Page 8)

    It seems to indicate that for a true active/active setup the two
    targets need to be aware of each other and exchange locking
    information for it to work reliably, I’ve also watched the video
    from the Ceph developer summit where this is discussed and it
    seems that Ceph+Kernel need changes to allow this locking to be
    pushed back to the RBD layer so it can be shared, from what I can
    see browsing through the Linux Git Repo, these patches haven’t
    made the mainline kernel yet.

    Can you shed any light on this? As tempting as having
    active/active is, I’m wary about using the configuration until I
    understand how the locking is working and if fringe cases
    involving multiple ESXi hosts writing to the same LUN on
    different targets could spell disaster.

    Many thanks,

    Nick

    *From:*Jake Young [mailto:jak3...@gmail.com]
    *Sent:* 14 January 2015 16:54


    *To:* Nick Fisk
    *Cc:* Giuseppe Civitella; ceph-users
    *Subject:* Re: [ceph-users] Ceph, LIO, VMWARE anyone?

    Yes, it's active/active and I found that VMWare can switch from
    path to path with no issues or service impact.

    I posted some config files here: github.com/jak3kaj/misc
    
<http://xo4t.mjt.lu/link/xo4t/gzyhtx3/2/_P2HWj3RxQZC1v5DQ_206Q/aHR0cDovL2dpdGh1Yi5jb20vamFrM2thai9taXNj>

    One set is from my LIO nodes, both the primary and secondary
    configs so you can see what I needed to make unique.  The other
    set (targets.conf) are from my tgt nodes.  They are both 4 LUN
    configs.

    Like I said in my previous email, there is no performance
    difference between LIO and tgt.  The only service I'm running on
    these nodes is a single iscsi target instance (either LIO or tgt).

    Jake

    On Wed, Jan 14, 2015 at 8:41 AM, Nick Fisk <n...@fisk.me.uk
    <mailto:n...@fisk.me.uk>> wrote:

        Hi Jake,

        I can’t remember the exact details, but it was something to
        do with a potential problem when using the pacemaker resource
        agents. I think it was to do with a potential hanging issue
        when one LUN on a shared target failed and then it tried to
        kill all the other LUNS to fail the target over to another
        host. This then leaves the TCM part of LIO locking the RBD
        which also can’t fail over.

        That said I did try multiple LUNS on one target as a test and
        didn’t experience any problems.

        I’m interested in the way you have your setup configured
        though. Are you saying you effectively have an active/active
        configuration with a path going to either host, or are you
        failing the iSCSI IP between hosts? If it’s the former, have
        you had any problems with scsi locking/reservations…etc
        between the two targets?

        I can see the advantage to that configuration as you
        reduce/eliminate a lot of the troubles I have had with
        resources failing over.

        Nick

        *From:*Jake Young [mailto:jak3...@gmail.com
        <mailto:jak3...@gmail.com>]
        *Sent:* 14 January 2015 12:50
        *To:* Nick Fisk
        *Cc:* Giuseppe Civitella; ceph-users
        *Subject:* Re: [ceph-users] Ceph, LIO, VMWARE anyone?

        Nick,

        Where did you read that having more than 1 LUN per target
        causes stability problems?

        I am running 4 LUNs per target.

        For HA I'm running two linux iscsi target servers that map
        the same 4 rbd images. The two targets have the same serial
        numbers, T10 address, etc.  I copy the primary's config to
        the backup and change IPs. This way VMWare thinks they are
        different target IPs on the same host. This has worked very
        well for me.

        One suggestion I have is to try using rbd enabled tgt. The
        performance is equivalent to LIO, but I found it is much
        better at recovering from a cluster outage. I've had LIO lock
        up the kernel or simply not recognize that the rbd images are
        available; where tgt will eventually present the rbd images
        again.

        I have been slowly adding servers and am expanding my test
        setup to a production setup (nice thing about ceph). I now
        have 6 OSD hosts with 7 disks on each. I'm using the LSI
        Nytro cache raid controller, so I don't have a separate
        journal and have 40Gb networking. I plan to add another 6 OSD
        hosts in another rack in the next 6 months (and then another
        6 next year). I'm doing 3x replication, so I want to end up
        with 3 racks.

        Jake

        On Wednesday, January 14, 2015, Nick Fisk <n...@fisk.me.uk
        <mailto:n...@fisk.me.uk>> wrote:

            Hi Giuseppe,

            I am working on something very similar at the moment. I
            currently have it working on some test hardware but seems
            to be working reasonably well.

            I say reasonably as I have had a few instability’s but
            these are on the HA side, the LIO and RBD side of things
            have been rock solid so far. The main problems I have had
            seem to be around recovering from failure with resources
            ending up in a unmanaged state. I’m not currently using
            fencing so this may be part of the cause.

            As a brief description of my configuration.

            4 Hosts each having 2 OSD’s also running the monitor role

            3 additional host in a HA cluster which act as iSCSI
            proxy nodes.

            I’m using the IP, RBD, iSCSITarget and iSCSILUN resource
            agents to provide HA iSCSI LUN which maps back to a RBD.
            All the agents for each RBD are in a group so they follow
            each other between hosts.

            I’m using 1 LUN per target as I read somewhere there are
            stability problems using more than 1 LUN per target.

            Performance seems ok, I can get about 1.2k random IO’s
            out the iSCSI LUN. These seems to be about right for the
            Ceph cluster size, so I don’t think the LIO part is
            causing any significant overhead.

            We should be getting our production hardware shortly
            which wil have 40 OSD’s with journals and a SSD caching
            tier, so within the next month or so I will have a better
            idea of running it in a production environment and the
            performance of the system.

            Hope that helps, if you have any questions, please let me
            know.

            Nick

            *From:*ceph-users
            [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf Of
            *Giuseppe Civitella
            *Sent:* 13 January 2015 11:23
            *To:* ceph-users
            *Subject:* [ceph-users] Ceph, LIO, VMWARE anyone?

            Hi all,

            I'm working on a lab setup regarding Ceph serving rbd
            images as ISCSI datastores to VMWARE via a LIO box. Is
            there someone that already did something similar wanting
            to share some knowledge? Any production deployments? What
            about LIO's HA and luns' performances?

            Thanks

            Giuseppe






    _______________________________________________
    ceph-users mailing list
    ceph-users@lists.ceph.com  <mailto:ceph-users@lists.ceph.com>
    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to