On Sun, Nov 4, 2018 at 12:49 PM The Lee-Man <[email protected]> wrote: > > On Thursday, November 1, 2018 at 11:54:31 AM UTC-7, [email protected] > wrote: >> >> Hi, >> >> We use open-iscsi in our "ESOS" project. We have a custom rc/init script >> (not RHEL / Debian based), which starts the iSCSI initiator stack at boot. >> The current start() function in that script looks like this: >> --snip-- >> start() { >> if [ ! -f "${INAME_FILE}" ]; then >> /bin/echo "Generating unique iSCSI initiator name..." >> /bin/echo "InitiatorName=`${ISCSI_INAME}`" > ${INAME_FILE} || exit 1 >> fi >> /bin/echo "Starting iSCSI initiator service..." >> eval ${ISCSID} ${USER_OPTS} || exit 1 >> /bin/touch ${ISCSID_LOCK} >> /bin/echo "Setting up iSCSI targets..." >> ${ISCSIADM} -m node --loginall=automatic || exit 1 >> } >> --snip-- >> >> This all works just fine, but there is one behavior / scenario that occurs, >> and I'm looking for advice / best practices on how to overcome this: >> - Let's say the remote iSCSI target server is down/unavailable >> - An initiator (client) machine boots up, and the init script runs start() >> and the very last line of the function above (--loginall=automatic) attempts >> to login to all node/session records >> - That command eventually times out of course if the iSCSI target that it's >> attempting to login to isn't up >> - The storage target comes up sometime later, after the "iscsiadm >> --loginall=automatic ..." has already failed/returned >> - The client/initiator machine then never logs in to the target without >> intervention (eg, running login by hand) >> >> I'd like to address this so there is no manual intervention needed, or at >> least not after some "reasonable" amount of time (eg, 12 hours, or >> whatever). My first thought is to tweak the initial login timeout values in >> iscsid.conf, and then background the "iscsiadm --loginall=automatic ..." >> that runs in the start() function of our rc/init script. But I'm wondering >> if there is a better, perhaps cleaner way, of achieving this behavior. I see >> some options related to "discoveryd" and retry attempts, but I'm not >> convinced that applies to my situation after reading some documentation / >> examples of that. The portal / target are all static on the initiator side, >> so I don't need to discover new targets each time, just login to existing >> target/portal/node records. >> >> Any help or guidance would be greatly appreciated. Thanks for your time. >> >> --Marc >> > > Hi Marc: > > Good question. Others have asked this, and there is no good answer. There is > nothing I know of (built into open-iscsi) that will do as you wish, i.e. "try > to login to a node now, and if the node is not currently present, keep trying > again (forever, or for some amount of time) until it works. > > But ... if you successfully have a session, and then the iscsid daemon goes > away and comes back and finds evidence of these now-stale sessions, it will > in fact retry forever to reconnect to them. Which is kind of what you want, > if I understand correctly. > > There _are_ some approaches to working around this, I believe. We could do > one of: > > use iSNS, if you target supports it. The iSNS daemon is like a name service > for targets and initiators, and can send asynchronous messages to a client > when a target goes down or comes up. This is probably the best long term > solution, but needs some testing as proof-of-concept, as iSNS isn't used much > right now > Or we could use the discovery daemon, which isn't really a daemon but a fork > of the iscsid daemon that re-runs discovery for you, periodically. I > _believe_ it can be set up to log into nodes it finds though, and also needs > some more testing. > Or you can create some custom, out-of-band test for (1) if the target is up, > try to login, else (2) sleep a while and try again, until (3) we connect or > give up after a certain time or number of tries > Or we can enhance iscsid to have this capability, since it's similar to what > it does not when recovering a "stale" session upon startup. > > I really like the iSNS solution, but I fear this won't work in many end-user > environments, since many customers will say "I have to run what daemon? I > never heard of that and don't like it". It is at the top of my todo list to > investigate this approach, since I believe it's the best solution, going > (way) forward. > > As far as the discovery daemon, it's been a few years since I looked at it, > but if memory serves me correctly, it is flawed. I plan to investigate this, > with the iSNS solution, for comparison. > > I believe the last solution is probably the most logical, but would require a > bit of time to develop (i.e more than I usually have).
Thank you for the detailed response, it's much appreciated. I think for now I'm going to go with my lame idea of just making the '--loginall' iscsiadm call hang out in the background of the rc/init script, and I can control how long it waits before timing out / retrying in the background by updating the attributes of the node with iscsiadm. This will be quickest solution for me to continue testing my POC. If everything goes well, I'll take a look at you suggested implementations and see if I can make some time to contribute. --Marc > > -- > You received this message because you are subscribed to the Google Groups > "open-iscsi" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/open-iscsi. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/open-iscsi. For more options, visit https://groups.google.com/d/optout.
