On Thursday, November 1, 2018 at 11:54:31 AM UTC-7, [email protected] wrote: > > Hi, > > We use open-iscsi in our "ESOS" project. We have a custom rc/init script > (not RHEL / Debian based), which starts the iSCSI initiator stack at boot. > The current start() function in that script looks like this: > --snip-- > start() { > if [ ! -f "${INAME_FILE}" ]; then > /bin/echo "Generating unique iSCSI initiator name..." > /bin/echo "InitiatorName=`${ISCSI_INAME}`" > ${INAME_FILE} || exit > 1 > fi > /bin/echo "Starting iSCSI initiator service..." > eval ${ISCSID} ${USER_OPTS} || exit 1 > /bin/touch ${ISCSID_LOCK} > /bin/echo "Setting up iSCSI targets..." > ${ISCSIADM} -m node --loginall=automatic || exit 1 > } > --snip-- > > This all works just fine, but there is one behavior / scenario that > occurs, and I'm looking for advice / best practices on how to overcome this: > - Let's say the remote iSCSI target server is down/unavailable > - An initiator (client) machine boots up, and the init script runs start() > and the very last line of the function above (--loginall=automatic) > attempts to login to all node/session records > - That command eventually times out of course if the iSCSI target that > it's attempting to login to isn't up > - The storage target comes up sometime later, after the "iscsiadm > --loginall=automatic ..." has already failed/returned > - The client/initiator machine then never logs in to the target without > intervention (eg, running login by hand) > > I'd like to address this so there is no manual intervention needed, or at > least not after some "reasonable" amount of time (eg, 12 hours, or > whatever). My first thought is to tweak the initial login timeout values in > iscsid.conf, and then background the "iscsiadm --loginall=automatic ..." > that runs in the start() function of our rc/init script. But I'm wondering > if there is a better, perhaps cleaner way, of achieving this behavior. I > see some options related to "discoveryd" and retry attempts, but I'm not > convinced that applies to my situation after reading some documentation / > examples of that. The portal / target are all static on the initiator side, > so I don't need to discover new targets each time, just login to existing > target/portal/node records. > > Any help or guidance would be greatly appreciated. Thanks for your time. > > --Marc > > Hi Marc:
Good question. Others have asked this, and there is no good answer. There is nothing I know of (built into open-iscsi) that will do as you wish, i.e. "try to login to a node now, and if the node is not currently present, keep trying again (forever, or for some amount of time) until it works. But ... if you successfully have a session, and then the iscsid daemon goes away and comes back and finds evidence of these now-stale sessions, it will in fact retry forever to reconnect to them. Which is kind of what you want, if I understand correctly. There _are_ some approaches to working around this, I believe. We could do one of: - use iSNS, if you target supports it. The iSNS daemon is like a name service for targets and initiators, and can send asynchronous messages to a client when a target goes down or comes up. This is probably the best long term solution, but needs some testing as proof-of-concept, as iSNS isn't used much right now - Or we could use the discovery daemon, which isn't really a daemon but a fork of the iscsid daemon that re-runs discovery for you, periodically. I _believe_ it can be set up to log into nodes it finds though, and also needs some more testing. - Or you can create some custom, out-of-band test for (1) if the target is up, try to login, else (2) sleep a while and try again, until (3) we connect or give up after a certain time or number of tries - Or we can enhance iscsid to have this capability, since it's similar to what it does not when recovering a "stale" session upon startup. I really like the iSNS solution, but I fear this won't work in many end-user environments, since many customers will say "I have to run what daemon? I never heard of that and don't like it". It is at the top of my todo list to investigate this approach, since I believe it's the best solution, going (way) forward. As far as the discovery daemon, it's been a few years since I looked at it, but if memory serves me correctly, it is flawed. I plan to investigate this, with the iSNS solution, for comparison. I believe the last solution is probably the most logical, but would require a bit of time to develop (i.e more than I usually have). -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/open-iscsi. For more options, visit https://groups.google.com/d/optout.
