Thanks for detailed information. I will see if I can use shell wrapper program as mentioned by you.
I had used LSF a lot like for 5 years. I still use it. bsub, bjobs. bkill, lim, sbatchd, mbatchd etc. it is easy to understand and use lsmake - I do not want to use IBM's proprietary stuff. Thanks for your suggestions. Nikhil On Mon, Sep 2, 2019 at 11:10 PM David Boyce <david.s.bo...@gmail.com> wrote: > I'm not going to address the remote execution topic since it sounds like > you already have the solution and are not looking for help. However, I do > have fairly extensive experience with the NFS/retry area so will try to > contribute there. > > First, I don't think what Paul says: > > > As for your NFS issue, another option would be to enable the .ONESHELL > > feature available in newer versions of GNU make: that will ensure that > > all lines in a recipe are invoked in a single shell, which means that > > they should all be invoked on the same remote host. > > Is sufficient. Consider the typical case of compiling foo.c to foo.o and > linking it into foo.exe. Typically, and correctly, those actions would be > in two separate recipes which in a distributed-build scenario could run on > different hosts so the linker may not find the .o file from a previous > recipe. Here .ONSHELL cannot help since they're different recipes. > > In my day job we use a product from IBM called LSF (Load Sharing > F-something, > https://www.ibm.com/support/knowledgecenter/en/SSETD4_9.1.3/lsf_welcome.html) > which exists to distribute jobs over a server farm (typically using NFS) > according to various metrics like load and free memory and so on. Part of > the LSF package is a program called lsmake ( > https://www.ibm.com/support/knowledgecenter/en/SSETD4_9.1.3/lsf_command_ref/lsmake.1.html) > which under the covers is a version of GNU make with enhancement to enable > remote/distributed recipes and also adds retry-with-delay feature Nikhil > requested). Since GNU make is GPL, IBM is required to make its package of > enhancements available under GPL as well. Much of it is not of direct > interest to the open source community because it's all about communicating > with IBM's proprietary daemons but their retry logic could probably be > taken directly from the patch. At the very least, if retries were to be > added to GNU make per se it would be nice if the flags were compatible with > lsmake. > > However, my personal belief is that retries are God's way of telling us to > think harder and better. Retrying (and worse, delay-and-retry) is a form of > defeatism which I call "sleep and hope". Computers are deterministic, > there's always a root cause which can usually be found and addressed with > sufficient analysis, etc. Granted there are cases where you understand the > problem but can't address it for administrative/permissions/business > reasons but that can't be known until the problem is understood. > > NFS caching is the root cause of unreliable distributed builds, as you've > already described, but most or all of these issues can be addressed with a > less blunt instrument than sleep-and-retry. Even LSF engineers threw up > their hands and did retries but what we did here was take their patch, > which at last check was still targeted to 3.81, and while porting it to 4.1 > added some of the cache-flushing strategies detailed below. This has solved > most if not all of our NFS sync problems. Caveat: most of our people still > use the LSF retry logic in addition, because they're not as absolutist as I > am and just want to get their jobs done (go figure), which makes it harder > to determine what percentage of problems are solved by cache flushing vs > retries but I'm pretty sure flushing has resolved the great majority of > problems. > > One problem with sleep-and-hope is that there's no amount of time > guaranteed to be enough so you're just driving the incidence rate down, not > fixing it. > > Since we were already working with a hacked version of GNU make we found > it most convenient to implement flushing directly in the make program but > it can also be done within recipes. In fact we have 3 different > implementations of the same NFS cache flushing logic: > > 1. Directly within our enhanced version of lsmake. > 2. In a standalone binary called "nfsflush". > 3. In a Python script called nfsflush.py. > > The Python script is a lab for trying out new strategies but it's too slow > for production use. The binary is a faster version of the same techniques > for direct use in recipes, and that same C code is linked directly into > lsmake as well. Here's the usage message of our Python script: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *$ nfsflush.py --helpusage: nfsflush.py [-h] [-f] [-l] [-r] [-t] [-u] [-V] > path [path ...]positional arguments: path directory paths to > flushoptional arguments: -h, --help show this help message and exit > -f, --fsync Use fsync() on named files -l, --lock Lock and > unlock the named files -r, --recursive flush recursively -t, --touch > additional flush action - touch and remove a temp file -u, --upflush > flush parent directories back to the root -V, --verbose increment > verbosity levelFlush the NFS filehandle caches of NFS directories.Newly > created files are sometimes unavailable via NFS for a periodof time due to > filehandle caching, leading to apparent race problems.See > http://tss.iki.fi/nfs-coding-howto.html#fhcache > <http://tss.iki.fi/nfs-coding-howto.html#fhcache> for details.This script > forces a flush using techniques mentioned in the URL. Itcan optionally do > so recursively.This always does an opendir/closedir sequence on each > directoryvisited, as described in the URL, because that's cheap and safe > andoften sufficient. Other strategies, such as creating and removing atemp > file, are optional.EXAMPLES: nfsflush.py /nfs/path/...* > > The most important thing is to read the URL given above and/or to google > for similar resource of which there are many. While I'm not an NFS guru > myself, the summary of my understanding is that NFS caches all sorts of > things (metadata like atime/mtime, directory updates, etc) with various > degrees of aggression according to NFS vendor and internal configuration. > We've seen substantial variation between NAS providers such as NetApp, EMC, > etc, so much depends on whose NFS server you're using. However, the NFS > spec _requires_ that caches be flushed on a write operation so all > implementations will do this. > > Bottom line, the most common failure case is as mentioned above: foo.o is > compiled on host A and immediately linked on host B. The close() system > call following the final write() of foo.o on host A will cause its data to > be flushed. Similarly I *believe* the directory write (assuming foo.o is > newly created and not just updated) will cause the filehandle cache to be > flushed. Thus, after these two write ops (directory and file) the NFS > server will know about the new foo.o as soon as it's created. > > The problem typically arises on host B because no write operation has > taken place there after foo.o was created on A so no one has told it to > update its caches and as a result it doesn't know foo.o exists and the link > fails with ENOENT. All the flushing techniques in the script above are > attempts to address this. One takeaway from all this is that even if you do > retries, a "dumb" retry is immeasurably enhanced by adding a flush. In > other words the most efficient retry formula in a distributed build > scenario would be: > > <recipe> || flush || <recipe> > > This never flushes a cache unless the first attempt fails. It presumes > that NFS implementors and admins know what they're doing and thus caching > helps with performance so it's not done unless needed. This is what we > built into our variant of lsmake. However, the same can also be done in the > shell. > > Details about implemented cache flushing techniques: the filehandle cache > is the biggest source of problems in distributed builds and the simplest > solution for it seems to be opening and reading the directory entry. Thus > our script and its parallel C implementation always do that. We've also > seen cases where forcing a directory write operation is required which the > -t, --touch option does. Sometimes you can't easily enumerate all > directories involved (vpath etc) so the recurse-downward (-r) and recurse > upward (-u) flags may be helpful though they (especially -u) may also be > overkill. The -f and -l options were added based on advice found on the net > but have not been shown to be helpful in our environment. > > Some techniques may be of limited utility because they require write > and/or ownership privileges. For instance I've seen statements that > umounts, even failed umounts, will force flushes. Thus a command like "cd > <dir> && umount $(pwd)" would have to fail since the moount is busy but > would flush as a side effect. However I believe this requires root > privileges so is not helpful in the normal case. > > In summary: although I don't believe in retries, if they're going to be > used I think they should be implemented in a shell wrapper program which > could be passed to make as SHELL=<wrapper> and the wrapper should use > flushing in addition to, or instead of, retries. We didn't do it that way > but I think our nfsflush program could just as well have been implemented > as say "nfsshell" such that "nfsshell [other-options] -c <recipe>" would > run the recipe along with added flushing and retrying options. I agree with > Paul that I see no reason to implement any of these features, retry and/or > flush, directly in make. > > David > > On Mon, Sep 2, 2019 at 6:05 AM Paul Smith <psm...@gnu.org> wrote: > >> On Sun, 2019-09-01 at 23:23 -0700, Kaz Kylheku (gmake) wrote: >> > If your R&D team would allow you to add just one line to the >> > legacy GNU Makefile to assign the SHELL variable, you can assign that >> > to a shell wrapper program which performs command re-trying. >> >> You don't have to add any lines to the makefile. You can reset SHELL >> on the command line, just like any other make variable: >> >> make SHELL=/my/special/sh >> >> You can even override it only for specific targets using the --eval >> command line option: >> >> make --eval 'somerule: SHELL := /my/special/sh' >> >> Or, you can add '-f mymakefile.mk -f Makefile' options to the command >> line to force reading of a personal makefile before the standard >> makefile. >> >> Clearly you can modify the command line, otherwise adding new options >> to control a putative retry on error option would not be possible. >> >> As for your NFS issue, another option would be to enable the .ONESHELL >> feature available in newer versions of GNU make: that will ensure that >> all lines in a recipe are invoked in a single shell, which means that >> they should all be invoked on the same remote host. This can also be >> done from the command line, as above. If your recipes are written well >> it should Just Work. If they aren't, and you can't fix them, then >> obviously this solution won't work for you. >> >> Regarding changes to set re-invocation on failure, at this time I don't >> believe it's something I'd be willing to add to GNU make directly, >> especially not an option that simply retries every failed job. This is >> almost never useful (why would you want to retry a compile, or link, or >> similar? It will always just fail again, take longer, and generate >> confusing duplicate output--at best). >> >> The right answer for this problem is to modify the makefile to properly >> retry those specific rules which need it. >> >> I commiserate with you that your environment is static and you're not >> permitted to modify it, however adding new specialized capabilities to >> GNU make so that makefiles don't have to be modified isn't a design >> philosophy I want to adopt. >> >> >> _______________________________________________ >> Help-make mailing list >> Help-make@gnu.org >> https://lists.gnu.org/mailman/listinfo/help-make >> > _______________________________________________ Help-make mailing list Help-make@gnu.org https://lists.gnu.org/mailman/listinfo/help-make