On 10/26/2016 at 14:08 Ricardo Wurmus writes: > At the MDC we’re using SGE and users specify their software environment > in the job script. The software environment is a Guix profile, so the > job script usually contains a line to source the profile’s > “etc/profile”, which has the effect of setting up the required > environment variables.
Cool. How do you deal with the tendency of user's profiles to be "moving targets?" IOW, I am wondering how one would reproduce a result at a later date when one's profile has "changed"? > I don’t know of anyone who uses VMs or VM images to specify software > environments. One rationale I can think of for VM images is to "archive" them along with the analysis result to provide brute-force reproducibility. An example I know of is a group whose cluster consists of VMs on VMware. The VMs run a mix of OSes provisioned with varying levels of resource (e.g. #CPUs, amount of memory, installed software). >> Based on my experiments with Guix/Debian, GuixSD, VMs, and VM images it >> is not obvious to me which of these levels of abstraction is >> appropriate. > > FWIW we’re using Guix on top of CentOS 6.8. The store is mounted > read-only on all cluster nodes. Nice. Do you attempt to "protect" your users from variations in the CentOS config? >> The most forward-thinking group that I know discarded their cluster >> hardware a year ago to replace it with starcluster >> (http://star.mit.edu/cluster/). Starcluster automates the creation, >> care, and feeding of a HPC clusters on AWS using the Grid Engine >> scheduler and AMIs. The group has a full-time "starcluster jockey" who >> manages their cluster and they seem quite happy with the approach. So >> you may want to consider starcluster as a model when you think of >> cluster management requirements. > > When using starcluster are software environments transferred to AWS on > demand? Does this happen on a per-job basis? Are any of the > instantiated machines persistent or are they discarded after use? In the application I refer to the cluster is kept spun up. I am not sure if they have built a custom Amazon VM-image (AMI) or if they start with a "stock" AMI and configure the compute hosts during the spin up.