Hi Will
This may be a good starting point:
http://www.clustermonkey.net/
although you will find recent and old articles there.
The simplest and less-effort clustering software is probably
Rocks Clusters:
http://www.rocksclusters.org/wordpress/
It is very easy to use if you want to setup a cluster from
scratch, and are willing to stick to their configuration,
which is not very flexible.
However, it may be more tricky to adjust it to
an existing cluster.
It also only works in diskfull/statefull clusters,
and installs a RHEL/CentOS/SL OS release that matches
the Rocks release version (e.g. won't work with Debian/Ubunt,
Fedora, SLES, etc).
Other clustering software are (some support diskless,
and a Linux distribution of choice):
xcat, open source supported by IBM/Lenovo:
http://sourceforge.net/p/xcat/wiki/Main_Page/
LLNL's Warewulf:
http://warewulf.lbl.gov/trac
and Perceus:
http://www.perceus.org/
Of course you can use other provisioning tools to do it yourself.
**
IMHO, opening the compute nodes for direct ssh access
bypassing the job scheduler is inviting a disaster,
and not needed at all.
Most if not all job schedulers, including Torque/Maui
have the interactive job functionality.
On Torque/Maui you can do "qsub -I -l nodes=1" , the -I stands for
interactive.
"man qsub" helps.
This opens a login session for the user on an available
node, whilst not breaking the priorities and hierarchy of
the queueing policy.
Actually, Torque can be configured to build a PAM module, which
you can use to prevent direct ssh to the nodes,
although you reach the same goal using other methods.
I hope this helps,
Gus Correa
On 03/19/2015 02:06 PM, Will Dennis wrote:
Hi all,
I have inherited admin duties on a Linux cluster here at work, that is
getting long in the tooth and is in need of refactoring. It’s about 60
nodes, all netbooted off a “head node”, with Torque/Maui job scheduler
(PBS) system and a Gluster filesystem, but not a traditional “single
image” cluster in that folks can (and do) SSH directly into the compute
nodes and do work there (others do use the job scheduler to run their jobs.)
Having never admin’d a cluster before, I’d like to know if there’s some
good resources on learning the different sorts of cluster architectures
that are out there (really need to know about small clusters only, not
looking to build a 100’s/1000’s-of-nodes cluster here) and decide what
would possibly fit the bill here for “cluster 2.0” :) I’ve looked on the
‘net, and also in Safari Books Online, and I’m seeing a lot of info
published back in the early 2000’s, but nothing too recent (say past
2010 of so.) I can’t really believe technology hasn’t changed much since
then :-P Whatever I do up thre road, I want to do “right”, just have to
learn what “right” is...
Thanks!
Will
_______________________________________________
Tech mailing list
Tech@lists.lopsa.org
https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
http://lopsa.org/
_______________________________________________
Tech mailing list
Tech@lists.lopsa.org
https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
http://lopsa.org/