The beowulf list is another great resource:

http://www.beowulf.org/mailman/listinfo/beowulf

There's a good mix of theoretical and practical people on the list.

While I agree in principal about disabling ssh access to nodes, in practice
it doesn't work. People really like to be able to login and run things like
top to watch their job in real-time. We've settled on a compromise where
direct ssh logins have very strict resource limits for CPU and memory use,
so it's good enough to run basic stuff but not as a way to subvert the
scheduler.

Skylar

On Thu, Mar 19, 2015 at 2:39 PM, Gus Correa <g...@ldeo.columbia.edu> wrote:

> Hi Will
>
> This may be a good starting point:
>
> http://www.clustermonkey.net/
>
> although you will find recent and old articles there.
>
> The simplest and less-effort clustering software is probably
> Rocks Clusters:
> http://www.rocksclusters.org/wordpress/
>
> It is very easy to use if you want to setup a cluster from
> scratch, and are willing to stick to their configuration,
> which is not very flexible.
> However, it may be more tricky to adjust it to
> an existing cluster.
> It also only works in diskfull/statefull clusters,
> and installs a RHEL/CentOS/SL OS release that matches
> the Rocks release version (e.g. won't work with Debian/Ubunt,
> Fedora, SLES, etc).
>
> Other clustering software are (some support diskless,
> and a Linux distribution of choice):
>
> xcat, open source supported by IBM/Lenovo:
> http://sourceforge.net/p/xcat/wiki/Main_Page/
>
> LLNL's Warewulf:
> http://warewulf.lbl.gov/trac
>
> and Perceus:
> http://www.perceus.org/
>
> Of course you can use other provisioning tools to do it yourself.
>
> **
>
> IMHO, opening the compute nodes for direct ssh access
> bypassing the job scheduler is inviting a disaster,
> and not needed at all.
>
> Most if not all job schedulers, including Torque/Maui
> have the interactive job functionality.
> On Torque/Maui you can do "qsub -I -l nodes=1" , the -I stands for
> interactive.
> "man qsub" helps.
> This opens a login session for the user on an available
> node, whilst not breaking the priorities and hierarchy of
> the queueing policy.
>
> Actually, Torque can be configured to build a PAM module, which
> you can use to prevent direct ssh to the nodes,
> although you reach the same goal using other methods.
>
> I hope this helps,
> Gus Correa
>
>
> On 03/19/2015 02:06 PM, Will Dennis wrote:
>
>> Hi all,
>>
>> I have inherited admin duties on a Linux cluster here at work, that is
>> getting long in the tooth and is in need of refactoring. It's about 60
>> nodes, all netbooted off a "head node", with Torque/Maui job scheduler
>> (PBS) system and a Gluster filesystem, but not a traditional "single
>> image" cluster in that folks can (and do) SSH directly into the compute
>> nodes and do work there (others do use the job scheduler to run their
>> jobs.)
>>
>> Having never admin'd a cluster before, I'd like to know if there's some
>> good resources on learning the different sorts of cluster architectures
>> that are out there (really need to know about small clusters only, not
>> looking to build a 100's/1000's-of-nodes cluster here) and decide what
>> would possibly fit the bill here for "cluster 2.0" :) I've looked on the
>> 'net, and also in Safari Books Online, and I'm seeing a lot of info
>> published back in the early 2000's, but nothing too recent (say past
>> 2010 of so.) I can't really believe technology hasn't changed much since
>> then :-P Whatever I do up thre road, I want to do "right", just have to
>> learn what "right" is...
>>
>> Thanks!
>>
>> Will
>>
>>
>>
>> _______________________________________________
>> Tech mailing list
>> Tech@lists.lopsa.org
>> https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
>> This list provided by the League of Professional System Administrators
>>   http://lopsa.org/
>>
>>
> _______________________________________________
> Tech mailing list
> Tech@lists.lopsa.org
> https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
> This list provided by the League of Professional System Administrators
> http://lopsa.org/
>
_______________________________________________
Tech mailing list
Tech@lists.lopsa.org
https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to