Hi,

Am 20.11.2006 um 13:12 schrieb Epitropakis Mixalis 00064:

Hello everyone!

I am a member of a research laboratory in University of Patras, Greece.
We have ordered our first cluster and in the following days it will
arrive. So, we will need the help of experts in order to decide which
cluster management and job scheduling software is the most suitable for it
:) .

Each computer of the cluster consists of: 2 x (Dual-Core Intel Xeon
Processor 5060 (3.2 GHz, 1066 MHz Bus)), (motherboard: S5000PAL)
, 4GB ECC RAM and 250GB HDD. All parts are interconnected with a
Gigabit Ethernet Switch.

What we need, is your opinion and experience for a software package
(or a collection) that will make easier the use of the cluster (job
scheduling) as well as the administration of it (update and upgrade of
the OS, installation of new software, user administration, etc). We are
proficient with Linux administration (any distro).

On our first search in the internet, we found some packages that do not combine both job scheduling and administration. If there is a package that could be suggested and that could combine both we would be really happy.
We would prefer the software packages to be open source. :)

Some of them found and studied so far are the following:

[1] TORQUE Resource Manager

http://www.clusterresources.com/pages/products/torque-resource- manager.php
[2] http://gridengine.sunsource.net/
[3] http://oscar.openclustergroup.org/
[4] http://dcc.irb.hr/

I think this question is of broader audience on the beowulf.org mailing list, but anyway: what are you using in the cluster besides OpenMPI? Although I'm biased, I would suggest SGE GridEngine, as it supports more parallel libs than Torque by its qrsh replacement; e.g. Linda or PVM. Also the integration between the qmaster and scheduler is tighter. In Torque you have two commands: "qstat" and "showq". The former is the view of the cluster by Torque, the latter the one of the Maui scheduler - and sometimes I observe that they disagree about what's running in the cluster and what not (we use SGE, but we have access to some clusters in other locations which prefer Torque).

The support for SGE will be in OpenMPI in1.2 AFAIK.

Question: you have a central filer server in the cluster, to serve the home directory to the nodes and which could also act as a NIS, NTP and SGE qmaster server? You mentioned only the nodes.

-- Reuti


These are some of our thoughts. We know that the distribution choice
as well as the cluster management software will apply only ONCE and we
will not be able to test/change it easily...

Thanks very much for your time and I am sure that your opinion will be
of of great help to us!

Michael
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to