-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

Am 12.04.2017 um 03:20 schrieb Mun Johl:

> […]Here's what I do:
> 
>     firewall-cmd --permanent --add-port=6444/tcp
>     firewall-cmd --permanent --add-port=6445/tcp
>     firewall-cmd --reload
> 
> 
> Hmm, that didn't help; thus, I have something even worse to deal with.  BTW, 
> I executed the firewall-cmd commands on both the qmaster and the execution 
> host.  FYI, here is the error I get when the execution host tries to qping 
> the qmaster:
> 
> ​% qping SGEMASTER 6444 qmaster 1
> endpoint SGEMASTER.company.com/qmaster/1 at port 6444: can't find connection
> got select error: No route to host
> got select error: closing "SGEMASTER.company.com/qmaster/1"

Has the qmaster machine more than one network interface? On which one runs SGE 
and what is the primary name of the machine? Sometimes it's necessary to use a 
host_aliases file (`man host_aliases`), so that SGE thinks it operates on eth1 
inside the cluster, although the machine got its name from eth0:

https://arc.liv.ac.uk/SGE/howto/multi_intrfcs.html


> I believe IT has a bridge configured on the SGEMASTER; therefore, I need to 
> discuss that aspect of the network with our IT folks to see if that may be 
> impeding my success in some fashion.

This looks more like a routing problem now. Nevertheless: do you run a firewall 
on all machines? Often a computing cluster is like a black box:

There is a login node with a firewall to the outside world:

- - ssh access for the users to log in
- - outgoing ntp requests to adjust the time of the cluster
- - a port for the backup agent (in our case a tape library)
- - maybe outgoing printer port to access a printer somewhere in the department
- - outgoing email to address any smtp server in the department/company
- - ingoing email only from the above mentioned smtp server in the 
department/company (to receive errors)

But inside the cluster (on a second network port of the login server) there is 
no firewall, also not on the computing nodes. Unless someone attached a cable 
to the switch there is no way to gain access unless someone was already granted 
access to the login node.

The next issue which could arise is the NFS inside the cluster and any MPI 
applications (or also mails inside the cluster, as the exechost sends them). 
It's necessary to tell the MPI applications which port range to use and 
configure them accordingly (and the firewall). IIRC there is also no way to 
force SGE's `qrsh` to use only certain ports.

Maybe if your cluster has essentially such a configuration, you can convince 
them to switch the firewall off.

- -- Reuti
-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org

iEYEARECAAYFAljuoXoACgkQo/GbGkBRnRq5rQCeI5m7HNW1MlapZbcSw/raSmFX
qlkAn38/TF7R94IV1YiDewDCVlGgP93E
=keyB
-----END PGP SIGNATURE-----

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to