-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi,
Am 12.04.2017 um 03:20 schrieb Mun Johl: > […]Here's what I do: > > firewall-cmd --permanent --add-port=6444/tcp > firewall-cmd --permanent --add-port=6445/tcp > firewall-cmd --reload > > > Hmm, that didn't help; thus, I have something even worse to deal with. BTW, > I executed the firewall-cmd commands on both the qmaster and the execution > host. FYI, here is the error I get when the execution host tries to qping > the qmaster: > > % qping SGEMASTER 6444 qmaster 1 > endpoint SGEMASTER.company.com/qmaster/1 at port 6444: can't find connection > got select error: No route to host > got select error: closing "SGEMASTER.company.com/qmaster/1" Has the qmaster machine more than one network interface? On which one runs SGE and what is the primary name of the machine? Sometimes it's necessary to use a host_aliases file (`man host_aliases`), so that SGE thinks it operates on eth1 inside the cluster, although the machine got its name from eth0: https://arc.liv.ac.uk/SGE/howto/multi_intrfcs.html > I believe IT has a bridge configured on the SGEMASTER; therefore, I need to > discuss that aspect of the network with our IT folks to see if that may be > impeding my success in some fashion. This looks more like a routing problem now. Nevertheless: do you run a firewall on all machines? Often a computing cluster is like a black box: There is a login node with a firewall to the outside world: - - ssh access for the users to log in - - outgoing ntp requests to adjust the time of the cluster - - a port for the backup agent (in our case a tape library) - - maybe outgoing printer port to access a printer somewhere in the department - - outgoing email to address any smtp server in the department/company - - ingoing email only from the above mentioned smtp server in the department/company (to receive errors) But inside the cluster (on a second network port of the login server) there is no firewall, also not on the computing nodes. Unless someone attached a cable to the switch there is no way to gain access unless someone was already granted access to the login node. The next issue which could arise is the NFS inside the cluster and any MPI applications (or also mails inside the cluster, as the exechost sends them). It's necessary to tell the MPI applications which port range to use and configure them accordingly (and the firewall). IIRC there is also no way to force SGE's `qrsh` to use only certain ports. Maybe if your cluster has essentially such a configuration, you can convince them to switch the firewall off. - -- Reuti -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iEYEARECAAYFAljuoXoACgkQo/GbGkBRnRq5rQCeI5m7HNW1MlapZbcSw/raSmFX qlkAn38/TF7R94IV1YiDewDCVlGgP93E =keyB -----END PGP SIGNATURE----- _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users