Hello all, we have a rather old installation of SGE that has been running for years without any problems. In the last 2-3 weeks I've been experiencing an odd problem: when issuing any command (qsub, qstat, qping, etc) I get the following error:
error: commlib error: access denied (server host resolves destination host "<server address>" as "(HOST_NOT_RESOLVABLE)") error: unable to contact qmaster using port 6444 on host "<server address>" There are several odd things about this: * Nothing has changed on the server or the clients in the months before the error started appearing. * This happens from most of the clients, but not all. * The error persists for 5-10 minutes, and then everything works fine. * Both gethostbyname and gethostbyaddr return the correct values from the client while the error occurs (I haven't had a chance to try them from the master during these episodes). I get a feeling that this has something to do with DNS and reverse lookup, but I don't know where to start debugging it. Anyone have any clue what I should look at ? Thanks, -- Valerio Luccio (212) 998-8736 Center for Brain Imaging 4 Washington Place, Room 157 New York University New York, NY 10003 "In an open world, who needs windows or gates ?"
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users