sorry againone can always add login node in rockscluster that will act as submit node to sge
regards
On 3/28/2012 6:21 AM, "Hung-Sheng Tsao (Lao Tsao 老曹) Ph.D." wrote:
On 3/28/2012 5:53 AM, Reuti wrote:Am 27.03.2012 um 23:27 schrieb Hung-sheng Tsao:It may be good to add an explanation: to me it looks like the original poster installed a separate SGE cluster on just one machine, including the qmaster daemon and hence it's just running local which explains the job id of being 1.May be just copy the /opt/ grid engine from one of the compute node Add this as submit host from the frontendsorry, if one just copy the /opt/gridengine from compute nodes thenit will have the full directory of /opt/gridengine/default/common and /opt/gridengine/binyes there is also default/spool that one could delete the demon should not run!of course one will need the home directory, uid etc from the rocks frontendIMHO, it is much simpler then install a new version of SGEof course if the submit host is not running the same centos/redhat of compute node that is another storyregardsTo add a submit host to an existing cluster it isn't necessary to have any daemon running on it, and installing a different version of SGE will most likely not work too, as the internal protocol changes between the releases. I suggest to:- Stop the daemons you started on the new submit host - Remove the compilation you did- Share the users from the existing cluster by NIS/LDAP (unless you want to define them all by hand on the new machine too)- Mount /home from the existing cluster- Mount /usr/sge or /opt/grid whereever you have SGE installed in the exisitng cluster- Add the machine in question as submit host in the original cluster- Source during login $SGE_ROOT/default/common/settings.sh on the submit machineThen you should be able to submit jobs from this machine.As there is no builtin file staging in SGE, it's most common to share /home.==Nevertheless it could be done to have a separate single machine cluster (with a different version of SGE) and use file staging (which you have to implement on your own) but it's to much overhead for adding just this particular machine IMO. It's a suitable setup to combine clusters by the use of a transfer queue this way. I did it once and used the job context to name the files which have to be copied back and forth to copy them then on my own in a starter method.-- ReutiLT Sent from my iPhoneOn Mar 27, 2012, at 4:36 PM, Robert Chase<[email protected]> wrote:Hello,A number of years ago, our group created a rocks cluster consisting of a head node, a data node and eight execution nodes. The eight execution nodes can only be accessed by the head node.My goal is to add a submit node to the existing cluster. I have downloaded GE2011.11 and compiled from source without errors. When I try the command:qsub simple.sh I get the error:Unable to run job: warning: root your job is not allowed to run in any queueWhen I look at qstat I get:job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 1 0.55500 simple.sh root qw 03/27/2012 09:41:11 1I have added the new submit node to the list of submit nodes on the head node using the commandqconf -asWhen I run qconf -ss on the new submit node I see the head node, the data node and the new submit node.When I run qconf -ss on the head node, I see the head node, the data node, the new submit node and all eight execution nodes.When I run qhost on the new submit node, I getHOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO SWAPUS ------------------------------------------------------------------------------- global - - - - - - -Other posts have asked about the output of qconf -sq all.q... [root@HEADNODE jobs]# qconf -sq all.q qname all.q hostlist @allhosts seq_no 0 load_thresholds np_load_avg=1.75 suspend_thresholds NONE nsuspend 1 suspend_interval 00:05:00 priority 0 min_cpu_interval 00:05:00 processors UNDEFINED qtype BATCH INTERACTIVE ckpt_list NONE pe_list make mpi mpich multicore orte rerun FALSEslots 1,[compute-0-0.local=16],[compute-0-1.local=16], \[compute-0-2.local=16],[compute-0-3.local=16], \ [compute-0-4.local=16],[compute-0-6.local=16], \ [compute-0-7.local=16] tmpdir /tmp shell /bin/csh prolog NONE epilog NONE shell_start_mode posix_compliant starter_method NONE suspend_method NONE resume_method NONE terminate_method NONE notify 00:00:60 owner_list NONE user_lists NONE xuser_lists NONE subordinate_list NONE complex_values NONE projects NONE xprojects NONE calendar NONE initial_state default s_rt INFINITY h_rt INFINITY s_cpu INFINITY h_cpu INFINITY s_fsize INFINITY h_fsize INFINITY s_data INFINITY h_data INFINITY s_stack INFINITY h_stack INFINITY s_core INFINITY h_core INFINITY s_rss INFINITY h_rss INFINITY s_vmem INFINITY h_vmem INFINITY [root@SUBMITNODE jobs]# qconf -sq all.q qname all.q hostlist @allhosts seq_no 0 load_thresholds np_load_avg=1.75 suspend_thresholds NONE nsuspend 1 suspend_interval 00:05:00 priority 0 min_cpu_interval 00:05:00 processors UNDEFINED qtype BATCH INTERACTIVE ckpt_list NONE pe_list make rerun FALSE slots 1 tmpdir /tmp shell /bin/csh prolog NONE epilog NONE shell_start_mode posix_compliant starter_method NONE suspend_method NONE resume_method NONE terminate_method NONE notify 00:00:60 owner_list NONE user_lists NONE xuser_lists NONE subordinate_list NONE complex_values NONE projects NONE xprojects NONE calendar NONE initial_state default s_rt INFINITY h_rt INFINITY s_cpu INFINITY h_cpu INFINITY s_fsize INFINITY h_fsize INFINITY s_data INFINITY h_data INFINITY s_stack INFINITY h_stack INFINITY s_core INFINITY h_core INFINITY s_rss INFINITY h_rss INFINITY s_vmem INFINITY h_vmem INFINITY I would like to know how to get qsub working. Thanks, -Robert Paul Chase Channing Labs _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
-- Hung-Sheng Tsao Ph D. Founder& Principal HopBit GridComputing LLC cell: 9734950840 http://laotsao.blogspot.com/ http://laotsao.wordpress.com/ http://blogs.oracle.com/hstsao/
<<attachment: laotsao.vcf>>
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
