Hello Reuti this is the output of ps -e f master@sgemstr:~$ ps -e f PID TTY STAT TIME COMMAND 2 ? S 0:00 [kthreadd] 3 ? S 0:00 \_ [ksoftirqd/0] 4 ? S 0:00 \_ [kworker/0:0] 5 ? S< 0:00 \_ [kworker/0:0H] 7 ? S 0:00 \_ [migration/0] 8 ? S 0:00 \_ [rcu_bh] 9 ? S 0:00 \_ [rcuob/0] 10 ? S 0:00 \_ [rcuob/1] 11 ? S 0:00 \_ [rcuob/2] 12 ? S 0:00 \_ [rcuob/3] 13 ? S 0:00 \_ [rcuob/4] 14 ? S 0:00 \_ [rcuob/5] 15 ? S 0:00 \_ [rcuob/6] 16 ? S 0:00 \_ [rcuob/7] 17 ? S 0:00 \_ [rcu_sched] 18 ? S 0:00 \_ [rcuos/0] 19 ? S 0:00 \_ [rcuos/1] 20 ? S 0:00 \_ [rcuos/2] 21 ? S 0:00 \_ [rcuos/3] 22 ? S 0:00 \_ [rcuos/4] 23 ? S 0:00 \_ [rcuos/5] 24 ? S 0:00 \_ [rcuos/6] 25 ? S 0:00 \_ [rcuos/7] 26 ? S 0:00 \_ [watchdog/0] 27 ? S 0:00 \_ [watchdog/1] 28 ? S 0:00 \_ [migration/1] 29 ? S 0:00 \_ [ksoftirqd/1] 30 ? S 0:00 \_ [kworker/1:0] 31 ? S< 0:00 \_ [kworker/1:0H] 32 ? S 0:00 \_ [watchdog/2] 33 ? S 0:00 \_ [migration/2] 34 ? S 0:00 \_ [ksoftirqd/2] 35 ? S 0:00 \_ [kworker/2:0] 36 ? S< 0:00 \_ [kworker/2:0H] 37 ? S 0:00 \_ [watchdog/3] 38 ? S 0:00 \_ [migration/3] 39 ? S 0:00 \_ [ksoftirqd/3] 40 ? S 0:00 \_ [kworker/3:0] 41 ? S< 0:00 \_ [kworker/3:0H] 42 ? S< 0:00 \_ [khelper] 43 ? S 0:00 \_ [kdevtmpfs] 44 ? S< 0:00 \_ [netns] 45 ? S< 0:00 \_ [writeback] 46 ? S< 0:00 \_ [kintegrityd] 47 ? S< 0:00 \_ [bioset] 49 ? S< 0:00 \_ [kblockd] 50 ? S< 0:00 \_ [ata_sff] 51 ? S 0:00 \_ [khubd] 52 ? S< 0:00 \_ [md] 53 ? S< 0:00 \_ [devfreq_wq] 54 ? S 0:00 \_ [kworker/3:1] 55 ? S 0:00 \_ [kworker/2:1] 57 ? S 0:00 \_ [khungtaskd] 58 ? S 0:00 \_ [kswapd0] 59 ? SN 0:00 \_ [ksmd] 60 ? SN 0:00 \_ [khugepaged] 61 ? S 0:00 \_ [fsnotify_mark] 62 ? S 0:00 \_ [ecryptfs-kthrea] 63 ? S< 0:00 \_ [crypto] 75 ? S< 0:00 \_ [kthrotld] 79 ? S< 0:00 \_ [dm_bufio_cache] 99 ? S< 0:00 \_ [deferwq] 100 ? S< 0:00 \_ [charger_manager] 101 ? S 0:00 \_ [kworker/0:1] 273 ? S 0:00 \_ [scsi_eh_0] 274 ? S 0:00 \_ [scsi_eh_1] 275 ? S 0:00 \_ [scsi_eh_2] 276 ? S 0:00 \_ [scsi_eh_3] 277 ? S 0:00 \_ [scsi_eh_4] 278 ? S 0:00 \_ [scsi_eh_5] 281 ? S 0:00 \_ [kworker/u16:5] 283 ? S 0:00 \_ [kworker/u16:7] 310 ? S 0:00 \_ [jbd2/sda7-8] 311 ? S< 0:00 \_ [ext4-rsv-conver] 312 ? S< 0:00 \_ [ext4-unrsv-conv] 599 ? S< 0:00 \_ [kmemstick] 601 ? S 0:00 \_ [irq/45-mei_me] 607 ? S< 0:00 \_ [kpsmoused] 621 ? S< 0:00 \_ [rpciod] 624 ? S 0:00 \_ [kworker/1:2] 659 ? S< 0:00 \_ [ktpacpid] 686 ? S< 0:00 \_ [cfg80211] 695 ? S< 0:00 \_ [nfsiod] 806 ? S< 0:00 \_ [kworker/u17:1] 809 ? S< 0:00 \_ [hci0] 810 ? S< 0:00 \_ [hci0] 811 ? S< 0:00 \_ [kworker/u17:2] 824 ? S< 0:00 \_ [hd-audio0] 888 ? S 0:00 \_ [wl_event_handle] 937 ? S< 0:00 \_ [ttm_swap] 968 ? S< 0:00 \_ [krfcommd] 1177 ? S< 0:00 \_ [nfsd4] 1178 ? S< 0:00 \_ [nfsd4_callbacks] 1179 ? S 0:00 \_ [lockd] 1182 ? S 0:00 \_ [nfsd] 1183 ? S 0:00 \_ [nfsd] 1184 ? S 0:00 \_ [nfsd] 1185 ? S 0:00 \_ [nfsd] 1186 ? S 0:00 \_ [nfsd] 1187 ? S 0:00 \_ [nfsd] 1188 ? S 0:00 \_ [nfsd] 1189 ? S 0:00 \_ [nfsd] 1 ? Ss 0:00 /sbin/init 386 ? S 0:00 upstart-udev-bridge --daemon 388 ? Ss 0:00 /sbin/udevd --daemon 547 ? S 0:00 \_ /sbin/udevd --daemon 548 ? S 0:00 \_ /sbin/udevd --daemon 632 ? Ss 0:00 /usr/sbin/sshd -D 873 ? Sl 0:00 rsyslogd -c5 874 ? Ss 0:00 rpc.idmapd 882 ? S 0:00 upstart-socket-bridge --daemon 885 ? Ss 0:00 dbus-daemon --system --fork --activation=upstart 920 ? Ss 0:00 /usr/sbin/bluetoothd 922 ? Ss 0:00 rpcbind -w 967 ? Ss 0:00 /usr/sbin/modem-manager 983 ? S 0:00 avahi-daemon: running [sgemstr.local] 985 ? S 0:00 \_ avahi-daemon: chroot helper 1000 ? Ss 0:00 /usr/sbin/cupsd -F 1004 ? Ss 0:00 rpc.statd -L 1008 ? Ssl 0:00 NetworkManager 2058 ? S 0:00 \_ /usr/sbin/dnsmasq --no-resolv --keep-in-foregroun 1016 ? Sl 0:00 /usr/lib/policykit-1/polkitd --no-debug 1085 tty4 Ss+ 0:00 /sbin/getty -8 38400 tty4 1092 tty5 Ss+ 0:00 /sbin/getty -8 38400 tty5 1094 ? Ss 0:02 /sbin/wpa_supplicant -B -P /run/sendsigs.omit.d/wpasu 1113 tty2 Ss+ 0:00 /sbin/getty -8 38400 tty2 1114 tty3 Ss+ 0:00 /sbin/getty -8 38400 tty3 1116 tty6 Ss+ 0:00 /sbin/getty -8 38400 tty6 1122 ? Ss 0:00 acpid -c /etc/acpi/events -s /var/run/acpid.socket 1127 ? Ss 0:00 cron 1128 ? Ss 0:00 atd 1134 ? Ssl 0:00 lightdm 1172 tty7 Ssl+ 0:04 \_ /usr/bin/X :0 -auth /var/run/lightdm/root/:0 -nol 1599 ? Sl 0:00 \_ lightdm --session-child 12 19 1852 ? Ssl 0:00 \_ gnome-session --session=ubuntu 1898 ? Ss 0:00 \_ /usr/bin/ssh-agent /usr/bin/dbus-launch - 1912 ? Sl 0:00 \_ /usr/lib/gnome-settings-daemon/gnome-sett 1936 ? S 0:00 | \_ syndaemon -i 2.0 -K -R -t 1929 ? Sl 0:05 \_ compiz 2037 ? Ss 0:00 | \_ /bin/sh -c /usr/bin/compiz-decorator 2038 ? Sl 0:00 | \_ /usr/bin/gtk-window-decorator 1959 ? Sl 0:00 \_ nautilus -n 1961 ? Sl 0:00 \_ bluetooth-applet 1962 ? Sl 0:00 \_ /usr/lib/gnome-settings-daemon/gnome-fall 1963 ? Sl 0:00 \_ nm-applet 1972 ? Sl 0:00 \_ /usr/lib/policykit-1-gnome/polkit-gnome-a 2233 ? Sl 0:00 \_ /usr/lib/gnome-disk-utility/gdu-notificat 2236 ? Sl 0:00 \_ telepathy-indicator 2254 ? Sl 0:00 \_ zeitgeist-datahub 2427 ? Sl 0:00 \_ update-notifier 2482 ? Sl 0:00 \_ /usr/lib/deja-dup/deja-dup/deja-dup-monit 1135 ? Ss 0:00 /usr/sbin/irqbalance 1175 ? Ssl 0:00 whoopsie 1193 ? Ss 0:00 /usr/sbin/rpc.mountd --manage-gids 1365 ? Sl 0:00 /opt/sge/bin/lx-amd64/sge_qmaster 1405 ? Sl 0:00 /usr/lib/accountsservice/accounts-daemon 1432 ? Sl 0:00 /usr/sbin/console-kit-daemon --no-daemon 1555 ? Sl 0:00 /usr/lib/upower/upowerd 1752 ? SNl 0:00 /usr/lib/rtkit/rtkit-daemon 1768 ? Sl 0:00 /usr/lib/x86_64-linux-gnu/colord/colord 1841 ? Sl 0:00 /usr/bin/gnome-keyring-daemon --daemonize --login 1901 ? S 0:00 /usr/bin/dbus-launch --exit-with-session gnome-sessio 1902 ? Ss 0:00 //bin/dbus-daemon --fork --print-pid 5 --print-addres 1920 ? S 0:00 /usr/lib/gvfs/gvfsd 1922 ? Sl 0:00 /usr/lib/gvfs//gvfs-fuse-daemon -f /home/master/.gvfs 1941 ? S<l 0:00 /usr/bin/pulseaudio --start --log-target=syslog 1946 ? S 0:00 \_ /usr/lib/pulseaudio/pulse/gconf-helper 1943 ? S 0:00 /usr/lib/x86_64-linux-gnu/gconf/gconfd-2 1948 ? S 0:00 /usr/lib/gvfs/gvfsd-metadata 1971 ? S 0:00 /usr/lib/gvfs/gvfs-gdu-volume-monitor 1979 ? Sl 0:00 /usr/lib/udisks/udisks-daemon 1980 ? S 0:00 \_ udisks-daemon: not polling any devices 1984 ? Sl 0:00 /usr/lib/gvfs/gvfs-afc-volume-monitor 1987 ? S 0:00 /usr/lib/gvfs/gvfs-gphoto2-volume-monitor 2003 ? Sl 0:00 /usr/lib/notify-osd/notify-osd 2007 ? S 0:00 /usr/lib/gvfs/gvfsd-trash --spawner :1.5 /org/gtk/gvf 2011 ? Sl 0:00 /usr/bin/gnome-screensaver --no-daemon 2016 ? S 0:00 /usr/lib/gvfs/gvfsd-burn --spawner :1.5 /org/gtk/gvfs 2019 ? Sl 0:00 /usr/lib/bamf/bamfdaemon 2043 ? Sl 0:00 /usr/lib/unity/unity-panel-service 2045 ? Sl 0:00 /usr/lib/indicator-appmenu/hud-service 2064 ? Sl 0:00 /usr/lib/indicator-session/indicator-session-service 2066 ? Sl 0:00 /usr/lib/indicator-datetime/indicator-datetime-servic 2068 ? Sl 0:00 /usr/lib/indicator-messages/indicator-messages-servic 2070 ? Sl 0:00 /usr/lib/indicator-sound/indicator-sound-service 2079 ? Sl 0:00 /usr/lib/indicator-printers/indicator-printers-servic 2080 ? Sl 0:00 /usr/lib/indicator-application/indicator-application- 2109 ? S 0:00 /usr/lib/geoclue/geoclue-master 2112 ? Sl 0:00 /usr/lib/ubuntu-geoip/ubuntu-geoip-provider 2190 ? S 0:00 /opt/sge/bin/lx-amd64/sge_shadowd 2226 tty1 Ss+ 0:00 /sbin/getty -8 38400 tty1 2243 ? Sl 0:00 /usr/lib/telepathy/mission-control-5 2248 ? Sl 0:00 /usr/lib/gnome-online-accounts/goa-daemon 2262 ? Sl 0:00 /usr/bin/zeitgeist-daemon 2268 ? Sl 0:00 /usr/lib/zeitgeist/zeitgeist-fts 2276 ? S 0:00 \_ /bin/cat 2287 ? Sl 0:00 /usr/lib/unity-lens-applications/unity-applications-d 2290 ? Sl 0:00 /usr/bin/python /usr/lib/unity-lens-video/unity-lens- 2292 ? Sl 0:00 /usr/lib/unity-lens-files/unity-files-daemon 2293 ? Sl 0:00 /usr/lib/unity-lens-music/unity-music-daemon 2322 ? Sl 0:01 gnome-terminal 2327 ? S 0:00 \_ gnome-pty-helper 2329 pts/0 Ss 0:00 \_ bash 2616 pts/0 R+ 0:00 \_ ps -e f 2385 ? Sl 0:00 /usr/bin/python /usr/lib/unity-scope-video-remote/uni 2387 ? Sl 0:00 /usr/lib/unity-lens-music/unity-musicstore-daemon 2438 ? S 0:00 /usr/bin/python /usr/lib/system-service/system-servic 2442 ? SNl 0:04 /usr/bin/python /usr/bin/update-manager --no-focus-on 2448 ? Sl 0:00 /usr/lib/dconf/dconf-service 2463 ? SN 0:01 /usr/bin/python /usr/sbin/aptd
and about the port if it is open is listening state meaning there is no problem with the port?? root@sgemstr:~# netstat -nltp |grep 644 tcp 0 0 0.0.0.0:6444 0.0.0.0:* LISTEN 1365/sge_qmaster finally about the firewall i never change anything or give a role and this is the output : root@sgemstr:~# iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination is't that mean i have an empty iptables?? many regards.. On Wednesday, October 29, 2014 9:01 PM, Reuti <re...@staff.uni-marburg.de> wrote: Please keep the list posted. Am 29.10.2014 um 18:47 schrieb Disny Disny: > Hello Reuti > this is the output of qhost and qstat -f but i don't know what it means so > i'm hoping you can help > > kind regards.. > > root@sgemstr:~# qhost > HOSTNAME ARCH NCPU NSOC NCOR NTHR NLOAD MEMTOT > MEMUSE SWAPTO SWAPUS > ---------------------------------------------------------------------------------------------- > global - - - - - - - > - - - > gcl1 lx-amd64 4 1 4 4 - 3.8G > - 6.7G - > gcl2 lx-amd64 4 1 4 4 - 3.7G > - 3.8G - > gcl3 lx-amd64 4 1 4 4 - 1.9G > - 6.7G - > shdwgcl4 lx-amd64 4 1 4 4 - 3.8G > - 3.8G - > root@sgemstr:~# qstat -f > queuename qtype resv/used/tot. np_load arch > states > --------------------------------------------------------------------------------- > all.q@gcl1 BIP 0/0/4 -NA- lx-amd64 au > --------------------------------------------------------------------------------- > all.q@gcl2 BIP 0/0/4 -NA- lx-amd64 au > --------------------------------------------------------------------------------- > all.q@gcl3 BIP 0/0/4 -NA- lx-amd64 au > --------------------------------------------------------------------------------- > all.q@shdwgcl4 BIP 0/0/4 -NA- lx-amd64 au This looks like there is no communication between the qmaster and the execds. Checking the output of: $ ps -e f shows the `sgemaster` resp. `sgexecd` running on the systems? Do you have a firewall in place? Maybe the port 6444 and 6445 needs to be opened. -- Reuti > > ############################################################################ > - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS > ############################################################################ > 4 0.00000 Sleeper root qw 10/23/2014 09:20:09 1 > root@sgemstr:~# > > > On Thursday, October 23, 2014 6:38 PM, Reuti <re...@staff.uni-marburg.de> > wrote: > > > Please check in `qhost` resp. `qstat -f` the state of the machines, i.e. > whether the execd can be reached by returning a suitable value for the > machines. - Reuti > > Am 23.10.2014 um 17:35 schrieb Disny Disny: > > > Yes during the exec installation it added a startup script but is there > > other startup i need to add to it manually?? > > > > > > From: Reuti <re...@staff.uni-marburg.de>; > > To: Disny Disny <disny.wo...@yahoo.com>; > > Cc: grid Engine Mailing List <users@gridengine.org>; > > Subject: Re: Queue instances dropped > > Sent: Thu, Oct 23, 2014 3:29:58 PM > > > > Am 23.10.2014 um 17:23 schrieb Disny Disny: > > > > > > > I have a problem with Sge ..after installing the cluster everything > > > wotked fine but when i shut down the pcs and in other time i start them > > > and try to submit ajob i got this message : > > > queue instance "all.q@gcl2" droped because It is temprerly not available > > > > > > queue instance "all.q@gcl3" droped because It is temprerly not available > > > > > > queue instance "all.q@shdwgcl4" droped because It is temprerly not > > > available > > > > > > queue instance "all.q@gcl1" droped because It is temprerly not available > > > all queues are dropped because of overload or full. > > > I appreaciate any help. > > > > > > Are the execd's running on the ndoes - maybe they need to be added to your > > startup mechanism to do it automatically in case you shutdown the machines? > > > > -- Reuti > > >
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users