sir smae virt-manager is bein used by all pc's.no i did n't enable openmpi-hetro.Yes openmpi version is same in all through same kickstart file. ok...actually sir...rocks itself installed,configured openmpi and mpich on it own through hpc roll.
On Fri, Apr 4, 2014 at 9:25 AM, Ralph Castain <r...@open-mpi.org> wrote: > > On Apr 3, 2014, at 8:03 PM, Nisha Dhankher -M.Tech(CSE) < > nishadhankher-coaese...@pau.edu> wrote: > > thankyou Ralph. > Yes cluster is heterogenous... > > > And did you configure OMPI --enable-heterogeneous? And are you running it > with ---hetero-nodes? What version of OMPI are you using anyway? > > Note that we don't care if the host pc's are hetero - what we care about > is the VM. If all the VMs are the same, then it shouldn't matter. However, > most VM technologies don't handle hetero hardware very well - i.e., you > can't emulate an x86 architecture on top of a Sparc or Power chip or vice > versa. > > > And i haven't made compute nodes on direct physical nodes (pc's) becoz in > college it is not possible to take whole lab of 32 pc's for your work so i > ran on vm. > > > Yes, but at least it would let you test the setup to run MPI across even a > couple of pc's - this is simple debugging practice. > > In Rocks cluster, frontend give the same kickstart to all the pc's so > openmpi version should be same i guess. > > > Guess? or know? Makes a difference - might be worth testing. > > Sir > mpiformatdb is a command to distribute database fragments to different > compute nodes after partitioning od database. > And sir have you done mpiblast ? > > > Nope - but that isn't the issue, is it? The issue is with the MPI setup. > > > > On Fri, Apr 4, 2014 at 4:48 AM, Ralph Castain <r...@open-mpi.org> wrote: > >> What is "mpiformatdb"? We don't have an MPI database in our system, and I >> have no idea what that command means >> >> As for that error - it means that the identifier we exchange between >> processes is failing to be recognized. This could mean a couple of things: >> >> 1. the OMPI version on the two ends is different - could be you aren't >> getting the right paths set on the various machines >> >> 2. the cluster is heterogeneous >> >> You say you have "virtual nodes" running on various PC's? That would be >> an unusual setup - VM's can be problematic given the way they handle TCP >> connections, so that might be another source of the problem if my >> understanding of your setup is correct. Have you tried running this across >> the PCs directly - i.e., without any VMs? >> >> >> On Apr 3, 2014, at 10:13 AM, Nisha Dhankher -M.Tech(CSE) < >> nishadhankher-coaese...@pau.edu> wrote: >> >> i first formatted my database with mpiformatdb command then i ran command >> : >> mpirun -np 64 -machinefile mf mpiblast -d all.fas -p blastn -i query.fas >> -o output.txt >> but then it gave this error 113 from some hosts and continue to run for >> other but with no results even after 2 hours lapsed.....on rocks 6.0 >> cluster with 12 virtual nodes on pc's ...2 on each using virt-manger , 1 gb >> ram to each >> >> >> On Thu, Apr 3, 2014 at 10:41 PM, Nisha Dhankher -M.Tech(CSE) < >> nishadhankher-coaese...@pau.edu> wrote: >> >>> i also made machine file which contain ip adresses of all compute nodes >>> + .ncbirc file for path to mpiblast and shared ,local storage path.... >>> Sir >>> I ran the same command of mpirun on my college supercomputer 8 nodes >>> each having 24 processors but it just running....gave no result uptill 3 >>> hours... >>> >>> >>> On Thu, Apr 3, 2014 at 10:39 PM, Nisha Dhankher -M.Tech(CSE) < >>> nishadhankher-coaese...@pau.edu> wrote: >>> >>>> i first formatted my database with mpiformatdb command then i ran >>>> command : >>>> mpirun -np 64 -machinefile mf mpiblast -d all.fas -p blastn -i >>>> query.fas -o output.txt >>>> but then it gave this error 113 from some hosts and continue to run for >>>> other but with results even after 2 hours lapsed.....on rocks 6.0 cluster >>>> with 12 virtual nodes on pc's ...2 on each using virt-manger , 1 gb ram to >>>> each >>>> >>>> >>>> >>>> On Thu, Apr 3, 2014 at 8:37 PM, Ralph Castain <r...@open-mpi.org> wrote: >>>> >>>>> I'm having trouble understanding your note, so perhaps I am getting >>>>> this wrong. Let's see if I can figure out what you said: >>>>> >>>>> * your perl command fails with "no route to host" - but I don't see >>>>> any host in your cmd. Maybe I'm just missing something. >>>>> >>>>> * you tried running a couple of "mpirun", but the mpirun command >>>>> wasn't recognized? Is that correct? >>>>> >>>>> * you then ran mpiblast and it sounds like it successfully started the >>>>> processes, but then one aborted? Was there an error message beyond just >>>>> the >>>>> -1 return status? >>>>> >>>>> >>>>> On Apr 2, 2014, at 11:17 PM, Nisha Dhankher -M.Tech(CSE) < >>>>> nishadhankher-coaese...@pau.edu> wrote: >>>>> >>>>> error btl_tcp_endpint.c: 638 connection failed due to error >>>>> 113<http://biosupport.se/questions/696/error-btl_tcp_endpintc-638-connection-failed-due-to-error-113> >>>>> >>>>> In openmpi: this error came when i run my mpiblast program on rocks >>>>> cluster.Connect to hosts failed on ip 10.1.255.236,10.1.255.244 . And when >>>>> i run following command linux_shell$ perl -e 'die$!=113' this msg comes: >>>>> "No route to host at -e line 1." shell$ mpirun --mca btl ^tcp shell$ >>>>> mpirun >>>>> --mca btl_tcp_if_include eth1,eth2 shell$ mpirun --mca btl_tcp_if_include >>>>> 10.1.255.244 was also executed but it did nt recognized these >>>>> commands....nd aborted.... what should i do...? When i run my mpiblast >>>>> program for the frst time then it give mpi_abort error...bailing out of >>>>> signal -1 on rank 2 processor...then i removed my public ethernet >>>>> cable....and then give btl_tcp endpint error 113.... >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> >>>> >>>> >>> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >