Am 04.04.2014 um 05:55 schrieb Ralph Castain:

> On Apr 3, 2014, at 8:03 PM, Nisha Dhankher -M.Tech(CSE) 
> <nishadhankher-coaese...@pau.edu> wrote:
> 
>> thankyou Ralph.
>> Yes cluster is heterogenous...
> 
> And did you configure OMPI --enable-heterogeneous? And are you running it 
> with ---hetero-nodes? What version of OMPI are you using anyway?
> 
> Note that we don't care if the host pc's are hetero - what we care about is 
> the VM. If all the VMs are the same, then it shouldn't matter. However, most 
> VM technologies don't handle hetero hardware very well - i.e., you can't 
> emulate an x86 architecture on top of a Sparc or Power chip or vice versa.

Well - you have to emulate the CPU. There were products running a virtual x86 
PC on a Mac with PowerPC chip. And IBM has a product called PowerVM Lx86 to run 
software compiled for Linux x86 directly on a PowerLinux machine.

-- Reuti


>> And i haven't made compute nodes on direct physical nodes (pc's) becoz in 
>> college it is not possible to take whole lab of 32 pc's for your work  so i 
>> ran on vm.
> 
> Yes, but at least it would let you test the setup to run MPI across even a 
> couple of pc's - this is simple debugging practice.
> 
>> In Rocks cluster, frontend give the same kickstart to all the pc's so 
>> openmpi version should be same i guess.
> 
> Guess? or know? Makes a difference - might be worth testing.
> 
>> Sir 
>> mpiformatdb is a command to distribute database fragments to different 
>> compute nodes after partitioning od database.
>> And sir have you done mpiblast ?
> 
> Nope - but that isn't the issue, is it? The issue is with the MPI setup.
> 
>> 
>> 
>> On Fri, Apr 4, 2014 at 4:48 AM, Ralph Castain <r...@open-mpi.org> wrote:
>> What is "mpiformatdb"? We don't have an MPI database in our system, and I 
>> have no idea what that command means
>> 
>> As for that error - it means that the identifier we exchange between 
>> processes is failing to be recognized. This could mean a couple of things:
>> 
>> 1. the OMPI version on the two ends is different - could be you aren't 
>> getting the right paths set on the various machines
>> 
>> 2. the cluster is heterogeneous
>> 
>> You say you have "virtual nodes" running on various PC's? That would be an 
>> unusual setup - VM's can be problematic given the way they handle TCP 
>> connections, so that might be another source of the problem if my 
>> understanding of your setup is correct. Have you tried running this across 
>> the PCs directly - i.e., without any VMs?
>> 
>> 
>> On Apr 3, 2014, at 10:13 AM, Nisha Dhankher -M.Tech(CSE) 
>> <nishadhankher-coaese...@pau.edu> wrote:
>> 
>>> i first formatted my database with mpiformatdb command then i ran command :
>>> mpirun -np 64 -machinefile mf mpiblast -d all.fas -p blastn -i query.fas -o 
>>> output.txt
>>> but then it gave this error 113 from some hosts and continue to run for 
>>> other but with no  results even after 2 hours lapsed.....on rocks 6.0 
>>> cluster with 12 virtual nodes on pc's ...2 on each using virt-manger , 1 gb 
>>> ram to each
>>> 
>>> 
>>> On Thu, Apr 3, 2014 at 10:41 PM, Nisha Dhankher -M.Tech(CSE) 
>>> <nishadhankher-coaese...@pau.edu> wrote:
>>> i also made machine file which contain ip adresses of all compute nodes + 
>>> .ncbirc file for path to mpiblast and shared ,local storage path....
>>> Sir
>>> I ran the same command of mpirun on my college supercomputer 8 nodes each 
>>> having 24 processors but it just running....gave no result uptill 3 hours...
>>> 
>>> 
>>> On Thu, Apr 3, 2014 at 10:39 PM, Nisha Dhankher -M.Tech(CSE) 
>>> <nishadhankher-coaese...@pau.edu> wrote:
>>> i first formatted my database with mpiformatdb command then i ran command :
>>> mpirun -np 64 -machinefile mf mpiblast -d all.fas -p blastn -i query.fas -o 
>>> output.txt
>>> but then it gave this error 113 from some hosts and continue to run for 
>>> other but with results even after 2 hours lapsed.....on rocks 6.0 cluster 
>>> with 12 virtual nodes on pc's ...2 on each using virt-manger , 1 gb ram to 
>>> each
>>>  
>>> 
>>> 
>>> On Thu, Apr 3, 2014 at 8:37 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>> I'm having trouble understanding your note, so perhaps I am getting this 
>>> wrong. Let's see if I can figure out what you said:
>>> 
>>> * your perl command fails with "no route to host" - but I don't see any 
>>> host in your cmd. Maybe I'm just missing something.
>>> 
>>> * you tried running a couple of "mpirun", but the mpirun command wasn't 
>>> recognized? Is that correct?
>>> 
>>> * you then ran mpiblast and it sounds like it successfully started the 
>>> processes, but then one aborted? Was there an error message beyond just the 
>>> -1 return status?
>>> 
>>> 
>>> On Apr 2, 2014, at 11:17 PM, Nisha Dhankher -M.Tech(CSE) 
>>> <nishadhankher-coaese...@pau.edu> wrote:
>>> 
>>>> error btl_tcp_endpint.c: 638 connection failed due to error 113
>>>> 
>>>> In openmpi: this error came when i run my mpiblast program on rocks 
>>>> cluster.Connect to hosts failed on ip 10.1.255.236,10.1.255.244 . And when 
>>>> i run following command linux_shell$ perl -e 'die$!=113' this msg comes: 
>>>> "No route to host at -e line 1." shell$ mpirun --mca btl ^tcp shell$ 
>>>> mpirun --mca btl_tcp_if_include eth1,eth2 shell$ mpirun --mca 
>>>> btl_tcp_if_include 10.1.255.244 was also executed but it did nt recognized 
>>>> these commands....nd aborted.... what should i do...? When i run my 
>>>> mpiblast program for the frst time then it give mpi_abort error...bailing 
>>>> out of signal -1 on rank 2 processor...then i removed my public ethernet 
>>>> cable....and then give btl_tcp endpint error 113....
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to