Re: [OMPI users] IB Memory Requirements, adjusting for reduced memory consumption

2012-01-12 Thread Nathan Hjelm
I would start by adjusting btl_openib_receive_queues . The default uses a per-peer QP which can eat up a lot of memory. I recommend using no per-peer and several shared receive queues. We use S,4096,1024:S,12288,512:S,65536,512 -Nathan On Thu, 12 Jan 2012, V. Ram wrote: Open MPI IB Gurus, I

[OMPI users] IB Memory Requirements, adjusting for reduced memory consumption

2012-01-12 Thread V. Ram
Open MPI IB Gurus, I have some slightly older InfiniBand-equipped nodes with IB which have less RAM than we'd like, and on which we tend to run jobs that can span 16-32 nodes of this type. The jobs themselves tend to run on the heavy side in terms of their own memory requirements. When we used t

Re: [OMPI users] SIGSEGV on MPI_Test

2012-01-12 Thread devendra rai
Hello All, Continuing my previous mail, I thought attaching this debugger screenshot may help anyone come up with an explanation. The exact location where the segfault happens is also highlighted. Thanks a lot for any help. Best, Devendra From: devendra ra

[OMPI users] SIGSEGV on MPI_Test

2012-01-12 Thread devendra rai
Hello Community: I am running into a strange problem. I get a SIGSEGV when I try to execute MPI_Test: ==21076== Process terminating with default action of signal 11 (SIGSEGV) ==21076==  Bad permissions for mapped region at address 0x43AEE1 ==21076==    at 0x509B957: ompi_request_default_test (re

Re: [OMPI users] Strange TCP latency results on Amazon EC2

2012-01-12 Thread Roberto Rey
With ifconfig I can only see one Ethernet card (eth0) as well as the loopback interface 2012/1/12 teng ma > Is it possible your EC2 cluster has another "unknown" crappy Ethernet > card(e.g. 1Gb > Ethernet card) . For small messages, they go through different paths in > NPtcp or MPI over NPmpi. >

Re: [OMPI users] Strange TCP latency results on Amazon EC2

2012-01-12 Thread teng ma
Is it possible your EC2 cluster has another "unknown" crappy Ethernet card(e.g. 1Gb Ethernet card) . For small messages, they go through different paths in NPtcp or MPI over NPmpi. Teng Ma On Thu, Jan 12, 2012 at 10:28 AM, Roberto Rey wrote: > Thanks for your reply! > > I'm using TCP BTL becaus

Re: [OMPI users] Strange TCP latency results on Amazon EC2

2012-01-12 Thread Roberto Rey
Thanks for your reply! I'm using TCP BTL because I don't have any other option in Amazon with 10 Gbit Ethernet. I also tried with MPICH2 1.4 and I got 60 microseconds...so I am very confused about it... Regarding hyperthreading and process binding settings...I am using only one MPI process in ea

Re: [OMPI users] Strange TCP latency results on Amazon EC2

2012-01-12 Thread Jeff Squyres
Hi Roberto. We've had strange reports of performance from EC2 before; it's actually been on my to-do list to go check this out in detail. I made contact with the EC2 folks at Supercomputing late last year. They've hooked me up with some credits on EC2 to go check out what's happening, but the

Re: [OMPI users] Strange TCP latency results on Amazon EC2

2012-01-12 Thread Roberto Rey
Hi again, Today I was trying with another TCP benchmark included in the hpcbench suite, and with a ping-pong test I'm also getting 100us of latency. Then, I tried with netperf and the same result So, in summary, I'm measuring TCP latency with messages size between 1-32 bytes: Netperf over TC

Re: [OMPI users] ompi + bash + GE + modules

2012-01-12 Thread Mark Suhovecky
Dave- I'm working with Univa support as well. I started out debugging this with pretty poor grasp of where in the software flow the problem might be. Like most sysadmins, I belong to many community lists, and find them to be of tremendous help in running problems down. They certainly have been

Re: [OMPI users] ompi + bash + GE + modules

2012-01-12 Thread Dave Love
Surely this should be on the gridengine list -- and it's in recent archives -- but there's some ob-openmpi below. Can Notre Dame not get the support they've paid Univa for? Reuti writes: > SGE 6.2u5 can't handle multi line environment variables or functions, > it was fixed in 6.2u6 which isn't

[OMPI users] checkpointing on other transports

2012-01-12 Thread Dave Love
What would be involved in adding checkpointing to other transports, specifically the PSM MTL? Are there (likely to be?) technical obstacles, and would it be a lot of work if not? I'm asking in case it would be easy, and we don't have to exclude QLogic from a procurement, given they won't respond

Re: [OMPI users] Passwordless ssh

2012-01-12 Thread Reuti
Am 12.01.2012 um 12:17 schrieb Shaandar Nyamtulga: > Dear Reuti > > Then what I should do? I am novice in ssh, OpenMPI. Can you direct me little > bit further? I am quite confused. > Thank you $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys on the file server. -- Reuti > > > From: re...

Re: [OMPI users] Passwordless ssh

2012-01-12 Thread Shaandar Nyamtulga
Dear Reuti Then what I should do? I am novice in ssh, OpenMPI. Can you direct me little bit further? I am quite confused. Thank you > From: re...@staff.uni-marburg.de > Date: Wed, 11 Jan 2012 12:31:07 +0100 > To: us...@open-mpi.org > Subject: Re: [OMPI users] Passwordless ssh > > Hi, > > A