On Mon, Dec 5, 2011 at 16:12, Ralph Castain wrote:
> Sounds like we should be setting this value when starting the process - yes?
> If so, what is the "good" value, and how do we compute it?
I've also been just looking at this for the past few days. What I came
up with is a small script psm_shctx
-
Arnaud HERITIER
Meteo France International
+33 561432940
arnaud.herit...@mfi.fr
--
On Mon, Dec 5, 2011 at 6:12 PM, Ralph Castain wrote:
>
> On Dec 5, 2011, at 5:49 AM, arnaud Heritier wrote
On Dec 5, 2011, at 5:49 AM, arnaud Heritier wrote:
> Hello,
>
> I found the solution, thanks to Qlogic support.
>
> The "can't open /dev/ipath, network down (err=26)" message from the ipath
> driver is really misleading.
>
> Actually, this is an hardware context problem on the Qlogic PSM. PSM
Hello,
I found the solution, thanks to Qlogic support.
The "can't open /dev/ipath, network down (err=26)" message from the ipath
driver is really misleading.
Actually, this is an hardware context problem on the Qlogic PSM. PSM can't
allocate any hardware context for the job because other(s) MPI
On Nov 28, 2011, at 11:53 PM, arnaud Heritier wrote:
> I do have a contract and i tried to open a case, but their support is ..
What happens if you put a delay between the two jobs? E.g., if you just delay
a few seconds before the 2nd job starts? Perhaps the ipath device just needs a
litt
I do have a contract and i tried to open a case, but their support is
..Anyway. I'm stii working on the strange error message from mpirun
saying it can't allocate memory when at the same time it also reports that
the memory is unlimited ...
Arnaud
On Tue, Nov 29, 2011 at 4:23 AM, Jeff Squyre
I'm afraid we don't have any contacts left at QLogic to ask them any more... do
you have a support contract, perchance?
On Nov 27, 2011, at 3:11 PM, Arnaud Heritier wrote:
> Hello,
>
> I run into a stange problem with qlogic OFED and openmpi. When i submit
> (through SGE) 2 jobs on the same no
Hello,
I run into a stange problem with qlogic OFED and openmpi. When i submit
(through SGE) 2 jobs on the same node, the second job ends up with:
(ipath/PSM)[10292]: can't open /dev/ipath, network down (err=26)
I'm pretty sure the infiniband is working well as the other job runs fine.
Here is