On 02/07/13 01:05, Siegmar Gross wrote:
thank you very much for your patch. I have applied the patch to openmpi-1.6.4rc4. Open MPI: 1.6.4rc4r28022 : [B .][. .] (slot list 0:0) : [. B][. .] (slot list 0:1) : [B B][. .] (slot list 0:0-1) : [. .][B .] (slot list 1:0) : [. .][. B] (slot list 1:1) : [. .][B B] (slot list 1:0-1) : [B B][B B] (slot list 0:0-1,1:0-1)
That looks great. I'll file a CMR to get this patch into 1.6. Unless you indicate otherwise, I'll assume this issue is understood for 1.6.
I get the following output for an unpatched openmpi-1.9. Open MPI: 1.9a1r28035 : [B/.][./.] : [B/B][./.] : [B/B][./.] : [./.][B/B] : [./.][./B] : [./.][B/B] : [B/B][./.]
Right. There is something else going on for 1.9. I think OMPI 1.9 is corrupting the binding strings. In my case, I said "0:1" and the internal string was "0,1". So, although I should have binding to only one core (0:1), OMPI was trying to bind to two of them (0,1). I'm still waiting for a response to other e-mail where I asked for hints where to find the problem in the source code.