Patch is built and under review...
Thanks again
Ralph
On Dec 2, 2009, at 5:37 PM, Nicolas Bock wrote:
> Thanks
>
> On Wed, Dec 2, 2009 at 17:04, Ralph Castain wrote:
> Yeah, that's the one all right! Definitely missing from 1.3.x.
>
> Thanks - I'll build a patch for the next bug-fix release
>
Oh bugger, I did miss the obvious.
The "old" code which I had ifdef'd out contained an actual
construction of the list itself.
OBJ_CONSTRUCT(&opal_if_list, opal_list_t);
If I make sure I do one of those, I now get a different
set of messages but we are back to running again.
mpirun -v -
> I would be leery of the hard-coded stuff.
Indeed, so I changed it to:
intf.if_mask = prefix( sin_addr->sin_addr.s_addr);
which seems to match what the "old" code was doing: still blowing
up though.
> Reason: the IPv6 code has been a continual source of trouble,
> while the IPv4 code has wor
I would be leery of the hard-coded stuff. Reason: the IPv6 code has been a
continual source of trouble, while the IPv4 code has worked quite well.
Could be a lot of reasons, especially the fact that the IPv6 code is hardly
exercised by the devel team...so changes that cause problems are rarely
Thanks
On Wed, Dec 2, 2009 at 17:04, Ralph Castain wrote:
> Yeah, that's the one all right! Definitely missing from 1.3.x.
>
> Thanks - I'll build a patch for the next bug-fix release
>
>
> On Dec 2, 2009, at 4:37 PM, Abhishek Kulkarni wrote:
>
> > On Wed, Dec 2, 2009 at 5:00 PM, Ralph Castain
> I believe this line is incorrect:
>
>>opal_list_append(&opal_if_list, (opal_list_item_t*)
>> intf_ptr);
>
> It needs to be
>
> opal_list_append(&opal_if_list, &intf_ptr->super);
Didn't seem to change things.
Any thoughts on the:
/*
* hardcoded netmask, adri
Yeah, that's the one all right! Definitely missing from 1.3.x.
Thanks - I'll build a patch for the next bug-fix release
On Dec 2, 2009, at 4:37 PM, Abhishek Kulkarni wrote:
> On Wed, Dec 2, 2009 at 5:00 PM, Ralph Castain wrote:
>> Indeed - that is very helpful! Thanks!
>> Looks like we aren't
I believe this line is incorrect:
>opal_list_append(&opal_if_list, (opal_list_item_t*) intf_ptr);
It needs to be
opal_list_append(&opal_if_list, &intf_ptr->super);
On Dec 2, 2009, at 4:46 PM, kevin.buck...@ecs.vuw.ac.nz wrote:
>> I have actually already taken the IPv6 block and si
> I have actually already taken the IPv6 block and simply tried to
> replace any IPv6 stuff with IPv4 "equivalents", eg:
At the risk of showing a lot of ignorance, here's the block I coddled
together based on the IPv6 block.
I have tried to keep it looking as close to the original IPv6
block as p
On Wed, Dec 2, 2009 at 5:00 PM, Ralph Castain wrote:
> Indeed - that is very helpful! Thanks!
> Looks like we aren't cleaning up high enough - missing the directory level.
> I seem to recall seeing that error go by and that someone fixed it on our
> devel trunk, so this is likely a repair that did
Currently, I am in the process of converting an MPMD program of mine from
LAM to OpenMPI. The old LAM setup used an application schema to handle the
launching of the server and remote processes on all the nodes in the
cluster; however, I have run into an issue due to the difference in how
mpirun w
Indeed - that is very helpful! Thanks!
Looks like we aren't cleaning up high enough - missing the directory level. I
seem to recall seeing that error go by and that someone fixed it on our devel
trunk, so this is likely a repair that didn't get moved over to the release
branch as it should have
> Given that it is working for us at the moment, and my current
> priorities, I doubt I'll get to this over the next 2-3 weeks.
> So if you have time and care to look at it before then, please
> do!
I have actually already taken the IPv6 block and simply tried to
replace any IPv6 stuff with IPv4
On Wed, Dec 2, 2009 at 14:23, Ralph Castain wrote:
> Hmmif you are willing to keep trying, could you perhaps let it run for
> a brief time, ctrl-z it, and then do an ls on a directory from a process
> that has already terminated? The pids will be in order, so just look for an
> early number (
Hmmif you are willing to keep trying, could you perhaps let it run for a
brief time, ctrl-z it, and then do an ls on a directory from a process that has
already terminated? The pids will be in order, so just look for an early number
(not mpirun or the parent, of course).
It would help if yo
On Wed, Dec 2, 2009 at 12:12, Ralph Castain wrote:
>
> On Dec 2, 2009, at 10:24 AM, Nicolas Bock wrote:
>
>
>
> On Tue, Dec 1, 2009 at 20:58, Nicolas Bock wrote:
>
>>
>>
>> On Tue, Dec 1, 2009 at 18:03, Ralph Castain wrote:
>>
>>> You may want to check your limits as defined by the shell/system
boost.MPI is probably your best bet. They export some nice C++ functionality
through MPI.
On Dec 2, 2009, at 2:37 PM, Ivan Marin wrote:
> Hello all,
>
> I'm developing an groundwater simulation application that will use openmpi to
> distribute the data and solve a linear system. The problem i
Hello all,
I'm developing an groundwater simulation application that will use openmpi
to distribute the data and solve a linear system. The problem is that my
primary data structure is composed of a base class and derived classes, and
they are inserted in a boost ptr_vector, as they are of differe
On Dec 2, 2009, at 10:24 AM, Nicolas Bock wrote:
>
>
> On Tue, Dec 1, 2009 at 20:58, Nicolas Bock wrote:
>
>
> On Tue, Dec 1, 2009 at 18:03, Ralph Castain wrote:
> You may want to check your limits as defined by the shell/system. I can also
> run this for as long as I'm willing to let it r
On Dec 1, 2009, at 11:15 AM, Ashley Pittman wrote:
On Tue, 2009-12-01 at 10:46 -0500, Brock Palen wrote:
The attached code, is an example where openmpi/1.3.2 will lock up, if
ran on 48 cores, of IB (4 cores per node),
The code loops over recv from all processors on rank 0 and sends from
all oth
On Tue, Dec 1, 2009 at 20:58, Nicolas Bock wrote:
>
>
> On Tue, Dec 1, 2009 at 18:03, Ralph Castain wrote:
>
>> You may want to check your limits as defined by the shell/system. I can
>> also run this for as long as I'm willing to let it run, so something else
>> appears to be going on.
>>
>>
>>
Hi Josh,
In case it help, I am running 1.3.3 compiled as follow :
../configure --enable-ft-thread --with-ft=cr --enable-mpi-threads
--with-blcr=... --with-blcr-libdir=...--disable-openib-rdmacm --prefix=
I ran my application like this :
mpirun -am ft-enable-cr --hostfile host -np 2 ./a.out
Though I do not test this scenario (using hostfiles) very often, it
used to work. The ompi-restart command takes a --hostfile (or --
machinefile) argument that is passed directly to the mpirun command. I
wonder if something broke recently with this handoff. I can certainly
checkpoint with on
Sorry to jump into this late -- yes, opal/util/if.c is the exact place for this
stuff.
Ralph is exactly correct that this code has been touched by multiple people
over a few years, so it's possible that it's a little krufty. I certainly hope
it isn't working by accident -- but given the contex
Given that it is working for us at the moment, and my current priorities, I
doubt I'll get to this over the next 2-3 weeks. So if you have time and care to
look at it before then, please do!
Thanks
On Dec 1, 2009, at 8:45 PM, kevin.buck...@ecs.vuw.ac.nz wrote:
>> Interesting - especially since
Hi,
I am trying to use BLCR checkpointing in mpi. I am currently able to run
my application using some hostfile, checkpoint the run, and then restart
the application using the same hostfile. The thing I would like to do is
to restart the application with a different hostfile. But this leads to
The --preload-* options to 'mpirun' currently use the ssh/scp commands (or
rsh/rcp via an MCA parameter) to move files from the machine local to the
'mpirun' command to the compute nodes during launch. This assumes that you have
Open MPI already installed on all of the machines. It was an option
I think the issue is that if you *dont* specifically use
pthread_attr_setstacksize the pthread library will (can?) give
each thread a stack of size equal to the stacksize rlimit.
You are correct - this is not specifically an Open MPI issue
although if it is Open MPI spawning the threads, maybe i
On Tue, 2009-12-01 at 05:47 -0800, Tim Prince wrote:
> amjad ali wrote:
> > Hi,
> > thanks T.Prince,
> >
> > Your saying:
> > "I'll just mention that we are well into the era of 3 levels of
> > programming parallelization: vectorization, threaded parallel (e.g.
> > OpenMP), and process parallel
John R. Cary wrote:
Jeff Squyres wrote:
(for the web archives)
Brock and I talked about this .f90 code a bit off list -- he's going
to investigate with the test author a bit more because both of us are
a bit confused by the F90 array syntax used.
Attached is a simple send/recv code writte
>PBS loves to read the nodes' list backwards.
> If you want to start with WN1,
> put it last on the Torque/PBS "nodes" file.
Nice to know. Thanks Gus for the tip!
Best Regards.
~Belaid.
>
> Gus Correa
> -
> Gustavo Correa
> Lam
31 matches
Mail list logo