Re: [OMPI users] my leak or OpenMPI's leak?

2010-10-18 Thread jody
I had this leak with OpenMPI 1.4.2

But in my case, there is no accumulation - when i repeat the same call,
no additional leak is reported for the second call

Jody

On Mon, Oct 18, 2010 at 1:57 AM, Ralph Castain  wrote:
> There is no OMPI 2.5 - do you mean 1.5?
>
> On Oct 17, 2010, at 4:11 PM, Brian Budge wrote:
>
>> Hi Jody -
>>
>> I noticed this exact same thing the other day when I used OpenMPI v
>> 2.5 built with valgrind support.  I actually ran out of memory due to
>> this.  When I went back to v 2.43, my program worked fine.
>>
>> Are you also using 2.5?
>>
>>  Brian
>>
>> On Wed, Oct 6, 2010 at 4:32 AM, jody  wrote:
>>> Hi
>>> I regularly use valgrind to check for leaks, but i ignore the leaks
>>> clearly created by OpenMPI,
>>> because i think most of them happen because of efficiency (lose no
>>> time cleaning up unimportant leaks).
>>> But i want to make sure no leaks come from my own apps.
>>> In most of the cases, leaks i am responsible for have the name of one
>>> of my files at the bottom of the stack printed by valgrind,
>>> and no internal OpenMPI-calls above, whereas leaks clearly caused by
>>> OpenMPI have something like
>>> ompi_mpi_init, mca_pml_base_open, PMPI_Init etc at or very near the bottom.
>>>
>>> Now i have an application where i am completely unsure where the
>>> responsibility for a particular leak lies. valgrind  shows (among
>>> others) this report
>>>
>>> ==2756== 9,704 (8,348 direct, 1,356 indirect) bytes in 1 blocks are
>>> definitely lost in loss record 2,033 of 2,036
>>> ==2756==    at 0x4005943: malloc (vg_replace_malloc.c:195)
>>> ==2756==    by 0x4049387: ompi_free_list_grow (in
>>> /opt/openmpi-1.4.2.p/lib/libmpi.so.0.0.2)
>>> ==2756==    by 0x41CA613: ???
>>> ==2756==    by 0x41BDD91: ???
>>> ==2756==    by 0x41B0C3D: ???
>>> ==2756==    by 0x408AC9C: PMPI_Send (in
>>> /opt/openmpi-1.4.2.p/lib/libmpi.so.0.0.2)
>>> ==2756==    by 0x8123377: ConnectorBase::send(CollectionBase*,
>>> std::pair,
>>> std::pair >&) (ConnectorBase.cpp:39)
>>> ==2756==    by 0x8123CEE: TileConnector::sendTile() (TileConnector.cpp:36)
>>> ==2756==    by 0x80C6839: TDMaster::init(int, char**) (TDMaster.cpp:226)
>>> ==2756==    by 0x80C167B: main (TDMain.cpp:24)
>>> ==2756==
>>>
>>> At a first glimpse it looks like an OpenMPI-internal leak,
>>> because it happens iinside PMPI_Send,
>>> but then i am using the function ConnectorBase::send()
>>> several times from other callers than TileConnector,
>>> but these don't show up in valgrind's output.
>>>
>>> Does anybody have an idea what is happening here?
>>>
>>> Thank You
>>> jody
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



Re: [OMPI users] my leak or OpenMPI's leak?

2010-10-18 Thread Ralph Castain

On Oct 18, 2010, at 1:41 AM, jody wrote:

> I had this leak with OpenMPI 1.4.2
> 
> But in my case, there is no accumulation - when i repeat the same call,
> no additional leak is reported for the second call

That's because it grabs a larger-than-required chunk of memory just in case you 
call again. This helps performance by reducing the number of malloc's in your 
application.


> 
> Jody
> 
> On Mon, Oct 18, 2010 at 1:57 AM, Ralph Castain  wrote:
>> There is no OMPI 2.5 - do you mean 1.5?
>> 
>> On Oct 17, 2010, at 4:11 PM, Brian Budge wrote:
>> 
>>> Hi Jody -
>>> 
>>> I noticed this exact same thing the other day when I used OpenMPI v
>>> 2.5 built with valgrind support.  I actually ran out of memory due to
>>> this.  When I went back to v 2.43, my program worked fine.
>>> 
>>> Are you also using 2.5?
>>> 
>>>  Brian
>>> 
>>> On Wed, Oct 6, 2010 at 4:32 AM, jody  wrote:
 Hi
 I regularly use valgrind to check for leaks, but i ignore the leaks
 clearly created by OpenMPI,
 because i think most of them happen because of efficiency (lose no
 time cleaning up unimportant leaks).
 But i want to make sure no leaks come from my own apps.
 In most of the cases, leaks i am responsible for have the name of one
 of my files at the bottom of the stack printed by valgrind,
 and no internal OpenMPI-calls above, whereas leaks clearly caused by
 OpenMPI have something like
 ompi_mpi_init, mca_pml_base_open, PMPI_Init etc at or very near the bottom.
 
 Now i have an application where i am completely unsure where the
 responsibility for a particular leak lies. valgrind  shows (among
 others) this report
 
 ==2756== 9,704 (8,348 direct, 1,356 indirect) bytes in 1 blocks are
 definitely lost in loss record 2,033 of 2,036
 ==2756==at 0x4005943: malloc (vg_replace_malloc.c:195)
 ==2756==by 0x4049387: ompi_free_list_grow (in
 /opt/openmpi-1.4.2.p/lib/libmpi.so.0.0.2)
 ==2756==by 0x41CA613: ???
 ==2756==by 0x41BDD91: ???
 ==2756==by 0x41B0C3D: ???
 ==2756==by 0x408AC9C: PMPI_Send (in
 /opt/openmpi-1.4.2.p/lib/libmpi.so.0.0.2)
 ==2756==by 0x8123377: ConnectorBase::send(CollectionBase*,
 std::pair,
 std::pair >&) (ConnectorBase.cpp:39)
 ==2756==by 0x8123CEE: TileConnector::sendTile() (TileConnector.cpp:36)
 ==2756==by 0x80C6839: TDMaster::init(int, char**) (TDMaster.cpp:226)
 ==2756==by 0x80C167B: main (TDMain.cpp:24)
 ==2756==
 
 At a first glimpse it looks like an OpenMPI-internal leak,
 because it happens iinside PMPI_Send,
 but then i am using the function ConnectorBase::send()
 several times from other callers than TileConnector,
 but these don't show up in valgrind's output.
 
 Does anybody have an idea what is happening here?
 
 Thank You
 jody
 ___
 users mailing list
 us...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/users
 
>>> 
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Typo in man page for MPI_File_iwrite_at

2010-10-18 Thread Jeff Squyres
Fixed.  Thanks!


On Oct 16, 2010, at 1:06 AM, Jeremiah Willcock wrote:

> In the online man page for this function, MPI_MODE_SEQENTIAL should be 
> MPI_MODE_SEQUENTIAL.  Also, 
> http://www.open-mpi.org/doc/v1.4/man3/MPI_File_open.3.php is showing man page 
> source code rather than the rendered text.
> 
> -- Jeremiah Willcock
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] my leak or OpenMPI's leak?

2010-10-18 Thread jody
But shouldn't something like this show up in the other processes as well?
I only see that in the master process, but the slave processes also
send data to each other and to the master.


On Mon, Oct 18, 2010 at 2:48 PM, Ralph Castain  wrote:
>
> On Oct 18, 2010, at 1:41 AM, jody wrote:
>
>> I had this leak with OpenMPI 1.4.2
>>
>> But in my case, there is no accumulation - when i repeat the same call,
>> no additional leak is reported for the second call
>
> That's because it grabs a larger-than-required chunk of memory just in case 
> you call again. This helps performance by reducing the number of malloc's in 
> your application.
>
>
>>
>> Jody
>>
>> On Mon, Oct 18, 2010 at 1:57 AM, Ralph Castain  wrote:
>>> There is no OMPI 2.5 - do you mean 1.5?
>>>
>>> On Oct 17, 2010, at 4:11 PM, Brian Budge wrote:
>>>
 Hi Jody -

 I noticed this exact same thing the other day when I used OpenMPI v
 2.5 built with valgrind support.  I actually ran out of memory due to
 this.  When I went back to v 2.43, my program worked fine.

 Are you also using 2.5?

  Brian

 On Wed, Oct 6, 2010 at 4:32 AM, jody  wrote:
> Hi
> I regularly use valgrind to check for leaks, but i ignore the leaks
> clearly created by OpenMPI,
> because i think most of them happen because of efficiency (lose no
> time cleaning up unimportant leaks).
> But i want to make sure no leaks come from my own apps.
> In most of the cases, leaks i am responsible for have the name of one
> of my files at the bottom of the stack printed by valgrind,
> and no internal OpenMPI-calls above, whereas leaks clearly caused by
> OpenMPI have something like
> ompi_mpi_init, mca_pml_base_open, PMPI_Init etc at or very near the 
> bottom.
>
> Now i have an application where i am completely unsure where the
> responsibility for a particular leak lies. valgrind  shows (among
> others) this report
>
> ==2756== 9,704 (8,348 direct, 1,356 indirect) bytes in 1 blocks are
> definitely lost in loss record 2,033 of 2,036
> ==2756==    at 0x4005943: malloc (vg_replace_malloc.c:195)
> ==2756==    by 0x4049387: ompi_free_list_grow (in
> /opt/openmpi-1.4.2.p/lib/libmpi.so.0.0.2)
> ==2756==    by 0x41CA613: ???
> ==2756==    by 0x41BDD91: ???
> ==2756==    by 0x41B0C3D: ???
> ==2756==    by 0x408AC9C: PMPI_Send (in
> /opt/openmpi-1.4.2.p/lib/libmpi.so.0.0.2)
> ==2756==    by 0x8123377: ConnectorBase::send(CollectionBase*,
> std::pair,
> std::pair >&) (ConnectorBase.cpp:39)
> ==2756==    by 0x8123CEE: TileConnector::sendTile() (TileConnector.cpp:36)
> ==2756==    by 0x80C6839: TDMaster::init(int, char**) (TDMaster.cpp:226)
> ==2756==    by 0x80C167B: main (TDMain.cpp:24)
> ==2756==
>
> At a first glimpse it looks like an OpenMPI-internal leak,
> because it happens iinside PMPI_Send,
> but then i am using the function ConnectorBase::send()
> several times from other callers than TileConnector,
> but these don't show up in valgrind's output.
>
> Does anybody have an idea what is happening here?
>
> Thank You
> jody
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

 ___
 users mailing list
 us...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



Re: [OMPI users] my leak or OpenMPI's leak?

2010-10-18 Thread Ralph Castain
I would guess that the difference is because the master process is 
communicating to multiple slaves, which causes it to need more memory, so that 
eventually it has to go get more than the initial block we allocated at startup 
- but the slaves, talking only to one process, never exceed the initial block 
and therefore never get more.

Just guessing without digging into your specific code.

On Oct 18, 2010, at 10:01 AM, jody wrote:

> But shouldn't something like this show up in the other processes as well?
> I only see that in the master process, but the slave processes also
> send data to each other and to the master.
> 
> 
> On Mon, Oct 18, 2010 at 2:48 PM, Ralph Castain  wrote:
>> 
>> On Oct 18, 2010, at 1:41 AM, jody wrote:
>> 
>>> I had this leak with OpenMPI 1.4.2
>>> 
>>> But in my case, there is no accumulation - when i repeat the same call,
>>> no additional leak is reported for the second call
>> 
>> That's because it grabs a larger-than-required chunk of memory just in case 
>> you call again. This helps performance by reducing the number of malloc's in 
>> your application.
>> 
>> 
>>> 
>>> Jody
>>> 
>>> On Mon, Oct 18, 2010 at 1:57 AM, Ralph Castain  wrote:
 There is no OMPI 2.5 - do you mean 1.5?
 
 On Oct 17, 2010, at 4:11 PM, Brian Budge wrote:
 
> Hi Jody -
> 
> I noticed this exact same thing the other day when I used OpenMPI v
> 2.5 built with valgrind support.  I actually ran out of memory due to
> this.  When I went back to v 2.43, my program worked fine.
> 
> Are you also using 2.5?
> 
>  Brian
> 
> On Wed, Oct 6, 2010 at 4:32 AM, jody  wrote:
>> Hi
>> I regularly use valgrind to check for leaks, but i ignore the leaks
>> clearly created by OpenMPI,
>> because i think most of them happen because of efficiency (lose no
>> time cleaning up unimportant leaks).
>> But i want to make sure no leaks come from my own apps.
>> In most of the cases, leaks i am responsible for have the name of one
>> of my files at the bottom of the stack printed by valgrind,
>> and no internal OpenMPI-calls above, whereas leaks clearly caused by
>> OpenMPI have something like
>> ompi_mpi_init, mca_pml_base_open, PMPI_Init etc at or very near the 
>> bottom.
>> 
>> Now i have an application where i am completely unsure where the
>> responsibility for a particular leak lies. valgrind  shows (among
>> others) this report
>> 
>> ==2756== 9,704 (8,348 direct, 1,356 indirect) bytes in 1 blocks are
>> definitely lost in loss record 2,033 of 2,036
>> ==2756==at 0x4005943: malloc (vg_replace_malloc.c:195)
>> ==2756==by 0x4049387: ompi_free_list_grow (in
>> /opt/openmpi-1.4.2.p/lib/libmpi.so.0.0.2)
>> ==2756==by 0x41CA613: ???
>> ==2756==by 0x41BDD91: ???
>> ==2756==by 0x41B0C3D: ???
>> ==2756==by 0x408AC9C: PMPI_Send (in
>> /opt/openmpi-1.4.2.p/lib/libmpi.so.0.0.2)
>> ==2756==by 0x8123377: ConnectorBase::send(CollectionBase*,
>> std::pair,
>> std::pair >&) (ConnectorBase.cpp:39)
>> ==2756==by 0x8123CEE: TileConnector::sendTile() 
>> (TileConnector.cpp:36)
>> ==2756==by 0x80C6839: TDMaster::init(int, char**) (TDMaster.cpp:226)
>> ==2756==by 0x80C167B: main (TDMain.cpp:24)
>> ==2756==
>> 
>> At a first glimpse it looks like an OpenMPI-internal leak,
>> because it happens iinside PMPI_Send,
>> but then i am using the function ConnectorBase::send()
>> several times from other callers than TileConnector,
>> but these don't show up in valgrind's output.
>> 
>> Does anybody have an idea what is happening here?
>> 
>> Thank You
>> jody
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
 
 
 ___
 users mailing list
 us...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/users
 
>>> 
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users