Ralph Castain <r...@open-mpi.org> writes: > I can't speak to all of the OMPI code, but I can certainly create > a new configure option --valgrind-friendly that would initialize > the OOB comm buffers and other RTE-related memory to eliminate such > warnings.
That would be excellent, thank you for offering. > I would prefer to configure it out rather than adding a bunch of > "if-then" checks for envars to avoid having the performance hit when > not needed. FWIW, we've solved this before by using function pointers initialized on load, e.g. (warning, untested pseudocode): void mymethod(int stuff) { do(stuff); } void mymethod_debug(int stuff) { internal_consistency_check(); do(stuff); } void (*method)(int); ... void init() { method = mymethod; if(getenv("DEBUGGING") != NULL) { method = mymethod_debug; } } void algorithm() { ... method(42); ... } You'd only pay the branch during the one-time init(). Of course, the method can't be inlined anymore either. Anyway, I realize that's quite a bit more work. Preferred, but the configure check would suffice for most of my needs. > Would that help? Tremendously, thank you. -tom > On Tue, Jun 9, 2009 at 11:40 AM, tom fogal <tfo...@alumni.unh.edu> wrote: > > > jody <jody....@gmail.com> writes: > > > I made a suppression file for the irrelevant memory leaks of ompi: I > > > make no claim that it catches all possible ones, but it catches all > > > that appear in my code. > > [snip] > > > > Thanks, Jody. > > > > What are the chances something like this could be added / maintained in > > the OpenMPI tree? It would be great to have something 1) maintained by > > someone more knowledgeable about these errors than me, and 2) installed > > by default when I setup my toolchain for parallel debugging. > > > > > On Tue, Jun 9, 2009 at 3:28 PM, Jeff Squyres<jsquy...@cisco.com> wrote: > > > > This is worth adding to the FAQ. > > > > > > > > On Jun 9, 2009, at 2:31 AM, Ashley Pittman wrote: > > > > > > > >> On Mon, 2009-06-08 at 23:41 -0600, tom fogal wrote: > > > >> > George Bosilca <bosi...@eecs.utk.edu> writes: > > > >> > > There is a whole page on valgrind web page about this topic. > > Please > > > >> > > read > > > >> > > > > http://valgrind.org/docs/manual/manual-core.html#manual-core.suppress > > > >> > > for more information. > > > >> > > > > >> > Even better, Ralph (et al.) is if we could just make valgrind think > > > >> > this is defined memory. One can do this with client requests: > > > >> > > > > >> > > > http://valgrind.org/docs/manual/mc-manual.html#mc-manual.clientreqs > > > >> > > > >> Using the Valgrind client requests unnecessarily is a very bad idea, > > > >> they are intended for where applications use their own memory > > allocator > > > >> (i.e. replace malloc/free) or are using custom kernel modules or > > > >> hardware which Valgrind doesn't know about. > > > > Okay, sure, I realize it was a bit of an abuse of the intended use of > > the tool. > > > > > >> The correct solution is either to not send un-initialised memory > > > >> or to suppress the error using a suppression file as George > > > >> said. As the error is from MPI_Init() you can safely ignore it > > > >> from a end-user perspective. > > > > As I mentioned in my initial message, MPI_Init is only one such > > error; I get them in a lot of MPI calls, seemingly anything that does > > communication. Though I've heard differently on this list, this led me > > to believe I was doing something wrong in my code. > > > > It seems like the only way I could verify that I'm not causing these > > errors myself is to grok the call stacks I'm given for each vg error > > and figure out where the uninitialized memory comes from, and then make > > a judgement call for myself whether this makes sense to suppress. Or > > I could mail the list about every error I see and ask for confirmation > > that it's benign/suppressable. Most likely, I'll take the simple > > approach and just use the suppression file I was given, but that's > > prone to be fragile and break with a future OpenMPI release. > > > > What about an environment variable which enables slower, > > valgrind-friendly behavior? There's precedent in other libraries, e.g. > > glib [1]. > > > > -tom > > > > [1] http://library.gnome.org/devel/glib/stable/glib-running.html